You can run and edit these examples interactively on Galaxy
Load a Study from the MGnify API and fetch its Analyses
The MGnify API returns JSON data. The jsonapi_client package can help you load this data into Python, e.g. into a Pandas dataframe.
This example shows you how to load a MGnify Study’s Analyses from the MGnify API
You can find all of the other “API endpoints” using the Browsable API interface in your web browser. The URL you see in the browsable API is exactly the same as the one you can use in this code.
This is an interactive code notebook (a Jupyter Notebook). To run this code, click into each cell and press the ▶ button in the top toolbar, or press shift+enter.
Select a Study
Pick a particular Study of interest. If you followed a link to this notebook, we might already know the Study Accession. Otherwise, you can enter one or use the example:
from lib.variable_utils import get_variable_from_link_or_input# You can also just directly set the accession variable in code, like this:# accession = "MGYS00005292"accession = get_variable_from_link_or_input('MGYS', 'Study Accession', 'MGYS00005292')
Using Study Accession MGYS00005292 from the link you followed.
Using "MGYS00005292" as Study Accession
Fetch data
Fetch Analyses for this study from the MGnify API, into a Pandas dataframe
from jsonapi_client import Sessionimport pandas as pdwith Session("https://www.ebi.ac.uk/metagenomics/api/v1") as mgnify: analyses =map(lambda r: r.json, mgnify.iterate(f'studies/{accession}/analyses')) analyses = pd.json_normalize(analyses)
Inspect the data
The .head() method prints the first few rows of the table
analyses.head()
type
id
attributes.accession
attributes.analysis-status
attributes.pipeline-version
attributes.analysis-summary
attributes.experiment-type
attributes.is-private
attributes.complete-time
attributes.instrument-platform
attributes.instrument-model
relationships.study.data.id
relationships.study.data.type
relationships.run.data.id
relationships.run.data.type
relationships.sample.data.id
relationships.sample.data.type
0
analysis-jobs
MGYA00448077
MGYA00448077
completed
4.1
[{'key': 'Submitted nucleotide sequences', 'va...
amplicon
False
2020-01-31T08:26:49
ILLUMINA
Illumina HiSeq 2500
MGYS00005292
studies
SRR6132556
runs
SRS2065862
samples
1
analysis-jobs
MGYA00448078
MGYA00448078
completed
4.1
[{'key': 'Submitted nucleotide sequences', 'va...
amplicon
False
2020-01-31T08:27:25
ILLUMINA
Illumina HiSeq 2500
MGYS00005292
studies
SRR6132555
runs
SRS2065861
samples
2
analysis-jobs
MGYA00448079
MGYA00448079
completed
4.1
[{'key': 'Submitted nucleotide sequences', 'va...
amplicon
False
2020-01-31T08:28:04
ILLUMINA
Illumina HiSeq 2500
MGYS00005292
studies
SRR6132554
runs
SRS2065860
samples
3
analysis-jobs
MGYA00448080
MGYA00448080
completed
4.1
[{'key': 'Submitted nucleotide sequences', 'va...
amplicon
False
2020-01-31T08:28:42
ILLUMINA
Illumina HiSeq 2500
MGYS00005292
studies
SRR6132553
runs
SRS2065859
samples
4
analysis-jobs
MGYA00448081
MGYA00448081
completed
4.1
[{'key': 'Submitted nucleotide sequences', 'va...
amplicon
False
2020-01-31T08:29:18
ILLUMINA
Illumina HiSeq 2500
MGYS00005292
studies
SRR6132552
runs
SRS2065858
samples
Example: distribution of instruments used for the Analysed Samples
import matplotlib.pyplot as pltanalyses.groupby('attributes.instrument-model').size().plot(kind='pie')plt.title('Number of Analysed Samples by instrument type');