MGnify Notebooks (previews)

MGnify Notebooks

The quantity and richness of metagenomics-derived data in MGnify grows every day. The MGnify website is the best place to start exploring and searching the MGnify database, and allows users to download modest query results as CSV tables.

For larger queries, or more complex requirements like fetching metadata from samples across multiple studies, a programmatic access approach is far better.

Programmatic access - fetching data from MGnify using a terminal command or code script - uses the MGnify API (Application Programming Interface). The API provides access to every data type in MGnify: Studies, Samples, Analyses, Annotations, MAGs etc: it is what lies behind the MGnify website. Using the API means you can fetch more data than is possible via the website, and can help you write reproducible analysis scripts.

The API can be explored interactively online, using the API Browser. But actually using the API first requires knowledge and/or installation of tools on your computer. This might range from a command line tool like cURL, to learning R and setting up the R Studio application, to setting up a Python environment and installing a suite of packages used for data analysis. Second, the API returns most data in JSON format: this is standard on the web, but less familiar for bioinformaticians used to TSVs and dataframes.

The MGnify Notebook Servers at EMBL and Galaxy, and MGnifyR package are designed to bridge these gaps. Users can launch an online R and Python coding environment in their browser, without installing anything. It already includes the main libraries needed for communicating with the MGnify API, analysing data, and making plots. It uses the popular Jupyter Lab software, which means you can code inside Notebooks: interactive code documents.

There are example Notebooks written in both R and Python, so users can pick whichever they’re more familiar with.

Preview the notebooks

Title	Author	Categories
Comparative Metagenomics	Alejandra Escobar	R
Download paginated API data to a CSV	Sandy Rogers	Python
Fetch Analyses metadata for a Study	Sandy Rogers	R
GSC23 MGnify Workshop	Tatiana Gurbich, Sandy Rogers, Virginie Grosboillot	Python
Interactive map for AtlantECO project	Kate S [Ekaterina Sakharova] (MGnify team)	Python
Load Analyses for a MGnify Study	Sandy Rogers	Python
Pathways Explorer	Alejandra Escobar, Amartya Nambiar	R
Pathways Visualisation	Alejandra Escobar	R
Search MGnify Genomes	Virginie Grosboillot	Python
Search for Samples or Studies	Sandy Rogers, Ben Allen	R