MGnify Notebooks (previews)

MGnify Notebooks

The quantity and richness of metagenomics-derived data in MGnify grows every day. The MGnify website is the best place to start exploring and searching the MGnify database, and allows users to download modest query results as CSV tables.

For larger queries, or more complex requirements like fetching metadata from samples across multiple studies, a programmatic access approach is far better.

Programmatic access - fetching data from MGnify using a terminal command or code script - uses the MGnify API (Application Programming Interface). The API provides access to every data type in MGnify: Studies, Samples, Analyses, Annotations, MAGs etc: it is what lies behind the MGnify website. Using the API means you can fetch more data than is possible via the website, and can help you write reproducible analysis scripts.

The API can be explored interactively online, using the API Browser. But actually using the API first requires knowledge and/or installation of tools on your computer. This might range from a command line tool like cURL, to learning R and setting up the R Studio application, to setting up a Python environment and installing a suite of packages used for data analysis. Second, the API returns most data in JSON format: this is standard on the web, but less familiar for bioinformaticians used to TSVs and dataframes.

The MGnify Notebook Servers at EMBL and Galaxy, and MGnifyR package are designed to bridge these gaps. Users can launch an online R and Python coding environment in their browser, without installing anything. It already includes the main libraries needed for communicating with the MGnify API, analysing data, and making plots. It uses the popular Jupyter Lab software, which means you can code inside Notebooks: interactive code documents.

There are example Notebooks written in both R and Python, so users can pick whichever they’re more familiar with.

Preview the notebooks

Title Author Categories
Comparative Metagenomics Alejandra Escobar R
Download paginated API data to a CSV Sandy Rogers Python
Fetch Analyses metadata for a Study Sandy Rogers R
GSC23 MGnify Workshop Tatiana Gurbich, Sandy Rogers, Virginie Grosboillot Python
Interactive map for AtlantECO project Kate S [Ekaterina Sakharova] (MGnify team) Python
Load Analyses for a MGnify Study Sandy Rogers Python
Pathways Explorer Alejandra Escobar, Amartya Nambiar R
Pathways Visualisation Alejandra Escobar R
Search MGnify Genomes Virginie Grosboillot Python
Search for Samples or Studies Sandy Rogers, Ben Allen R
No matching items