Resources
Package/Pipeline |
How to use |
Processes |
features |
Issues |
DADA2 |
R |
Filtering and trimming reads Deduplicating reads Clustering Chimera removal Taxonomic classification |
Uses error model to cluster |
|
Dadaist2 |
Command-line |
QC Filtering and trimming reads |
Run DADA2 using command-line with less code |
Must use paired reads |
Phyloseq |
R |
|||
MicrobiomeAnalyist |
Browser |
Visualise data, view statistical metrics |
||
Microbiome R package |
R |
Visualise data, view statistical metrics |
||
Fastp |
Command-line |
Read trimming and filtering |
||
Fastqc & multiqc |
Command-line |
Quality assessment |
||
Frogs |
Command-line |
|||
Frogs |
Galaxy |
Visually explore data and statistics
Paper: Using MicrobiomeAnalyst for comprehensive statistical, functional, and meta-analysis of microbiome data PDF
Tutorial: Metabarcoding 16s tutorial
Uses: Metacoder: An R package for visualization and manipulation of community taxonomic diversity data . Vizualising abundance comparisons in unique heatmap tree format. This is a good alternative to the stacked bar charts normally used to vizualise species abundance and compare conditions.
Where to find in MicrobiomeAnalyst: Data Upload > Data Inspection > Data Filter > Normalization > Analysis Overview > Heat Tree
Note
When to use:
Excellent when you don’t want to use R, and do not need to automate the output.
Can be used directly after using Dadaist2 - outputs are supplied.
Excellent for a first look at data.
The drawbacks are that you cannot automate the process and you loose some control over the details. However, you can download the R scripts to ensure you can trace back what you have done.
Pipelines
“micca (MICrobial Community Analysis) is a software pipeline for the processing of amplicon sequencing data, from raw sequences to OTU tables, taxonomy classification and phylogenetic tree inference. The pipeline can be applied to a range of highly conserved genes/spacers, such as 16S rRNA gene, Internal Transcribed Spacer (ITS) 18S and 28S rRNA. micca is an open-source, GPLv3-licensed software.”
“nfcore/ampliseq is a bioinformatics analysis pipeline used for amplicon sequencing, supporting denoising of any amplicon and, currently, taxonomic assignment of 16S, ITS and 18S amplicons. Supported is paired-end Illumina or single-end Illumina, PacBio and IonTorrent data. Default is the analysis of 16S rRNA gene amplicons sequenced paired-end with Illumina.”
Statistical analysis
PhyloSeq
<Shiny-phyloseq: Web application for interactive microbiome analysis with provenance tracking
Rhea - statistical methods package for R
Paper: Rhea: a transparent and modular R pipeline for microbial profiling based on 16S rRNA gene amplicons
Installation : Download a an extract folder into project directory - do this individually for each project. Use the installation script “install_packages.R” to install packages required (you can do this part once per R installation).
Note
When to use:
If R is relatively new to you, this is designed to be simple to run and follow.
Microbiome R packages
Paper:
Installation:
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
# The following initializes usage of Bioc devel
BiocManager::install(version='devel')
BiocManager::install("microbiome")
Tutorial: tutorial pages
Note
When to use:
This package is extensive and can use data structured for use in phyloseq (a frequently used R package for diversity statistics).
It appears to be in active use so bugs and issues will be addressed more easily by the community/ developers.
The syntax is fairly clean and simple
e.g.: to run an alpha diversity analysis looks like : tab <-microbiome::alpha(pseq, index = “all”) vs tab <- estimate_richness(data) in phyloseq)
The microbiome R package will produce more alpha diversity meterics than pyloseq. This may be of use if you intend to use a different metric.
Exploring functional implications of community structures:
PICRUSt: Phylogenetic Investigation of Communities by Reconstruction of Unobserved States
Taxonomy
indicspecies: multivariate-analysis-indicator-value
“This package provides a set of functions to assess the strength and statistical significance of the relationship between species occurrence/abundance and groups of sites. It is also possible to check the statistical significance of such associations.”
Papers, chapters and commentary:
Author = David Ryder & Nicola Coyle
Testing Alpha Diversity
Comment in the Usearch Documentation
Paper discussing rarefaction of data
Paper discussion rarefaction of data
Chapter on species richness / alpha diversity metrics / population estimates 2001
Testing Beta Diversity
Paper on normalisation prior to using beta diversity metrics
Formats / standardisation
Different algorithms
Databases (lots of others)
Fungi - ITS2
Best practices in metabarcoding of fungi: From experimental design to results <https://onlinelibrary.wiley.com/doi/10.1111/mec.16460#.YmMZICqe5zw.twitter>_
Nuclear ribosomal internal transcribed spacer (ITS) Note: “By reanalysing published data sets, we demonstrate that operational taxonomic units (OTUs) outperform amplified sequence variants (ASVs) in recovering fungal diversity, a finding that is particularly evident for long markers. Additionally, analysis of the full-length ITS region allows more accurate taxonomic placement of fungi and other eukaryotes compared to the ITS2 subregion.”
ANCOM
Analysis of composition of microbiomes: a novel method for studying microbial composition https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4450248/
— Author: Nicola Coyle 25/01/2022