Resources

Package/Pipeline

How to use

Processes

features

Issues

DADA2

R

Filtering and trimming reads Deduplicating reads Clustering Chimera removal Taxonomic classification

Uses error model to cluster

Dadaist2

Command-line

QC Filtering and trimming reads

Run DADA2 using command-line with less code

Must use paired reads

Phyloseq

R

MicrobiomeAnalyist

Browser

Visualise data, view statistical metrics

Microbiome R package

R

Visualise data, view statistical metrics

Fastp

Command-line

Read trimming and filtering

Fastqc & multiqc

Command-line

Quality assessment

Frogs

Command-line

Frogs

Galaxy

Visually explore data and statistics

MicrobiomeAnalyst :

Paper: Using MicrobiomeAnalyst for comprehensive statistical, functional, and meta-analysis of microbiome data PDF

Tutorial: Metabarcoding 16s tutorial

Uses: Metacoder: An R package for visualization and manipulation of community taxonomic diversity data . Vizualising abundance comparisons in unique heatmap tree format. This is a good alternative to the stacked bar charts normally used to vizualise species abundance and compare conditions.

Where to find in MicrobiomeAnalyst: Data Upload > Data Inspection > Data Filter > Normalization > Analysis Overview > Heat Tree

Note

When to use:

Excellent when you don’t want to use R, and do not need to automate the output.

Can be used directly after using Dadaist2 - outputs are supplied.

Excellent for a first look at data.

The drawbacks are that you cannot automate the process and you loose some control over the details. However, you can download the R scripts to ensure you can trace back what you have done.

Pipelines

micca

“micca (MICrobial Community Analysis) is a software pipeline for the processing of amplicon sequencing data, from raw sequences to OTU tables, taxonomy classification and phylogenetic tree inference. The pipeline can be applied to a range of highly conserved genes/spacers, such as 16S rRNA gene, Internal Transcribed Spacer (ITS) 18S and 28S rRNA. micca is an open-source, GPLv3-licensed software.”

nf-core ampliseq

“nfcore/ampliseq is a bioinformatics analysis pipeline used for amplicon sequencing, supporting denoising of any amplicon and, currently, taxonomic assignment of 16S, ITS and 18S amplicons. Supported is paired-end Illumina or single-end Illumina, PacBio and IonTorrent data. Default is the analysis of 16S rRNA gene amplicons sequenced paired-end with Illumina.”

Statistical analysis

PhyloSeq

<Shiny-phyloseq: Web application for interactive microbiome analysis with provenance tracking

Rhea - statistical methods package for R

Paper: Rhea: a transparent and modular R pipeline for microbial profiling based on 16S rRNA gene amplicons

Installation : Download a an extract folder into project directory - do this individually for each project. Use the installation script “install_packages.R” to install packages required (you can do this part once per R installation).

Note

When to use:

If R is relatively new to you, this is designed to be simple to run and follow.

Microbiome R packages

Paper:

Installation:

if (!require("BiocManager", quietly = TRUE))
  install.packages("BiocManager")

# The following initializes usage of Bioc devel
BiocManager::install(version='devel')

BiocManager::install("microbiome")

Tutorial: tutorial pages

Note

When to use:

This package is extensive and can use data structured for use in phyloseq (a frequently used R package for diversity statistics).

It appears to be in active use so bugs and issues will be addressed more easily by the community/ developers.

The syntax is fairly clean and simple

e.g.: to run an alpha diversity analysis looks like : tab <-microbiome::alpha(pseq, index = “all”) vs tab <- estimate_richness(data) in phyloseq)

The microbiome R package will produce more alpha diversity meterics than pyloseq. This may be of use if you intend to use a different metric.

Exploring functional implications of community structures:

PICRUSt: Phylogenetic Investigation of Communities by Reconstruction of Unobserved States

Taxonomy

indicspecies: multivariate-analysis-indicator-value

“This package provides a set of functions to assess the strength and statistical significance of the relationship between species occurrence/abundance and groups of sites. It is also possible to check the statistical significance of such associations.”

Papers, chapters and commentary:

Author = David Ryder & Nicola Coyle

Testing Alpha Diversity

Comment in the Usearch Documentation

Comment in the PhyloSeq FAQ

Paper discussing rarefaction of data

Paper discussion rarefaction of data

Chapter on species richness / alpha diversity metrics / population estimates 2001

Testing Beta Diversity

Paper on normalisation prior to using beta diversity metrics

Formats / standardisation

Biom format

Different algorithms

Dada2 Software

Swarm Software

USearch Software

Databases (lots of others)

PR2 database

Silvia database

Fungi - ITS2

Best practices in metabarcoding of fungi: From experimental design to results <https://onlinelibrary.wiley.com/doi/10.1111/mec.16460#.YmMZICqe5zw.twitter>_

Database: Unite

Nuclear ribosomal internal transcribed spacer (ITS) Note: “By reanalysing published data sets, we demonstrate that operational taxonomic units (OTUs) outperform amplified sequence variants (ASVs) in recovering fungal diversity, a finding that is particularly evident for long markers. Additionally, analysis of the full-length ITS region allows more accurate taxonomic placement of fungi and other eukaryotes compared to the ITS2 subregion.”

ANCOM

Analysis of composition of microbiomes: a novel method for studying microbial composition https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4450248/

— Author: Nicola Coyle 25/01/2022