Home Software Services About Contact     
Follow on twitter

Robert C. Edgar on twitter

11-Aug-2018 New paper describes octave plots for visualizing alpha diversity.

12-Jun-2018 New paper shows that one in five taxonomy annotations in SILVA and Greengenes are wrong.

18-Apr-2018 New paper shows that taxonomy prediction accuracy is <50% for V4 sequences.

05-Oct-2017 PeerJ paper shows low accuracy of closed- and open-ref. QIIME OTUs.

22-Sep-2017 New paper shows 97% threshold is wrong, OTUs should be 99% full-length 16S, 100% for V4.

UPARSE tutorial video posted on YouTube. Make OTUs from MiSeq reads.



Connected components of a graph

See also
  cluster_edges command

A graph is a set of points (called "nodes") in which pairs of nodes are connected by lines (called "edges"). In the figure below, there is a graph with eight nodes, six edges and three connected components.

A connected component is a maximal set of nodes such that a path can be found between any two nodes. Connected components form disconnected "islands" in a graph because by definition there can be no edge between two components.

Finding the connected components of a graph is equivalent to agglomerative clustering with single linkage.

In USEARCH, finding connected components is supported by cluster_agg (input is sequences; use -linkage min), cluster_aggd (input is a distance matrix; use -linkage min) and by cluster_edges (input is a tabbed text file specifying the edges).