Home Software Services About Contact     
Follow on twitter

Robert C. Edgar on twitter

11-Aug-2018 New paper describes octave plots for visualizing alpha diversity.

12-Jun-2018 New paper shows that one in five taxonomy annotations in SILVA and Greengenes are wrong.

18-Apr-2018 New paper shows that taxonomy prediction accuracy is <50% for V4 sequences.

05-Oct-2017 PeerJ paper shows low accuracy of closed- and open-ref. QIIME OTUs.

22-Sep-2017 New paper shows 97% threshold is wrong, OTUs should be 99% full-length 16S, 100% for V4.

UPARSE tutorial video posted on YouTube. Make OTUs from MiSeq reads.



fastq_chars command

Report statistics on ASCII characters used to represent quality scores in a FASTQ file. Useful for guessing FASTQ options for a file of unknown format.

Statistics are written as progress messages to standard error output (stderr). The output can be saved to a file by specifying a log file with the -log option.

The output indicates the range of ASCII values found in the quality scores, and reports a guess at the appropriate FASTQ options. Here is a typical output for a FASTQ file in Sanger format:

  Qmin 66, QMax 104, Range 39
  Guess: -fastq_qmin 2 -fastq_qmax 40 -fastq_ascii 64

Also reported is the frequency of each nucleotide and the range of ASCII Q score(s) corresponding to any undetermined bases (Ns), e.g.:

      Letter          N   Freq MaxRun
   ------ ---------- ------ ------
        A     4249438 30.0%     41
        C     3720773 26.3%     16
        G     3571073 25.2%     18
        T     2631491 18.6%     12
        N        1025  0.0%     12 Q=B


usearch -fastq_chars reads.fastq -log chars.log