Home Software Services About Contact     
Follow on twitter

Robert C. Edgar on twitter

11-Aug-2018 New paper describes octave plots for visualizing alpha diversity.

12-Jun-2018 New paper shows that one in five taxonomy annotations in SILVA and Greengenes are wrong.

18-Apr-2018 New paper shows that taxonomy prediction accuracy is <50% for V4 sequences.

05-Oct-2017 PeerJ paper shows low accuracy of closed- and open-ref. QIIME OTUs.

22-Sep-2017 New paper shows 97% threshold is wrong, OTUs should be 99% full-length 16S, 100% for V4.

UPARSE tutorial video posted on YouTube. Make OTUs from MiSeq reads.



fastx_info command

Gives a short summary report of the sequences in a FASTA or FASTQ file. Handy for a frst check on what is in a new file. The report is written to the console and can be saved to a text file using the -output option.

For very large file, the -secs option can be used to stop scanning the file after a given number of seconds and report what has been found so far.

Example report

File size 44.6M, 39.2k seqs, 21.5M letters and quals
Lengths min 267, low 539, med 544, hi 551, max 1326
Letter freqs C 31.1%, T 27.0%, A 21.9%, G 20.0%, N 0.021%
0% masked (lower-case)
EE mean 3.4; min 0.2, low 1.4, med 2.3, hi 4.0, max 123.0

"EE" means expected errors.

Example command line

usearch -fastx_info huge_reads.fastq -secs 5 -output reads_info.txt