
OTU identifiers in sequence labels
Making an OTU table
An
	OTU table is made by running the
	usearch_global command with an appropriate 
	output file option, e.g. otutabout. See 
	Mapping reads to OTUs for details.
OTU sequences must have OTU identifiers
	When you run usearch_global to make the OTU table, the FASTA file 
	with the OTU sequences must have OTU identifiers in the 
	sequence labels.
OTU identifier syntax
The OTU 
identifier must start with the three letters OTU (case-insensitive) and 
continues to the first letter which is not alphanumeric or an underscore. The 
identifier may appear anywhere in the label, it does not have to be the first 
field. As a special case, if the identifier starts with otu=, the first four 
characters are deleted. This means that you can use otu=xxx; annotations where 
xxx is the OTU identifier, which can now be any string of characters (except 
semi-colon). The following labels have OTU identifier Otu123.
>Otu123
>Otu123;size=14;
>FA87888ZZQ;Otu123;size=14;
>FA87888ZZQ;otu=Otu123;size=14;
How to get OTU identifiers in your labels
The simplest method is to use the option -relabel Otu when you run
cluster_otus. Or, 
you can write your own script to relabel an existing FASTA file.
	WARNING -- QIIME doesn't like underscores in OTU identifiers
	Some of my older examples use OTU idenfiers like OTU_123. 
	Underscores in OTU identifiers can cause problems with QIIME, apparently 
	because the
	
	Newick tree file standard uses underscore to mean a blank space (because 
	the problem only seems to occur when a tree file is used). Some USEARCH 
	commands only allows letters, digits and underscores in OTU identifiers, so 
	you can't use another punctuation symbol (e.g., a period). The safest choice 
	is to use Otu123.