See also
  
fastq_mergepairs command
  
fastq_mergepairs options
With next-generation amplicon sequencing, 
the PCR reaction often creates artifacts which are not correctly constructed, 
i.e., do not contain an intended full-length biological sequence. Such artifacts 
can be due to primers binding to a different region of the genome, secondary 
structure formation in the reaction, and so on. Usually, these amplicons are 
shorter or longer than expected and can therefore be filtered by setting a 
length range for the merged read. This is supported by the -fastq_mimmergelen 
and -fastq_maxmergelen options.
	Length range for 16S V4
Currently, a popular method is 2 
	x 250 reads of the 16S V4 hypervariable region. This region has 
	well-conserved length, unlike other 16S hypervariable regions. You can 
	therefore set a narrow range of length to exclude artifacts; With the 
	typical primers V4F (GTGCCAGCMGCCGCGGTAA) and V4R (GGACTACHVGGGTWTCTAAT) I 
	set a length range of 230 to 270. These values exceed the known variation in 
	length to allow novel outliers.
How to determine the length 
	range for a primer pair
The 
	search_pcr command can generate amplicon sequences given primer 
	sequences and a database of known genes (or genomes). The length range in 
	the predicted amplicons can be determined by the
	fastx_info command. Be careful to include 
	or exclude the primer sequences in the total length depending on whether 
	they will appear in the reads (typically they do, but there are many 
	variations in the library preparation protocols). I recommend using a range 
	which exceeds the measured minimum and maximum because there may be novel 
	outliers which are not in the database.