Home Software Services About Contact     
 
Muscle5

Ensemble FASTA (EFA) format

The EFA file format stores one or more multiple sequence alignments (MSAs) in a single text file. This is convenient for processing ensembles, which typically have 16 or 100 MSAs. Each alignment has a header line with a less-than symbol (<) followed by a label (e.g., <abc.2). This header line is followed by the MSA in aligned FASTA format. The end of the alignment is indicated by the next header line, or the end of the file. Blank lines are allowed, but the first character in the file must be <.
 
You can convert between multiple FASTA files and one EFA file using the fa2efa and efa_explode commands.

Most commands which require an ensemble filename as input accept either EFA or a text file with a list of FASTA filenames or pathnames, one per line.

Below is a simple example with two MSAs of three sequences.

Example

<none.0
>SequenceA
GATTACA
>SequenceB
GAT-ACA
>SequenceC
GATTAC-
<abc.1
>SequenceA
GATTACA
>SequenceB
GA-TACA
>SequenceC
GATTAC-