augur.align module

Align multiple sequences from FASTA.

exception augur.align.AlignmentError

Bases: Exception

augur.align.check_arguments(args)
augur.align.check_duplicates(*values)
augur.align.ensure_reference_strain_present(ref_name, existing_alignment, seqs)
augur.align.generate_alignment_cmd(method, nthreads, existing_aln_fname, seqs_to_align_fname, aln_fname, log_fname)
augur.align.make_gaps_ambiguous(aln)

replace all gaps by ‘N’ in all sequences in the alignment. TreeTime will treat them as fully ambiguous and replace then with the most likely state. This modifies the alignment in place.

Parameters

aln (MultipleSeqAlign) – Biopython Alignment

augur.align.prune_seqs_matching_alignment(seqs, aln)

Return a set of seqs excluding those set via exclude & print a warning message for each sequence which is exluded.

augur.align.read_alignment(fname)
augur.align.read_reference(ref_fname)
augur.align.read_sequences(*fnames)
augur.align.register_arguments(parser)
augur.align.run(args)
Parameters

args (namespace) – arguments passed in via the command-line from augur

Returns

returns 0 for success, 1 for general error

Return type

int

augur.align.strip_non_reference(alignment_fname, reference, keep_reference=False)

return sequences that have all insertions relative to the reference removed. The alignment is read from file and returned as list of sequences.

Parameters
  • alignment_fname (str) – alignment file name, file needs to be fasta format

  • reference (str) – name of reference sequence, assumed to be part of the alignment

  • keep_reference (bool, optional) – by default, the reference sequence is removed after stripping non-reference sequence. To keep the reference, use keep_reference=True

Returns

list of trimmed sequences, effectively a multiple alignment

Return type

list

augur.align.write_seqs(seqs, fname)

A wrapper around SeqIO.write with error handling

augur.align.write_uppercase_alignment_in_place(fname)