augur.ancestral
Infer ancestral sequences based on a tree.
The ancestral sequences are inferred using TreeTime. Each internal node gets assigned a nucleotide sequence that maximizes a likelihood on the tree given its descendants and its parent node. Each node then gets assigned a list of nucleotide mutations for any position that has a mismatch between its own sequence and its parent’s sequence. The node sequences and mutations are output to a node-data JSON file.
Note
The mutation positions in the node-data JSON are one-based.
- augur.ancestral.ancestral_sequence_inference(tree=None, aln=None, ref=None, infer_gtr=True, marginal=False, fill_overhangs=True, infer_tips=False, alphabet='nuc')
infer ancestral sequences using TreeTime
- Parameters
tree (Bio.Phylo.BaseTree.Tree or str) – tree or filename of tree
aln (Bio.Align.MultipleSeqAlignment or str) – alignment or filename of alignment
ref (str, optional) – reference sequence to pass to TreeTime’s TreeAnc class
infer_gtr (bool, optional) – Description
marginal (bool, optional) – Description
fill_overhangs (bool) – In some cases, the missing data on both ends of the alignment is filled with the gap character (‘-‘). If set to True, these end-gaps are converted to “ambiguous” characters (‘N’ for nucleotides, ‘X’ for aminoacids). Otherwise, the alignment is treated as-is
infer_tips (bool) – Since v0.7, TreeTime does not reconstruct tip states by default. This is only relevant when tip-state are not exactly specified, e.g. via characters that signify ambiguous states. To replace those with the most-likely state, set infer_tips=True
alphabet (str) – alphabet to use for ancestral sequence inference. Default is the nucleotide alphabet that included a gap character ‘nuc’. Alternative is aa for amino acids.
- Returns
treetime.TreeAnc instance
- Return type
- augur.ancestral.collect_mutations_and_sequences(tt, infer_tips=False, full_sequences=False, character_map=None, is_vcf=False)
iterates of the tree and produces dictionaries with mutations and sequences for each node.
- Parameters
tt (treetime.TreeTime) – instance of treetime with valid ancestral reconstruction
infer_tips (bool, optional) – if true, request the reconstructed tip sequences from treetime, otherwise retain input ambiguities
full_sequences (bool, optional) – if true, add the full sequences
character_map (None, optional) – optional dictionary to map characters to a custom set.
- Returns
dictionary of mutations and sequences
- Return type
- augur.ancestral.register_parser(parent_subparsers)
- augur.ancestral.run(args)
- augur.ancestral.run_ancestral(T, aln, root_sequence=None, is_vcf=False, full_sequences=False, fill_overhangs=False, infer_ambiguous=False, marginal=False, alphabet='nuc')