cogent3.core.moltype.MolType#
- class MolType(name: str, monomers: TStrOrBytes, make_seq: type[c3_sequence.Sequence] | str, gap: str | None = '-', missing: str | None = '?', complements: dict[str, str] | None = None, ambiguities: dict[str, frozenset[str]] | None = None, colors: dict[str, str] | None = None, pairing_rules: dict[frozenset[str], bool] | None = None, mw_calculator: WeightCalculator | None = None, coerce_to: Callable[[bytes], bytes] | None = None)#
MolType handles operations that depend on the sequence type.
- Attributes:
alphabetmonomers
degen_alphabetmonomers + ambiguous characters
degen_gapped_alphabetmonomers + gap + ambiguous characters
gapped_alphabetmonomers + gap
gapped_missing_alphabetmonomers + gap
- gaps
is_nucleicis a nucleic acid moltype
labelsynonym for name
- matching_rules
Methods
can_match(first, second)Returns True if every pos in 1st could match same pos in 2nd.
can_mispair(first, second)Returns True if any position in self could mispair with other.
complement(-> str -> bytes)converts a string or bytes into it's nucleic acid complement
count_degenerate(seq[, validate])returns the number of degenerate characters in a sequence
count_gaps(seq)returns the number of gap characters in a sequence
count_variants(seq)Counts number of possible sequences matching the sequence, given any ambiguous characters in the sequence.
degap(-> str -> bytes)removes all gap and missing characters from a sequence
degenerate_from_seq(seq)Returns least degenerate symbol that encompasses a set of characters
disambiguate(-> str -> bytes)Returns a non-degenerate sequence from a degenerate one.
get_css_style([colors, font_size, font_family])returns string of CSS classes and {character: <CSS class name>, ...}
get_degenerate_positions(seq[, include_gap, ...])Return list of position indexs of degenerate characters in the sequence.
has_ambiguity(seq[, validate])whether sequence has an ambiguity character
is_ambiguity(query_motif[, validate])Return True if querymotif is an amibiguity character in alphabet.
is_compatible_alphabet(alphabet[, strict])checks that characters in alphabet are equal to a bound alphabet
is_degenerate(seq[, validate])checks if a sequence contains degenerate characters
is_gapped(seq[, validate])checks if a sequence contains gaps
is_valid(seq)checks against most degenerate alphabet
iter_alphabets()yield alphabets in order of most to least degenerate
make_seq(*, seq[, name, check_seq])creates a Sequence object corresponding to the molecular type of this instance.
most_degen_alphabet()returns the most degenerate alphabet for this instance
mw(seq[, method, delta])Returns the molecular weight of the sequence.
random_disambiguate(-> str -> bytes)disambiguates a sequence by randomly selecting a non-degenerate character
rc(-> str -> bytes)reverse reverse complement of a sequence
resolve_ambiguity(ambig_motif[, alphabet, ...])Returns tuple of all possible canonical characters corresponding to ambig_motif
strand_symmetric_motifs([motif_length])returns ordered pairs of strand complementary motifs
strip_bad(-> str)Removes any symbols not in the alphabet.
strip_bad_and_gaps(-> str)Removes any symbols not in the alphabet, and any gaps.
strip_degenerate(-> str -> bytes)removes degenerate characters
to_json()returns result of json formatted string
to_regex(seq)returns a regex pattern with ambiguities expanded to a character set
to_rich_dict(**kwargs)returns dict suitable for serialisation
can_pair
Notes
The only way to create sequences is via a MolType instance. The instance defines different alphabets that are used for data conversions. Create a moltype using the
get_moltype()function.