This page gives a compact mental model for misha. Use it as the first
quick read before the full Manual vignette.
Most analyses follow the same pattern:
In misha this is usually one call to gextract,
gscreen, or gsummary.
You are not limited to raw track names. You can pass full
expressions, for example log(dense_track + 1),
dense_track / (chip.sum + 1e-6), or
pmin(dense_track, 2).
All examples below assume the bundled examples database:
A track is genomic signal organized over coordinates.
dense_track in the examples DB).Useful starter commands:
gtrack.ls() # list tracks in the examples DB
gtrack.info("dense_track") # inspect type/metadata
gtrack.info("sparse_track")For intuition, you can think of dense_track as a
ChIP-seq-like coverage signal.
An interval set defines genomic regions
(chrom, start, end) where you
want to work.
The iterator is the stepping policy inside the scope.
iterator = 100 -> fixed 100 bp binsiterator = "some_sparse_track" -> iterate over that
track’s intervalsiterator = some_intervals_df -> iterate over
explicit regionsiterator = "my_intervals_set" -> iterate directly
over an intervals setThink of it as: scope says where, iterator says in what chunks.
out <- gextract("dense_track", regions, iterator = 100)
log_out <- gextract("log(dense_track + 1)", regions, iterator = 100)Create and use an intervals set as an iterator:
A virtual track is a named on-the-fly transformation, not stored as a physical track file.
Examples:
gvtrack.create("chip.sum", "dense_track", "sum")
out <- gextract("chip.sum", regions, iterator = 200)You can also shift the iterator window used by the virtual track:
gvtrack.create("chip.shifted", "dense_track", "sum")
gvtrack.iterator("chip.shifted", sshift = -100, eshift = 100)
out <- gextract("chip.shifted", regions, iterator = 200)Here, each iterator interval is expanded by 100 bp on both sides
before evaluating dense_track.
Virtual tracks are session objects (easy to list with
gvtrack.ls and delete with gvtrack.rm).
library(misha)
gdb.init_examples()
# 1) pick scope
regions <- gintervals(1, 0, 50000)
# 2) inspect available tracks
print(gtrack.ls())
# 3) extract signal with a chosen iterator
chip <- gextract("dense_track", regions, iterator = 100)
# 4) screen high-signal bins (as a simple peak-like filter)
hi_chip <- gscreen("dense_track > 0.6", regions, iterator = 100)
# 5) summarize distribution/coverage
stats <- gsummary("dense_track", regions, iterator = 100)A PWM/PSSM is a motif model over A/C/G/T. In misha, a common pattern is:
regions <- gintervals(1, c(1000, 2000), c(1020, 2020))
seqs <- gseq.extract(regions)
pssm <- matrix(c(
0.80, 0.05, 0.10, 0.05,
0.10, 0.10, 0.70, 0.10,
0.05, 0.80, 0.05, 0.10,
0.10, 0.10, 0.10, 0.70
), ncol = 4, byrow = TRUE)
colnames(pssm) <- c("A", "C", "G", "T")
scores <- gseq.pwm(seqs, pssm, mode = "lse")If your database has motif files under pssms/, you can
create a genome-wide PWM-energy track with
gtrack.create_pwm_energy(...).