Type: | Package |
Encoding: | UTF-8 |
Title: | Data Structure and Manipulations Tool for Host and Viral Population |
Version: | 0.0.5 |
Date: | 2019-06-14 |
Author: | Jean-Francois Rey [aut, cre] |
Maintainer: | Jean-Francois Rey <jean-francois.rey@inra.fr> |
Description: | Statistical Methods for Inferring Transmissions of Infectious Diseases from deep sequencing data (SMITID). It allow sequence-space-time host and viral population data storage, indexation and querying. |
License: | GPL-2 | GPL-3 | file LICENSE [expanded from: GPL (≥ 2) | file LICENSE] |
LazyData: | true |
BuildVignettes: | true |
NeedsCompilation: | no |
Biarch: | true |
URL: | https://informatique-mia.inra.fr/biosp/anr-smitid-project, https://gitlab.paca.inra.fr/SMITID/structR |
BugReports: | https://gitlab.paca.inra.fr/SMITID/structR/issues |
Depends: | methods, utils, grDevices (≥ 3.0.0), graphics (≥ 3.0.0), R (≥ 3.3.0) |
DependsNote: | BioC (>= 3.0) |
Imports: | ggplot2, sf (≥ 0.6.3), stats (≥ 3.0.2), Biostrings (≥ 2.0.0) |
ImportsNote: | BioC (>= 3.0), Recommended: Biostrings |
Suggests: | testthat (≥ 2.0) |
Collate: | 'Class-Host.R' 'Class-ViralPop.R' 'Methods-Host.R' 'Methods-ViralPop.R' 'Methods-time.R' 'SMITIDstruct.R' 'demo.R' 'diversity.R' 'index.R' |
RoxygenNote: | 6.1.1 |
Packaged: | 2019-06-14 10:05:06 UTC; jfrey |
Repository: | CRAN |
Date/Publication: | 2019-06-14 11:30:11 UTC |
Data Structure and Manipulation Tool for Host and Viral Population
Description
Statistical Methods for Inferring Transmissions of Infectious Diseases from deep sequencing data (SMITID). It allow sequence-space-time host and viral population data storage, indexation and querying.
Details
Package: | SMITIDstruct |
Type: | Package |
Version: | 0.0.5 |
Date: | 2019-06-14 |
License: | GPL (>=2) |
The SMITIDstruct package contains functions and methods for manipulating Host and Viral population genotico-space-time data.
Author(s)
Jean-Francois Rey jean-francois.rey@inra.fr
Maintainer: Jean-Francois Rey jean-francois.rey@inra.fr
See Also
Examples
## Run a simulation
library("SMITIDstruct")
demo.SMITIDstruct.run()
Class Host
Description
Spatio-temporal information about Host.
Details
Object can be created by calling ...
rdname Host-class
Slots
ID
Host identifier
coordinates
Host coordinates in time (as sf)
states
Host States/Status (dob, Inf...)
sources
data.frame of time and host id who infected this host
offsprings
data.frame of time and host id who has been contamined by this host
ID_V_POP
data.frame of time and index of Viral population Observation
covariates
data.frame of time, cavariate and value of this host.
Class ViralPop
Description
Viral population data containing genotypes
Slots
ID
Host identifier
time
Observation time as numeric since 1970/01/01
size
Qt of variants
names
list of variants id with same sequence
genotypes
all variants genotypes (as DNAStringSet)
proportions
proportions of each variants
addHost
Description
add an Host to a HostSet
Usage
addHost(lhost, id)
Arguments
lhost |
a hostSet Object |
id |
a character of host ID |
Value
a HostSet of host object with there ID
Examples
lhost <- list()
lhost <- addHost(lhost,"42")
addIndex
Description
add to an index a new eventcode
Usage
addIndex(index, id_host, time, code)
Arguments
index |
an index |
id_host |
an host index in HostSet |
time |
a time |
code |
an event code |
Value
the index updated (add a row or update one)
addViralObs
Description
load Viral pop observation in Host object
Usage
addViralObs(lhost, lvpop)
Arguments
lhost |
a HostSet |
lvpop |
a ViralPopSet |
Value
lhost update with viral population observed
addcode
Description
add a code event to an another
Usage
addcode(code, code.add)
Arguments
code |
an existing code |
code.add |
the code to add |
Value
merge of the two code
alleleCount
Description
count allele at each position
Usage
alleleCount(mat, seq.char = c("A", "T", "G", "C"))
Arguments
mat |
a genomique seq list as matrix by row |
seq.char |
allele alphabet |
Value
a matrix, each row as a unique seq and col as allele count by position
concatViralPop
Description
concat several Viral population in one ViralPop object
Usage
concatViralPop(lvpop, lid)
Arguments
lvpop |
a ViralPop Set |
lid |
vector of viralpop id to concat |
Value
a ViralPop object with ID concatenation from all IDs and time at 0.
createAViralPop
Description
Create a new ViralPop object
Usage
createAViralPop(host_id, obs_time, seq, id_seq = "seq_ID",
seq_value = "seq", prop = "prop", compact = FALSE)
Arguments
host_id |
host ID which viral pop is observed |
obs_time |
time of the observation (numeric or date) |
seq |
a data.frame of sequences ID, sequences and counts |
id_seq |
column name containing the sequences ID |
seq_value |
column name containing the sequences |
prop |
column name containing the count of each sequences |
compact |
boolean, default FALSE, if TRUE will try group identicals sequences (not implemented yet) |
createHost
Description
create a list of Host class object
Usage
createHost(list_host)
Arguments
list_host |
a character vector of host ID |
Value
a HostSet of host object with there ID
Examples
lh <- seq(1,30,1)
lhost <- createHost(lh)
createIndex
Description
create an index of time id_host and event code
Usage
createIndex(hostlist)
Arguments
hostlist |
a Hostset |
Value
a data.frame with TIME, ID_HOST and EVENTCODE as columns
demo.SMITIDstruct.run
Description
run a demo to load HostSet, ViralPopSet and index
Usage
demo.SMITIDstruct.run()
diversity.pDistance
Description
diversity calculation using Mean Pairwise Distance
Usage
diversity.pDistance(vpop)
Arguments
vpop |
a ViralPop object |
Value
result
diversity.sfs
Description
Allele frequency spectrum or Site frequency spectra : the distribution of alternative allele frequencies across all sites of genetic sequences
Usage
diversity.sfs(vpop)
Arguments
vpop |
a viralPop class |
Value
the site frequency spectra
getCov
Description
get Host(s) covariates
Usage
getCov(lhost, id = NA)
Arguments
lhost |
a HostSet |
id |
a vector of host id (default NA : all lhost) |
Value
a data.frame
getDate
Description
Converte timestamp to Date (string)
Usage
getDate(time, format = "%Y-%m-%dT%H:%M:%S")
Arguments
time |
a timestamp or vector of |
format |
Date format output (default %Y-%m-%dT%H:%M:%S) |
Value
time as string date
getDiversity.pDistance
Description
get pairwise distance of an host over viral population observated
Usage
getDiversity.pDistance(host, lvpop)
Arguments
host |
an Host object |
lvpop |
a ViralPopSet object |
Value
a data.frame with col as time of observation and p_distance
getDiversity.sfs
Description
get Allele Frequency Spectrum or Site Frequency spectra for observated viral pop of an host
Usage
getDiversity.sfs(host, lvpop)
Arguments
host |
an Host object |
lvpop |
an ViralPopSet object |
Value
a list indexed by time that contains allele.time and count
getInfosByHostAndTime
Description
get hosts informations, status, infectedby, coordinates and time
Usage
getInfosByHostAndTime(index, lhost)
Arguments
index |
an index |
lhost |
a hosts list |
Value
a data.frame with colnames (id, time, infectedby, status, probabilities, X ,Y)
getStates
Description
get Host(s) states
Usage
getStates(lhost, id = NA)
Arguments
lhost |
a HostSet |
id |
a vector of host id (default NA : all lhost) |
Value
a data.frame
getTimeLine
Description
get the time line of an host
Usage
getTimeLine(lhost, id)
Arguments
lhost |
a hostSet |
id |
a host ID |
Value
a data.frame
getTimestamp
Description
Get the timestamp of Date
Usage
getTimestamp(date, format = "%Y-%m-%dT%H:%M:%S")
Arguments
date |
a date (as string) or vector of |
format |
the date format (default %Y-%m-%dT%H:%M:%S) |
Value
timestamp of the date(s)
getTransmissionTree
Description
get a transmission tree as a data.frame
Usage
getTransmissionTree(lhost, id = NA)
Arguments
lhost |
a hostSet |
id |
a vector of hosts ids (default NA : all host) |
Value
a data.frame as source|target|time in columns
Examples
path = system.file("extdata", "data-simul/", package="SMITIDstruct")
lhost <- list()
lhost <- loadTree(lhost,paste(path,"/tree.txt",sep=''))
print(getTransmissionTree(lhost))
is.StringDate
Description
Check if a string represent a date
Usage
is.StringDate(date)
Arguments
date |
a string or a vector of string (without NA) |
Value
TRUE if date contains date format
is.juliendate
Description
Chekc if a numeric is not a timestamp
Usage
is.juliendate(time)
Arguments
time |
a numeric |
Value
TRUE if time is a julien day, otherwise FALSE
is.timestamp
Description
Check if a numeric represent a timestamp
Usage
is.timestamp(time)
Arguments
time |
a numeric |
Value
TRUE if time >= 1971
isInCode
Description
check a code contains a specific code
Usage
isInCode(code, thecode)
Arguments
code |
list of code to test |
thecode |
the real code |
Value
TRUE if code contain thecode otherwise FLASE
loadCoords
Description
Load Hosts states
Usage
loadCoords(lhost, dfCoords, id = "ID")
Arguments
lhost |
a HostSet |
dfCoords |
a data.frame with host ID, time and longitude latitude values |
id |
colname for host ID |
Value
lhost updated
Examples
path = system.file("extdata", "data-simul/", package="SMITIDstruct")
lhost <- list()
lhost <- loadTree(lhost,paste(path,"/tree.txt",sep=''))
coords <- read.table(file=paste(path,"/hosts_coords.txt",sep=''), header=TRUE, check.names=FALSE)
lhost <- loadCoords(lhost,coords)
loadCovs
Description
Load Hosts covariates
Usage
loadCovs(lhost, dfCovs, id = "ID", colCovs)
Arguments
lhost |
a HostSet |
dfCovs |
a data.frame with host ID in rows and covariates in columns |
id |
colname for host ID |
colCovs |
colnames of covariates columns |
Value
lhost updated with covariates
loadHost
Description
load host object from a file
Usage
loadHost(file = "host.txt")
Arguments
file |
a file containing hosts data |
Value
a list of Host object (HostSet) include Class-Host.R
loadStates
Description
Load Hosts states
Usage
loadStates(lhost, dfStates, id = "ID", colStates)
Arguments
lhost |
a HostSet |
dfStates |
a data.frame with host ID and states in columns and time as value |
id |
colname for host ID |
colStates |
colnames of States columns |
Value
lhost updated
Examples
path = system.file("extdata", "data-simul/", package="SMITIDstruct")
lhost <- list()
class(lhost) <- "hostSet"
lhost <- loadTree(lhost,paste(path,"/tree.txt",sep=''))
obs <- read.table(paste(path,"/obs.txt",sep=''),header=TRUE, check.names=FALSE)
obs.states <- c(colnames(obs[-grep("ID|Tobs.*",colnames(obs))]))
lhost <- loadStates(lhost, obs, colStates=obs.states)
loadTree
Description
load sources and offsprings from file
Usage
loadTree(lhost = list(), file = "tree.txt", source = "ID-source",
receptor = "ID-receptor", tinf = "Tinf", weight = "Weight")
Arguments
lhost |
a HostSet |
file |
a file containing tree data |
source |
column name for source ID |
receptor |
column name for receptor ID |
tinf |
column name for infection Time |
weight |
column name of infection weight |
Value
the lhost param update with sources and offsprings
Examples
path = system.file("extdata", "data-simul/", package="SMITIDstruct")
lhost <- list()
class(lhost) <- "hostSet"
lhost <- loadTree(lhost,paste(path,"/tree.txt",sep=''))
loadTreeDF
Description
load sources and offsprings from a data.frame
Usage
loadTreeDF(lhost = list(), df = data.frame(), source = "ID-source",
receptor = "ID-receptor", tinf = "Tinf", weight = "Weight")
Arguments
lhost |
a HostSet |
df |
a data.frame containing tree data |
source |
column name for source ID |
receptor |
column name for receptor ID |
tinf |
column name for infection Time |
weight |
infection links probability |
Value
the lhost param update with sources and offsprings
loadViralObs
Description
load a ViralPop object
Usage
loadViralObs(id, time, file)
Arguments
id |
host pathogen ID |
time |
time of the observation (numeric or Date) |
file |
a fasta file |
Value
a new ViralPop object
loadViralPop
Description
Load all ViralPop observated in the file.obs
Usage
loadViralPop(directory, listFiles, listCol = list(id = "id", timeObs =
"time", filename = "filename"), file.extension = "fasta")
Arguments
directory |
path where is data |
listFiles |
a dataframe with host ID, time observation and file name (filename.fasta) |
listCol |
a list of listFiles colomns names ("id", "timeObs", "filename") |
file.extension |
genotype file extension |
Value
a vector of VirlaPop object
Examples
path = system.file("extdata", "data-simul/", package="SMITIDstruct")
files <- list.files(path, pattern = ".*.fasta" ,full.names=FALSE)
lfileinfo <- sapply(files,function(x){return(substr(x,1,nchar(x)-6))})
splitFiles <- strsplit(lfileinfo, "_");
listF <- cbind(data.frame(matrix(unlist(splitFiles),nrow=length(splitFiles), byrow=TRUE),
stringsAsFactors = FALSE), names(splitFiles))
colnames(listF) <- c("id", "time", "filename")
lvpop <- loadViralPop(path,listF)
loadViralPopSet
Description
load a list of viral populations
Usage
loadViralPopSet(lvpop = list(), list)
Arguments
lvpop |
a viralPopSet (default new one) |
list |
a list (see details) |
Details
The list have to be on this format: list$HOST_ID$TIME$list$seq_id $seq $prop A list indexed by host ID, follow by a list indexed by time (of observation). The last list contains an array of seq_ID (sequence ID), an array of seq (sequence as characters), and an array of the count of seq. example : $'HOST_42'$'2014-01-01T00:00:00'$seq_ID ["SEQ_1","SEQ_2"] $'HOST_42'$'2014-01-01T00:00:00'$seq ["ACGT","TGCA"] $'HOST_42'$'2014-01-01T00:00:00'$seq_ID ["46","6"]
mergeCode
Description
merge a list of event code
Usage
mergeCode(listcode)
Arguments
listcode |
a list of event code* |
Value
a code
plotDiversity.pDistance
Description
plot Mean Pairwise Distance for an host viralpop over time
Usage
plotDiversity.pDistance(host, lvpop)
Arguments
host |
an Host object |
lvpop |
a ViralPopSet object |
plotDiversity.sfs
Description
plot Allele frequency spetrum for an host viralpop over time
Usage
plotDiversity.sfs(host, lvpop)
Arguments
host |
an Host object |
lvpop |
an ViralPopSet object |
setStates
Description
set hosts states from a data.frame
Usage
setStates(lhost, dfStates, colStates = c(id = "ID", time = "time", states
= "value"))
Arguments
lhost |
a HostSet |
dfStates |
a data.frame with host ID and states and time in columns |
colStates |
vector of the columns name, id, time and states |
Value
the HostSet updated
simulateStates
Description
simulate states from sources infection
Usage
simulateStates(lhost)
Arguments
lhost |
a HostSet |
Value
lhost update with states from sources time ~