% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/readUCSCtable.R
\name{readUCSCtable}
\alias{readUCSCtable}
\title{Read annotation files from UCSC}
\usage{
readUCSCtable(
  fiName,
  exportFileNa = NULL,
  gtf = NA,
  simplifyCols = c("gene_id", "chr", "start", "end", "strand", "frame"),
  silent = FALSE,
  callFrom = NULL
)
}
\arguments{
\item{fiName}{(character) name (and path) of file to read}

\item{exportFileNa}{(character) optional file-name to be exported, if \code{NULL} no file will be written}

\item{gtf}{(logical) specify if file \code{fiName} in gtf-format (see \href{https://genome.ucsc.edu/cgi-bin/hgTables}{UCSC})}

\item{simplifyCols}{(character) optional list of column-names to be used for simplification  (if 6 column-headers are given) : the 1st value will be used to identify the column
used as refence to summarize all lines with this ID; for the 2nd (typically chromosome names) will be taken a representative value, 
for the 3rd (typically gene start site) will be taken the minimum, 
for the 4th (typically gene end site) will be taken the maximum, for the 5th and 6th a representative values will be reported;}

\item{silent}{(logical) suppress messages}

\item{callFrom}{(character) allows easier tracking of message(s) produced}
}
\value{
This function returns a matrix, optionally the file 'exportFileNa' may be written
}
\description{
This function allows reading and importing genomic \href{https://genome.ucsc.edu/cgi-bin/hgTables}{UCSC-annotation} data.
Files can be read as default UCSC exprot or as GTF-format. 
In the context of proteomics we noticed that sometimes UniProt tables from UCSC are hard to match to identifiers from UniProt Fasta-files, ie many protein-identifiers won't match.
For this reason additional support is given to reading 'Genes and Gene Predictions': Since this table does not include protein-identifiers, a non-redundant list of ENSxxx transcript identifiers 
can be exprted as file for an additional stop of conversion, eg using a batch conversion tool at the site of \href{https://www.uniprot.org/uploadlists/}{UniProt}. 
The initial genomic annotation can then be complemented using \code{\link{readUniProtExport}}. 
Using this more elaborate route, we found higher coverage when trying to add genomic annotation to protein-identifiers to proteomics results with annnotation based on an initial Fasta-file.
}
\examples{
path1 <- system.file("extdata", package="wrProteo")
gtfFi <- file.path(path1, "UCSC_hg38_chr11extr.gtf.gz")
# here we'll write the file for UniProt conversion to tempdir() to keep things tidy
expFi <- file.path(tempdir(), "deUcscForUniProt2.txt")
UcscAnnot1 <- readUCSCtable(gtfFi, exportFileNa=expFi)

## results can be further combined with readUniProtExport() 
deUniProtFi <- file.path(path1, "deUniProt_hg38chr11extr.tab")
deUniPr1 <- readUniProtExport(deUniProtFi, deUcsc=UcscAnnot1,
  targRegion="chr11:1-135,086,622")  
deUniPr1[1:5,-5] 
}
\seealso{
\code{\link{readUniProtExport}}
}
