Show simple item record

dc.creatorFungtammasan, Arkarachai
dc.creatorTomaszkiewicz, Marta
dc.creatorCampos Sánchez, Rebeca
dc.creatorEckert, Kristin A.
dc.creatorDeGiorgio, Michael
dc.creatorMakova, Kateryna D.
dc.date.accessioned2017-08-11T20:52:22Z
dc.date.available2017-08-11T20:52:22Z
dc.date.issued2016-07-12
dc.identifier.citationhttps://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msw139
dc.identifier.issn1537-1719
dc.identifier.issn0737-4038
dc.identifier.urihttps://hdl.handle.net/10669/72968
dc.description.abstractTranscript variation has important implications for organismal function in health and disease. Most transcriptome studies focus on assessing variation in gene expression levels and isoform representation. Variation at the level of transcript sequence is caused by RNA editing and transcription errors, and leads to nongenetically encoded transcript variants, or RNA–DNA differences (RDDs). Such variation has been understudied, in part because its detection is obscured by reverse transcription (RT) and sequencing errors. It has only been evaluated for intertranscript base substitution differences. Here, we investigated transcript sequence variation for short tandem repeats (STRs). We developed the first maximum-likelihood estimator (MLE) to infer RT error and RDD rates, taking next generation sequencing error rates into account. Using the MLE, we empirically evaluated RT error and RDD rates for STRs in a large-scale DNA and RNA replicated sequencing experiment conducted in a primate species. The RT error rates increased exponentially with STR length and were biased toward expansions. The RDD rates were approximately 1 order of magnitude lower than the RT error rates. The RT error rates estimated with the MLE from a primate data set were concordant with those estimated with an independent method, barcoded RNA sequencing, from a Caenorhabditis elegans data set. Our results have important implications for medical genomics, as STR allelic variation is associated with >40 diseases. STR nonallelic transcript variation can also contribute to disease phenotype. The MLE and empirical rates presented here can be used to evaluate the probability of disease-associated transcripts arising due to RDD.es_ES
dc.description.sponsorshipNational Institutes of Health/[R01-GM087472]/NIH/Estados Unidoses_ES
dc.description.sponsorshipNational Science Foundation/[DBI-0965596]/NSF/Estados Unidoses_ES
dc.description.sponsorshipNational Science Foundation/[OCI-0821527]/NSF/Estados Unidoses_ES
dc.language.isoen_USes_ES
dc.rightsAtribución 3.0 Costa Rica*
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/cr/*
dc.sourceMolecular Biology and Evolution; Volumen 33, Número 10. 2016es_ES
dc.subjectMicrosatelliteses_ES
dc.subjectTandem repeatses_ES
dc.subjectRNA sequencinges_ES
dc.subjectRNA-DNA differenceses_ES
dc.subjectTranscription errorses_ES
dc.subjectReverse transcription errorses_ES
dc.subjectSequencing errorses_ES
dc.subjectError correction modeles_ES
dc.titleReverse Transcription Errors and RNA-DNA Differences at Short Tandem Repeatses_ES
dc.typeartículo original
dc.identifier.doi10.1093/molbev/msw139
dc.description.procedenceUCR::Vicerrectoría de Investigación::Unidades de Investigación::Ciencias Básicas::Centro de Investigación en Biología Celular y Molecular (CIBCM)es_ES


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

Atribución 3.0 Costa Rica
Except where otherwise noted, this item's license is described as Atribución 3.0 Costa Rica