Critical evaluation of the FANTOM3 non-coding RNA transcripts
Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt
Standard
Critical evaluation of the FANTOM3 non-coding RNA transcripts. / Nordström, Karl J V; Mirza, Majd A I; Almén, Markus Sällman; Gloriam, David Erik Immanuel; Fredriksson, Robert; Schiöth, Helgi B.
I: Genomics, Bind 94, Nr. 3, 2009, s. 169-176.Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - Critical evaluation of the FANTOM3 non-coding RNA transcripts
AU - Nordström, Karl J V
AU - Mirza, Majd A I
AU - Almén, Markus Sällman
AU - Gloriam, David Erik Immanuel
AU - Fredriksson, Robert
AU - Schiöth, Helgi B
N1 - Keywords: Computational Biology; Databases, Genetic; EST; Expressed Sequence Tags; FANTOM3; Genome, Human; Humans; Introns; ncRNA; Non-coding RNA; Proteins; RIKEN; RNA, Messenger; RNA, Untranslated; Sequence Analysis, RNA; snoRNA; Transcription, Genetic
PY - 2009
Y1 - 2009
N2 - We studied the genomic positions of 38,129 putative ncRNAs from the RIKEN dataset in relation to protein-coding genes. We found that the dataset has 41% sense, 6% antisense, 24% intronic and 29% intergenic transcripts. Interestingly, 17,678 (47%) of the FANTOM3 transcripts were found to potentially be internally primed from longer transcripts. The highest fraction of these transcripts was found among the intronic transcripts and as many as 77% or 6929 intronic transcripts were both internally primed and unspliced. We defined a filtered subset of 8535 transcripts that did not overlap with protein-coding genes, did not contain ORFs longer than 100 residues and were not internally primed. This dataset contains 53% of the FANTOM3 transcripts associated to known ncRNA in RNAdb and expands previous similar efforts with 6523 novel transcripts. This bioinformatic filtering of the FANTOM3 non-coding dataset has generated a lead dataset of transcripts without signs of being artefacts, providing a suitable dataset for investigation with hybridization-based techniques.
AB - We studied the genomic positions of 38,129 putative ncRNAs from the RIKEN dataset in relation to protein-coding genes. We found that the dataset has 41% sense, 6% antisense, 24% intronic and 29% intergenic transcripts. Interestingly, 17,678 (47%) of the FANTOM3 transcripts were found to potentially be internally primed from longer transcripts. The highest fraction of these transcripts was found among the intronic transcripts and as many as 77% or 6929 intronic transcripts were both internally primed and unspliced. We defined a filtered subset of 8535 transcripts that did not overlap with protein-coding genes, did not contain ORFs longer than 100 residues and were not internally primed. This dataset contains 53% of the FANTOM3 transcripts associated to known ncRNA in RNAdb and expands previous similar efforts with 6523 novel transcripts. This bioinformatic filtering of the FANTOM3 non-coding dataset has generated a lead dataset of transcripts without signs of being artefacts, providing a suitable dataset for investigation with hybridization-based techniques.
KW - Former Faculty of Pharmaceutical Sciences
U2 - 10.1016/j.ygeno.2009.05.012
DO - 10.1016/j.ygeno.2009.05.012
M3 - Journal article
C2 - 19505569
VL - 94
SP - 169
EP - 176
JO - Genomics
JF - Genomics
SN - 0888-7543
IS - 3
ER -
ID: 21087519