#PAGE_PARAMS# #ADS_HEAD_SCRIPTS# #MICRODATA#

A Survey of Genomic Traces Reveals a Common Sequencing Error, RNA Editing, and DNA Editing


While it is widely held that an organism's genomic information should remain constant, several protein families are known to modify it. Members of the AID/APOBEC protein family can deaminate DNA. Similarly, members of the ADAR family can deaminate RNA. Characterizing the scope of these events is challenging. Here we use large genomic data sets, such as the two billion sequences in the NCBI Trace Archive, to look for clusters of mismatches of the same type, which are a hallmark of editing events caused by APOBEC3 and ADAR. We align 603,249,815 traces from the NCBI trace archive to their reference genomes. In clusters of mismatches of increasing size, at least one systematic sequencing error dominates the results (G-to-A). It is still present in mismatches with 99% accuracy and only vanishes in mismatches at 99.99% accuracy or higher. The error appears to have entered into about 1% of the HapMap, possibly affecting other users that rely on this resource. Further investigation, using stringent quality thresholds, uncovers thousands of mismatch clusters with no apparent defects in their chromatograms. These traces provide the first reported candidates of endogenous DNA editing in human, further elucidating RNA editing in human and mouse and also revealing, for the first time, extensive RNA editing in Xenopus tropicalis. We show that the NCBI Trace Archive provides a valuable resource for the investigation of the phenomena of DNA and RNA editing, as well as setting the stage for a comprehensive mapping of editing events in large-scale genomic datasets.


Vyšlo v časopise: A Survey of Genomic Traces Reveals a Common Sequencing Error, RNA Editing, and DNA Editing. PLoS Genet 6(5): e32767. doi:10.1371/journal.pgen.1000954
Kategorie: Research Article
prolekare.web.journal.doi_sk: https://doi.org/10.1371/journal.pgen.1000954

Souhrn

While it is widely held that an organism's genomic information should remain constant, several protein families are known to modify it. Members of the AID/APOBEC protein family can deaminate DNA. Similarly, members of the ADAR family can deaminate RNA. Characterizing the scope of these events is challenging. Here we use large genomic data sets, such as the two billion sequences in the NCBI Trace Archive, to look for clusters of mismatches of the same type, which are a hallmark of editing events caused by APOBEC3 and ADAR. We align 603,249,815 traces from the NCBI trace archive to their reference genomes. In clusters of mismatches of increasing size, at least one systematic sequencing error dominates the results (G-to-A). It is still present in mismatches with 99% accuracy and only vanishes in mismatches at 99.99% accuracy or higher. The error appears to have entered into about 1% of the HapMap, possibly affecting other users that rely on this resource. Further investigation, using stringent quality thresholds, uncovers thousands of mismatch clusters with no apparent defects in their chromatograms. These traces provide the first reported candidates of endogenous DNA editing in human, further elucidating RNA editing in human and mouse and also revealing, for the first time, extensive RNA editing in Xenopus tropicalis. We show that the NCBI Trace Archive provides a valuable resource for the investigation of the phenomena of DNA and RNA editing, as well as setting the stage for a comprehensive mapping of editing events in large-scale genomic datasets.


Zdroje

1. BassBL

2002 RNA editing by adenosine deaminases that act on RNA. Annu Rev Biochem 71 817 846

2. HurstSR

HoughRF

AruscavagePJ

BassBL

1995 Deamination of mammalian glutamate receptor RNA by Xenopus dsRNA adenosine deaminase: similarities to in vivo RNA editing. Rna 1 1051 1060

3. KimU

WangY

SanfordT

ZengY

NishikuraK

1994 Molecular cloning of cDNA for double-stranded RNA adenosine deaminase, a candidate enzyme for nuclear RNA editing. Proc Natl Acad Sci U S A 91 11457 11461

4. MelcherT

MaasS

HerbA

SprengelR

SeeburgPH

1996 A mammalian RNA editing enzyme. Nature 379 460 464

5. O'ConnellMA

KrauseS

HiguchiM

HsuanJJ

TottyNF

1995 Cloning of cDNAs encoding mammalian double-stranded RNA-specific adenosine deaminase. Mol Cell Biol 15 1389 1397

6. MaasS

KawaharaY

TamburroKM

NishikuraK

2006 A-to-I RNA editing and human disease. RNA Biol 3 1 9

7. KeeganLP

LeroyA

SproulD

O'ConnellMA

2004 Adenosine deaminases acting on RNA (ADARs): RNA-editing enzymes. Genome Biol 5 209

8. LiJB

LevanonEY

YoonJK

AachJ

XieB

2009 Genome-wide identification of human RNA editing sites by parallel DNA capturing and sequencing. Science 324 1210 1213

9. AthanasiadisA

RichA

MaasS

2004 Widespread A-to-I RNA Editing of Alu-Containing mRNAs in the Human Transcriptome. PLoS Biol 2 e391 10.1371/journal.pbio.0020391

10. BlowM

FutrealPA

WoosterR

StrattonMR

2004 A survey of RNA editing in human brain. Genome Res 14 2379 2387

11. KimDD

KimTT

WalshT

KobayashiY

MatiseTC

2004 Widespread RNA editing of embedded alu elements in the human transcriptome. Genome Res 14 1719 1725

12. LevanonEY

EisenbergE

YelinR

NemzerS

HalleggerM

2004 Systematic identification of abundant A-to-I editing sites in the human transcriptome. Nat Biotechnol 22 1001 1005

13. ConticelloSG

2008 The AID/APOBEC family of nucleic acid mutators. Genome Biol 9 229

14. NavaratnamN

MorrisonJR

BhattacharyaS

PatelD

FunahashiT

1993 The p27 catalytic subunit of the apolipoprotein B mRNA editing enzyme is a cytidine deaminase. J Biol Chem 268 20709 20712

15. TengB

BurantCF

DavidsonNO

1993 Molecular cloning of an apolipoprotein B messenger RNA editing protein. Science 260 1816 1819

16. HarrisRS

Petersen-MahrtSK

NeubergerMS

2002 RNA editing enzyme APOBEC1 and some of its homologs can act as DNA mutators. Mol Cell 10 1247 1253

17. MuramatsuM

SankaranandVS

AnantS

SugaiM

KinoshitaK

1999 Specific expression of activation-induced cytidine deaminase (AID), a novel member of the RNA-editing deaminase family in germinal center B cells. J Biol Chem 274 18470 18476

18. MuramatsuM

KinoshitaK

FagarasanS

YamadaS

ShinkaiY

2000 Class switch recombination and hypermutation require activation-induced cytidine deaminase (AID), a potential RNA editing enzyme. Cell 102 553 563

19. RevyP

MutoT

LevyY

GeissmannF

PlebaniA

2000 Activation-induced cytidine deaminase (AID) deficiency causes the autosomal recessive form of the Hyper-IgM syndrome (HIGM2). Cell 102 565 575

20. JarmuzA

ChesterA

BaylissJ

GisbourneJ

DunhamI

2002 An anthropoid-specific locus of orphan C to U RNA-editing enzymes on chromosome 22. Genomics 79 285 296

21. SheehyAM

GaddisNC

ChoiJD

MalimMH

2002 Isolation of a human gene that inhibits HIV-1 infection and is suppressed by the viral Vif protein. Nature 418 646 650

22. WedekindJE

DanceGS

SowdenMP

SmithHC

2003 Messenger RNA editing in mammals: new members of the APOBEC family seeking roles in the family business. Trends Genet 19 207 216

23. MehtaA

KinterMT

ShermanNE

DriscollDM

2000 Molecular cloning of apobec-1 complementation factor, a novel RNA-binding protein involved in the editing of apolipoprotein B mRNA. Mol Cell Biol 20 1846 1854

24. LellekH

KirstenR

DiehlI

ApostelF

BuckF

2000 Purification and molecular cloning of a novel essential component of the apolipoprotein B mRNA editing enzyme-complex. J Biol Chem 275 19848 19856

25. HarrisRS

BishopKN

SheehyAM

CraigHM

Petersen-MahrtSK

2003 DNA deamination mediates innate immunity to retroviral infection. Cell 113 803 809

26. MangeatB

TurelliP

CaronG

FriedliM

PerrinL

2003 Broad antiretroviral defence by human APOBEC3G through lethal editing of nascent reverse transcripts. Nature 424 99 103

27. MarianiR

ChenD

SchrofelbauerB

NavarroF

KonigR

2003 Species-specific exclusion of APOBEC3G from HIV-1 virions by Vif. Cell 114 21 31

28. VartanianJP

GuetardD

HenryM

Wain-HobsonS

2008 Evidence for editing of human papillomavirus DNA by APOBEC3 in benign and precancerous lesions. Science 320 230 233

29. YuQ

KonigR

PillaiS

ChilesK

KearneyM

2004 Single-strand specificity of APOBEC3G accounts for minus-strand deamination of the HIV genome. Nat Struct Mol Biol 11 435 442

30. EsnaultC

HeidmannO

DelebecqueF

DewannieuxM

RibetD

2005 APOBEC3G cytidine deaminase inhibits retrotransposition of endogenous retroviruses. Nature 433 430 433

31. ChiuYL

GreeneWC

2008 The APOBEC3 cytidine deaminases: an innate defensive network opposing exogenous retroviruses and endogenous retroelements. Annu Rev Immunol 26 317 353

32. EwingB

GreenP

1998 Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8 186 194

33. EwingB

HillierL

WendlMC

GreenP

1998 Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 8 175 185

34. LeeYN

MalimMH

BieniaszPD

2008 Hypermutation of an ancient human retrovirus by APOBEC3G. J Virol 82 8762 8770

35. ChiuYL

WitkowskaHE

HallSC

SantiagoM

SorosVB

2006 High-molecular-mass APOBEC3G complexes restrict Alu retrotransposition. Proc Natl Acad Sci U S A 103 15588 15593

36. LehmannKA

BassBL

2000 Double-stranded RNA adenosine deaminases ADAR1 and ADAR2 have overlapping specificities. Biochemistry 39 12875 12884

37. WongSK

SatoS

LazinskiDW

2001 Substrate recognition by ADAR1 and ADAR2. Rna 7 846 858

38. HillierLD

LennonG

BeckerM

BonaldoMF

ChiapelliB

1996 Generation and analysis of 280,000 human expressed sequence tags. Genome Res 6 807 828

39. EisenbergE

NemzerS

KinarY

SorekR

RechaviG

2005 Is abundant A-to-I RNA editing primate-specific? Trends Genet 21 77 81

40. NeemanY

LevanonEY

JantschMF

EisenbergE

2006 RNA editing level in the mouse is determined by the genomic repeat repertoire. Rna 12 1802 1809

41. BassBL

WeintraubH

1987 A developmentally regulated activity that unwinds RNA duplexes. Cell 48 607 613

42. ScaddenAD

2007 Inosine-containing dsRNA binds a stress-granule-like complex and downregulates gene expression in trans. Mol Cell 28 491 500

43. KimelmanD

KirschnerMW

1989 An antisense mRNA directs the covalent modification of the transcript encoding fibroblast growth factor in Xenopus oocytes. Cell 59 687 696

44. SaccomannoL

BassBL

1999 A minor fraction of basic fibroblast growth factor mRNA is deaminated in Xenopus stage VI and matured oocytes. Rna 5 39 48

45. TuzunE

SharpAJ

BaileyJA

KaulR

MorrisonVA

2005 Fine-scale structural variation of the human genome. Nat Genet 37 727 732

46. McKernanKJ

PeckhamHE

CostaGL

McLaughlinSF

FuY

2009 Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. Genome Res

47. BentleyDR

BalasubramanianS

SwerdlowHP

SmithGP

MiltonJ

2008 Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456 53 59

48. WheelerDL

BarrettT

BensonDA

BryantSH

CaneseK

2008 Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 36 D13 21

49. KentWJ

SugnetCW

FureyTS

RoskinKM

PringleTH

2002 The human genome browser at UCSC. Genome Res 12 996 1006

50. ZhangZ

SchwartzS

WagnerL

MillerW

2000 A greedy algorithm for aligning DNA sequences. J Comput Biol 7 203 214

51. ZaranekA

CleggT

VandewegeW

ChurchG

Free Factories: Unified Infrastructure for Data Intensive Web Services; 2008 Boston, MA 391 404

Štítky
Genetika Reprodukčná medicína

Článok vyšiel v časopise

PLOS Genetics


2010 Číslo 5
Najčítanejšie tento týždeň
Najčítanejšie v tomto čísle
Kurzy

Zvýšte si kvalifikáciu online z pohodlia domova

Získaná hemofilie - Povědomí o nemoci a její diagnostika
nový kurz

Eozinofilní granulomatóza s polyangiitidou
Autori: doc. MUDr. Martina Doubková, Ph.D.

Všetky kurzy
Prihlásenie
Zabudnuté heslo

Zadajte e-mailovú adresu, s ktorou ste vytvárali účet. Budú Vám na ňu zasielané informácie k nastaveniu nového hesla.

Prihlásenie

Nemáte účet?  Registrujte sa

#ADS_BOTTOM_SCRIPTS#