Genome-wide association study

From Wikipedia, the free encyclopedia

Jump to: navigation, search

In genetic epidemiology, a genome-wide association study (GWA study, or GWAS), also known as whole genome association study (WGA study, or WGAS), is an examination of the entire genome of different individuals to see if any variant is associated with a trait. Typically single-nucleotide polymorphisms (SNPs) are investigated and typically investigated traits include major diseases. The first GWAS was from 2005 and compared 96 patients with age-related macular degeneration with 50 healthy controls^[1]. Today, hundreds or thousands of individuals are tested. As of December 2010^[update], over 1,200 human GWASs have examined over 200 diseases and traits, and found almost 4,000 SNP associations.^[2] The GWAS identify SNPs and other variation in DNA, but cannot on their own specify which genes are causal.^[3]^[4]^[5]

These studies normally compare the DNA of two groups of participants: people with the disease (cases) and similar people without (controls). Each person gives a sample of cells, such as swabs of cells from the inside of the cheek. DNA is extracted from these cells, and spread on gene chips, which can read millions of DNA sequences. These chips are read into computers, where they can be analyzed with bioinformatics techniques. Rather than reading the entire DNA sequence, these systems usually read SNPs that are variations in single nucleotides. If genetic variations are more frequent in people with the disease, the variations are said to be "associated" with the disease. The associated genetic variations are then considered as pointers to the region of the human genome where the disease-causing problem is likely to reside. In contrast to methods which specifically test one or a few genetic regions, the GWAS investigates the entire genome. These two approaches are said to be candidate-driven and non-candidate-driven, respectively.

Surprisingly, most of the SNP variations associated with disease are not in the region of DNA that codes for a protein. Instead, they are usually in the large non-coding regions on the chromosome between genes, or in the intron sequences that are edited out of the DNA sequence when proteins are processed. These are presumably sequences of DNA that control other genes, but usually, their protein function is not known.^[3]

[edit] Background

The human genome contains many millions of single-nucleotide polymorphisms, and thousands more variations in the number of copies of large and small segments of the genome (copy number variation), which may either directly cause changes in phenotype or which tag nearby mutations containing the key differences that influence individual variation and susceptibility to disease. GWA studies allow researchers to sample 500,000 or more SNPs from each subject in a study capturing variation uniformly across the genome. To date, these studies have identified risk and protective factors for asthma, cancer, diabetes, heart disease, mental illness, and other human differences.

Most genetic variations are associated with the geographical and historical populations in which the mutations first arose.^[6] This ability of SNPs to tag surrounding blocks of ancient DNA (haplotypes) underlies the rationale for GWAS. However, because of this, studies must take account of the geographical and racial background of participants—controlling for what is called population stratification. As the peoples of the world have migrated and inter-married over many generations, these geographical variations also become broken down and mixed over time.^[7]

[edit] Genes identified

In 2005, a GWAS found an association between age-related macular degeneration (ARMD) and a variation in the gene for complement factor H (CFH). Complement is a protein that regulates inflammation.^[8] This association was unexpected from previous research in ARMD, and identified ARMD as an inflammatory process. Together with 4 other variants, these genes can predict half the risk of ARMD between siblings, and it is among the most successful examples of GWAS.^[3]

In 2007, a GWAS found an association between type 2 diabetes and a variation in several SNPs in the genes TCF7L2, SLC30A8 and others.^[9]

In 2007, the Wellcome Trust Case Control Consortium carried out genome-wide association studies for the diseases coronary heart disease, type 1 diabetes, type 2 diabetes, rheumatoid arthritis, Crohn's disease, bipolar disorder, and hypertension. This study was successful in uncovering many new disease genes underlying these diseases.^[10]^[11]

Genes in many traditional genetic diseases, such as hemophilia, are always associated with the disease. Other genes are associated with an increased risk. Disappointingly, most of the SNP variations found by GWAS are associated with only a small increased risk of the disease, and have only a small predictive value. The median odds ratio for a SNP is 1.33 per SNP, with some variants carrying odds ratios above 3.0, and some exceeding 12.0. A common pattern is that a few variants have a large effect, but most have small effects.^[3]^[12]

[edit] Clinical applications

One of the challenges for a successful GWAS in the future will be to apply the findings in a way that accelerates drug and diagnostics development, including better integration of genetic studies into the drug-development process and a focus on the role of genetic variation in maintaining health as a blueprint for designing new drugs and diagnostics.^[13] One of such successes is related to identifying the genetic variant associating with response to anti-hepatitis C virus treatment. For genotype 1 hepatitis C treated with Pegylated interferon-alpha-2a or Pegylated interferon-alpha-2b (brand names Pegasys or PEG-Intron) combined with ribavirin, a GWAS study ^[14] has shown that genetic polymorphisms near the human IL28B gene, encoding interferon lambda 3, are associated with significant differences in response to the treatment. A later report demonstrated that the same genetic variants are also associated with the natural clearance of the genotype 1 hepatitis C virus.^[15]

[edit] Problems

GWA studies are necessarily hypothesis-free: that is they search the entire genome for associations rather than focusing on small candidate areas. This aspect of GWA has attracted the criticism as expensive "factory science". Robert Elston is a prominent proponent of linkage, although he does accept association may occasionally be useful. Methodologically, the power of association to localize a mutation translates directly into the need for extremely dense searches. This led Pearson and Manolio to note that "the GWA approach can also be problematic because the massive number of statistical tests performed presents an unprecedented potential for false-positive results".^[4] Alternative strategies such as linkage analysis act as systematic studies of variation, without needing variants at each region.

[edit] See also

Association mapping

[edit] References

^ Klein RJ (March 2005). "Complement factor H polymorphism in age-related macular degeneration.". Science 308: 385–9. PMID 15761122.
^ Johnson, A.; O'Donnell, C. (2009). "An open access database of genome-wide association results". BMC medical genetics 10: 6. doi:10.1186/1471-2350-10-6. PMC 2639349. PMID 19161620. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=2639349. edit
^ ^a ^b ^c ^d Manolio TA; Guttmacher, Alan E.; Manolio, Teri A. (July 2010). "Genomewide association studies and assessment of the risk of disease" (PDF). N. Engl. J. Med. 363 (2): 166–76. doi:10.1056/NEJMra0905980. PMID 20647212. http://www.nejm.org/doi/pdf/10.1056/NEJMra0905980.
^ ^a ^b Pearson TA, Manolio TA (March 2008). "How to interpret a genome-wide association study" (PDF). J. Am. Med. Ass. 299 (11): 1335–44. doi:10.1001/jama.299.11.1335. PMID 18349094. http://jama.ama-assn.org/content/299/11/1335.full.pdf+html.
^ "Genome-Wide Association Studies". National Human Genome Research Institute. http://www.genome.gov/20019523.
^ Novembre, J.; Johnson, T.; Bryc, K.; Kutalik, Z. N.; Boyko, A. R.; Auton, A.; Indap, A.; King, K. S. et al. (2008). "Genes mirror geography within Europe". Nature 456 (7218): 98–101. doi:10.1038/nature07331. PMC 2735096. PMID 18758442. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=2735096. edit
^ Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (August 2006). "Principal components analysis corrects for stratification in genome-wide association studies". Nat. Genet. 38 (8): 904–9. doi:10.1038/ng1847. PMID 16862161. Lay summary.
^ Klein RJ, et al. (2005). "Complement factor H polymorphism in age-related macular degeneration". Science (New York, N.Y.) 308 (5720): 385–9. doi:10.1126/science.1109557. PMC 1512523. PMID 15761122. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=1512523.
^ Sladek R, Rocheleau G, Rung J, et al. (2007). "A genome-wide association study identifies novel risk loci for type 2 diabetes". Nature 445 (7130): 881–5. doi:10.1038/nature05616. PMID 17293876.
^ "Largest ever study of genetics of common diseases published today" (Press release). Wellcome Trust Case Control Consortium. 2007-06-06. http://www.wtccc.org.uk/info/070606.shtml. Retrieved 2008-06-19.
^ Wellcome Trust Case Control Consortium (2007). "Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls". Nature 447 (7145): 661–78. doi:10.1038/nature05911. PMC 2719288. PMID 17554300. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=2719288.
^ Ku CS, Loy EY, Pawitan Y, Chia KS (2010). "The pursuit of genome-wide association studies: where are we now?". Journal of Human Genetics 55 (4): 195–206. doi:10.1038/jhg.2010.19. PMID 20300123.
^ Iadonato SP & Katze MG (2009). "Genomics: Hepatitis C virus gets personal". Nature 461 (7262): 357–8. doi:10.1038/461357a. PMID 19759611.
^ Ge D, Fellay J, Thompson AJ, et al. (2009). "Genetic variation in IL28B predicts hepatitis C treatment-induced viral clearance". Nature 461 (7262): 399–401. doi:10.1038/nature08309. PMID 19684573.
^ Thomas DL, Thio CL, Martin MP, et al. (2009). "Genetic variation in IL28B and spontaneous clearance of hepatitis C virus". Nature 461 (7265): 798–801. doi:10.1038/nature08463. PMID 19759533.

[edit] Reviews

Wang WY, Barratt BJ, Clayton DG, Todd JA (February 2005). "Genome-wide association studies: theoretical and practical concerns". Nat. Rev. Genet. 6 (2): 109–18. doi:10.1038/nrg1522. PMID 15716907.
Hardy J, Singleton A (April 2009). "Genomewide association studies and human disease". N. Engl. J. Med. 360 (17): 1759–68. doi:10.1056/NEJMra0808700. PMID 19369657.
Ku CS, Loy EY, Pawitan Y, Chia KS (April 2010). "The pursuit of genome-wide association studies: where are we now?". J. Hum. Genet. 55 (4): 195–206. doi:10.1038/jhg.2010.19. PMID 20300123.

[edit] External links

Whole genome association study — entry in the public domain NCI Dictionary of Cancer Terms.
Whole genome association studies — by the National Human Genome Research Institute
Performing Genome Wide Association (GWAS) using R and GenABEL — Bioinformatics tutorials
Principles for the post-GWAS functional characterisation of risk loci — by Mills I, Dall'Olio GM, Burke DF, Engelken J, Glazov EA, et al.
GWAS Central — a central database of summary-level genetic association findings
Barrett, Jeff (18 July 2010). "How to read a genome-wide association study". Genomes Unzipped. http://www.genomesunzipped.org/2010/07/how-to-read-a-genome-wide-association-study.php.

[pmid15761122-0] Klein RJ (March 2005). "Complement factor H polymorphism in age-related macular degeneration.". Science 308: 385–9. PMID 15761122.

[1] Johnson, A.; O'Donnell, C. (2009). "An open access database of genome-wide association results". BMC medical genetics 10: 6. doi:10.1186/1471-2350-10-6. PMC 2639349. PMID 19161620. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=2639349. edit

[pmid20647212-2] Manolio TA; Guttmacher, Alan E.; Manolio, Teri A. (July 2010). "Genomewide association studies and assessment of the risk of disease" (PDF). N. Engl. J. Med. 363 (2): 166–76. doi:10.1056/NEJMra0905980. PMID 20647212. http://www.nejm.org/doi/pdf/10.1056/NEJMra0905980.

[Pearson-3] Pearson TA, Manolio TA (March 2008). "How to interpret a genome-wide association study" (PDF). J. Am. Med. Ass. 299 (11): 1335–44. doi:10.1001/jama.299.11.1335. PMID 18349094. http://jama.ama-assn.org/content/299/11/1335.full.pdf+html.

[4] "Genome-Wide Association Studies". National Human Genome Research Institute. http://www.genome.gov/20019523.

[5] Novembre, J.; Johnson, T.; Bryc, K.; Kutalik, Z. N.; Boyko, A. R.; Auton, A.; Indap, A.; King, K. S. et al. (2008). "Genes mirror geography within Europe". Nature 456 (7218): 98–101. doi:10.1038/nature07331. PMC 2735096. PMID 18758442. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=2735096. edit

[pmid16862161-6] Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (August 2006). "Principal components analysis corrects for stratification in genome-wide association studies". Nat. Genet. 38 (8): 904–9. doi:10.1038/ng1847. PMID 16862161. Lay summary.

[Klein-7] Klein RJ, et al. (2005). "Complement factor H polymorphism in age-related macular degeneration". Science (New York, N.Y.) 308 (5720): 385–9. doi:10.1126/science.1109557. PMC 1512523. PMID 15761122. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=1512523.

[8] Sladek R, Rocheleau G, Rung J, et al. (2007). "A genome-wide association study identifies novel risk loci for type 2 diabetes". Nature 445 (7130): 881–5. doi:10.1038/nature05616. PMID 17293876.

[9] "Largest ever study of genetics of common diseases published today" (Press release). Wellcome Trust Case Control Consortium. 2007-06-06. http://www.wtccc.org.uk/info/070606.shtml. Retrieved 2008-06-19.

[WTCCC-10] Wellcome Trust Case Control Consortium (2007). "Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls". Nature 447 (7145): 661–78. doi:10.1038/nature05911. PMC 2719288. PMID 17554300. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=2719288.

[11] Ku CS, Loy EY, Pawitan Y, Chia KS (2010). "The pursuit of genome-wide association studies: where are we now?". Journal of Human Genetics 55 (4): 195–206. doi:10.1038/jhg.2010.19. PMID 20300123.

[12] Iadonato SP & Katze MG (2009). "Genomics: Hepatitis C virus gets personal". Nature 461 (7262): 357–8. doi:10.1038/461357a. PMID 19759611.

[13] Ge D, Fellay J, Thompson AJ, et al. (2009). "Genetic variation in IL28B predicts hepatitis C treatment-induced viral clearance". Nature 461 (7262): 399–401. doi:10.1038/nature08309. PMID 19684573.

[14] Thomas DL, Thio CL, Martin MP, et al. (2009). "Genetic variation in IL28B and spontaneous clearance of hepatitis C virus". Nature 461 (7265): 798–801. doi:10.1038/nature08463. PMID 19759533.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

Oct	NOV	Dec
	10
2010	2011	2012

Genome-wide association study

Contents

[edit] Background

[edit] Genes identified

[edit] Clinical applications

[edit] Problems

[edit] See also

[edit] References

[edit] Reviews

[edit] External links

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Interaction

Toolbox

Print/export

Languages