CROSS REFERENCE TO OTHER APPLICATIONS
This application is a continuation of co-pending U.S. patent application Ser. No. 12/956,525 entitled POLYMORPHISMS ASSOCIATED WITH PARKINSON'S DISEASE filed Nov. 30, 2010 which is incorporated herein by reference for all purposes, which claims priority to U.S. Provisional Patent Application No. 61/265,304 entitled POLYMORPHISMS ASSOCIATED WITH PARKINSON'S DISEASE filed Nov. 30, 2009 and U.S. Provisional Patent Application No. 61/359,769 entitled POLYMORPHISMS ASSOCIATED WITH PARKINSON'S DISEASE filed Jun. 29, 2010 which are incorporated herein by reference for all purposes.
FIELD OF THE INVENTION
The present invention is related to polymorphisms associated with Parkinson's disease. More specifically, the invention is related to compositions, methods and for use in therapeutic and preventative treatment, study, diagnosis and prognosis of Parkinson's disease.
CROSS-REFERENCE TO SEQUENCE LISTING
The sequence listing included in the electronic file submitted herewith as one of the parts of this application, entitled “23MEP023_PD_Sequence_Listing”, is incorporated by reference into this application in its entirety. This sequence listing file is created on Apr. 19, 2012, and is 3 kilobytes in size.
BACKGROUND OF THE INVENTION
Parkinson's disease (PD) is a progressive degenerative disease of the central nervous system (CNS). PD is characterized by muscle rigidity, tremors, a slowness of physical movement (bradykinesia), impaired balance and coordination, and, in advanced stages, a loss of physical movement (akinesia). Over one million Americans suffer from Parkinson's disease, with the prevalence of approximately 1 in 272 or 0.37% in the United States (US Census Bureau, Population Estimates, 2004).
There is no known cure for PD. Patients are treated with drugs and physical therapy to control the symptoms, but the disease is a progressive disorder and symptoms continue to worsen throughout life. Four major classes of drugs are used to treat PD: Levodopa, direct dopamine agonists, catechol-O-methyltransferase (COMT) inhibitors and anticholinergics. Other types of drugs include selegiline (an MAO-B inhibitor), amantadine (an antiviral agent), vitamin E and hormone replacement therapy. Although these treatments may provide relief from the symptoms of PD, these noncurative drug treatments are often are accompanied by side effects, such as low blood pressure, nausea, constipation, and various psychiatric or behavioral disorders (e.g., hallucinations, depression, and sleep disorders).
While the molecular bases for PD have not been fully elucidated, several genetic regions have been found to be associated with PD. The PARK1 region at 4q21 contains the alpha-synuclein (SNCA) gene. Certain mutations in this gene confer a rare autosomal dominant form of PD (Duvoisin, R. C. (1996), Recent advances in the genetics of Parkinson's disease, Adv Neurol 69:33-40; Polymeropoulos et al. (1997) Mutation in the alpha-synuclein gene identified in families with Parkinson's disease, Science 276:2045-7; and Kruger et al. (1998) Ala30Pro mutation in the gene encoding alpha-synuclein in Parkinson's disease, Nat Genet 18:106-108). The PARK2 region at 6q25-27 contains the parkin gene. The loss of function of both copies of the parkin gene confers an autosomal recessive juvenile form of PD (Abbas et al. (1999) A wide variety of mutation in the parkin gene are responsible for autosomal recessive parkinsonism in Europe, Hum Mol Genet 8:567-574; Lucking et al. (1998) Homozygous deletions in the parkin gene in European and North African families with autosomal recessive juvenile parkinsonism, Lancet 352:1355-1356; and Lucking et al. (2000) Association between early-onset Parkinson's disease and mutations in the parkin gene, N Engl J Med 342:1560-1567). Other regions believed to contain one or more genes associated with PD include PARK3 at 2p13 (autosomal dominant), PARK4 at 4p15 (autosomal dominant; same locus as PARK1), PARK5 at 4p14 (which contains a gene encoding a neuron-specific C-terminal ubiquitin hydrolase), PARK6 at 1p35 (autosomal recessive), PARK7 at 1p36 (which contains the DJ-1 gene; autosomal recessive) and PARK8 at 12p1.2-q13.1 (which contains the LRRK2 gene; autosomal dominant). Additional loci designated PARK9, PARK10, PARK11, PARK12, PARK13, PARK14, PARK15, and PARK16 have also been linked to PD. While the molecular bases for most cases of PD are unclear, the various genetic regions that have been linked to this disease serve to illustrate the potential that the etiology of PD may involve the interaction of a large number of genetic components.
SUMMARY OF THE INVENTION
The present application provides compositions, methods and systems for determining increased or decreased risk or susceptibility of an individual to developing Parkinson's disease (PD). In one aspect, the application provides nucleic acid sequences that may be used to determine the presence or absence of nucleotides at polymorphic sites in an individual's RNA or genomic DNA that are associated with susceptibility to or protection from PD. In another aspect, the application provides a method for identifying a human subject having an increased or decreased susceptibility to PD, including the following steps: 1) obtaining a nucleic acid sample from a patient; and ii) detecting in the sample the identity of nucleotide or nucleotides at one or more polymorphic nucleotide positions listed in Tables 1-1 (SEQ ID NO: 1-8) and 2-1 (SEQ ID NO: 9).
In an additional aspect, methods of identifying a modulator of a PD phenotype are also provided. The methods include contacting a potential modulator to a gene or gene product, e.g. wherein the gene or gene product comprises or is closely linked to a polymorphism described herein (e.g. in Table 1-1 (SEQ ID NO: 1-8) and/or Table 2-1 (SEQ ID NO: 9)). An effect of the potential modulator on the gene or gene product is detected, thereby identifying whether the potential modulator modulates the phenotype.
Kits for performing any of the methods herein are another feature of the disclosure in this application. Such kits can include probes or amplicons for detecting any polymorphism herein, appropriate packaging materials, and instructions for practicing the methods.
The application also provides systems for generating a prognosis of a human subject's increased or decreased susceptibility to PD based on genotyping data spanning hundreds of thousands of single nucleotide polymorphisms (SNPs). The system may include means for storing a subject's profile comprising a set of patient-specific information including a subject's medical history, the family medical history and the subject's genetic testing results including genotypes at the various polymorphic sites listed in Tables 1-1 (SEQ ID NO: 1-8) and 2-1 (SEQ ID NO: 9).
The invention also provides methods of PD prognosis based on expression profiling. Such methods include determining the expression levels of at least 2 and no more than 5,000 genes in a subject, wherein at least two of the genes are selected from the group consisting of MCCC1, TMEM175, RIT2, GAK, DGKQ, RIN, SYT4, STBD1, SCARB2, HLA-DRB1, HLA-DQA1, LOC729862, PGDB3P2 and LRKK2, wherein the expression levels are used to create an expression profile. In one aspect the expression levels are used for PD prognosis by comparing expression levels of the genes in a human subject to expression levels of the genes in a control human subject known to have Parkinson's disease (PD) and/or a healthy control human subject known to not have PD, wherein similarity of expression profiles in the subject and the control subject having PD is suggestive that the subject has a higher likelihood of having or developing Parkinson's disease, and similarity of the expression profiles of the subject and a control subject not having PD-is suggestive that the subject has a lower likelihood of having or developing Parkinson's disease.
Any of the above methods can include informing the subject or a relative thereof of the presence of or susceptibility to PD; or further comprising administering a treatment regimen effective to treat or effect prophylaxis of PD in said subject.
Terms and symbols of genetics, molecular biology, biochemistry and nucleic acid used herein follow those of standard treatises and texts in the field, e.g. Kornberg and Baker, DNA Replication, Second Edition (W. H. Freeman, New York, 1992); Lehninger, Biochemistry, Second Edition (Worth Publishers, New York, 1975); Strachan and Read, Human Molecular Genetics, Second Edition (Wiley-Liss, New York, 1999); Eckstein, editor, Oligonucleotides and Analogs: A Practical Approach (Oxford University Press, New York, 1991); Gait, editor, Oligonucleotide Synthesis: A Practical Approach (IRL Press, Oxford, 1984); and the like.
All terms are to be understood with their typical meanings established in the relevant art. Without limiting any term, further clarifications for some of the terms are provided below:
The term “nucleic acid” refers to a deoxyribonucleotide or ribonucleotide, whether singular or in polymers, naturally occurring or non-naturally occurring, double-stranded or single-stranded, coding (e.g. translated gene) or non-coding (e.g. regulatory region), or any fragments, derivatives, mimetics or complements thereof. Examples of nucleic acids include oligonucleotides, nucleotides, polynucleotides, nucleic acid sequences, genomic sequences, antisense nucleic acids, DNA regions, probes, primers, genes, regulatory regions, introns, exons, open-reading frames, binding sites, target nucleic acids and allele-specific nucleic acids. A nucleic acid can include one or more polymorphisms, variations or mutations (e.g., SNPs, insertions, deletions, inversions, translocations, etc.). A nucleic acid includes analogs (e.g., phosphorothioates, phosphoramidates, methyl phosphonate, chiral-methyl phosphonates, 2-O-methyl ribonucleotides) or modified nucleic acids (e.g., modified backbone residues or linkages) or nucleic acids that are combined with carbohydrates, lipids, protein or other materials, or peptide nucleic acids (PNAs) (e.g., chromatin, ribosomes, transcriptosomes, etc.) or nucleic acids in various structures (e.g., A DNA, B DNA, Z-form DNA, siRNA, tRNA, ribozymes, etc.). A nucleic acid may also include a detectable label. The term “detectable label” as used herein refers to, for example, a luminescent label, a light scattering label or a radioactive label, or any other form of labeling that can be detected by a physical, chemical, or a biological process. Fluorescent labels include commercially available fluorescein phosphoramidites such as Fluoreprime (Pharmacia), Fluoredite (Millipore) and FAM (ABI).
The term “hybridization” as used herein refers to the process in which two single-stranded polynucleotides bind non-covalently to form a stable double-stranded polynucleotide; triple-stranded hybridization is also theoretically possible. Hybridizations are usually performed under stringent conditions. For example, conditions of 5× SSPE (750 mM NaCl, 50 mM NaPhosphate, 5 mM EDTA, pH 7.4) and a temperature of 25-30° C. are suitable for allele-specific probe hybridizations. The term “specific hybridization” refers to the ability of a first nucleic acid to bind, duplex or hybridize to a second nucleic acid in a manner such that the second nucleic acid can be identified or distinguished from other components of a mixture (e.g., cellular extracts, genomic DNA, etc.). In certain embodiments, specific hybridization is performed under stringent conditions.
The term “hybridization-based assay” means any assay that relies on the formation of a stable duplex or triplex between a probe and a target nucleotide sequence for detecting or measuring such a sequence. Hybridization-based assays include, without limitation, assays based on use of oligonucleotides, such as polymerase chain reactions, oligonucleotide ligation reactions, single-base extensions of primers, circularizable probe reactions, allele-specific oligonucleotide hybridizations, either in solution phase or bound to solid phase supports, such as microarrays or microbeads.
The terms “isolated” and “purified” refer to a material that is substantially or essentially removed from or concentrated in its natural environment. For example, a purified nucleic acid is one that is separated from the nucleic acids that normally flank it or from other biological materials (e.g., other nucleic acids, proteins, lipids, cellular components, etc.) in a sample.
The term “linkage disequilibrium” refers to the preferential segregation of a particular polymorphic form with another polymorphic form at a different chromosomal location more frequently than expected by chance.
The term “modulate” refers to a change in expression, lifespan, function or activity of a nucleic acid or a protein. Such changes may include, for example, an increase, decrease, alteration, enhancement or inhibition of expression or activity of a nucleic acid or protein.
The term “PD-related nucleic acid” refers to a nucleic acid, or fragment, derivative (e.g., RNA), variant, polymorphism, or complement thereof, associated with resistance or susceptibility to PD including, for example, at least one or more PD-associated polymorphisms, genomic regions spanning 10 kb immediately upstream and 10 kb immediately downstream of a PD-associated polymorphism, coding and non-coding regions of an associated gene, and/or genomic regions spanning 10 kb immediately upstream and 10 kb immediately downstream of an associated gene, and nucleotide variants thereof.
The term “phenotype” is a trait or collection of traits that is/are observable in an individual or population. The trait can be quantitative (a quantitative trait, or QTL) or qualitative. For example, susceptibility to Parkinson's disease is a phenotype that can be identified according to the methods and compositions of the application as described herein.
A “PD susceptibility phenotype” is a phenotype that displays a predisposition towards developing Parkinson's disease in an individual. A phenotype that displays a predisposition for PD, can, for example, show a higher likelihood that the disease will develop in an individual with the phenotype than in members of a relevant general population under a given set of environmental conditions (diet, physical activity regime, geographic location, etc.).
The terms “polymorphism”, “polymorphic nucleotide”, “polymorphic site” or “polymorphic nucleotide position” refer to a position in a nucleic acid that possesses the quality or character of occurring in several different forms. A nucleic acid may be naturally or non-naturally polymorphic, e.g., having one or more sequence differences (e.g., additions, deletions and/or substitutions) as compared to a reference sequence. A reference sequence may be based on publicly available information (e.g., the U.C. Santa Cruz Human Genome Browser Gateway (genome.ucsc.edu/cgi-bin/hgGateway) or the NCBI website (www.ncbi.nlm.nih.gov)) or may be determined by a practitioner of the present invention using methods well known in the art (e.g., by sequencing a reference nucleic acid). A nucleic acid polymorphism is characterized by two or more “alleles”, or versions of the nucleic acid sequence. Typically, an allele of a polymorphism that is identical to a reference sequence is referred to as a “reference allele” and an allele of a polymorphism that is different from a reference sequence is referred to as an “alternate allele”, or sometimes a “variant allele”. As used herein, the term “major allele” refers to the more frequently occurring allele at a given polymorphic site, and “minor allele” refers to the less frequently occurring allele, as present in the general or study population. The term “risk allele” as used herein refers to an allele of a genetic polymorphism associated with an increased risk for PD. The term “protective allele” as used herein refers to the allele associated with a decreased risk for PD.
The term “single nucleotide polymorphism” or “SNP” refers to a polymorphic site occupied by a single nucleotide, which is the site of variation between allelic sequences. The site is usually preceded by and followed by highly conserved sequences of the allele (e.g., sequences that vary in less than 1/100 or 1/1000 members of the populations). A single nucleotide polymorphism usually arises due to substitution of one nucleotide for another at the polymorphic site. Single nucleotide polymorphisms can also arise from a deletion of a nucleotide or an insertion of a nucleotide relative to a reference allele.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 depicts a Manhattan plot of the GWAS, showing the distribution of p-values along the genome, with chromosomes arranged along the X-axis, and all SNPs associated with p-values under 10e-7 represented with an “x”.
FIG. 2 depicts a quantile-quantile plot of the GWAS performed by the applicants. There is no evidence of population structure biasing the results, as the plot shows no inflation. The genomic inflation factor (Devlin and Roeder (1999) Genomic control for association studies, Biometrics 55:997-1004) was calculated to be 1.11.
FIG. 3 depicts plots of the p-values surrounding significant SNPs around the MCCC1/LAMP3 region. SNPs with a p-value under 10e-6 are shown.
FIG. 4 depicts plots of the p-values surrounding significant SNPs around the GAK/DGKQ/TMEM175 region. SNPs with a p-value under 10 e-6 are shown.
FIG. 5 depicts plots of the p-values surrounding significant SNPs around the RIT2 region. SNPs with a p-value under 10 e-6 are shown.
FIG. 6 depicts plots of the p-values surrounding significant SNPs around the SCARB2 region. SNPs with a p-value under10 e-6 are shown.
FIG. 7 depicts plots of the p-values surrounding significant SNPs around the HLA-DRB1 and HLA-DQA1regions. SNPs with a p-value under 10 e-6 are shown.
FIG. 8 depicts plots of the p-values surrounding significant SNPs around the LOC729862 region. SNPs with a p-value under 10e-6 are shown.
DETAILED DESCRIPTION OF THE INVENTION
The invention provides a set of novel polymorphisms associated with Parkinson's disease (PD). Identification of such polymorphisms is useful for the development and design of diagnostic or prognostic assays for PD. The polymorphisms may also have additional applications, including diagnostic and prognostic use in Parkinson's disease-related conditions, therapeutic treatments for Parkinson's disease, genetic linkage analysis and positional cloning.
Polymorphisms of the Invention
A genome-wide association study (GWAS) was performed to search for novel genetic variants associated with PD. Such studies have proven successful in identifying many hundreds of genetic associations to a wide range of diseases (Hirschhorn, J N (2009) Genomewide Association Studies—Illuminating Biologic Pathways, N Engl J Med 360: 1699-1701). Briefly, a GWAS is performed by collecting genome-wide SNP data on a large number of cases and controls and then testing each of the many (typically over 500,000) SNPs that were typed for significant frequency differences between cases and controls. A significant frequency difference is evidence that the SNP is associated with the disease, however due to the large multiple testing burden incurred by doing hundreds of thousands of tests, the association must be very significant to be considered true. Typical standards (employed by the applicants in the present application) require a p-value of under 1e-7 and replication of the association in an independent sample.
The GWAS performed by the applicants identified a total of 12 SNPs that were found to be independently and significantly associated with Parkinson's disease. Of these, 8 SNPs have never before been demonstrated to be associated with PD. In addition, 4 SNPs previously associated with PD replicated in the GWAS performed by the applicants. The presence of these 4 previously identified and well-known PD-associated SNPs in the set of significantly associated SNPs serves as supporting evidence for the validity of the study design and methodology. Finally, the GWAS performed by the applicants identified a novel PD-associated SNP modifying a known PD-associated mutation in the LRKK2 gene (LRRK2 G2019S, rs34637584).
Tables 1-1 (SEQ ID NO: 1-8) and 1-2 (SEQ ID NO: 1-8).
Tables 1-1 (SEQ ID NO: 1-8) and 1-2 (SEQ ID NO: 1-8) identify the 8 novel PD-associated SNPs that independently associated with PD. These SNPs were selected on the basis of fulfilling the following criteria: 1) a p-value under 1e-7 for association with PD; 2) replication in the National Institute of Neurological Disorders and Stroke (NINDS) Parkinson's disease dataset; 3) evidence of independent effect (significant after controlling for other SNPs in the list); and 4) no evidence of genotyping error.
SNP rs10513789 (SEQ ID NO: 1) is found in the following sequence: SEQ ID NO: 1: 5′-tgatggtttttcaattttgttatgttgata[t/g]gtactgcatgataccagattacaaacaggg-3′. The major allele is T, the minor allele is G, with the major allele being the risk allele. This association lies in an intron of MCCC1. Another candidate gene in this region is LAMP3, which has been found to be overexpressed in the brains of individuals with PD.
SNP rs6599389 (SEQ ID NO: 2) is found in the following sequence:
SEQ ID NO: 2:
The major allele is G, the minor allele is A, with the minor allele being the risk allele. This SNP lies in an intron of TMEM175. Other candidate genes in this region include GAK and DGKQ. GAK is a promising candidate gene as it is differentially expressed in Parkinson\'s disease in the substantia nigra (Grünblatt et al (2004) Gene expression profiling of parkinsonian substantia nigra pars compacta: alterations in ubiquitin-proteasome, heat shock protein, iron and oxidative stress regulated proteins, cell adhesion/cellular matrix and vesicle trafficking genes, J Neural Transm 111(12):1543-73).
SNP rs873785 (SEQ ID NO: 3) is found in the following sequence:
SEQ ID NO: 3: