| Methods and compositions for efficient nucleic acid sequencing -> Monitor Keywords |
|
Methods and compositions for efficient nucleic acid sequencingUSPTO Application #: 20080108074Title: Methods and compositions for efficient nucleic acid sequencing Abstract: Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species. (end of abstract) Agent: Marshall, Gerstein & Borun LLP - Chicago, IL, US Inventor: Radoje T. Drmanac USPTO Applicaton #: 20080108074 - Class: 435006000 (USPTO) Related Patent Categories: Chemistry: Molecular Biology And Microbiology, Measuring Or Testing Process Involving Enzymes Or Micro-organisms; Composition Or Test Strip Therefore; Processes Of Forming Such Composition Or Test Strip, Involving Nucleic Acid The Patent Description & Claims data below is from USPTO Patent Application 20080108074. Brief Patent Description - Full Patent Description - Patent Application Claims [0001] The present application is a continuation-in-part of co-pending U.S. patent application Ser. No. 08/303,058, filed Sep. 8, 1994; which is a continuation-in-part of U.S. patent application Ser. No. 08/127,420, filed Sep. 27, 1993; the entire text and figures of which disclosures are specifically incorporated herein by reference without disclaimer. BACKGROUND OF THE INVENTION [0003] 1. Field of the Invention [0004] The present invention generally relates to the field of molecular biology. The invention particularly provides novel methods and compositions to enable highly efficient sequencing of nucleic acid molecules. The methods of the invention are suitable for sequencing long nucleic acid molecules, including chromosomes and RNA, without cloning or subcloning steps. [0005] 2. Description of the Related Art [0006] Nucleic acid sequencing forms an integral part of scientific progress today. Determining the sequence, i.e. the primary structure, of nucleic acid molecules and segments is important in regard to individual projects investigating a range of particular target areas. Information gained from sequencing impacts science, medicine, agriculture and all areas of biotechnology. Nucleic acid sequencing is, of course, vital to the human genome project and other large-scale undertakings, the aim of which is to further our understanding of evolution and the function of organisms and to provide an insight into the causes of various disease states. [0007] The utility of nucleic acid sequencing is evident, for example, the Human Genome Project (HGP), a multinational effort devoted to sequencing the entire human genome, is in progress at various centers. However, progress in this area is generally both slow and costly. Nucleic acid sequencing is usually determined on polyacrylamide gels that separate DNA fragments in the range of 1 to 500 bp, differing in length by one nucleotide. The actual determination of the sequence, i.e., the order of the individual A, G, C and T nucleotides may be achieved in two ways. Firstly, using the Maxam and Gilbert method of chemically degrading the DNA fragment at specific nucleotides (Maxam & Gilbert, 1977), or secondly, using the dideoxy chain termination sequencing method described by Sanger and colleagues (Sanger et al., 1977). Both methods are time-consuming and laborious. [0008] More recently, other methods of nucleic acid sequencing have been proposed that do not employ an electrophoresis step, these methods may be collectively termed Sequencing By Hybridization or SBH (Drmanac et al., 1991; Cantor et al., 1992; Drmanac & Crkvenjakov, U.S. Pat. No. 5,202,231). Development of certain of these methods has given rise to new solid support type sequencing tools known as sequencing chips. The utility of SBH in general is evidenced by the fact that U.S. Patents have been granted on this technology. However, although SBH has the potential for increasing the speed with which nucleic acids can be sequenced, all current SBH methods still suffer from several drawbacks. [0009] SBH can be conducted in two basic ways, often referred to as Format 1 and Format 2 (Cantor et al., 1992). In Format 1, oligonucleotides of unknown sequence, generally of about 100-1000 nucleotides in length, are arrayed on a solid support or filter so that the unknown samples themselves are immobilized (Strezoska et al., 1991; Drmanac & Crkvenjakov, U.S. Pat. No. 5,202,231). Replicas of the array are then interrogated by hybridization with sets of labeled probes of about 6 to 8 residues in length. In Format 2, a sequencing chip is formed from an array of oligonucleotides with known sequences of about 6 to 8 residues in length (Southern, WO 89/10977; Khrapko et al., 1991; Southern et al., 1992). The nucleic acids of unknown sequence are then labeled and allowed to hybridize to the immobilized oligos. [0010] Unfortunately, both of these SBH formats have several limitations, particularly the requirement for prior DNA cloning steps. In Format 1, other significant problems include attaching the various nucleic acid pieces to be sequenced to the solid surface support or preparing a large set of longer probes. In Format 2, major problems include labelling the nucleic acids of unknown sequence, high noise to signal ratios that generally result, and the fact that only short sequences can be determined. Further problems of Format 2 include the secondary structure formation that prevents access to some targets and the different conditions that are necessary for probes with different GC contents. Therefore, the art would clearly benefit from a new procedure for nucleic acid sequencing, and particularly, one that avoids the tedious processes of cloning and/or subcloning. SUMMARY OF THE INVENTION [0011] The present invention seeks to overcome these and other drawbacks inherent in the prior art by providing new methods and compositions for the sequencing of nucleic acids. The novel techniques described herein have been generally termed Format 3 by the inventors and represent marked improvements over the existing Format 1 and Format 2 SBH methods. In the Format 3 sequencing provided by the invention, nucleic acid sequences are determined by means of hybridization with two sets of small oligonucleotide probes of known sequences. The methods of the invention allow high discriminatory sequencing of extremely large nucleic acid molecules, including chromosomal material or RNA, without prior cloning, subcloning or amplification. Furthermore, the present methods do not require large numbers of probes, the complex synthesis of longer probes, or the labelling of a complex mixture of nucleic acids segments. [0012] To determine the sequence of a nucleic acid according to the methods of the present invention, one would generally identify sequences from the nucleic acid by hybridizing with complementary sequences from two sets of small oligonucleotide probes (oligos) of defined length and known sequence, which cover most combinations of sequences for that length of probe. One would then analyze the sequences identified to determine stretches of the identified sequences that overlap, and reconstruct or assemble the complete nucleic acid sequence from such overlapping sequences. [0013] The sequencing methods may be conducted using sequential hybridization with complementary sequences from the two sets of small oligos. Alternatively, a mode described as "cycling" may be employed, in which the two sets of small oligos are hybridized with the unknown sequences simultaneously. The term "cycling" is applied as the discriminatory part of the technique comes from then increasing the temperature to "melt" those hybrids that are non-complementary. Such cycling techniques are commonly employed in other areas of molecular biology, such as PCR, and will be readily understood by those of skill in the art in light when reading the present disclosure. [0014] The invention is applicable to sequencing nucleic acid molecules of very long length. As a practical matter, the nucleic acid molecule to be sequenced will generally be fragmented to provide small or intermediate length nucleic acid fragments that may be readily manipulated. The term nucleic acid fragment, as used herein, most generally means a nucleic acid molecule of between about 10 base pairs (bp) and about 100 bp in length. The most preferred methods of the invention are contemplated to be those in which the nucleic acid molecule to be sequenced is treated to provide nucleic acid fragments of intermediate length, i.e., of between about 10 bp and about 40 bp. However, it should be stressed that the present invention is not a method of completely sequencing small nucleic acid fragments, rather it is a method of sequencing nucleic acid molecules per se, which involves determining portions of sequence from within the molecule--whether this is done using the whole molecule, or for simplicity, whether this is achieved by first fragmenting the molecule into smaller sized sections of from about 4 to about 1000 bases. [0015] Sequences from nucleic acid molecules are determined by hybridizing to small oligonucleotide probes of known sequence. In referring to "small oligonucleotide probes", the term "small" means probes of less than 10 bp in length, and preferably, probes of between about 4 bp and about 9 bp in length. In one exemplary sequencing embodiment, probes of about 6 bp in length are contemplated to be particularly useful. For the sets of oligos to cover all combinations of sequences for the length of probe chosen, their number will be represented by 4.sup.F, wherein F is the length of the probe. For example, for a 4-mer, the set would contain 256 probes; for a 5-mer, the set would contain 1024 probes; for a 6-mer, 4096 probes; a 7-mer, 16384 probes; and the like. The synthesis of oligos of this length is very routine in the art and may be achieved by automated synthesis. [0016] In the methods of the invention, one set of the small oligonucleotide probes of known sequence, which may be termed the first set, will be attached to a solid support, i.e., immobilized on that support in such a way so that they are available to take part in hybridization reactions. The other set of small oligonucleotide probes of known sequence, which may be termed the second set, will be probes that are in solution and that are labelled with a detectable label. The sets of oligos may include probes of the same or different lengths. [0017] The process of sequential hybridization means that nucleic acid molecules, or fragments, of unknown sequence can be hybridized to the distinct sets of oligonucleotide probes of known sequences at separate times (FIG. 1). The nucleic acid molecules or fragments will generally be denatured, allowing hybridization, and added to the first, immobilized set of probes under discriminating hybridization conditions to ensure that only fragments with complementary sequences hybridize. Fragments with non-complementary sequences are removed and the next round of discriminating hybridization is then conducted by adding the second, labelled set of probes, in solution, to the combination of fragments and probes already formed. Labelled probes that hybridize adjacent to a fixed probe will remain attached to the support and can be detected, which is not the case when there is space between the fixed and labelled probes (FIG. 1). [0018] The process of simultaneous hybridization means that the unknown sequence nucleic acid molecules can be contacted with the distinct sets of oligonucleotide probes of known sequences at the same time. Hybridization will occur under discriminating hybridization conditions. Fragments with non-complementary sequences are then "melted", i.e., removed by increasing the temperature, and the next round of discriminating hybridization is then conducted, allowing any complementary second probes to hybridize. Labelled probes that hybridize adjacent to a fixed probe will then be detected in the same manner. [0019] Nucleic acid sequences that are "complementary" are those that are capable of base-pairing according to the standard Watson-Crick complementarity rules, and variations of the rules as they apply to modified bases. That is, that the larger purines, or modified purines, will always base pair with the smaller pyrimidines to form only known combinations. These include the standard paris of guanine paired with Cytosine (G:C) and Adenine paired with either Thymine (A:T), in the case of DNA, or Adenine paired with Uracil (A:U) in the case of RNA. The use of modified bases, or the so-called Universal Base (M, Nichols et al., 1994) is also contemplated. [0020] As used herein, the term "complementary sequences" means nucleic acid sequences that are substantially complementary over their entire length and have very few base mismatches. For example, nucleic acid sequences of six bases in length may be termed complementary when they hybridize at five out of six positions with only a single mismatch. Naturally, nucleic acid sequences that are "completely complementary" will be nucleic acid sequences that are entirely complementary throughout their entire length and have no base mismatches. [0021] After identifying, by hybridization to the oligos of known sequence, various individual sequences that are part of the nucleic acid fragments, these individual sequences are next analyzed to identify stretches of sequences that overlap. For example, portions of sequences in which the 5' end is the same as the 3' end of another sequence, or vice versa, are identified. The complete sequence of the nucleic acid molecule or fragment can then be delineated i.e., it can be reconstructed from the overlapping sequences thus determined. [0022] The processes of identifying overlapping sequences and reconstructing the complete sequence will generally be achieved by computational analysis. For example, if a labelled probe 5'-TTTTTT-3' hybridizes to the spot containing the fixed probe 5'-AAAAAA-3', a 12-mer sequence from within the nucleic acid molecule is defined, namely 5'-AAAAAATTTTTT-3' (SEQ ID NO:1), i.e. the sequence of the two hybridized probes is combined to reveal a previously unknown sequence. The next question to be answered is which nucleotide follows next after the newly determined 5'AAAAAATTTTTT-3' (SEQ ID NO:1) sequence. There are four possibilities represented by the fixed probe 5'-AAAAAT-3' and labelled probes 5'-TTTTTA-3' for A; 5'-TTTTTT-3' for T; 5'-TTTTTC-3' for C; and 5'-TTTTTG-3' for G. If, for example, the probe 5'-TTTTTC-3' is positive and the other three are negative, then the assembled sequence is extended to 5'-AAAAAATTTTTTC-3' (SEQ ID NO:2). In the next step, an algorithm determines which of the labelled probes TTTTCA, TTTTCT, TTTTCC or TTTTCG are positive at the spot containing the fixed probe AAAATT. The process is repeated until all positive (F+P) oligonucleotide sequences are used or defined as false positives. [0023] The present invention thus provides a very effective way to sequence nucleic acid fragments and molecules of long length. Large nucleic acid molecules, as defined herein, are those molecules that need to be fragmented prior to sequencing. They will generally be of at least about 45 or 50 base pairs (bp) in length, and will most often be longer. In fact, the methods of the invention may be used to sequence nucleic acid molecules with virtually no upper limit on length, so that sequences of about 100 bp, 1 kilobase (kb), 100 kb, 1 megabase (Mb), and 50 Mb or more may be sequenced, up to and including complete chromosomes, such as human chromosomes, which are about 100 Mb in length. Such a large number is well within the scope of the present invention and sequencing this number of bases will require two sets of 8-mers or 9-mers (so that F+P.apprxeq.16-18). The nucleic acids to be sequenced may be DNA, such as cDNA, genomic DNA, microdissected chromosome bands, cosmid DNA or YAC inserts, or may be RNA, including mRNA, rRNA, tRNA or snRNA. Continue reading... Full patent description for Methods and compositions for efficient nucleic acid sequencing Brief Patent Description - Full Patent Description - Patent Application Claims Click on the above for other options relating to this Methods and compositions for efficient nucleic acid sequencing patent application. Patent Applications in related categories: 20080182236 - Asthma susceptibility locus - The present invention describes a susceptibility locus which is functionally related to asthma. The locus maps within human chromosome 7p15-p14. The invention also describes a novel human gene, GPRA. The invention provides diagnostic methods and materials for analysing allelic variation in said locus and the GPRA gene. The invention also ... 20080182237 - Lung cancer-related nucleic acids - Described are polynucleotides associated with lung cancer. The polynucleotides are miRNAs, miRNA precursors, and associated nucleic acids. Methods and compositions are described that can be used for diagnosis, prognosis, and treatment of lung cancer. Also described are methods that can be used to identify modulators of the disease-associated polynucleotides. Also ... ### 1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored. 3. Each week you receive an email with patent applications related to your keywords. Start now! - Receive info on patent apps like Methods and compositions for efficient nucleic acid sequencing or other areas of interest. ### Previous Patent Application: Maize event dp-098140-6 and compositions and methods for the identification and/or detection thereof Next Patent Application: Methods and systems to determine fetal sex and detect fetal abnormalities Industry Class: Chemistry: molecular biology and microbiology ### FreshPatents.com Support Thank you for viewing the Methods and compositions for efficient nucleic acid sequencing patent info. IP-related news and info Results in 1.16433 seconds Other interesting Feshpatents.com categories: Software: Finance , AI , Databases , Development , Document , Navigation , Error |
||