Computer algorithm for automatic allele determination from fluorometer genotyping device -> Monitor Keywords
Fresh Patents
Monitor Patents Patent Organizer How to File a Provisional Patent Browse Inventors Browse Industry Browse Agents Browse Locations
     new ** File a Provisional Patent ** 
site info Site News  |  monitor Monitor Keywords  |  monitor archive Monitor Archive  |  organizer Organizer  |  account info Account Info  |  
04/03/08 | 50 views | #20080082273 | Prev - Next | USPTO Class 702 | About this Page  702 rss/xml feed  monitor keywords

Computer algorithm for automatic allele determination from fluorometer genotyping device

USPTO Application #: 20080082273
Title: Computer algorithm for automatic allele determination from fluorometer genotyping device
Abstract: The present invention provides methods and systems for an automated method of identifying allele values from data files derived from processed fluorophore emissions detected during the observation of fluorophore labeled nucleotide probes used in analyzing polymorphic DNA are provided. These methods are used in the rapid and efficient distinguishing of targeted polymorphic DNA sites without control samples.
(end of abstract)
Agent: Kilyk & Bowersox, P.l.l.c. - Fairfax, VA, US
Inventors: Stephen Glanowski, Jeremy Heil, Emily S. Winn-Deen, Ivy A. McMullen
USPTO Applicaton #: 20080082273 - Class: 702020000 (USPTO)
Related Patent Categories: Data Processing: Measuring, Calibrating, Or Testing, Measurement System In A Specific Environment, Biological Or Biochemical, Gene Sequence Determination
The Patent Description & Claims data below is from USPTO Patent Application 20080082273.
Brief Patent Description - Full Patent Description - Patent Application Claims  monitor keywords

[0001] This application is a continuation application of U.S. patent application Ser. No. 10/085,142, filed Mar. 1, 2002, which is incorporated herein in its entirety by reference.

FIELD OF THE INVENTION

[0002] The invention relates generally to the field of DNA genotypic analysis. More particularly, the invention relates to the allelic classification of DNA samples through cluster analysis of analyzed emission spectra observed from excited fluorophore-labeled nucleotide probes. Specifically, fluorophore-labeled nucleotide probes can be used verify DNA variations between individual samples and verify the expression of a region of DNA in different cell lines.

BACKGROUND OF THE INVENTION

[0003] Individual DNA sequence variations are known to directly cause specific diseases or conditions, or to predispose certain individuals to specific diseases or conditions. Such variations also modulate the severity or progression of many diseases. Additionally, DNA sequence variations exist between populations. Therefore, determining DNA sequence variations is useful for making accurate diagnoses, for finding suitable therapies, and for understanding the relationship between genome variations and environmental factors in the pathogenesis of diseases and prevalence of conditions.

[0004] There are several types of DNA sequence variations. These variations include insertions, deletions, restriction fragment length polymorphisms ("RFLPs"), short tandem repeat polymorphisms ("STRPs"), and single nucleotide polymorphisms ("SNPs"). Of these, SNPs are considered the most useful in studying the relationship between DNA sequence variations and diseases and conditions because they are more common, more stable, and more amenable to being employed in large-scale studies than other sorts of variations.

[0005] Currently, a set of over 3 million putative SNPs has been identified in the human genome. It is a current goal of researchers to verify these putative SNPs and associate them with phenotypes and diseases, eventually replacing currently-used RFLP and STRP linkage analysis screening sets. In order to successfully accomplish this goal, it will be necessary for researchers to generate and analyze large amounts of genotypic data.

[0006] A number of methods have been developed which can locate or identify SNPs. These methods include dideoxy fingerprinting (ddF), fluorescently labeled ddF, denaturation fingerprinting (in F1R and DnF2R), single-stranded conformation polymorphism analysis, denaturing gradient gel electrophoresis, heteroduplex analysis, RNase cleavage, chemical cleavage, hybridization sequencing using arrays and direct DNA sequencing.

[0007] One method of particular relevance to the present invention employs a pair of fluorescent probes, each probe containing a different dye and specific for a different allele. In this method, the two probes are added to the DNA sample to be tested, and the mixture is amplified using PCR. If the DNA sample is homozygous for the first allele, the first probe's dye will exhibit a high degree of fluorescence and the fluorescence from the second probe's dye will be absent. Conversely, if the DNA sample is homozygous for the second allele, the second probe's dye will exhibit a high degree of fluorescence and the fluorescence from the first probe's dye will be absent. If the DNA sample is heterozygous for both alleles, then both probes should fluoresce equally. A commercial implementation of this method is APPLIED BIOSYSTEMS' TAQMAN platform, which employs APPLIED BIOSYSTEMS' PRISM 7700 and 7900HT SEQUENCE DETECTION SYSTEMS to record the fluorescence of each sample's PCR product.

[0008] A typical implementation generates amplification products from a set of a large number of samples at a time, and measures a pair of fluorescence values, one for each dye, from each amplified sample. To classify the samples, it is useful to first plot the fluorescence values of the entire set on a two dimensional graph, and observe that the plotted points tend to cluster into separate groups according to genotype, as illustrated in FIG. 1. In this figure, a human observer can readily discern that the data falls into four groups. The first group, in the lower-left hand corner, represents samples that had no amplification or were a no template control ("NTC") reaction. The second group, in the lower right hand corner, represents those samples homozygous for Allele 2. The third group, at the top, represents those samples homozygous for Allele 1. Finally, the fourth group, located between the second and third groups, represents the heterozygous samples. This classification is illustrated further in FIG. 2. Although it is relatively easy for human observer to analyze this type of data, it is necessary to develop a fast, reliable, and unsupervised method of computational analysis to produce the level of throughput necessary to analyze the large amounts of genotypic data generated.

[0009] Previous methods of computational analysis have employed a family of algorithms known as clustering algorithms. A typical clustering algorithm receives raw unstructured data and processes it to form groups of data elements that are similar to each other. Clustering algorithms are well known in the field of computer science, and are typically applied in data mining applications. In a data mining application, clustering is used to identify relationships in data collections not readily observable to an expert user due to the volume of information.

[0010] A typical clustering algorithm examines the distance between data elements to find a common centroid. The centroid is mean of the value of the data elements belonging to a cluster. Clusters are selected by the algorithm to minimize the distance between the elements contained within it relative to the elements contained in other clusters. Clustering algorithms belong to the greater class of unsupervised machine learning algorithms. Other supervised machine learning algorithms, including decision trees and neural networks, were considered for application to analyzing output from a fluorometric genotyping device. However, all machine learning algorithms considered were determined to be insufficient to analyze this type of data accurately. A thorough review of initial collection of 80 human reviewed outputs revealed characteristics of the data that would not allow standard machine learning algorithms to work with a high degree of accuracy.

[0011] It is an object of this invention to provide a fast, accurate, and unsupervised method of classifying genotypic samples based on fluorometric data generated from them.

SUMMARY OF THE INVENTION

[0012] In one aspect, the invention relates to a method for categorizing the members of a dataset into discrete categories. In this aspect the dataset has a plurality of datapoints, and each datapoint has at least two numerical values associated with it. In this aspect, the method has the following steps: Assign each datapoint an angular value based on that datapoint's numerical values; sort the dataset by angular value; calculate the differences between adjacent angular values in the sorted dataset; determining category-dividing values by identifying differences that are larger than a predetermined threshold; and classifying datapoints according to their angular values relative to the category-dividing values.

[0013] In a further aspect, each datapoint has exactly two numerical values, and the angular value is an arctangent of the datapoint's numerical values. In a further aspect the numerical values are normalized before the angular values are calculated.

[0014] In a further aspect, the numerical values represent fluorometric data, wherein the different numerical values for each datapoint represent the fluorescence of a different dye.

[0015] In a further aspect, the method identifies exactly two category-dividing values and three categories. In a further aspect, these three categories represent homozygosity for a first allele, homozygosity for a second allele, and heterozygosity for both alleles.

[0016] In a further aspect the fluorometric data is measured from the product of an amplification reaction, and the method includes a step for removing datapoints that represent either a control reaction or a failure to amplify. In this aspect, the datapoints whose Euclidean distance falls beneath a predetermined threshold are removed from any further classification.

[0017] In a further aspect, the results of the classification are examined to determine whether to bring them to the attention of a human user. In this aspect, the results are examined for conditions that indicate that the classification was unsuccessful. Such conditions include excess classification in one category, classification into more than three categories, absence or near absence of any classification in one or more categories, unclassified datapoints, inadequate separation from control or nonamplification reactions, clusters having angular values that are either too high or too low, clusters whose ranges of angular values are too wide, classification that is not compatible with a Hardy-Weinberg equilibrium, and control or nonamplification reactions that are too far from the origin.

BRIEF DESCRIPTION OF THE DRAWINGS

[0018] FIG. 1 provides a two dimensional scatterplot of fluorometric data.

[0019] FIG. 2 provides a two dimensional scatter plot of fluorometric data classified by allele.

[0020] FIG. 3 provides a two dimensional scatterplot of raw fluorometric data.

Continue reading...
Full patent description for Computer algorithm for automatic allele determination from fluorometer genotyping device

Brief Patent Description - Full Patent Description - Patent Application Claims
Click on the above for other options relating to this Computer algorithm for automatic allele determination from fluorometer genotyping device patent application.
###
monitor keywords

How KEYWORD MONITOR works... a FREE service from FreshPatents
1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored.
3. Each week you receive an email with patent applications related to your keywords.  
Start now! - Receive info on patent apps like Computer algorithm for automatic allele determination from fluorometer genotyping device or other areas of interest.
###


Previous Patent Application:
Computational systems and methods related to nutraceuticals
Next Patent Application:
Measuring method for electromagnetic field intensity and apparatus therefor, measuring method for electromagnetic field intensity distribution and apparatus therefor, measuring method for current and voltage distributions and apparatus therefor
Industry Class:
Data processing: measuring, calibrating, or testing

###

FreshPatents.com Support
Thank you for viewing the Computer algorithm for automatic allele determination from fluorometer genotyping device patent info.
IP-related news and info


Results in 2.32148 seconds


Other interesting Feshpatents.com categories:
Daimler Chrysler , DirecTV , Exxonmobil Chemical Company , Goodyear , Intel , Kyocera Wireless ,