Identifying associations using graphical models -> Monitor Keywords
Fresh Patents
Monitor Patents Patent Organizer How to File a Provisional Patent Browse Inventors Browse Industry Browse Agents Browse Locations
     new ** File a Provisional Patent ** 
site info Site News  |  monitor Monitor Keywords  |  monitor archive Monitor Archive  |  organizer Organizer  |  account info Account Info  |  
07/17/08 | 1 views | #20080172351 | Prev - Next | USPTO Class 706 | About this Page  706 rss/xml feed  monitor keywords

Identifying associations using graphical models

USPTO Application #: 20080172351
Title: Identifying associations using graphical models
Abstract: Computer-executable instructions for identifying associations are described herein. By way of example, a method for facilitating developing a treatment can include employing computer-executable instructions stored on one or more computer-readable media to determine correlations and utilizing at least some of the determined correlations to develop a treatment. (end of abstract)
Agent: Amin. Turocy & Calvin, LLP - Cleveland, OH, US
Inventors: David E. Heckerman, Jonathan M. Carlson, Carl M. Kadie
USPTO Applicaton #: 20080172351 - Class: 706 46 (USPTO)

The Patent Description & Claims data below is from USPTO Patent Application 20080172351.
Brief Patent Description - Full Patent Description - Patent Application Claims  monitor keywords CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a Continuation of co-pending U.S. patent application Ser. No. 11/622,895 filed on Jan. 12, 2007, and entitled “IDENTIFYING ASSOCIATIONS USING GRAPHICAL MODELS,” the entirety of which is incorporated herein by reference.

BACKGROUND

The search for correlations in many types of data, such as biological data, can be difficult if the data are not exchangeable or independent and identically distributed (IID). For example, a set of DNA or amino acid sequences are rarely exchangeable because they are derived from a phylogeny (e.g., an evolutionary tree). In other words, some sequences are very similar to each other but not to others due to their position in the evolutionary tree. This phylogenetic structure can confound the statistical identification of associations. For instance, although a number of candidate disease genes have been identified by genome wide association (GWA) studies, the inability to reproduce these results in other studies is likely due in part to confounding by phylogeny. Other areas in which phylogeny may confound the statistical identification of associations include the identification of coevolving residues in proteins given a multiple sequences alignment and the identification of Human Leukocyte Antigen (HLA) alleles that mediate escape mutations of the Human Immunodeficiency Virus (HIV).

SUMMARY

This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

The subject matter described herein facilitates the identification of associated or correlated variables using graphical models to remove and even leverage the non-exchangeability of data. By capturing this structure, these models yield well-calibrated false discovery rates and increase discriminatory power over standard methods that assume independence. The subject matter has many applications including but not limited to vaccine design for diseases such as Human Immunodeficiency Virus (HIV) infections, Acquired Immunodeficiency Syndrome (AIDS), Hepatitis C Virus (HCV) infections and malaria infections, as well as the development of treatments for diseases/conditions based on the results of genotype-phenotype association studies in biology and medicine and/or through the elucidation of protein structure.

By way of example, generative models that account for phylogenetic structure can be employed to improve the identification of associations. The phylogenetic structure of the data can be provided or learned simultaneously with the statistical models. To determine whether an association exists between target variable(s) and one or more predictor variables, two generative models can be created—a null model and a non-null model. The null model represents the null hypothesis that the data is accounted for by the phylogenetic tree alone and the non-null model represents the alternative hypothesis that the one or more predictor variables influence the target variable. Frequentist, Bayesian and cross-validation techniques then can be used to determine how much the non-null model better explains the observed data than the null model in order to assess the strength of association between the target variable and the one or more predictor variables. In the case of multiple target variables, the process described above can be repeated for each of the target variables. Optionally or alternatively, the predictor variables can be restricted for each of the multiple target variables in such a way that the resulting network of dependencies among predictor and target variables is a directed acyclic graph representing the relationships among the multiple variables.

The non-null models include a conditional model and a directed joint model. The conditional model is based on the assumption that the target variable evolves according to a phylogenetic tree having a topology and a branch length and is influenced by the one or more predictor variables at the tips of the tree. The directed joint model is based on the assumption that the target and one or more predictor variables coevolve, but that the influence between the variables is asymmetric (e.g., the predictor variable(s) influence the target variable, but not vice versa). Other evolutionary processes are possible and are within the scope of the subject matter described herein.

Although the examples described below are focused on the correlation of discrete (specifically binary) variables, the models can be generalized to multistate and continuous variables as well as to multiple predictor and target variables, thus producing a directed network (acyclic or otherwise) of relationships among multiple variables. The applications of multiple predictor variable models include but are not limited to learning the combined effects of drug and immune pressure on HIV evolution, identifying chains of compensatory mutations, learning the influence of diploid genes on phenotype and learning networks of interacting genes and proteins.

The following description and the annexed drawings set forth in detail certain illustrative aspects of the subject matter. These aspects are indicative, however, of but a few of the various ways in which the subject matter can be employed and the claimed subject matter is intended to include all such aspects and their equivalents.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 schematically illustrates the (a) overcounting and (b) undercounting of evidence for an association between X and Y.

FIG. 2 schematically illustrates two generative (graphical) models (a) the single-variable model for Y and (b) the conditional model for Y given X. The variable Zi represents the variable Yi had there been no influence from Xi. Observed variables are shaded.

FIG. 3 shows Receiver Operating Characteristic (ROC) curves for synthetic coevolution data.

FIG. 4 shows the calibration of q-values on synthetic coevolution data.

FIG. 5 shows ROC curves for artificial conditional influence data.

FIG. 6 shows the calibration of q-values on synthetic conditional influence data.



Continue reading...
Full patent description for Identifying associations using graphical models

Brief Patent Description - Full Patent Description - Patent Application Claims
Click on the above for other options relating to this Identifying associations using graphical models patent application.

Patent Applications in related categories:

20080195567 - Information mining using domain specific conceptual structures - A method and analytics tools for information mining incorporating domain specific knowledge and conceptual structures are disclosed, the method including: providing a first set of documents related to a first topic of interest; using a first taxonomy to categorize the first set of documents into a set of categories; providing ...


###
monitor keywords

How KEYWORD MONITOR works... a FREE service from FreshPatents
1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored.
3. Each week you receive an email with patent applications related to your keywords.  
Start now! - Receive info on patent apps like Identifying associations using graphical models or other areas of interest.
###


Previous Patent Application:
Neural network controller with fixed long-term and adaptive short-term memory
Next Patent Application:
Method and apparatus for rule-based transformation of policies
Industry Class:
Data processing: artificial intelligence

###

FreshPatents.com Support
Thank you for viewing the Identifying associations using graphical models patent info.
IP-related news and info


Results in 1.30066 seconds


Other interesting Feshpatents.com categories:
Daimler Chrysler , DirecTV , Exxonmobil Chemical Company , Goodyear , Intel , Kyocera Wireless ,