Adapting a search classifier based on user queries -> Monitor Keywords
Fresh Patents
Monitor Patents Patent Organizer How to File a Provisional Patent Browse Inventors Browse Industry Browse Agents Browse Locations
     new ** File a Provisional Patent ** 
site info Site News  |  monitor Monitor Keywords  |  monitor archive Monitor Archive  |  organizer Organizer  |  account info Account Info  |  
11/29/07 | 60 views | #20070276818 | Prev - Next | USPTO Class 707 | About this Page  707 rss/xml feed  monitor keywords

Adapting a search classifier based on user queries

USPTO Application #: 20070276818
Title: Adapting a search classifier based on user queries
Abstract: Multiple different user queries are applied to an automated classifier to identify multiple tasks. For each query, a task is provided to a user. A task selected by the user is logged and a mapping between each query and each selected task is generated. Fewer than all of the mappings are used to train a new classifier, wherein selecting fewer than all of the mappings to train the new classifier comprises selecting mappings based on when the mappings were generated. The new classifier is stored on a computer-readable storage medium. (end of abstract)
Agent: Westman Champlin (microsoft Corporation) - Minneapolis, MN, US
Inventors: Daniel B. Cook, Chad S. Oftedal, Scott E. Seiber, Matthew A. Goldberg
USPTO Applicaton #: 20070276818 - Class: 707003000 (USPTO)
Related Patent Categories: Data Processing: Database And File Management Or Data Structures, Database Or File Accessing, Query Processing (i.e., Searching)
The Patent Description & Claims data below is from USPTO Patent Application 20070276818.
Brief Patent Description - Full Patent Description - Patent Application Claims  monitor keywords

REFERENCE TO RELATED APPLICATIONS

[0001] This application is a divisional of and claims priority from U.S. patent application Ser. No. 10/310,408, filed on Dec. 5, 2002 and entitled METHOD AND APPARATUS FOR ADAPTING A SEARCH CLASSIFIER BASED ON USER QUERIES.

BACKGROUND OF THE INVENTION

[0002] The present invention relates to text classifiers. In particular, the present invention relates to the classification of user queries.

[0003] In the past, search tools have been developed that classify user queries to identify one or more tasks or topics that the user is interested in. In some systems, this was done with simply key-word matching in which each key word was assigned to a particular topic. In other systems, more sophisticated classifiers have been used that use the entire query to make a determination of the most likely topic or task that the user may be interested in. Examples of such classifiers include support vector machines that provide a binary classification relative to each of a set of tasks. Thus, for each task, the support vector machine is able to decide whether the query belongs to the task or not.

[0004] Such sophisticated classifiers are trained using a set of queries that have been classified by a librarian. Based on the queries and the classification given by the librarian, the support vector machine generates a hyper-boundary between those queries that match to the task and those queries that do not match to the task. Later, when a query is applied to the support vector machine for a particular task, the distance between the query and the hyper-boundary determines the confidence level with which the support vector machine is able to identify the query as either belonging to the task or not belonging to the task.

[0005] Although the training data provided by the librarian is essential to initially training the support vector machine, such training data limits the performance of the support vector machine over time. In particular, training data that includes current-events queries becomes dated over time and results in unwanted topics or tasks being returned to the user. Although additional librarian-created training data can be added over time to keep the support vector machines current, such maintenance of the support vector machines is time consuming and expensive. As such, a system is needed for updating search classifiers that requires less human intervention, while still maintaining a high standard of precision and recall.

SUMMARY OF THE INVENTION

[0006] Multiple different user queries are applied to an automated classifier to identify multiple tasks. For each query, a task is provided to a user. A task selected by the user is logged and a mapping between each query and each selected task is generated. Fewer than all of the mappings are used to train a new classifiers wherein selecting fewer than all of the mappings to train the new classifier comprises selecting mappings based on when the mappings were generated. The new classifier is stored on a computer-readable storage medium.

BRIEF DESCRIPTION OF THE DRAWINGS

[0007] FIG. 1 is a block diagram of a computing device on which a user may enter a query under the present invention.

[0008] FIG. 2 is a block diagram of a client-server architecture under one embodiment of the present invention.

[0009] FIG. 3 is a flow diagram of a method of logging search queries and selected tasks under embodiments of the present invention.

[0010] FIG. 4 is a display showing a list of tasks provided to the user in response to their query.

[0011] FIG. 5 is a flow diagram of a system for training a classifier using logged search queries under embodiments of the present invention.

[0012] FIG. 6 is a display showing an interface for designating the training data to be used in building a classifier under one embodiment of the present invention.

DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS

[0013] The present invention may be practiced within a single computing device or in a client-server architecture in which the client and server communicate through a network. FIG. 1 provides a block diagram of a single computing device on which the present invention may be practiced or which may be operated as the client in a client-server architecture.

[0014] The computing system environment 100 is only one example of a suitable computing environment and is not intended to suggest any limitation as to the scope of use or functionality of the invention. Neither should the computing environment 100 be interpreted as having any dependency or requirement relating to any one or combination of components illustrated in the exemplary operating environment 100.

[0015] The invention is operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well-known computing systems, environments, and/or configurations that may be suitable for use with the invention include, but are not limited to, personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, telephony systems, distributed computing environments that include any of the above systems or devices, and the like.

[0016] The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

[0017] With reference to FIG. 1, an exemplary system for implementing the invention includes a general-purpose computing device in the form of a computer 110. Components of computer 110 may include, but are not limited to, a processing unit 120, a system memory 130, and a system bus 121 that couples various system components including the system memory to the processing unit 120. The system bus 121 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus also known as Mezzanine bus.

[0018] Computer 110 typically includes a variety of computer readable media. Computer readable media can be any available media that can be accessed by computer 110 and includes both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, RON, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computer 110. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term "modulated data signal" means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer readable media.

[0019] The system memory 130 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 131 and random access memory (RAM) 132. A basic input/output system 133 (BIOS) , containing the basic routines that help to transfer information between elements within computer 110, such as during start-up, is typically stored in ROM 131. RAM 132 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 120. By way of example, and not limitation, FIG. 1 illustrates operating system 134, application programs 135, other program modules 136, and program data 137.

Continue reading...
Full patent description for Adapting a search classifier based on user queries

Brief Patent Description - Full Patent Description - Patent Application Claims
Click on the above for other options relating to this Adapting a search classifier based on user queries patent application.
###
monitor keywords

How KEYWORD MONITOR works... a FREE service from FreshPatents
1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored.
3. Each week you receive an email with patent applications related to your keywords.  
Start now! - Receive info on patent apps like Adapting a search classifier based on user queries or other areas of interest.
###


Previous Patent Application:
Systems and methods for data storage and retrieval using algebraic relations composed from query language statements
Next Patent Application:
Apparatus and method for querying databases via a web service
Industry Class:
Data processing: database and file management or data structures

###

FreshPatents.com Support
Thank you for viewing the Adapting a search classifier based on user queries patent info.
IP-related news and info


Results in 4.32171 seconds


Other interesting Feshpatents.com categories:
Electronics: Semiconductor Audio Illumination Connectors Crypto