Apparatus and method for conducting searches with a search engine for unstructured data to retrieve records enriched with structured data and generate reports based thereon -> Monitor Keywords
Fresh Patents
Monitor Patents Patent Organizer How to File a Provisional Patent Browse Inventors Browse Industry Browse Agents Browse Locations
     new ** File a Provisional Patent ** 
site info Site News  |  monitor Monitor Keywords  |  monitor archive Monitor Archive  |  organizer Organizer  |  account info Account Info  |  
05/01/08 | 37 views | #20080104542 | Prev - Next | USPTO Class 715 | About this Page  715 rss/xml feed  monitor keywords

Apparatus and method for conducting searches with a search engine for unstructured data to retrieve records enriched with structured data and generate reports based thereon

USPTO Application #: 20080104542
Title: Apparatus and method for conducting searches with a search engine for unstructured data to retrieve records enriched with structured data and generate reports based thereon
Abstract: Records in databases or unstructured files are enriched with metadata and are indexed for retrieval by a search engine. In response to a search request, a graphical user interface (GUI) control based on the metadata associated with the search hits is constructed and displayed with the search results in a standard view. Selection of a metadata value via the GUI control filters the previously matched records down to those matching the value selected via the GUI control. The metadata in the search results is arranged in a tabular view which is embedded in the display of search results and rendered invisible until selected by the user. Reports can be constructed from an identifier each returned record set for presenting, analyzing and modifying the data, and for generating further reports. (end of abstract)
Agent: Levine & Mandelbaum - New York, NY, US
Inventors: Gerald D. Cohen, Radoslav P. Kotorov, Vincent Lam, Peter Lenahan
USPTO Applicaton #: 20080104542 - Class: 715810 (USPTO)

The Patent Description & Claims data below is from USPTO Patent Application 20080104542.
Brief Patent Description - Full Patent Description - Patent Application Claims  monitor keywords

BACKGROUND OF THE INVENTION

[0001]The present invention provides a method for searching an index of structured and unstructured incoming data received from remote locations on a wide area network or global network, e.g. the Internet or an enterprise intranet. More specifically, the invention provides for capturing and enriching data records with metadata or appended data, and accessing the data through the use of a search engine designed for searching unstructured or free-form data.

[0002]Such search engines are in common use. Examples presented herein have been specifically tested for use with the familiar Google.RTM. search appliance and Internet search engine. However, the teachings herein are adaptable for use with other search appliances and engines useful in searching records on the Internet and on private intranets, often configured by business enterprises to enable access to data from diverse locations, e.g., the open source Lucene search engine licensed by the Apache Software Foundation.

[0003]Referring to FIG. 1 of the drawings, Internet search engines, such as Google.RTM.'s, typically index information that is `unstructured`. Each record has information of a different type than the next record. Such search engines can also index the fields in structured databases but treat the data similar to unstructured text.

[0004]To conduct a search users type a word or a set of keywords and then submit their natural language query to the search engine. The search engine returns a set of results, known as "hits", each of which contains: [0005]a uniform resource locator (URL) for the source document which can be either unstructured data (text or word processor document) or structured data (database record); [0006]a snippet--the description of the search result, and a link, for example, to the cached source in the index.

[0007]Some search engines return all records found in a search of an index, but others, e.g., Google.RTM., generally return only a subset of the most relevant results (URLS)--usually up to 1000 hits. For example, even though a search query may have one million hits, the engine will return only the first 1000 most relevant search results. If a user needs additional results, i.e., if the user was not able to find what was sought within the first 1000 results, the user would have to refine the search words by adding or replacing words and submit them as a new query. This limitation is pragmatic since the expectation is that if a user does not find the results within the top most relevant hits, it will be more efficient to refine the query than to page through all one million hits.

[0008]Search engines typically display approximately 10 search results per page. Usability studies indicate that the majority of the users, especially enterprise users, expect to find what they are looking for within the first 3 pages (30 search results). If they do not find it, they resubmit the query. This process is inefficient, because:

[0009]The user has no other way to gain insight about what may be in the search results except by reading the snippets of all of the results. Snippets are generated by algorithms. Sometimes they are not understandable. Such snippets can also be misleading.

[0010]There is no guarantee that replacing the old results with new results will be more useful given that the user refines the search without much knowledge about the structure content of all 1000 previous results.

[0011]Unlike unstructured information, structured information has the property that the information is all of the same type, and the components of the information can be identified by tags or field names. The information that is structured may be intended for storage in relational databases for example. For each data element that is described by a `fieldname`, there is a `fieldvalue`.

[0012]Structured databases contain uniformly structured records, each of which has the same named categories of information, referred to as fields, and one or more values for each field in the records. That is, records are each composed of fieldname-value pairs, sometimes herein referred to as tag-value pairs, name-value pairs, or FIELD_Name, Field_Value pairs, such as those shown in Table 1 below.

TABLE-US-00001 TABLE 1 Fieldname Value ACCIDENTDATE 090106 TYPE_OF_ACCIDENT auto crash COUNTY HUDSON INJURED 1 NAME_OF_INJURED01 JOHN SMITH HOSPITAL Hackensack General ADMITTING_DOCTOR ROBERT JONES

[0013]Users of a search engine find information by entering a search term. This is usually on one or more data values. For example, if a user enters the search information as "Smith", among the "hits" (search engine answer set) would be the sample record shown in Table 1 above.

[0014]However, the sample record of Table 1 would be included in the hits no matter which field had the value "Smith". That is, "Smith" could be the value of the field ADMITTING_DOCTOR, or of the field NAMEOF_INJURED, or of the field COUNTY. Hence a search for hospital records with a patient's name of "Smith" would find records where the patient's name was "Jones" if the doctor's name was "Smith". Or a search for hospital records with a patient's name of "Smith" would find records where the patient's name was "Adams" if the patient was in an automobile accident in Smith County.

[0015]Even though the number of records having information of interest to a searcher might be very small, the number of hits could occupy many pages, most containing irrelevant information, making it very difficult for the searcher to find what was wanted. Some filtering may, therefore, be appropriate.

[0016]Search results are usually displayed in a static form, giving users almost no ability to analyze or perform any manipulation of the returned results within the search results page. At most, users can sort the results by relevance or by date, and they can do this only when they are connected to the server. If they are offline, they loose even the ability to sort by relevance or date, hence storing search results has little usefulness. These limitations severely constrain the ability of users to efficiently analyze and manipulate search results to make faster and more informed decisions.

[0017]While this limitation may not be as obvious when searching completely unstructured data, such as word processing documents, it becomes quickly apparent when users search structured data sources.

[0018]An example of such application would be the search of retail or inventory databases. In both cases the search engine may return hundreds of records within different categories and different price ranges. A mere sequential listing of these records is not very useful. A tabular view would be more appropriate.

[0019]Users want to manipulate tabular data as well as transform it in order to make informed decisions. A dynamic tabular view offers the user the ability to sort the data by any of the available categories, such as gender, product category, sub-category, price range, price, color, etc.

[0020]In a dynamic table a user can quickly find not only the minimum price, but also the minimum price within each category. A user can also pivot the data, i.e., display product prices by brand and category in order to compare and contrast. An inventory manager can sum the quantities directly in the search results, instead of having to go to other applications to perform this task. The prior art offers no search tool having analytic capabilities and a facility for data transformation within the search results. Prior art search systems fail to make analysis, manipulation and storing of search results meaningful.

SUMMARY OF THE INVENTION

[0021]The present invention overcomes the aforementioned problems of prior art search engines in providing a method for zeroing in on the hits returned by a search that are most relevant to the user. The method of the invention winnows down the number of hits returned by a search request thereby enabling the user to find only the one or more relevant items from a potentially much larger list of search results obtained from an inquiry to a search engine.

[0022]In order to structure the data of interest for being able to isolate the records of information containing subsets of that data, the data of interest is indexed by embedding tags corresponding to field names in association with each value for the field name. Thus, at least some items in the result list will have embedded tags that were placed there during the indexing process.

[0023]As part of the indexing process, and prior to the indexing itself, a database record or a transaction that is being entered in the database is enriched with metadata. The metadata comes from the database as FIELD_Name, Field_Value pair. The "Field Name" is the name of the database field, and the "Field_Value" is the corresponding value for the particular record being passed through the process flow.

Continue reading...
Full patent description for Apparatus and method for conducting searches with a search engine for unstructured data to retrieve records enriched with structured data and generate reports based thereon

Brief Patent Description - Full Patent Description - Patent Application Claims
Click on the above for other options relating to this Apparatus and method for conducting searches with a search engine for unstructured data to retrieve records enriched with structured data and generate reports based thereon patent application.

Patent Applications in related categories:

20080168393 - Method for operation of an automation device - In the English translation document, please replace the abstract with the following: A method for operation of an automation device for obtaining information about a control program provided as an automation solution for control and/or monitoring of a technical process is specified in which, especially from a database in conjunction ...

20080168390 - Multimedia object grouping, selection, and playback system - A multimedia organization and playback system intelligently organizes media objects, such as music files, and plays back their contents. The system considers and analyzes multiple media object attributes to determine groups of similar songs. As a result, the system delivers a consistent selection of media to the listener despite wide ...

20080168392 - Visualization of firewall rules in an auto provisioning environment - Various aspects of the invention provide a method, apparatus, and software for selecting interconnectivity rules for a computer network environment and visualization on a display of a data processing system interconnectivity rules in an auto provisioning environment, including: selecting a network environment specification having characteristics describing the environment, the characteristics ...

20080168391 - Widget synchronization in accordance with synchronization preferences - Improved techniques and apparatus for managing data between a host device (e.g., host computer) and a client device. The data being managed can, for example, pertain to portable computer programs, such as widgets. The managing of the data thus can involve transfer of portable computer programs (e.g., widgets) between the ...


###
monitor keywords

How KEYWORD MONITOR works... a FREE service from FreshPatents
1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored.
3. Each week you receive an email with patent applications related to your keywords.  
Start now! - Receive info on patent apps like Apparatus and method for conducting searches with a search engine for unstructured data to retrieve records enriched with structured data and generate reports based thereon or other areas of interest.
###


Previous Patent Application:
Declarative association of dialog fields
Next Patent Application:
Guidance apparatus, method and program of analysis work
Industry Class:
Data processing: presentation processing of document

###

FreshPatents.com Support
Thank you for viewing the Apparatus and method for conducting searches with a search engine for unstructured data to retrieve records enriched with structured data and generate reports based thereon patent info.
IP-related news and info


Results in 2.9305 seconds


Other interesting Feshpatents.com categories:
Qualcomm , Schering-Plough , Schlumberger , Seagate , Siemens , Texas Instruments ,