| Automated processing of appropriateness determination of content for search listings in wide area network searches -> Monitor Keywords |
|
Automated processing of appropriateness determination of content for search listings in wide area network searchesRelated Patent Categories: Data Processing: Database And File Management Or Data Structures, Database Or File AccessingAutomated processing of appropriateness determination of content for search listings in wide area network searches description/claimsThe Patent Description & Claims data below is from USPTO Patent Application 20060235824, Automated processing of appropriateness determination of content for search listings in wide area network searches. Brief Patent Description - Full Patent Description - Patent Application Claims RELATED APPLICATIONS [0001] The present patent document is a continuation of U.S. patent application Ser. No. 10/244,051, filed Sep. 13, 2002, the entirety of which is hereby incorporated by reference. FIELD OF THE INVENTION [0002] This invention relates to the field of automated document content analysis, and more specifically to a mechanism for automated determination of the appropriateness of a search listing for inclusion in a wide area network search engine database. BACKGROUND [0003] The Internet is a wide area network having a truly global reach, interconnecting computers all over the world. That portion of the Internet generally known as the World Wide Web is a collection of inter-related data whose magnitude is truly staggering. The content of the World Wide Web (sometimes referred to as "the Web") includes, among other things, documents of the known HTML (Hyper-Text Mark-up Language) format which are transported through the Internet according to the known protocol, HTTP (Hyper-Text Transport Protocol). [0004] The breadth and depth of the content of the Web is amazing and overwhelming to anyone hoping to find specific information therein. Accordingly, an extremely important component of the Web is a search engine. As used herein, a search engine is an interactive system for locating content relevant to one or more user-specified search terms, which collectively represent a search query. Through the known Common Gateway Interface (CGI), the Web can include content which is interactive, i.e., which is responsive to data specified by a human user of a computer connected to the Web. A search engine receives a search query of one or more search terms from the user and presents to the user a list of one or more documents which are determined to be relevant to the search query. [0005] Search engines dramatically improve the efficiency with which users can locate desired information on the Web. As a result, search engines are one of the most commonly used resources of the Internet. An effective search engine can help a user locate very specific information within the billions of documents currently represented within the Web. The critical function and raison d'tre of search engines is to identify the few most relevant results among the billions of available documents given a few search terms of a user's query and to do so in as little time as possible. Thus, a critical function of search engines is determination of relevance of documents to a search query. [0006] Generally, search engines maintain a database of records associating search terms with information resources on the Web. Search engines currently acquire information about the contents of the Web primarily in several common ways. The most common is generally known as crawling the Web and the second is by submission of such information by a provider of such information or by third-parties (i.e., neither a provider of the information nor the provider of the search engine). Another common way for search engines to acquire information about the content of the Web is for human editors to create indices of information based on their review. [0007] To understand crawling, one must first understand that documents of the Web can include references, commonly referred to as links, to other documents of the Web. Anyone who has "clicked on" a portion of a document to cause display of a referenced document has activated such a link. Crawling the Web generally refers to an automated process by which documents referenced by one document are retrieved and analyzed and documents referred to by those documents are retrieved and analyzed and the retrieval and analysis are repeated recursively. Thus, an attempt is made to automatically traverse the entirety of the Web to catalog the entirety of the contents of the Web. [0008] Due to the fact that documents of the Web are constantly being added and/or modified and also to the sheer immensity of the Web, no Web crawler has successfully cataloged the entirety of the Web. Accordingly, providers of Web content who wish to have their content included in search engine databases directly submit their content to providers of search engines. Other providers of content and/or services available through the Internet contract with operators of search engines to have their content regularly crawled and updated such that search results include current information. Some search engines, such as the search engine provided by Overture Services, Inc. of Pasadena, Calif. (http://www.overture.com) and described in U.S. Pat. No. 6,269,361 which is incorporated herein by reference, allow providers of Internet content and/or services to compose and submit brief titles and descriptions to be associated with their content and/or services in results as a search query. As the Internet has grown and commercial activity has also grown over the Internet, some search engines have specialized in providing commercial search results presented separately from informational results with the added benefit of facilitating commercial transactions over the Internet. One such search engine is the search engine described in the '361 patent and provided by Overture Services, Inc. as described above. [0009] Since search engines which provide unwanted information are at a distinct disadvantage to search engines which minimize presentation of unwanted information, search engine providers have a strong interest in maximizing relevance of results provided to search queries. Providers of search engines therefore often review the content of individual search listings for desirability and appropriateness prior to including each listing in their database for real-time delivery of search results in response to a search query. [0010] Due to the overwhelming amount of information on the Web, such review is a daunting task. In addition, content review generally has not lent itself to automation since the appropriateness of a particular search listing depends upon subtleties of human perception of both the search listing itself and the content referenced by the search listing. Operators of search engines have general had to choose between (i) automatically generating search results of listings having questionable relevance and therefore less value to the user or (ii) manually generating more relevant search listings by human editing but on a drastically reduced scale. While manually edited search listings tend to be far more relevant and therefore far more effective in attracting users to a search engine, manual editing of search listings is very expensive in both time and resources and significantly delays availability of newly submitted search listings to users of the search engine. Delayed availability of search listings reduces the currency of search listings produced as results in response to search queries. [0011] What is needed is a mechanism by which review of one or more search listings can be efficiently performed while maintaining accurate analysis of the impression of a given search listing on a human user seeing the search listing and/or the content referenced by the search listing. BRIEF SUMMARY [0012] In accordance with the present invention, candidate search listings are automatically evaluated to determine the likelihood that the search listings comport with a content policy. Specifically, candidate search listings that are determined to be lower-risk and lower-volume search listings can be automatically and quickly approved for inclusion in the search listing database for immediate serving as results in response to a real-time query by a user. Parties submitting candidate search listings for inclusion in a search engine database benefit from quick approval and availability of submitted search listings. In addition, such parties can be automatically notified of automated approval or rejection of submitted listings, providing greater satisfaction and promoting confidence in the efficiency and effectiveness of the candidate search listing evaluation process. [0013] Another benefit of quickly and automatically approving lower-risk, lower-volume candidate search listings for inclusion in a search listing database is that valuable human resources can be dedicated to more careful editorial review of candidate search listings which are automatically determined to be either not lower-risk or not lower-volume search listings. Thus, quality of the editorial review of candidate search listings increases while efficiency of editorial review of all candidate search listings simultaneously increases. [0014] The automated preprocessing to assess likelihood that a candidate search listing comports with the predetermined content policy includes generally quality, style, and relevance analysis. Quality analysis assesses the nature of the content and, specifically, the likelihood and degree to which the content of the candidate search listing is objectionable. Some types of content are so objectionable as to be unilaterally prohibited by a search engine provider, and so the detection of such blocked content in a candidate search listing results in the automatic rejection of the listing and notification of the submitting source of such rejection and the reasons for the rejection. Suspect terms are terms which indicate that a more thorough review of the candidate search listing is warranted. Detection of suspect content in the search listing causes the search listing to be routed for manual review of the search listing to determine whether the search listing comports with the content policy and notification of the submitter that such manual review is being undertaken. Likewise, sexual and gambling content in a search listing does not automatically flag the search listing for rejection but does flag the search listing for a more thorough, manual review by the human editor. Nonsensical, junk text within a search listing however does cause the search listing to be automatically rejected and the submitter notified. [0015] In automated evaluation of the style of a candidate search listing, generally three actions are possible. It should be noted that the three actions are not mutually exclusive. First, the candidate search listing can be marked for rejection and automatically sent back to the submitting source with an indication of the reasons for the rejection. Second, the candidate search listing can be flagged for manual review and routed to a human editor with notification of same to the submitter. Third, the candidate search listing can be automatically modified to comport with the predetermined style policy and once edited automatically included in the database. The style policy can specify various style criteria which must be met by a search listing to be included in the search engine database, including rules on capitalization of characters, rules on punctuation, prohibitions of contact information in the search listing, prohibitions against superlatives, and similar criteria as illustrative examples. [0016] In the automated relevance determination of a candidate search listing, the relevance of a submitted listing to a search term is determined by algorithmically screening the content of an assocaited web page to verify a set of relevance criteria. Relevance criteria include such things as (i) whether the associated URL address refer to an existing document, (ii) whether the referenced document contains the associated search term, and (iii) whether the search term, title, and description of the search listing are relevant to the referenced document. Such relevance criteria are only representative and could include any criteria deemed appropriate to a relevance determination. Like the evaluation of style, generally three actions are possible from an automated relevance determination. First the search listing can be definitively considered relevant to the search term and thus approved for automatic processing. Second, the search listing can be determined marginally relevant to the search term and thus routed for manual review by a human editor. Third, the search listing can be determined to be decidedly not relevant to the search and automatically rejected. BRIEF DESCRIPTION OF THE DRAWINGS [0017] FIG. 1 is a block diagram illustrating a wide area network, such as the Internet, in which a search engine according to the present invention is deployed. [0018] FIG. 2 is a block diagram of the search engine of FIG. 1 in greater detail. [0019] FIG. 3 is a block diagram of a search listing to be considered for inclusion in a search database in accordance with the present invention. Continue reading about Automated processing of appropriateness determination of content for search listings in wide area network searches... Full patent description for Automated processing of appropriateness determination of content for search listings in wide area network searches Brief Patent Description - Full Patent Description - Patent Application Claims Click on the above for other options relating to this Automated processing of appropriateness determination of content for search listings in wide area network searches patent application. ### 1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored. 3. Each week you receive an email with patent applications related to your keywords. Start now! - Receive info on patent apps like Automated processing of appropriateness determination of content for search listings in wide area network searches or other areas of interest. ### Previous Patent Application: Apparatus and method for reducing data returned for a database query using select list processing Next Patent Application: Computer input control for specifying scope with explicit exclusions Industry Class: Data processing: database and file management or data structures ### FreshPatents.com Support Thank you for viewing the Automated processing of appropriateness determination of content for search listings in wide area network searches patent info. IP-related news and info Results in 0.13939 seconds Other interesting Feshpatents.com categories: Tyco , Unilever , Warner-lambert , 3m 174 |
* Protect your Inventions * US Patent Office filing
PATENT INFO |
|