System and method for retrieving data using agents in a distributed network -> Monitor Keywords
Fresh Patents
Monitor Patents Patent Organizer File a Provisional Patent Browse Inventors Browse Industry Browse Agents Browse Locations
site info Site News  |  monitor Monitor Keywords  |  monitor archive Monitor Archive  |  organizer Organizer  |  account info Account Info  |  
08/28/08 - USPTO Class 707 |  1 views | #20080208817 | Prev - Next | About this Page  707 rss/xml feed  monitor keywords

System and method for retrieving data using agents in a distributed network

USPTO Application #: 20080208817
Title: System and method for retrieving data using agents in a distributed network
Abstract: A method and apparatus for data retrieval by a computing system and a plurality of agent computers in a distributed network is disclosed. The computing system sends a request to each agent computer to perform a search at a node. The agents perform the searches. The agents thereupon send the resulting data to the computing system for storage in a central database. (end of abstract)



USPTO Applicaton #: 20080208817 - Class: 707 3 (USPTO)

System and method for retrieving data using agents in a distributed network description/claims


The Patent Description & Claims data below is from USPTO Patent Application 20080208817, System and method for retrieving data using agents in a distributed network.

Brief Patent Description - Full Patent Description - Patent Application Claims
  monitor keywords CROSS-REFERENCE TO RELATED APPLICATION

The present application for patent claims priority to Provisional Application No. 60/866,433 entitled “System And Method For Tracking Target Assets And Alerting Users Of Changes On A Computer Network,” filed Nov. 20, 2006, attorney docket no. 79789-011, and assigned to the assignee hereof and hereby expressly incorporated by reference herein.

BACKGROUND

1. Field

The present invention relates generally to data retrieval in distributed networks, and more specifically to techniques for retrieving data from nodes using agent machines in the network.

2. Description of Related Art

For a variety of applications, a computing system on a network such as the internet may be tasked with retrieving data from various locations. One such application is an internet search engine. Many commercially available search engines engage in the practice of web scraping to collect data. Web scraping refers to extracting content from websites for the purpose of transforming the content into another format suitable for use in another application. In the case of an internet search engine, an automated web crawler program may explore the Internet and copy content from millions of websites. The content can then be indexed and made available to users in response to the execution of queries at the search engine website.

From the standpoint of exposure, Web scraping in the search engine arena often stands to benefit both the sponsor of the scraping program and the website owner. For an established search engine like Google, website owners can benefit greatly from allowing a web crawler to search their content, because it offers the potential to enable many more users to discover and visit their websites than they otherwise would through, for example, happening upon the website during the course of browsing. For this reason, many website owners intentionally place content in a designated area on their websites, with the anticipation that the web crawler will search these areas for content tailored specifically for use in connection with subsequent internet searches conducted by users.

For various reasons, many website owners run special blocking code on their websites that attempts to recognize automated scraping programs and prevent them from collecting data on the targeted websites. For example, a company that maintains a travel website for selling airline tickets may elect to limit access to its websites to “human” users—namely, users that are manually running a web browser on a computer and conducting queries in real time at the website over the Internet.

Many of these blocking programs work by searching for and identifying a node (i.e., a website or other network location) with a particular address that repeatedly executes searches at a target website and returns the search results to the node from the website, often in volume. This may indicate that the identified node is sponsoring a scraping process at the target website. In addition, the blocking programs may explore one or more attributes of the search itself such as, for example, whether repeated queries follow some recognizable pattern or the digital signature left by the node. These and other characteristics of the search often provides clues that the searches are automated, rather than being conducted at the direct behest of a user in real time. In short, where a node querying the target website demonstrates some or all of these characteristics, the blocking program may flag this node as one believed to be running an automated data scraping program. In this event, the blocking program may prevent future access by the node to the target website.

The businesses and website owners that represent potential targets for web-scraping programs may perceive, in certain instances, that such programs serve to dilute the import or popularity of their websites, to reduce their profitability by giving customers more purchase options from other sources, or to focus consumers on entitlements that do not necessarily benefit the specific objective of the website. As a result they may take measures such as those discussed above to attempt to limit access by certain types of web scraping-type programs, or to exclude such programs altogether from accessing the target website.

For these types of traditional blocking programs, it is generally important to the website owner that any candidate blocking program considered for use at the target website does not inadvertently prevent what they perceive to be “legitimate” users of client machines from having substantially unhindered access to information at the target website. These legitimate users may, for example, be individuals executing routine queries in a manner intended by the website for topics, products, items or assets, for purchase or otherwise.

To curtail the inadvertent blocking of the website's target audience of potential customers, many blocking programs are configured to issue block orders only to those nodes whose activity at the target website satisfies a condition. Such conditions may include, for example, the node's frequency of visiting a target website, the amount of the target website's resources used by the node, or the volume of information obtained by the node from the website. Only one or more of these conditions exceed some predetermined threshold would the node be blocked from access. This approach represents a traditional attempt by the website owner to balance the owner's interest in preventing access to the website by unwanted scraping nodes on one hand, and preserving to the target websites the right of entry for “desirable” users on the other.

One problem with this conventional approach is that otherwise legitimate data collecting programs may simply be blocked wholesale by e-commerce based business and other websites, without regard to the numerous advantages that sponsors of these programs may offer to a variety of classes of individuals. From a legal standpoint, the objectives of the entity owning a particular web scraping program may be entirely legitimate. Such scraping programs may in actuality result in the provision of necessary or useful services and benefits to the business owner, the relevant consumer class, or both. This is particularly true where the data blocked from access constitutes government-published data, or data types involving minimal or no restrictions of use.

In the above example of the travel website, a consumer may wish to purchase an airline ticket on the Internet. To get the lowest possible price of a ticket, the consumer may well be required to spend a considerable amount of time visiting a plethora of websites, such as some of the major travel websites as well as the airlines' own websites. If, however, a data retrieval program performs these tasks (in advance or automatically at the behest of a user), and the results are somehow made available to the user in an intelligible format, then the user may be relieved of the obligation to conduct multiple time-consuming searches. The consumer may thereupon opt to return to the airlines' website, or return to the travel website after a designated time, for example, to insert the criteria obtained from a proprietor of the scraping application to obtain the lowest possible fare. None of these activities are currently feasible, however, where the scraping program is simply blocked by the target node.

As another illustration, a consumer may purchase an asset online at a target website, and an event sometime down the road may trigger the consumer's entitled to a refund on the asset that the consumer already purchased. In the travel industry, by way of example, prices of assets such as airline tickets may be highly volatile, and hence, possible or likely to change over time. The entitlement to a refund of part of a purchase price may arise, for example, by law, or by a surreptitious provision in an agreement with an eCommerce website. In the conventional scenario, the consumer may not be notified about the discount, and thus may miss out on it altogether. Further, the consumer seeking information about a discount may be relegated to conducting multiple searches of the e-commerce website to establish to what extent, if any, the consumer is entitled to a refund. The average consumer may understandably elect not to pursue these time-consuming tasks, in which case the business owner stands to accrue an additional financial benefit as a result of the consumer's inability to access information that might otherwise entitle the consumer to a return of some of the funds used to purchase the asset in the first place.

Countless other examples relating to the utility of legitimate scraping applications in Internet eCommerce and other arenas exist.

As a result, a need persists in the art for a superior data-retrieval mechanism that overcomes the stated disadvantages.

SUMMARY

A plurality of agents may be used in a distributed network to perform queries at nodes from which information is desired. A computing system may delegate tasks to perform, such as the execution of queries, to the agents at the nodes. When the tasks are performed, information acquired from performing the tasks may be forwarded to the computing system for storage in a central database.

A computing system for retrieving data from a node using a plurality of agent computers in a distributed network may include a memory system for storing the code, and a processing system associated with the memory system and configured to run the code, wherein the code when run is configured to deliver a request to each agent computer to retrieve data at the node, receive from the agent computer the data obtained in response to the request; and store the data in a database.



Continue reading about System and method for retrieving data using agents in a distributed network...
Full patent description for System and method for retrieving data using agents in a distributed network

Brief Patent Description - Full Patent Description - Patent Application Claims

Click on the above for other options relating to this System and method for retrieving data using agents in a distributed network patent application.

Patent Applications in related categories:

20090299980 - method for searching and displaying content in a directory - An improved system and methods for searching and displaying content in a directory having a single-action process which instantaneously displays search results solely of items open and operating at the exact time of the search request, within close proximity of the requester. An exemplary method may comprise the steps of: ...

20090299982 - Apparatus and method for routing composite objects to a report server - A computer readable medium stores instructions for execution on a computer. The instructions receive a collection of composite objects. An aggregate dataset that includes a portion of contents of object instances in the collection of composite objects is created. The aggregate dataset includes contents of object instances formed by reflection, ...

20090299971 - Binary search circuit and method - A binary search circuit 36 searches a database 50, which stores pieces of data aligned in ascending or descending order, for comparison target data by binary search. Comparison circuits 36A, 36B and 36C compare pieces of data read out from databases 50A, 50B and 50C with the comparison target data. ...

20090299974 - Character sequence map generating apparatus, information searching apparatus, character sequence map generating method, information searching method, and computer product - A computer-readable recording medium stores therein a sequence-map generating program that causes a computer to execute extracting from files that include character strings written therein, a word having q (q≧2) characters; extracting from the word extracted at the extracting the word, consecutive characters from a character position s-th (1≦s≦q−r+1) from ...

20090299969 - Data warehouse system - Methods and apparatus, including computer program products, implementing and using techniques for analyzing historical data in a data warehouse. A data warehouse is provided. The data warehouse includes several database tables. Every database table has a start time column and an end time column. A query is issued to the ...

20090299972 - Device and method for updating a certificate - A method updates certificates for potential recipients. The method comprises determining whether the certificates require updating. The method comprises determining a number of the certificates that require updating. The method comprises requesting updates for each of the certificates that require updating when the number is at most a preset number ...

20090299962 - Dynamic update of a web index - Systems and methods are provided for regularly updating a web index with new or updated content, such as meta words or meta streams, for a particular web page address, such as a URL. Web page addresses and associated updated information, such as meta words, meta streams, values, and locations in ...

20090299961 - Face search in personals - A device, system and method to enable searching of personal profiles in the context of on-line dating that includes the ability to determine the personal profiles that have images that most closely resemble a target image. ...

20090299981 - Information processing device, information processing method, and program - An information processing device includes: a storage management unit configured to store and manage content files; a metadata obtaining unit configured to obtain metadata of a recommendation source content; a content selecting unit configured to select, from content files managed by the storage management unit, recommended contents to be recommended ...

20090299973 - Information searching apparatus, information managing apparatus, information searching method, information managing method, and computer product - A computer-readable recording medium stores therein an information searching program that causes a computer having access to archives including a compressed file group of compressed files that are to be searched and that have described therein character strings, to execute: sorting the compressed files in descending order of access frequency ...

20090299966 - Management of large dynamic tables - Managing a table as multiple ordered blocks of entries. Each block has a local index value for each entry, and each entry has an associated element value. The entries in the table are monotonically ordered, and the table is searchable by element value and entry index value. Each block has ...

20090299977 - Method for automatic labeling of unstructured data fragments from electronic medical records - A method for automatically labeling unstructured data from electronic medical records using a computer-based medical data processing system includes selecting a data pattern based on a desired medical finding. The selected data pattern is searched for within source data including patient records to find one or more matches. A context ...

20090299963 - Method, apparatus, and computer program product for content use assignment by exploiting social graph information - An apparatus for automatically assigning content information may include a processor. The processor may be configured to receive content information, and identify the usage type and the sub-usage type of the content information. The content information may comprise an indicator for a usage type and a sub-usage type. The processor ...

20090299968 - Methods and apparatus to save search data - Methods and apparatus to save search data are described. An example method for use in media presentation system includes receiving one or more characters to form a search string to be used in a first type of search; converting the search string to one or more keywords to be used ...

20090299960 - Methods, systems, and computer program products for automatically modifying a virtual environment based on user profile information - The subject matter described herein includes methods, systems, and computer program products for automatically modifying a virtual environment based on user profile information. According to one aspect, the method includes determining user profile information associated with a user and automatically modifying a virtual environment based on the determined user profile ...

20090299965 - Navigating product relationships within a search system - Embodiments of the present invention relate to aggregating product information from a variety of sources to generate user interfaces that allow users to navigate and discover products. Product information is aggregated from both feed and crawl sources, and product entities are identified within the aggregate product information. In some embodiments, ...

20090299984 - Partial data model exposure through client side caching - The present invention generally provides methods, articles of manufacture and systems for exposing, on a client device, fields of a data model representing an underlying database for use in building queries against the database. For some embodiments, the client device may be a device having limited resources, such as a ...

20090299964 - Presenting search queries related to navigational search queries - A method and medium are provided for determining whether search queries issued to a search engine are navigational search queries and displaying related search queries and corresponding URLs in association with a URL corresponding to a target of the navigational search query. One embodiment of the method includes receiving a ...

20090299979 - Product lifecycle information management system using ubiquitous technology - A product lifecycle information management system using ubiquitous technology is provided. The system includes a service manager that comprises a service repository for registering a service using product information in a product lifecycle and multiple interface agents (IAs) for providing an interface for the service registered in the service repository. ...

20090299970 - Social network for mail - A method for analyzing email data including: parsing a first email into one or more email attributes; searching a social network datastore that stores email attributes of other emails; retrieving history data related to one or more or the email attributes from the social network datastore; and processing the one ...

20090299975 - System and method for document analysis, processing and information extraction - The present invention is directed to a method and computer system for representing a dataset comprising N documents by computing a diffusion geometry of the dataset comprising at least a plurality of diffusion coordinates. The present method and system stores a number of diffusion coordinates, wherein the number is linear ...

20090299983 - System and method of accelerating document processing - Embodiments include methods and systems for processing XML documents. One embodiment is a system that includes a tokenizer configured to identify tokens in an XML document. A plurality of speculative processing modules are configured to receive the tokens and to at least partially process the XML document and to provide ...

20090299978 - Systems and methods for keyword and dynamic url search engine optimization - A method implemented on one or more computer processors for search engine optimization may comprise automatically determining a relevancy of the keywords, automatically assigning an inverse document frequency (IDF) value to each keyword designated highly relevant, automatically defining relationships between keywords that are determined both highly user-relevant and highly database-relevant, ...

20090299976 - Systems and methods of identifying chunks from multiple syndicated content providers - A computer receives a first set of information items from a first content provider and a second set of information items from a second content provider. For each of the first and second sets of information items, the computer retrieves the document identified by the corresponding document link from a ...

20090299967 - User advertisement click behavior modeling - Described herein is technology for, among other things, mining similar user clusters based on user advertisement click behaviors. The technology involves methods and systems for mining similar user clusters based on log data available on an online advertising platform. By building a user linkage representation based on one or more ...


###
monitor keywords

How KEYWORD MONITOR works... a FREE service from FreshPatents
1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored.
3. Each week you receive an email with patent applications related to your keywords.  
Start now! - Receive info on patent apps like System and method for retrieving data using agents in a distributed network or other areas of interest.
###


Previous Patent Application:
System and method for quality control in healthcare settings to continuously monitor outcomes and undesirable outcomes such as infections, re-operations, excess mortality, and readmissions
Next Patent Application:
System and method of accident investigation for complex situations involving numerous known and unknown factors along with their probabilistic weightings
Industry Class:
Data processing: database and file management or data structures

###

FreshPatents.com Support
Thank you for viewing the System and method for retrieving data using agents in a distributed network patent info.
IP-related news and info


Results in 0.0911 seconds


Other interesting Feshpatents.com categories:
Software:  Finance AI Databases Development Document Navigation Error 174
filepatents (1K)

* Protect your Inventions
* US Patent Office filing
patentexpress PATENT INFO