Methods and apparatus for reducing command processing latency while maintaining coherence -> Monitor Keywords
Fresh Patents
Monitor Patents Patent Organizer File a Provisional Patent Browse Inventors Browse Industry Browse Agents Browse Locations
site info Site News  |  monitor Monitor Keywords  |  monitor archive Monitor Archive  |  organizer Organizer  |  account info Account Info  |  
08/09/07 - USPTO Class 711 |  136 views | #20070186052 | Prev - Next | About this Page  711 rss/xml feed  monitor keywords

Methods and apparatus for reducing command processing latency while maintaining coherence

USPTO Application #: 20070186052
Title: Methods and apparatus for reducing command processing latency while maintaining coherence
Abstract: In a first aspect, a first method of reducing command processing latency while maintaining memory coherence is provided. The first method includes the steps of (1) providing a memory map including memory addresses available to a system; and (2) arranging the memory addresses into a plurality of groups. At least one of the groups does not require the system, in response to a command that requires access to a memory address in the group from a bus unit, to get permission from all remaining bus units included in the system to maintain memory coherence. Numerous other aspects are provided. (end of abstract)



Agent: Ibm Corporation Intellectual Property Law Dept. 917 - Rochester, MN, US
Inventors: Jeffrey Douglas Brown, Scott Douglas Clark, Mark S. Fredrickson, Charles Ray Johns, David John Krolak
USPTO Applicaton #: 20070186052 - Class: 711141000 (USPTO)

Related Patent Categories: Electrical Computers And Digital Processing Systems: Memory, Storage Accessing And Control, Hierarchical Memories, Caching, Coherency

Methods and apparatus for reducing command processing latency while maintaining coherence description/claims


The Patent Description & Claims data below is from USPTO Patent Application 20070186052, Methods and apparatus for reducing command processing latency while maintaining coherence.

Brief Patent Description - Full Patent Description - Patent Application Claims
  monitor keywords

FIELD OF THE INVENTION

[0001] The present invention relates generally to computer systems, and more particularly to methods and apparatus for reducing command processing latency while maintaining coherence.

BACKGROUND

[0002] A computer system may include a plurality of bus units (e.g., logical units such as microprocessors, memory management processors, input/output (I/O) processors and/or the like), coupled via one or more buses, that may require access to one or more memories of the system. For example, the system may include a hierarchy of bus units. More specifically, the system may include a first group of bus units in a first chip and a second group of bus units in a second chip. Further, the first and second chips may be on the same or different cards of the system.

[0003] During operation, one of the bus units may issue a pending coherent command on a bus. The pending command may require access to an address (e.g., cacheline) included in a memory of the system. In a conventional system, to maintain coherence, the system requires each of the remaining bus units of the system to respond to the issuing bus unit to indicate whether the bus unit locally stores the cacheline, and if so, the state of such a locally-stored cacheline. However, due to the hierarchy of the bus units, a response from one or more of the remaining bus units to the issuing bus unit may take a long time, and therefore, increase command latency. For example, assuming the first and second chips are on the same card, if the issuing bus unit is in the first chip, respective responses from the bus units in the second chip may require a long time. If the first and second chips are on different cards, respective responses from the bus units in the second chip may require an even longer time. Accordingly, improved methods and apparatus for reducing command processing latency while maintaining coherence are desired.

SUMMARY OF THE INVENTION

[0004] In a first aspect of the invention, a first method of reducing command processing latency while maintaining memory coherence is provided. The first method includes the steps of (1) providing a memory map including memory addresses available to a system; and (2) arranging the memory addresses into a plurality of groups. At least one of the groups does not require the system, in response to a command that requires access to a memory address in the group from a bus unit, to get permission from all remaining bus units included in the system to maintain memory coherence.

[0005] In a second aspect of the invention, a first apparatus for reducing command processing latency while maintaining memory coherence is provided. The first apparatus includes logic and/or memory adapted to store a memory map including memory addresses available to a system. The memory addresses are arranged into a plurality of groups in which at least one of the groups does not require the system, in response to a command that requires access to a memory address in the group from a bus unit, to get permission from all remaining bus units included in the system to maintain memory coherence.

[0006] In a third aspect of the invention, a first computer program product is provided. The computer program product includes a medium readable by a computer, the computer readable medium having computer program code adapted to (1) provide a memory map including memory addresses available to a system; and (2) arrange the memory addresses into a plurality of groups, wherein at least one of the groups does not require the system, in response to a command that requires access to a memory address in the group from a bus unit, to get permission from all remaining bus units included in the system to maintain memory coherence.

[0007] In a fourth aspect of the invention, a first system for reducing command processing latency while maintaining memory coherence is provided. The first system includes (1) a plurality of bus units, wherein two or more of the bus units may be on different chips, cards or computers of the system; (2) a plurality of buses coupling the bus units; (3) a plurality of memories, each of which corresponds to one or more of the bus units; and (4) a memory map including memory addresses available to a system. The memory addresses are arranged into a plurality of groups such that at least one of the groups does not require the system, in response to a command that requires access to a memory address in such a group from a first bus unit, to get permission from all remaining bus units included in the system to maintain memory coherence. Numerous other aspects are provided, as are systems, apparatus and computer program products in accordance with these and other aspects of the invention. Each computer program product described herein may be carried by a medium readable by a computer (e.g., a carrier wave signal, a floppy disc, a compact disc, a DVD, a hard drive, a random access memory, etc.).

[0008] Other features and aspects of the present invention will become more fully apparent from the following detailed description, the appended claims and the accompanying drawings.

BRIEF DESCRIPTION OF THE FIGURES

[0009] FIG. 1 illustrates a system adapted to reduce command processing latency while maintaining coherence in accordance with an embodiment of the present invention.

[0010] FIG. 2 illustrates a first exemplary method of reducing command processing latency while maintaining coherence in accordance with an embodiment of the present invention.

[0011] FIG. 3 illustrates a process flow of a second exemplary method of reducing command processing latency while maintaining coherence in accordance with an embodiment of the present invention.

DETAILED DESCRIPTION

[0012] The present invention provides improved methods and apparatus for reducing command processing latency while maintaining system coherence. More specifically, the present methods and apparatus may employ a system map that does not require system-wide memory coherency. A system map includes all memory addresses available to the system. However, the present methods and apparatus may arrange addresses in the system memory map into groups or domains. The system may only be required to maintain coherence of addresses included in the same group or domain. The memory map groups or domains may be based on system hardware hierarchy and/or applications intended to be executed by the system. For example, the memory map may include a first group or domain of addresses corresponding to memory addresses associated with bus units included in a first chip, a second group or domain of addresses corresponding to memory addresses associated with bus units included in a second chip, and so on. Therefore, the system may only be required to maintain coherence of memory addresses associated with bus units within the same chip. Thus, if a bus unit in the chip issues a command, only remaining bus units in the chip may be required to respond. However, memory map addresses may be arranged into groups or domains differently. For example, if a system designer or architect contemplates a first card of the system will execute a first application and a second card of the system will execute a second application, the memory map may include a first group or domain of addresses corresponding to memory addresses associated with all bus units in the first card of the system, a second group or domain of addresses corresponding to memory addresses associated with all bus units in the second card of the system, and so on. Therefore, the system may only be required to maintain coherence of memory addresses associated with bus units within the same card. Thus, if a bus unit in the card issues a command, only remaining bus units in the card may be required to respond. By reducing the coherency requirement in the manner described above, the present methods and apparatus may reduce command processing latency while maintaining system coherence.

[0013] FIG. 1 illustrates a system adapted to reduce command processing latency while maintaining coherence in accordance with an embodiment of the present invention. With reference to FIG. 1, the system 100 may include a plurality of bus units 102, such as microprocessors, memory management processors, input/output (I/O) processors and/or the like. The bus units 102 may be coupled via one or more of a plurality of buses (e.g., processor buses) 104 included in the system 100. Two or more of the bus units 102 may be included in different chips and/or cards included in the system 100. Further, the system 100 may include a plurality of memories 106, each of which may correspond to one or more of the bus units 102.

[0014] For example, the system 100 may include a first card 108 including first and second chips 110, 112. The first chip 110 may include a first bus unit 114 coupled to a second bus unit 116 via a bus 118 included in the first chip 110. Further, the first chip 110 may include a first memory 120 corresponding to the first bus unit 114 and a second memory 122 corresponding to the second bus unit 116. However, the first chip 110 may include a larger or smaller number of memories. Further, in some embodiments, the first and second chips 110, 112 may share one or more such memories. The configuration of the second chip 112 may be the same as the first chip 110. Further, the first and second chips 110, 112 of the first card 108 may be coupled via a bus 123. Additionally, in some embodiments, the first card 108 may include a memory 124 corresponding to bus units 114, 116 included in the first and/or second chips 110, 112. Alternatively, the first card 108 may not include such memory 124.

[0015] The system 100 may include a second card 126 coupled to the first card 108 via a bus 128. Further, the system 100 may include a third card 130 coupled to the second card 126 via a bus 132. The configuration of the second and third cards 126, 130 may be the same as the first card 108. Consequently, bus units 114, 116 may communicate via the buses 104. The system 100 described above is exemplary, and therefore, the system 100 may be configured differently. For example, each chip 110, 112 of each card 108, 126, 130 may include a larger or smaller number of bus units 114, 116, buses 118 and/or memories 120, 122. Further, each card 108, 126, 130 may include a larger or smaller number of chips 110, 112 and/or memories 124. Additionally, the system 100 may include a larger or smaller number of cards 108, 126, 130, which may be coupled in the same or a different manner.

[0016] The system 100 is adapted to reduce command processing latency while maintaining coherence of memories 120, 122, 124 included therein. For example, in contrast to a conventional system, the system 100 may process a pending command requiring access to a memory address from one of the plurality of bus units 114, 116 included in the system 100 without requiring permission from all remaining bus units 114, 116 of the system 100. Permission from a remaining bus unit 114, 116 may refer to a snoop response in which the remaining bus unit 114, 116 indicates whether the bus unit 114, 116 locally stores the memory address, and if so, the status of the locally stored memory address. Assume the system 100 is processing a pending command from the first bus unit 114 in the first chip 110 of the first card 108, a large amount of time (e.g., a large number of clock cycles) may be required for such a bus unit 114 to receive permission from a bus unit 114, 116 included in another chip 112 included in the same card 108 (e.g., due to the chip crossing involved). An even longer amount of time may be required for the first bus unit 114 in the first chip 110 of the first card 108 to receive permission from a bus unit 114, 116 included in another card 126, 130 included in the system 100 (e.g., due to the card crossing involved).

[0017] However, during system operation, different groups of bus units 114, 116 on the same chip 110, 112, bus units 114, 116 on different chips 110, 112, or bus units 114, 116 on different cards 108, 126, 130 may be employed for (e.g., to execute) different applications. Therefore, while processing a pending command from a first bus unit 114, 116 included in a first set or group of one or more bus units 114, 116 employed for a first application, the first bus unit 114, 116 may not need to know a state of memories 120, 122, 124 corresponding to bus units 114, 116 employed for different applications. By requiring the bus unit 114 which issued the pending command to await permission from every remaining bus unit 114, 116 of the system 100 during such operation to ensure memory coherence, latency may be introduced in command processing.

[0018] Consequently, to avoid such unnecessary command processing delay, the system 100 may employ an improved memory map 134 stored by logic (e.g., memory). The improved memory map 134 may be adapted to store memory addresses available to the system 100 (e.g., memory addresses provided by all of the plurality of memories 106), and may be implemented in hardware (e.g., logic), a computer program product and/or software executed by the system 100. The memory map 134 may enable the system 100 to reduce command processing latency while maintaining system coherence. To wit, the improved memory map 134 may enable the system 100 to reduce coherent command processing latency while maintaining coherence of memories 106 included in the system 100. More specifically, in contrast to a memory map included in the conventional system, memory addresses included in the memory map 134 may be arranged into a plurality of domains or groups 136 such that at least one of the groups does not require the system 100, in response to a command that requires access to a memory address in such a group from a bus unit 114, 116, to get permission from all remaining bus units 114, 116 included in the system 100 to maintain memory coherence. A system designer or architect may arrange memory addresses of the system memory map 134 into such groups or domains 136, which may require less than full system-wide coherency. The system 100 may take advantage of such a reduced coherency requirement to dramatically reduce command processing latency. In this manner, the memory map 134 may define a coherency domain hierarchy for a multiple bus unit (e.g., multiprocessor) memory system 100.

[0019] Exemplary groups or domains may include one or more of non-coherency, chip-wide coherency, card-wide coherency, box-wide coherency (e.g., computer- or server-wide coherency) and system-wide coherency. In the non-coherency domain, before executing a command, which requires access to a memory address in the domain, from a bus unit 114, 116, the system 100 may not be required to get permission from all remaining bus units 114, 116 in the system 100. Permission may only be required from the owner of (e.g., bus unit 114, 116 corresponding to) the memory 120, 122, 124 that is the target of the transaction before executing the command.

Continue reading about Methods and apparatus for reducing command processing latency while maintaining coherence...
Full patent description for Methods and apparatus for reducing command processing latency while maintaining coherence

Brief Patent Description - Full Patent Description - Patent Application Claims

Click on the above for other options relating to this Methods and apparatus for reducing command processing latency while maintaining coherence patent application.
###
monitor keywords

How KEYWORD MONITOR works... a FREE service from FreshPatents
1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored.
3. Each week you receive an email with patent applications related to your keywords.  
Start now! - Receive info on patent apps like Methods and apparatus for reducing command processing latency while maintaining coherence or other areas of interest.
###


Previous Patent Application:
Memory system and method for controlling the same, and method for maintaining data coherency
Next Patent Application:
Distributed cache coherence at scalable requestor filter pipes that accumulate invalidation acknowledgements from other requestor filter pipes using ordering messages from central snoop tag
Industry Class:
Electrical computers and digital processing systems: memory

###

FreshPatents.com Support
Thank you for viewing the Methods and apparatus for reducing command processing latency while maintaining coherence patent info.
IP-related news and info


Results in 0.15181 seconds


Other interesting Feshpatents.com categories:
Electronics: Semiconductor Audio Illumination Connectors Crypto 174
filepatents (1K)

* Protect your Inventions
* US Patent Office filing
patentexpress PATENT INFO