Maintaining even and odd array pointers to extreme values by searching and comparing multiple elements concurrently where a pointer is adjusted after processing to account for a number of pipeline stages -> Monitor Keywords
Fresh Patents
Monitor Patents Patent Organizer How to File a Provisional Patent Browse Inventors Browse Industry Browse Agents Browse Locations
     new ** File a Provisional Patent ** 
site info Site News  |  monitor Monitor Keywords  |  monitor archive Monitor Archive  |  organizer Organizer  |  account info Account Info  |  
05/11/06 | 68 views | #20060101230 | Prev - Next | USPTO Class 712 | About this Page  712 rss/xml feed  monitor keywords

Maintaining even and odd array pointers to extreme values by searching and comparing multiple elements concurrently where a pointer is adjusted after processing to account for a number of pipeline stages

USPTO Application #: 20060101230
Title: Maintaining even and odd array pointers to extreme values by searching and comparing multiple elements concurrently where a pointer is adjusted after processing to account for a number of pipeline stages
Abstract: In one embodiment, a programmable processor searches an array of N data elements in response to N/M machine instructions, where the processor has a pipeline configured to process M data elements in parallel. In response to the machine instructions, a control unit directs the pipeline to retrieve M data elements from the array of elements in a single fetch cycle, concurrently compare the data elements to M current extreme values, and update the current extreme values, as well as M references to the current extreme values, based on the comparisons. (end of abstract)
Agent: Fish & Richardson, PC - Minneapolis, MN, US
Inventors: Charles P. Roth, Ravi Kolagotla, Jose Fridman
USPTO Applicaton #: 20060101230 - Class: 712009000 (USPTO)
Related Patent Categories: Electrical Computers And Digital Processing Systems: Processing Architectures And Instruction Processing (e.g., Processors), Processing Architecture, Vector Processor, Vector Processor Operation, Concurrent
The Patent Description & Claims data below is from USPTO Patent Application 20060101230.
Brief Patent Description - Full Patent Description - Patent Application Claims  monitor keywords



CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application is a divisional application of and claims priority to U.S. patent application Ser. No. 09/675,066, filed Sep. 28, 2000.

BACKGROUND

[0002] This invention relates to array searching operations for a computer.

[0003] Many conventional programmable processors, such as digital signal processors (DSP), support a rich instruction set that includes numerous instructions for manipulating arrays of data. These operations are typically computationally intensive and can require significant computing time, depending upon the number of execution units, such as multiply-accumulate units (MACs), within the processor.

DESCRIPTION OF DRAWINGS

[0004] FIG. 1 is a block diagram illustrating an example of a pipelined programmable processor.

[0005] FIG. 2 is a block diagram illustrating an example execution pipeline for the programmable processor.

[0006] FIG. 3 is a flowchart for implementing an example array manipulation machine instruction.

[0007] FIG. 4 is a flowchart of an example routine for invoking the machine instruction.

[0008] FIG. 5 is a flowchart for a single SEARCH instruction.

[0009] FIG. 6 is a flowchart where a software application issues N/M SEARCH instructions and, upon completion of the N/M SEARCH instructions, determines an extreme value for an entire array.

DESCRIPTION

[0010] FIG. 1 is a block diagram illustrating a programmable processor 2 having an execution pipeline 4 and a control unit 6. Processor 2, as explained in detail below, reduces the computational time required by array manipulation operations. In particular, processor 2 may support a machine instruction, referred to herein as the SEARCH instruction, that reduces the computational time to search an array of numbers in a pipelined processing environment.

[0011] Pipeline 4 has a number of stages for processing instructions. Each stage processes concurrently with the other stages and passes results to the next stage in pipeline 4 at each clock cycle. The final results of each instruction emerge at the end of the pipeline in rapid succession.

[0012] Control unit 6 controls the flow of instructions and data through the various stages of pipeline 4. During the processing of an instruction, for example, control unit 6 directs the various components of the pipeline 4 to fetch and decode the instruction, perform the corresponding operation and write the results back to memory or local registers.

[0013] FIG. 2 illustrates an example pipeline 4 configured according to the invention. Pipeline 4, for example, has five stages: instruction fetch (IF), decode (DEC), address calculation (AC), execute (EX) and write back (WB). Instructions are fetched from memory, or from an instruction cache, during the IF stage by fetch unit 21 and decoded within address registers 22 during the DEC stage. At the next clock cycle, the results pass to the AC stage, where data address generators 23 calculate any memory addresses that are necessary to perform the operation.

[0014] During the EX stage, execution units 25A through 25M perform the specified operation such as, for example, adding or multiplying numbers, in parallel. Execution units 25 may contain specialized hardware for performing the operations including, for example, one or more arithmetic logic units (ALU's), floating-point units (FPU) and barrel shifters. A variety of data can be applied to execution units 25 such as the addresses generated by data address generator 23, data retrieved from data memory 18 or data retrieved from data registers 24. During the final stage (WB), the results are written back to data memory or to data registers 24.

[0015] The SEARCH instruction supported by processor 2, may allow software applications to search an array of N data elements by issuing N/M search instructions, where M is the number of data elements that can be processed in parallel by execution units 25 of pipeline 4. Note, however, that a single execution unit may be capable of executing two or more operations in parallel. For example, an execution unit may include a 32-bit ALU capable of concurrently comparing two 16-bit numbers.

[0016] Generally, the sequence of SEARCH instructions allows the processor 2 to process M sets of elements in parallel to identify an "extreme value", such as a maximum or a minimum, for each set. During the execution of the search instructions, processor 2 stores references to the location of the extreme value of each of the M sets of elements. Upon completion of the N/M instructions, as described in detail below, the software application analyzes the references to the extreme values for each set to quickly identify an extreme value for the array. For example, the search instruction allows the software applications to quickly identify either the first or last occurrence of a maximum or minimum value. Furthermore, as explained in detail below, processor 2 implements the operation in a fashion suitable for vectorizing in a pipelined processor across the M execution units 25.

[0017] As described above, a software application searches an array of data by issuing N/M SEARCH machine instructions to processor 2. FIG. 3 is a flowchart illustrating an example mode of operation 300 for processor 2 when it receives a single SEARCH machine instruction. Process 300 is described with reference to identifying the last occurrence of a minimum value within the array of elements; however, process 300 can be easily modified to perform other functions such as identifying the first occurrence of a minimum value, the first occurrence of a maximum value or a last occurrence of a maximum value.

[0018] For exemplary purposes, process 300 is described in assuming M equals 2, i.e., processor 2 concurrently processes two sets of elements, each set having N/2 elements. However, the process is not limited as such and is readily extensible to concurrently process more than two sets of elements. In general, process 300 facilitates vectorization of the search process by fetching pairs of elements as a single data quantity and processing the element pairs through pipeline 4 in parallel, thereby reducing the total number of clock cycles necessary to identify the minimum value within the array. Although applicable to other architectures, process 300 is well suited for a pipelined processor 2 having multiple execution units in the EX stage. For the two sets of elements, process 300 maintains two pointer registers, P.sub.Even and P.sub.Odd, that store locations for the current extreme value within the corresponding set. In addition, process 300 maintains two accumulators, A0 and A1, that hold the current extreme values for the sets. The pointer registers and the accumulators, however, may readily be implemented as general-purpose data registers without departing from process 300.

[0019] Referring to FIG. 3, in response to each SEARCH instruction, processor 2 fetches a pair of elements in one clock cycle as a single data quantity (301). For example, processor 2 may fetch two adjacent 16-bit values as one 32-bit quantity. Next, processor 2 compares the even element of the pair to a current minimum value for the even elements (302) and the odd element of the pair to a current minimum value for the odd elements (304).

[0020] When a new minimum value for the even elements is detected, processor 2 updates accumulator A0 to hold the new minimum value and updates a pointer register P.sub.Even to hold a pointer to point to a corresponding data quantity within the array (303). Similarly, when a new minimum value for the odd elements has been detected, processor 2 updates accumulator A1 and a pointer register P.sub.Odd (305). In this example, each pointer register P.sub.Even and P.sub.Odd points to the data quantity and not the individual elements, although the process is not limited as such. Processor 2 repeats the process until all of the elements within the array have been processed (306). Because processor 2 is pipelined, element pairs may be fetched until the array is processed.

Continue reading...
Full patent description for Maintaining even and odd array pointers to extreme values by searching and comparing multiple elements concurrently where a pointer is adjusted after processing to account for a number of pipeline stages

Brief Patent Description - Full Patent Description - Patent Application Claims
Click on the above for other options relating to this Maintaining even and odd array pointers to extreme values by searching and comparing multiple elements concurrently where a pointer is adjusted after processing to account for a number of pipeline stages patent application.
###
monitor keywords

How KEYWORD MONITOR works... a FREE service from FreshPatents
1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored.
3. Each week you receive an email with patent applications related to your keywords.  
Start now! - Receive info on patent apps like Maintaining even and odd array pointers to extreme values by searching and comparing multiple elements concurrently where a pointer is adjusted after processing to account for a number of pipeline stages or other areas of interest.
###


Previous Patent Application:
Vectorized table lookup
Next Patent Application:
Semiconductor integrated circuit
Industry Class:
Electrical computers and digital processing systems: processing architectures and instruction processing (e.g., processors)

###

FreshPatents.com Support
Thank you for viewing the Maintaining even and odd array pointers to extreme values by searching and comparing multiple elements concurrently where a pointer is adjusted after processing to account for a number of pipeline stages patent info.
IP-related news and info


Results in 0.83845 seconds


Other interesting Feshpatents.com categories:
Canon USA , Celera Genomics , Cephalon, Inc. , Cingular Wireless , Clorox , Colgate-Palmolive , Corning , Cymer ,