| Stream processor with variable single instruction multiple data (simd) factor and common special function -> Monitor Keywords |
|
Stream processor with variable single instruction multiple data (simd) factor and common special functionUSPTO Application #: 20070186082Title: Stream processor with variable single instruction multiple data (simd) factor and common special function Abstract: Included are embodiments of a stream processor configured to process data in any of a plurality of different formats. At least one embodiment of the stream processor includes a first scalar arithmetic logic unit (ALU), configured to process a plurality of sets of short data in response to a received short format control signal from an instruction set and process a set of long data in response to a received long format control signal from the instruction set. Embodiments of the processor also include a second arithmetic logic unit (ALU), configured to receive the processed data from the first arithmetic logic unit (ALU) and process the input data and the processed data according to a control signal from the instruction set. Still other embodiments include a special function unit (SFU) configured to provide additional computational functionality to the first ALU and the second ALU. (end of abstract) Agent: Thomas, Kayden, Horstemeyer & Risley, LLP - Atlanta, GA, US Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding USPTO Applicaton #: 20070186082 - Class: 712221 (USPTO) The Patent Description & Claims data below is from USPTO Patent Application 20070186082. Brief Patent Description - Full Patent Description - Patent Application Claims CROSS REFERENCE [0001]This application claims the benefit of U.S. Provisional Application No. 60/765,571, filed on Feb. 6, 2006, which is incorporated by reference in its entirety. This application is also related to copending U.S. Utility Patent Application entitled "Dual Mode Floating Point Multiply-Accumulate Unit" filed on the same day as the present application and accorded Ser. No. ______, which is hereby incorporated by reference herein in its entirety. [0002]The U.S. patent application entitled "SIMD Processor with Scalar Arithmetic Logic Units" filed on Jan. 29, 2003 and given Ser. No. 10/354,795 is also incorporated by reference in its entirety. BACKGROUND [0003]Since the year 2000, fixed function Graphics Processing Units (GPUs) are becoming more and more programmable, providing a user with direct and flexible control on the processing primitive, vertex, texture, and pixel streams in graphics chips. Many current GPUs can feature programmability in the form of at least one shader (primitive, vertex, etc.) but generally can process only a few types of data (say 32-bit floating point for vertex and 32-bit integer). The programmable shaders in the graphics pipeline are generally arranged in sequential manner for forwarding data to fixed function units and to each other with a data format conversion if desired. [0004]Also generally involved in the design of GPUs are parallel multiprocessor architecture principles. Application of parallel architecture principles generally utilizes a plurality of same type arithmetic logic units (ALUs) to process different types of stream data in non-uniform program threads. In many circumstances, the ALUs are desired to process different kinds of data for every clock cycle if non-uniform program threads are interleaved. [0005]One of important issues is an implementation of complex mathematical functions (special functions) in such multiprocessor structures. There are generally two ways to implement them: special subroutine executed on general ALU and special hardware unit attached to general ALU which produced result by its request. Software implementation of such functions creates significant performance degradation, which might be unacceptable in case of real-time graphics applications. In the case of multiple ALU combined in SIMD structure such unit should be attached to every ALU which may significantly increase hardware overhead. Such complex functions are not used very often in a shader program and most of the time those special hardware units combined with each general ALU will be idling. [0006]This situation can be partially resolved by sharing the special function unit (SFU) among a plurality of ALUs, but in the case of an SIMD structure, a thread will be stalled until all streams will get their result from shared SFU which will process requests sequentially. It may take several cycles of overhead in each involvement of complex mathematical function in shader program. Special arrangements in the SIMD stream architecture should be made to minimize stall wait cycles and provide smooth stream processing with minimal overhead if non-uniform program threads are interleaved. [0007]While the ALUs used in this multiprocessing manner generally sustain high throughput, the ALUs should be able to process more data streams in short format sharing the same hardware for longer format. Generally speaking, current ALUs for GPUs are configured to process only one format of floating point unit (e.g., 32-bit IEEE format as standard) and generally experience low performance in processing lower accuracy pixel and texture data. Additionally, if another type of data format is supported, the ALU generally works with the same number of streams with little to no throughput improvement nor Single Instruction Multiple Data (SIMD) factor variability regardless of the data format. Further, current ALUs are generally not configured to arbitrarily interleave the flow of instructions (lack of support for non-uniform threads). Additionally, current dual format Multiply Accumulate (MACC) units can generally process only integer data. [0008]Vector machines with a fixed data format and a fixed SIMD factor generally have less of a hardware load and generally process stream data relatively slowly in the case where there are a lesser number of elements in the vector stream than the width of a vector unit. Additionally current graphics shader architecture generally has limited instruction set capabilities in processing different format data in the same instruction. [0009]Thus, a heretofore unaddressed need exists in the industry to address the aforementioned deficiencies and inadequacies. SUMMARY [0010]Included are embodiments of a stream processor configured to process data in any of a plurality of different formats. At least one embodiment of the stream processor includes a first scalar arithmetic logic unit (ALU), configured to process short data in response to a received short format control signal from an instruction set and process long data in response to a received long format control signal from the instruction set. Embodiments of the processor also include a second arithmetic logic unit (ALU), configured to receive the processed data from the first arithmetic logic unit (ALU) and process the input data and the processed data according to a control signal from the instruction set. Still other embodiments include a special function unit (SFU) configured to provide computational functionality to the first ALU and the second ALU. [0011]Additionally included are embodiments of a method for processing data in any of a plurality of different formats. At least one embodiment of the method includes determining that received data is short format data and in response to determining that the received data is short format data, functionally dividing a first arithmetic logic unit (ALU) for processing, according to an instruction set. Other embodiments of the method include sending the processed data to a second functionally divided ALU. [0012]Also included are embodiments of a modular stream processor configured to process data in a plurality of different formats. At least one embodiment of a modular stream processor includes a first Arithmetic Logic Unit (ALU) configured to receive first input data and control data, the control data being configured to indicate a format associated with the received input data, the first ALU further configured to process short format input data and long format input data, according to the control data. Some embodiments include a second ALU configured to receive the control data from the first ALU, the second ALU further configured to process second input data, the second input data being related to the first input data, the second ALU being further configured to process short format input data and long format input data, according to the control data. Still some embodiments include a third ALU configured to receive the control data from the second ALU, the third ALU further configured to receive third input data, the third input data being related to the first input data and the second input data, the third ALU further configured to process short format input data and long format input data according to the control data. Some embodiments include a fourth ALU configured to receive the control data from the third ALU, the fourth ALU further configured to receive fourth input data, the fourth input data being related to the first input data, the second input data, and the third input data, the fourth ALU further configured to process short format data and long format data, according to the control data. [0013]Other systems, methods, features, and advantages of this disclosure will be or become apparent to one with skill in the art upon examination of the following drawings and detailed description. It is intended that all such additional systems, methods, features, and advantages be included within this description, be within the scope of the present disclosure. BRIEF DESCRIPTION [0014]Many aspects of the disclosure can be better understood with reference to the following drawings. The components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present disclosure. Moreover, in the drawings, like reference numerals designate corresponding parts throughout the several views. While several embodiments are described in connection with these drawings, there is no intent to limit the disclosure to the embodiment or embodiments disclosed herein. On the contrary, the intent is to cover all alternatives, modifications, and equivalents. [0015]FIG. 1A is a flowchart illustrating stream data processing steps that can be taken in an exemplary vector processing unit. [0016]FIG. 1B is a flowchart illustrating stream data processing steps that can be taken in an exemplary scalar processing unit, similar to the steps illustrated in FIG. 1A. [0017]FIG. 1C is an exemplary stream processing SIMD structure with software implementation of complex mathematical functions. [0018]FIG. 1D is an exemplary stream processing SIMD structure with hardware implementation of complex mathematical functions using private special function unit (SFU) for each ALU. [0019]FIG. 1E is an exemplary stream processing SIMD structure with hardware implementation of complex mathematical functions using a common SFU for all ALUs. [0020]FIG. 1F is an exemplary stream processing SIMD structure with implementation of complex mathematical functions using a common SFU with interleaved access to common SFU. Continue reading... Full patent description for Stream processor with variable single instruction multiple data (simd) factor and common special function Brief Patent Description - Full Patent Description - Patent Application Claims Click on the above for other options relating to this Stream processor with variable single instruction multiple data (simd) factor and common special function patent application. ### 1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored. 3. Each week you receive an email with patent applications related to your keywords. Start now! - Receive info on patent apps like Stream processor with variable single instruction multiple data (simd) factor and common special function or other areas of interest. ### Previous Patent Application: Supporting out-of-order issue in an execute-ahead processor Next Patent Application: Circuit and method for loop control Industry Class: Electrical computers and digital processing systems: processing architectures and instruction processing (e.g., processors) ### FreshPatents.com Support Thank you for viewing the Stream processor with variable single instruction multiple data (simd) factor and common special function patent info. IP-related news and info Results in 2.74492 seconds Other interesting Feshpatents.com categories: Electronics: Semiconductor , Audio , Illumination , Connectors , Crypto , |
||