FreshPatents.com Logo
stats FreshPatents Stats
1 views for this patent on FreshPatents.com
2012: 1 views
Updated: December 09 2014
newTOP 200 Companies filing patents this week


Advertise Here
Promote your product, service and ideas.

    Free Services  

  • MONITOR KEYWORDS
  • Enter keywords & we'll notify you when a new patent matches your request (weekly update).

  • ORGANIZER
  • Save & organize patents so you can view them later.

  • RSS rss
  • Create custom RSS feeds. Track keywords without receiving email.

  • ARCHIVE
  • View the last few months of your Keyword emails.

  • COMPANY DIRECTORY
  • Patents sorted by company.

Your Message Here

Follow us on Twitter
twitter icon@FreshPatents

Processor for executing highly efficient vliw

last patentdownload pdfdownload imgimage previewnext patent

20120272044 patent thumbnailZoom

Processor for executing highly efficient vliw


A 32-bit instruction 50 is composed of a 4-bit format field 51, a 4-bit operation field 52, and two 12-bit operation fields 59 and 60. The 4-bit operation field 52 can only include (1) an operation code “cc” that indicates a branch operation which uses a stored value of the implicitly indicated constant register 36 as the branch address, or (2) a constant “const”. The content of the 4-bit operation field 52 is specified by a format code provided in the format field 51.

Browse recent Panasonic Corporation patents - Osaka, JP
Inventors: Shuichi TAKAYAMA, Nobuo HIGAKI
USPTO Applicaton #: #20120272044 - Class: 712205 (USPTO) - 10/25/12 - Class 712 
Electrical Computers And Digital Processing Systems: Processing Architectures And Instruction Processing (e.g., Processors) > Instruction Fetching



view organizer monitor keywords


The Patent Description & Claims data below is from USPTO Patent Application 20120272044, Processor for executing highly efficient vliw.

last patentpdficondownload pdfimage previewnext patent

BACKGROUND OF THE INVENTION

(1) Field of the Invention

The present invention relates to a processor with VLIW (Very Long Instruction Word) architecture, and in particular to a processor that executes instructions with comparatively short word length and high code efficiency.

(2) Description of the Prior Art

With the increase in demand for multimedia devices and the miniaturization of electronic circuits in recent years, there has been a growing need for microprocessors that can process multimedia data, such as audio data and image data, at high speed. One kind of processors that are capable of meeting this need are processors that use VLIW architecture, these being hereinafter referred to as “VLIW processors”.

VLIW processors include a number of internal operation units and so are able to simultaneously execute a number of operations in one VLIW in parallel. Such VLIW are generated by a compiler that investigates the extent to which parallel processing is possible at the source program level and performs scheduling. For embedded microprocessors used in consumer appliances, however, it is important to suppress the code size of programs, so that 256-bit VLIW, with their high incidence of no-operation instructions (hereinafter referred to as “NOP instructions”) and resulting poor code efficiency, are far from ideal.

One example of a VLIW processor that executes instructions with relative short word length is Japanese Laid-Open Patent Application H09-26878. This technique teaches a data processing apparatus that is a VLIW processor for executing 32-bit instructions that can simultaneously indicate a maximum of two operations.

FIGS. 1A and 1B show the instruction format of the stated technique, with FIG. 1A showing the instruction format for simultaneously indicating two operations and FIG. 1B showing the instruction format for indicating only one operation. This technique aims to improve code efficiency by including a 2-bit value in the format field 410 that shows the number of operations in each instruction and the execution order.

The indication of a maximum of two operations by a single 32-bit instruction, however, does not achieve a sufficient degree of parallelism. There is also the problem of decreases in code efficiency of instructions when performing an operation using a constant that exceeds a given word length. As one example, when a 32-bit constant is split into an upper 16 bits and a lower 16 bits so that it can be set into registers, two 32-bit instructions are required just to indicate an operation using this constant.

SUMMARY

OF THE INVENTION

In view of the stated problems, it is a first object of the present invention to provide a VLIW processor that executes instructions of comparatively short word length, but which have a high degree of parallelism and a highly efficient code structure so that several operations can be simultaneously indicated. As one example, three or more operations can be indicated by a single 32-bit instruction.

It is a second object of the present invention to provide a VLIW processor for executing instructions of a comparatively short word length that have a structure whereby the overall code efficiency will be comparatively unaffected even when handling constants of comparatively long word length.

The first object can be realized by a VLIW (Very Long Instruction Word) processor that decodes and executes an instruction that has at least two operation fields, of which a first operation field can only include one operation code for specifying an operation type and a second operation field includes a combination of one operation code and at least one operand used in an operation indicated by the second operation field, the VLIW processor including: a first decoding unit for decoding the operation code in the first operation field; a first execution unit for executing an operation indicated by the operation code in the first operation field in accordance with a decoding result of the first decoding unit; a second decoding unit for decoding the operation code in the second operation field; and a second execution unit for executing the operation indicated by the operation code in the second operation field on data which is indicated by the operands in the second operation field, in accordance with a decoding result of the second decoding unit.

By doing so, since at least one operation in the instruction can be indicated by merely inserting an operation code without an explicit indication of an operand, the word length of instructions can be reduced. As a result, a VLIW processor that executes instructions of comparatively short word length, but which have a highly efficient code structure so that several operations can be simultaneously indicated is achieved.

Here, a number of bits occupied by the operation code in the first operation field may be equal to a number of bits occupied by the operation code in the second operation field.

As a result, all operation codes that are included in an instruction will be composed of the same number of bits, which simplifies components such as the decoder circuits.

Here, the instruction may include three operation fields, wherein a third operation field in the three operation fields may occupy a same number of bits as the second operation field and may include a combination of one operation code and at least one operand, the VLIW processor further including: a third decoding unit which decodes, when an operation code is present in the third operation field, the operation code in the third operation field; and a third executing unit for executing an operation indicated by the operation code in the third operation field on data which is indicated by the operands in the third operation field, in accordance with a decoding result of the third decoding unit.

As a result, a VLIW processor with a high degree of parallelism whereby three operations can be simultaneously performed can be achieved.

Here, the first executing unit may control a control flow of a program including the instruction.

As a result, branch operations which do not normally require a large number of bits can be assigned to a short operation field. This means an instruction set with high code efficiency can be defined.

Here, the second executing unit may control transfer of the data that is indicated by the operands included in the second operation field, and the third executing unit may control executes an arithmetic logic operation on the data that is indicated by the operands included in the third operation field.

As a result, data transfer to and from an external memory can be indicated by a single operation in an instruction, so that the operand access circuit that should be provided in a VLIW processor can be simplified.

The second object of the present invention can be achieved by a VLIW processor that decodes and executes an instruction that has at least two operation fields, of which a first operation field can only include one of (i) a single operation code for specifying an operation type and (ii) a constant, and a second operation field includes one of (i) a combination of one operation code and at least one operand used in an operation indicated by the second operation field and (ii) a constant, the VLIW processor including: a first decoding unit which decodes, when an operation code is present in the first operation field, the operation code in the first operation field; a first executing unit for executing an operation indicated by the operation code in the first operation field, in accordance with a decoding result of the first decoding unit; a second decoding unit which decodes, when an operation code is present in the second operation field, the operation code in the second operation field; and a second executing unit for executing an operation indicated by the operation code in the second operation field on data which is indicated by the operands in the second operation field, in accordance with a decoding result of the second decoding unit.

With the stated construction, when it is necessary to put meaningless code into an operation field in an instruction, a constant that will be used by a different operation may instead be inserted, so that a VLIW processor can be realized for executing instructions which have a high code efficiency despite having only a short word length.

Here, the instruction also includes a format field including a format code indicating whether only a constant is located in the first operation field and whether only a constant is located in the second operation field, the VLIW processor further including: a format decoding unit for decoding the format code; and a constant storage unit for extracting, when a decoding result of the format decoding unit shows that only a constant is present in at least one of the first operation field and the second operation field, the constant in the instruction and storing the extracted constant.

As a result, constants placed in an operation field can be stored in the constant storage unit for use by an operation in a later instruction, so that decreases in code efficiency can be avoided even when handling constants of a comparatively long word length using instructions of a comparatively short word length.

Here, the format field, the first operation field, the operation code in the second operation field, each operand in the second operation field, the operation code in the third operation field, and each operand in the third operation field may each occupy n bits in the instruction.

With the stated construction, all of the fields that compose an instruction have the same number of bits, which enables the internal circuits of the VLIW processor to be simplified.

Here, a VLIW processor may include: a fetch unit for fetching an L-bit instruction that includes n operation fields; and n operation units which are each associated with a different one of the n operation fields in the fetched instruction and each independently execute an operation indicated in the associated operation field in parallel with each other; the VLIW processor being characterized by the n operation fields not all being a same size, and by L not being an integer multiple of n.

With the stated construction, there is no need for all of the operation fields in an instruction to have the same word length, making it possible to define instructions with high code efficiency. As a result, a VLIW processor that executes instructions of comparatively short word length, but which have a highly efficient code structure so that several operations can be simultaneously indicated is achieved.

Here, n may be 3 and L may be 32.

The stated construction realizes a VLIW processor with a high degree of parallelism whereby three operations that are specified by a single 32-bit instruction can be simultaneously performed.

Here, a number of operands included in at least one operation field out of the n operation fields may be different to a number of operands in other operation fields in the n operation fields.

With the stated construction, there is no need for every operation field in an instruction to have the same number of operands, so that instruction formats with a high degree of code efficiency can be defined.

Here, the n operation fields may include at least one operation field composed of only an operation code and at least one operation field composed of an operation code and at least one operand.

With the stated construction, the instruction word length is shorter than the case when every operation field in an instruction contains a combination of an operation code and operands, so that a VLIW processor that executes instructions which have a highly efficient code construction can be realized.

As described above, the present invention realizes a VLIW processor that executes instructions of comparatively short word length but which have a highly efficient code structure that allows several operations to be specified by a single instruction. This effect is especially noticeable for embedded processors that process multimedia data.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other objects, advantages and features of the invention will become apparent from the following description thereof taken in conjunction with the accompanying drawings which illustrate a specific embodiment of the invention. In the drawings:

FIGS. 1A and 1B show instruction formats used under the prior art, with FIG. 1A showing an instruction format where two operations are simultaneously indicated and FIG. 1B showing an instruction format where one only operation is indicated;

FIG. 2A shows the field structure of an instruction that is executed by the processor of the present invention;

FIGS. 2B to 2D show sixteen types of instruction format, with FIG. 2B showing triple operation instructions, FIG. 2C showing twin operation instructions, and FIG. 2D showing single operation instructions;

FIG. 3 is a table showing specific operations that are indicated by the three types of operation code, “cc”, “op1”, and “op2”, that are used in FIGS. 2B to 2D;

FIG. 4 is a block diagram showing the hardware construction of the present processor;

FIG. 5 is a block diagram showing the detailed construction of the constant register 36 of the present processor and the peripheral circuits;

FIGS. 6A to 6D are representations of different methods for storing a constant by the constant register control unit 32 shown in FIG. 5, with FIG. 6A showing the case when the format code is “0” or “1”, FIG. 6B showing the case when the format code is “4”, FIG. 6C showing the case when the format code is “5”, and FIG. 6D showing the case when the format code is “2”, “3”, or “A”;

FIG. 7 is a block diagram showing the detailed construction of the PC unit 33 of the present processor;

FIG. 8 is a flowchart showing a procedure that handles a 32-bit constant;

FIG. 9 shows an example of a program that has the present processor execute the procedure shown in FIG. 8;

FIG. 10 is a timing chart showing the operation of the present processor when executing the program shown in FIG. 9;

FIG. 11 is an example of a program that has the present processor execute a procedure that handles a 16-bit constant;

FIG. 12A shows the field definition of instructions that are executed by a standard processor;

FIG. 12B shows the instruction format of the instructions shown in FIG. 12A;

FIG. 13 shows an example of a program that has a standard processor perform the same procedure as the program shown in FIG. 9;

FIG. 14 shows an example of a program that has a standard processor execute the same procedure as the program shown in FIG. 11;

FIGS. 15A to 15D show modifications to the structure of the instructions executed by the VLIW processor of the present invention; and

FIG. 16 shows a modification to the hardware construction of the present processor to enable the execution of the instruction shown in FIG. 15A.

DESCRIPTION OF THE PREFERRED EMBODIMENT

An embodiment of the processor of the present invention is described below with reference to the figures. In this embodiment, the expression “instruction” refers to a set of code that is decoded and executed by the present processor simultaneously and in parallel, with the expression “operation” refers to a unit of processing, such as an arithmetic operation, a logic operation, a transfer, or a branch, which is executed by the present processor in parallel, as well as to the code which indicates each unit of processing.

Instruction Format

First, the structure of the instructions that are decoded and executed by the present processor will be described. The present processor is a VLIW processor that decodes and executes instructions with a fixed word length of 32 bits.

FIG. 2A shows the field structure of an instruction 50 to be executed by the present processor. FIGS. 2B to 2D, meanwhile, show sixteen instruction formats. Of these, the instruction formats in FIG. 2B simultaneously indicate three operations, the instruction formats in FIG. 2C two operations, and the instruction formats in FIG. 2D a single operation.

This instruction 50 has a fixed word length of 32 bits and is composed of eight 4-bit physical fields shown in order starting from the MSB (Most Significant Bit) as P0.0 field 51, P1.0 field 52, . . . P3.2 field 58 in FIG. 2A. Of these, the range from the P2.0 field 53 to the P2.2 field 55 is called the first operation field 59, while the range from the P3.0 field 56 to the P3.2 field 58 is called the second operation field 60.

In FIGS. 2B to 2D, the legend “const” indicates a constant, and depending on the operation in which it is used, this can be a numeric constant or a character constant such as an immediate, an absolute address, or a displacement. The legend “op” represents an operation code that indicates an operation type, while the legend “Rs” indicates the register used as the source operand, “Rd” the register used as the destination operand, and “cc” an operation code indicating a branch operation that uses the stored value of a specialized 32-bit register provided in the present processor (the constant register 36 shown in FIG. 4) as the absolute address or relative address (displacement) of a branch destination.

The numerical values given directly after the codes described above show values that are used in the operation in either the first operation field 59 or the second operation field 60. As one example, for the instruction format with the format code “6”, the 4-bit constant “const1” located in the P1.0 field 52 and the 4-bit constant “const1” located in the P2.1 field 54 are combined to form an 8-bit constant that is the source operand corresponding to the operation code “op1” of the first operation field 59.

The constant “const” which is not appended with a number represents a constant to be stored in the specialized 32-bit register provided in the present processor (the constant register 36 shown in FIG. 4). As one example, for the instruction format with the format code “0”, the 4-bit constant “const” located in the P1.0 field 52 implies the constant that is to be stored in the constant register 36 which is implicitly indicated.

FIG. 3 shows specific examples of the operations that can be indicated by the three kinds of operation code “cc”, “op1”, and “op2” given in FIGS. 2B to 2D. These operations are described in detail below.

The 4-bit operation code “cc” indicates one out of sixteen types of branch instruction. Each branch instruction is specified as a branch condition and a branch format. Examples of branch conditions include “equal to (\'eq\')”, “not equal to (\'neq\')”, and “greater than (\'gt\')”. The branch format can be a format where the stored value of the constant register 36 serves as the absolute address of the branch destination (denoted by having no “i” attached to the instruction mnemonic), or a format where the stored value of the constant register 36 serves as a relative address (denoted by having “i” attached to the instruction mnemonic). As one example, the operation code “eq” represents an operation that branches to a destination indicated through absolute addressing when a preceding comparison finds the compared values to be equal, while the operation code “eqi” represents an operation that branches to a destination indicated through relative addressing when a preceding comparison finds the compared values to be equal.

The 4-bit operand “op2” can be used to indicate an arithmetic logic operation, such as any of an “add” (addition), a “sub” (subtraction), a “mul” (multiplication), an “and” (logical AND), or an “or” (logical OR), or an operation that is an inter-register transfer, such as any of a “mov” (transfer of word (32-bit) data), a “movh (transfer of halfword data), or a “movb” (transfer of one byte data). The 4-bit operand “op2” can be used to indicate any of the arithmetic logic operations or inter-register transfers that can be indicated by the operand “op1”, but can also be used to indicate a register-memory transfer operation such as an “ld” (load of one word data from memory into registers) or an “st” (store of one word data into memory from registers).

The characteristic features of the fields 51, 52, 59, and 60 shown in FIG. 2A are described below.

The P0.0 field 51 holds a 4-bit format code that specifies the format of the instruction 50. More specifically, this P0.0 field 51 specifies one of the sixteen instruction formats shown in FIGS. 2B to 2D.

The P1.0 field 52 is a field holds a constant or an operation code for a branch operation. When a constant is located in the P1.0 field 52 (such as in the instructions with the format codes “0”, “1”, and “4” to “9”) there are cases where the constant is to be stored in the constant register 36 (such as in the instructions with the format codes “0”, “1”, “4”, and “5”), and cases where the constant forms one part of the operand in the first operation field 59 or the second operation field 60 (such as in the instructions with the format codes “5”, “7”, “8”, “9”, and “B”). When the constant in the P1.0 field 52 is to be stored in the constant register 36, there are cases where only this 4-bit constant is stored (such as in the instructions with the format codes “0” and “1”), and cases where this constant is stored together with a 12-bit constant located in either the first operation field 59 or the second operation field 60 (such as in the instructions with the format codes “4” and “5”).

When the operation code “cc” for branching is given in the P1.0 field 52 (such as in the instructions with the format codes “2”, “3”, and “A”), this indicates a branch operation that uses the stored value of the constant register 36 as the absolute address or relative address (displacement) of a branch destination.

The first operation field 59 holds either a constant or a combination of (a) an operation code for indicating an operation (such as an arithmetic logic operation or inter-register transfer) that does not involve data transfer between the present processor and the periphery (memory), and (b) source and destination operands for the operation.

The second operation field 60 can hold the same content as the first operation field 59 described above, but can also alternatively hold a combination of (a) an operation code for indicating an operation (such as memory-register transfer) that involves data transfer between the present processor and the periphery and (b) operands for the operation.

The above assignment of different operation types to certain fields rests on the premises for the present von Neumann-type processor whereby it is not necessary to process two or more branch operations simultaneously, and that only one input/output port (the operand access unit 40 shown in FIG. 4) for transferring operands is provided between the present processor and the periphery (memory).

The instruction formats shown in FIGS. 2B to 2D have the following characteristic features.

First, by focusing on the constant “const”, it can be seen that there are the following three types of instruction for storing a constant in the constant register 36.

(1) When the Format Code is “0” or “1”:

In these instructions, the 4-bit constant located in the P1.0 field 52 is stored in the constant register 36.

(2) When the Format Code is “4”:

In this instruction, a 16-bit constant located in the P1.0 field 52 to P2.2 field 55 is stored in the constant register 36.

(3) When the Format Code is “5”:

In this instruction, a 16-bit constant located in the P1.0 field 52 and the P3.0 field 56 to P3.2 field 58 is stored in the constant register 36.

Secondly, for the present processor, a maximum of three operations can be indicated by a single instruction, and in this case, as can be seen from the triple operation formats shown in FIG. 2B, either of the following combinations of operation types can be used.

(1) One operation that sets a 4-bit constant into the constant register 36 and two standard operations (when the format code is “0” or “1”).

(2) One operation that performs branching using the value set in the constant register 36 as an absolute address or a relative address and two standard operations (when the format code “2” or “3”).

As described above, the instructions of present processor have a highly efficient field structure that enables a maximum of three operations to be simultaneously indicated by a single 32-bit instruction.

Hardware Construction of the Processor

The hardware construction of the present processor is described below.

FIG. 4 is a block diagram showing the hardware construction of the processor of the present invention. As described above; this processor is a VLIW processor that can execute a maximum of three operations in parallel. The construction of the processor can be roughly divided into an instruction register 10, a decoder unit 20, and an execution unit 30.

The instruction register 10 is a 32-bit register that stores one instruction that has been sent from the instruction fetch unit 39.



Download full PDF for full patent description/claims.

Advertise on FreshPatents.com - Rates & Info


You can also Monitor Keywords and Search for tracking patents relating to this Processor for executing highly efficient vliw patent application.
###
monitor keywords

Browse recent Panasonic Corporation patents

Keyword Monitor How KEYWORD MONITOR works... a FREE service from FreshPatents
1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored.
3. Each week you receive an email with patent applications related to your keywords.  
Start now! - Receive info on patent apps like Processor for executing highly efficient vliw or other areas of interest.
###


Previous Patent Application:
Distributed micro instructions set processor architecture for high-efficiency signal processing
Next Patent Application:
Request coalescing for instruction streams
Industry Class:
Electrical computers and digital processing systems: processing architectures and instruction processing (e.g., processors)
Thank you for viewing the Processor for executing highly efficient vliw patent info.
- - - Apple patents, Boeing patents, Google patents, IBM patents, Jabil patents, Coca Cola patents, Motorola patents

Results in 0.71479 seconds


Other interesting Freshpatents.com categories:
Nokia , SAP , Intel , NIKE ,

###

Data source: patent applications published in the public domain by the United States Patent and Trademark Office (USPTO). Information published here is for research/educational purposes only. FreshPatents is not affiliated with the USPTO, assignee companies, inventors, law firms or other assignees. Patent applications, documents and images may contain trademarks of the respective companies/authors. FreshPatents is not responsible for the accuracy, validity or otherwise contents of these public document patent application filings. When possible a complete PDF is provided, however, in some cases the presented document/images is an abstract or sampling of the full patent application for display purposes. FreshPatents.com Terms/Support
-g2--0.7575
Key IP Translations - Patent Translations

     SHARE
  
           

stats Patent Info
Application #
US 20120272044 A1
Publish Date
10/25/2012
Document #
13543437
File Date
07/06/2012
USPTO Class
712205
Other USPTO Classes
712E09016, 712E09028
International Class
06F9/30
Drawings
17


Your Message Here(14K)



Follow us on Twitter
twitter icon@FreshPatents

Panasonic Corporation

Browse recent Panasonic Corporation patents

Electrical Computers And Digital Processing Systems: Processing Architectures And Instruction Processing (e.g., Processors)   Instruction Fetching