Neural processing element for use in a neural network -> Monitor Keywords
Fresh Patents
Monitor Patents Patent Organizer How to File a Provisional Patent Browse Inventors Browse Industry Browse Agents Browse Locations
     new ** File a Provisional Patent ** 
site info Site News  |  monitor Monitor Keywords  |  monitor archive Monitor Archive  |  organizer Organizer  |  account info Account Info  |  
01/25/07 | 104 views | #20070022063 | Prev - Next | USPTO Class 706 | About this Page  706 rss/xml feed  monitor keywords

Neural processing element for use in a neural network

USPTO Application #: 20070022063
Title: Neural processing element for use in a neural network
Abstract: A neural processing element for use in a modular neural network is provided. One embodiment provides a neural network comprising an array of autonomous modules (300). The modules (300) can be arranged in a variety of configurations to form neural networks with various topologies, for example, with a hierarchical modular structure. Each module (300) contains sufficient neurons (100) to enable it to do useful work as a stand alone system, with the advantage that many modules (300) can be connected together to create a wide variety of configurations and network sizes. This modular approach results in a scaleable system that meets increased workload with an increase in parallelism and thereby avoids the usually extensive increases in training times associated with unitary implementations. (end of abstract)
Agent: David S. Resnick - Boston, MA, US
Inventor: Neil Lightowler
USPTO Applicaton #: 20070022063 - Class: 706015000 (USPTO)
Related Patent Categories: Data Processing: Artificial Intelligence, Neural Network
The Patent Description & Claims data below is from USPTO Patent Application 20070022063.
Brief Patent Description - Full Patent Description - Patent Application Claims  monitor keywords

[0001] The present invention relates to neural networks and more particularly, but not exclusively, to an apparatus for creating, and a method of training, a neural network.

[0002] Artificial Neural Networks (ANNs) are parallel information processing systems inspired by what is known about the brain and the way it functions. They offer a computing mechanism that differs significantly from the conventional serial computer systems, not simply because they process information in a parallel manner but because they do not require explicit information about the problems they are required to tackle; instead they learn by example. However, rather than being designed and built as computing platforms, they are predominantly simulated on conventional serial computing systems in software. For small networks this approach is generally sufficient, especially when considering the improvement in processing speed that has been achieved in recent years. However, when real-time systems and large networks are required, the computational burden often requires other approaches.

[0003] The basic neuron does very little computation on its own but when large numbers of neurons are used, the total computation is often such that even the fastest of serial computers is unable to train a network in a reasonable time scale. The problem is exacerbated because, the larger the network, the more training steps are required and, consequently, the amount of computation required increases exponentially with increasing network size. There is also the added problem of inter-neuron communication, which also increases with increasing network size and must be taken into account when attempting to implement networks on parallel systems, because this communication can become a bottleneck, preventing substantial speedups for parallel implementations.

[0004] When considering parallel implementation of ANNs, it is important to consider how the system is to be parallelised. This is dependent not only on the underlying architecture/technology but also the algorithm and sometimes on the intended application itself. However, there is often more than one approach for any particular architecture and an understanding of the consequences of partitioning strategies is of great value. When using multi-processor systems, there are two basic approaches to parallelising the Self-Organising Map (SOM) algorithm; either the functionality of the network can be partitioned such that one processor may perform only one aspect of the functionality of a neuron but performs this function for a large number of neurons, or the network can be partitioned so that a set of neurons (a set typically consists of one or more neurons) is implemented on each processor in the system.

[0005] Partitioning functionality of the network is an approach that has been used with transputer systems and, normally results in an architecture known as a systolic array. The basic principle of the systolic array is that the traditional single processing element is replaced by an array of processing elements with inputs and outputs only occurring at each end of the array. The processing that would traditionally be carried out by a single processor is then divided amongst the processor array. Normally, each processor would perform some of the functionality of the network and that function would only be performed by that processor. The array then acts as a pipeline of processors, with data flowing in at one end and results flowing out of the other. Unfortunately, this approach is generally only appropriate for moderately sized networks because the inter-processor communication overheads become unmanageable very quickly and adding more processors does little or nothing to alleviate the problem.

[0006] When partitioning the SOM wherein one or more neurons are implemented on an individual processor, the communication overhead is lessened when compared to approaches that partition functionality but can still become a bottleneck as network size increases. Coarse grain parallelism is the term generally associated with a number of neurons implemented on each processor whereas fine grain parallelism is the term used when only a single neuron is implemented on individual processors. The communication overhead tends to become more prominent as the number of neurons per processor is reduced because traditional processors are implemented on separate devices and communication between devices has much greater overheads than communication amongst neurons on the same device. Fine grain parallelism normally results in a Single Instruction stream Multiple Data stream (SIMD) system and is suited to massively parallel architectures such as the Connection Machine.

[0007] If the implementation medium is to be in hardware such as very large scale integration (VLSI) or similar, then it may be possible to increase the level of parallelism to the extent of implementing each weight in parallel. However, this approach does little to improve overall parallelism of the system because only part of the functionality is performed at the weight level and consequently, such an approach does not lead to the most effective use of resources. The approach adopted is fine grain parallelism with a single processing element performing the functionality of a single neuron. To overcome some of the inter-processor communication problems it is suggested that several processors be implemented on a single device with strong short range communications.

Neural Network Implementations

[0008] In an attempt to overcome the limitations of general purpose parallel computing platforms some researchers attempted to develop specialised neural network computers. Such approaches attempt to develop architectures best suited to neural networks but are normally based on the traditional parallel architectures listed above. Modifications to these basic architectural approaches have often been used in an attempt to overcome some of the traditional problems such as inter-processor communication. Others have attempted to modify existing parallel systems such as the Connection Machine to improve their usefulness as neurocomputing architectures. Some have even considered reconfigurable neurocomputer systems based on Field Programmable Gate Array Technology (FPGA) but most neurocomputer systems, while useful for investigating the possibilities of ANNs, are normally too large and expensive to be used for many applications.

[0009] Driven mainly by the application domain researchers undertook to investigate direct hardware implementation of ANNs, and as biological neural systems appear to be analogue, there was a bias towards analogue implementation. Indeed, analogue implementation of ANNs appears to be beneficial in some ways, e.g. very little hardware is required for the memory elements of such a system. However, there are also many problems with analogue implementation of ANNs because the fundamental building block of such systems is the capacitor. Due to the shortcomings of the capacitor, such as its tendency to suffer from leakage, a variety of schemes were developed to overcome these weaknesses.

[0010] Macq et al proposed an analogue approach to implementation of the SOM based on the use of currents to represent weight values. Such an approach may provide a mechanism for generating high density integration due to the small number of transistors required for each neuron, but this approach uses analogue synaptic weights based on current copiers, the principle component of which is the capacitor, which is prone to leakage. These leakage currents continuously modify the value stored by the capacitor thereby necessitating some form of refreshment to maintain reasonable precision of weight values. The main cause of this leakage is the reverse biased junction. Their proposed method of refreshment uses a converter to periodically refresh each synaptic weight. This is achieved by reading the current memorised by each cell using successive approximation and then writing back to the cell the next upper reference current. It is claimed that this approach allows for on chip learning. However, for the gain factor to reduce with time, as prescribed by Kohonen, adjustments need to be made to the reset signal, and for the neighbourhood to reduce with time the period of one of the timing circuit clocks must be adjusted. The impression given is that these changes would require manual intervention. The leakage current of capacitors also appears to be the main factor that would restrict the maximum number of memory cells in this design.

[0011] A charge based approach to implementation was suggested in "A Charge-Based On-Chip Adaptation Kohonen Neural Network" which claims that such an approach would lead to low power dissipation and compact device configurations. The approach uses switched capacitor circuits to store the weights and the adaptive weight synapses used utilises parasitic capacitances between two adjacent gates of the switched capacitor circuit to determine the learning rate. This will give a fixed learning rate, which will be different for each device manufactured due to the difficulties in manufacturing such components to exactly the same parameters from device to device. Weight integrity is also a potential problem area because, as with most analogue implementations of neural networks, weight values are stored by capacitors which have difficulty maintaining the charge held, and consequently the weight value. The authors of this paper attempt to address this issue but, for weights not being updated during a cycle, they simply regarded it as a forget effect. Unfortunately, as the number of neurons on the device increases, so too does the common node parasitic capacitance. This will require the size of the storage electrode of each neuron to be increased as network size increases to compensate.

[0012] Perhaps the most successful analogue implementations are those which utilise a pulse stream approach. It has long been known that biological neural systems use pulses to communicate between cells and simple oscillating circuits can be implemented in VLSI relatively easily. Unfortunately, the problem of analogue memory still overshadows such approaches. The main advantage of pulse stream approaches is that hardware requirements for the arithmetic units are very low compared to the equivalent digital implementation; in particular multipliers which can be implemented in an analogue fashion using only three transistors require many gates for digital systems.

[0013] The problems of implementing digital multipliers and storing weight values provide two reasons that most digital implementations of the SOM have been restricted to small network sizes and are often only coprocessors rather than fully parallel implementations. The other main factor that has made a significant contribution to limiting network size is the inter-neuron communication overhead which increases exponentially with network size. Consequently, most fully digital implementations of the SOM require some modification to Kohonen's original algorithm, e.g. Ienne et al suggest two alternative modifications to the SOM algorithm for digital implementation. Van den Bout et al also propose an all digital implementation of the SOM and investigate a rapid prototyping approach towards neural network hardware development. This is facilitated by the use of Xilinx field programmable gate arrays (FPGAs) which provide a flexible platform for such endeavours and speed up construction time compared to VLSI development. Their approach uses stochastic signals to allow pseudo-analogue computation to be carried out using space efficient digital logic. A Markovian learning algorithm is used to simplify that suggested by Kohonen and the Manhattan distance metric is used in place of Euclidean distance to simplify distance calculations. Their approach towards the implementation of the SOM is later reiterated when they describe their VLSI implementation, TInMann.

[0014] Saarinen et al propose a fully digital approach to the implementation of Kohonen's SOM in order to create a neural coprocessor for PC based systems. Their approach uses three Xilinx XC3090 FPGAs to create 16 processing elements, and RAM to store both weight and input vector values. The host computer initialises the random weight values, loads up the input vector values and sets the network parameters (i.e. network size, number of inputs, gain factor and number of training steps). After the host computer has set these parameters the coprocessor system then trains the network according to the pre-specified parameters until training is complete. The architecture of the system consists of three main elements; a distance and update unit (DUU), a distance comparator unit (DCU) and an address control unit (ACU), each implemented on a separate FPGA which is clearly a partitioning of the network functionality and is not likely to be scaleable due to the communication overheads. In addition, this implementation does not implement the standard SOM but, a rather limited, one dimensional version.

[0015] While more obvious than many of the digital implementation approaches used, that of Saarinen is rather typical in that it partitions functionality. Most digital implementations appear to do the same, but they maintain the whole system on a single device. The rationale behind this is that when using digital multipliers, vast resources are normally required to implement them, so it is often more effective to have a limited number but to make them fast. To avoid using excessive resources for the Modular Map implementation, very limited reduced instruction set computers (RISC) processors are suggested that use an alternative approach to multiplication which will only require a fraction of the resources needed to implement a traditional digital multiplier. In addition, while minor modifications to Kohonen's algorithm are made, its basic operation and two dimensional nature are maintained.

[0016] The paper by Ruping et al presented simultaneously with the paper by Lightowler et al presents a fully digital hardware implementation of the SOM which incorporates some of the same ideas as does the Modular Map design. To facilitate hardware implementation Ruping et al also use Manhattan distance instead of Euclidean distance and the gain factor is restricted to negative powers of two. A system comprising 16 devices is outlined and performance information is presented in terms of the operating speed of the system etc. Each of their devices implements 25 neurons as separate processing elements and allows for network size to be increased by using several devices. However, these devices only contain neurons; there is no local control for the neurons on a device. An external controller is required to interface with these devices and control the actions of their constituent neurons. Consequently, these devices are not autonomous as are Modular Maps and only lateral expansion which creates a Single Instruction stream Multiple Data stream (SIMD) architecture has been considered as an approach towards creating larger network sizes.

[0017] There have also been some commercial hardware implementations of ANNs, the number of which has been steadily growing over the last few years. They generally offer a speedup of around an order of magnitude compared to implementation on a PC alone but are predominantly coprocessors rather than stand alone systems and are not normally scaleable. However, while some of these implementations are only able to implement a single ANN paradigm, most use digital signal processing (DSP) chips, transputers or standard microprocessors, thereby allowing the system to be programmable to some extent and implement a range of standard ANNs.

[0018] The commercially available approach to implementation, (i.e. accelerator cards) offers the slowest speedup of the main implementation approaches but can still offer a significant speedup compared to simulation on standard PC systems and the growing number available on the market suggests that they are useful for a range of applications. General purpose multiprocessor systems offer a further speedup but large scale systems normally have significant communication overheads. Some researchers have attempted to modify standard multiprocessor architectures to improve their application to ANNs and have increased achievable speedup by doing so but while these systems have been useful in ANN research, they are not fully scaleable and require significant financial outlay. The greatest speedups for ANN implementations have been achieved by dedicated neural network chips but the problem again has been that these systems are limited to relatively small scale systems. As an approach towards developing scaleable neural network systems, there have been some attempts at developing modular systems.

Modular System

[0019] There is considerable evidence to suggest that biological neural systems have a modular organisation at various levels. At a macroscopic level, for example, it has been found that some people have no connection between the left and right hemispheres of the brain, which does bring with it certain problems, but they are still able to function in a near to normal way, which shows that each hemisphere is able to function independently. However, it has also been noted that, while each hemisphere is almost identical physiologically, they specialise in functionality. When one begins to look closer at the cerebral hemisphere one finds that different functionality is found at different regions, even though these regions show a modular organisation and are made up of geometrically defined repetitive units. Research by Murre and Sturdy also supports this view of a modular organisation in their attempt at a quantitative analysis of the brain's connectivity. It is of interest that this modularity is also seen in relation to the topological maps formed in the neo-cortex, e.g. somatosensory maps for different parts of the body are found at different parts of the cerebral cortex and similar maps for other senses such as sound (tonotopic maps) are found in different regions again. Such evidence suggests that while the concept of topological maps which form the basis for Kohonen's self organising map is valid, it also suggests that the brain contains many of these maps. Consequently, it is reasonable to suggest that when attempting to develop scaleable, and particularly when trying to develop large scale implementations of the SOM, that a modular approach should be considered.

[0020] Researchers such as Happel and Murre have approached neural network design as an evolutionary process using genetic algorithms to determine network architectures. Their investigations into the design of modular neural networks using the CALM module are intended as a study to assist with understanding of the relationship between structure and functionality in the brain but they present some findings that may also assist with the development of information processing systems. They found that the best performing network architectures derived with their approach reproduced characteristics of the vision system with the is organisation of coarse and fine processing of stimuli in different pathways. They also present a range of evidence that supports the belief that the brain is highly organised and modular in its architecture.

[0021] The basic premise on which modular neural network systems are developed is that the computation performed by the network is decomposed into two or more separate modules which operate as individual entities. Not only can such approaches improve scaleability but considerable savings can be made on the learning times required for large networks, which are often rather slow. In addition, the generalisation abilities of large networks are often poor, whereas systems composed of several modules do not appear to suffer from this drawback. Research carried out by Jacobs et al using modules composed of Multi Layer Perceptrons (MLPs) used competition to split the input space into overlapping regions. Their work found that the modular approach had much improved training times compared to single large networks and gave better performance, especially where there were discontinuities within classes in the original input space. They also found, when building hierarchies of such systems, an architecture they refer to as a hierarchical mixture of experts, that the results yielded a probabilistic approach to decision tree modelling. Others, such as Hansen and Salamon, have considered ensembles of neural networks as a means of improving classification. Essentially the ensemble approach involves training several networks on the same task to achieve a more reliable output.

[0022] A modular approach to implementation of the SOM is a valid alternative to the more traditional approaches which attempt to create single networks. Other authors such as Helge Ritter have also presented research supporting a modular approach for the SOM. There also appears to be a sound basis for modularity in biological systems and, while no attempt is being made to replicate biological systems, they are nevertheless the initial inspiration for artificial neural networks. It is also pertinent to consider that, while Man has only been attempting to develop computing systems for a matter of centuries, natural evolution had produced a range of biological computers long before Man was on this earth. Even with the latest of modern technology, Man is unable to create computers that surpass the computing abilities of biological systems, so it is suggested that Man should continue to learn from nature.

Continue reading...
Full patent description for Neural processing element for use in a neural network

Brief Patent Description - Full Patent Description - Patent Application Claims
Click on the above for other options relating to this Neural processing element for use in a neural network patent application.
###
monitor keywords

How KEYWORD MONITOR works... a FREE service from FreshPatents
1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored.
3. Each week you receive an email with patent applications related to your keywords.  
Start now! - Receive info on patent apps like Neural processing element for use in a neural network or other areas of interest.
###


Previous Patent Application:
Methodology for the configuration and repair of unreliable switching elements
Next Patent Application:
Clustering apparatus, clustering method and program
Industry Class:
Data processing: artificial intelligence

###

FreshPatents.com Support
Thank you for viewing the Neural processing element for use in a neural network patent info.
IP-related news and info


Results in 3.46361 seconds


Other interesting Feshpatents.com categories:
Daimler Chrysler , DirecTV , Exxonmobil Chemical Company , Goodyear , Intel , Kyocera Wireless ,