Seria ELECTRONICĂ și TELECOMUNICAȚII TRANSACTIONS on ELECTRONICS and COMMUNICATIONS

Tom 49(63), Fascicola 1, 2004

# Adaptive Interfaces Based on FPGA Implemented Artificial Neural Network

Ștefan Oniga<sup>1</sup>, Virgil Tiponuț<sup>2</sup>, Atilla Buchman<sup>1</sup>, Daniel Mic<sup>1</sup>

Abstract - The goal of this work is to build smart interfaces with learning and adaptive capability. The key element of the learning and adaptive behavior are artificial neural network (ANN) blocks, implemented in FPGA using the System Generator tool for Simulink developed by Xilinx Inc. This tool allow the easy generation of hardware Description Language (HDL) code from a system representation in Simulink. This VHDL design can then be synthesized for implementation in the Xilinx family of FPGA devices. The off-chip learning task is performed using Matlab and the ANN's weights are transferred automatically from Matlab workspace to weights memory.

Keywords: smart, neural network, adaptive, learning, FPGA, prosthetic

# I. INTRODUCTION

The efforts made world wide by the large numbers of universities and research organizations that are involved in designing and building natural user interfaces it seems to be not enough because of the lack of adaptation and learning capabilities. The use of neural networks to add learning and adaptive behavior to smart sensors is essential and the FPGA implementation is an easy an attractive way for hardware implementation.

Among possible applications are intelligent computer peripherals enabling people with any kind of handicap to use computer and communicate, as any kind of industrial or domestic device with learning and adaptive capabilities.

The goal of this work was to develop hardwaresoftware codesign platform enabling the fast development of smart interfaces with the addition of sensors. hardware modules that can be easily connected and VHDL modules that can manage sensors, basic behaviours (ex: features extraction, pattern recognition, etc.). Using this framework development of new smart devices needs only design and synthesis of new VHDL drivers for the new sensors and new application-specific ANNs. This platform is based on low cost general purpose FPGA boards without need for hardware design. This paper presents a new method for hardware implementation of artificial neural networks (ANN) in field programmable logic devices (FPGA) that can be used in smart sensors development. It also permits the development of the ANN's specific modules and libraries for System Generator tool.

Main applications for such smart devices with embedded and hidden intelligence at user are in the prosthetic. automotive, "domotic" and automation field where the trend is to produce easy-to-use devices

# II. THE HARDWARE-SOFTWARE CODESIGN PLATFORM

Smart devices must use multisensorial interfaces with natural, adaptive behavior and learning capability. The key for the adaptive and learning behavior are VHDL described neural networks. Any application of a new smart device should use these ANN modules to add adaptive and learning capability.

The platform developed in order to provide a fast prototyping environment for adaptive interfaces is shown in Fig.1. and it was developed to facilitate the use of codesign techniques.

Other requirements for the development platform are:

- Exchangeability of sensors, thanks to common interfaces for any class of VHDL drivers
- Reusability of developed VHDL components
- Reduced time to market

The Aduc812 microcontroler is used to implement the Data Acquisition System and to adapt signal sensors to neural network input requirements. The reconfigurable device (XC2S50 Xilinx) is used to implement the neural networks and other logic blocks of the same application. The System Generator tool for Simulink developed by Xilinx Inc. allow the easy generation of hardware Description Language (HDL) code from a system representation in Simulink. This VHDL design can then be synthesized for implementation in the Xilinx family of FPGA devices.

<sup>&</sup>lt;sup>1</sup>North University of Baia Mare, Electrotechnical Department

Dr. V. Babes Str., Nr. 62/A, 430083 Baia Mare, e-mail onigas@ubm.ro

<sup>&</sup>lt;sup>2</sup> Electronics and Telecommunications Faculty, Applied Electronic Department

Bd. V. Pårvan Nr. 2, 300223 Timişoara. e-mail, tiponut@etc.utt.ro



Fig. 1 The codesign platform

The developed framework allows device communication with a PC in order, to perform offchip training task or, to transfer data for analysis. Software is designed to manage the communication protocol with Matlab via parallel port.

The platform could be used in three ways:

- 1. The neural network simulation and learning phase of the weights,
- The network design and hardware implementation using System Generator tool for Simulink and Xilinx ISE,
- 3. Normal use of the network (propagation phase).

#### III. NEURAL NETWORK DESIGN

As mentioned above, with this method neural networks could be realized using the specific modules created with blocks from Xilinx Blockset.

Fig.2. shows the neural network model in Simulink. The main element of neuron is the multiply-accumulate (MAC) block. This block could be implemented efficiently using existing dedicated multipliers in Virtex II, Virtex II Pro or Spartan III FPGAs. For example XC2V250 (a Virtex II FPGA) has 24 dedicated 18 bits MAC blocks. They can be implemented efficiently even in other FPGAs without dedicated MAC blocks, using Xilinx LogiCORE Generator.



Fig. 2 Neural Network Model in Simulink

Fig.3 presents a MAC block realized using blocks from Xilinx Blockset library. The multiplyaccumulate operation is the bottleneck of ANNs FPGA implementation, because require a large amount of logic blocks. The resources depend in a grate measure on the number of bits used to represent data and weights.



Fig. 3 Multiply-accumulate block

Table 1 presents resources used by the 16 bits multiply-accumulate block, between parentheses are shown resources used by the 16 bits multiply block.

| <b>m</b> |     |   |   |
|----------|-----|---|---|
| 1        | ahl | e | 1 |
| •        |     | - | • |

|                       | MAC implemented with |             |  |
|-----------------------|----------------------|-------------|--|
| Lisad recourses       | VIRTEX-II            | Xilinx      |  |
| Useu resources        | dedicated            | LogiCORE    |  |
|                       | multipliers          | multipliers |  |
| Slices                | 55 (29)              | 89 (63)     |  |
| Flip Flops            | 56 (39)              | 123 (106)   |  |
| Block RAMs            | 0                    | 0           |  |
| Look-up tables        | 66 (17)              | 170 (121)   |  |
| Dedicated multipliers | 1(1)                 | 0           |  |
| % from a 50.000       |                      | 11,58 %     |  |
| gates Spartan-II      |                      |             |  |
| % from a 250.000      | 3.58 %               | 5,79        |  |
| gates Virtex-II       |                      |             |  |
| % from a 1.000.000    | 1,07%                | 1.73%       |  |
| gates Virtex-II       |                      |             |  |

Control logic block presented in Fig. 4. determines neural network architecture. For example determines number of neurons and the correspondence between inputs and weights. For simplicity we have considered that all neurons from a layer are connected to all outputs of neurons from previous layer. In other cases the not necessary connections could be deactivated setting corresponding weights to zero.



Fig. 4 Control logic block

ROM memory is used for storage of neurons inputs weights, and the RAM memory as a data buffer.

Transfer function is implemented using look-up tables.

The resources consumed by a very simple network with one layer of 7 neurons are presented in Table 2. Between \_arenthesis are shown resources used by the 16 bits multiply-accumulate block.

| T   |   |   | • |  |
|-----|---|---|---|--|
| 1 2 | b | e | 7 |  |

|                       | MAC implemented with |             |  |
|-----------------------|----------------------|-------------|--|
| Light recourses       | VIRTEX-II            | Xilinx      |  |
| Used resources        | _e_icated            | Log O       |  |
|                       | multipliers          | multipliers |  |
| Slices                | 80 (55)              | 114 (89)    |  |
| Flip Flops            | 77 (56)              | 144 (123)   |  |
| Block RAMs            | 3 (0)                | 3 (0)       |  |
| Look-up tables        | 103 (66)             | 207 (170)   |  |
| Dedicated multipliers | 1(1)                 | 0           |  |
| % from a 50.000       |                      | 14.84%      |  |
| gates Spartan-II      |                      |             |  |
| % from a 250.000      | 5,20%                | 7,42%       |  |
| gates Virtex-II       |                      |             |  |
| % from a 1.000.000    | 1,56%                | 2,22%       |  |
| gates Virtex-II       |                      |             |  |

The shown data are for 8 bits representation of data and 12 bits used for weights.

Definition of system elements is made automatically using variables that are taken from Matlab workspace. In this way dimension of the memories, registers, counters, as constants and number of bits/word are automatically loaded in Simulink representation of the ANN after the simulation of the neural network in Matlab.

#### **IV. RESULTS**

As presented earlier the method was developed for easy implementation of neural network used in smart sensors. The chosen application for testing the method was static hand gesture recognition using a data glove equipped with optical fiber flex sensors. Figure 5 presents the implemented configuration for gesture recognition.

First block is a parallel port implementation and ensure the correct data transfer between data acquisition system and gesture recognition neural network.

RNA1 is Feed-Forward network that can be trained in many different ways but one of the most  $g_{\rm max}$   $g_{\rm max$ 





Fig. 5 Gesture recognition network

The second network used for classification task is a simple competitive network with one layer of N2 neurons, one for each of N2 gesture to be recognized. Last block is a BCD to 7 segment decoder and it displays the number of the recognized gesture.

A complete process from learning to propagation is presented:

# A. Learning phase

Training of the neural network can be executed using a given set of inputs with the corresponding outputs. The inputs for training are collected via parallel port of a personal computer running Matlab, and a data acquisition program developed by authors. Input and output sets are stored in a file and will be used to determine neural network weights.

The desired network architecture is simulated using Neural Network Toolbox, the neural network weights are saved in a file and will be loaded automatically from Matlab workspace to weight (ROM) memory represented in Simulink. Many networks architecture trained with different methods could be simulated and the network that is best performing for given application is choused for hardware implementation.

# B. Implementation phase

First step for transfer the neural network from software simulation to hardware implementation is the network modeling with System Generator tool for Simulink, using Xilinx blocks or user created, neural network specific blocks. One layer could be created using only one ANN block from user created libraries and the block parameters (number of neurons, weights, bias) are loaded automatically from Matlab workspace. If the designed system is well performing in simulation it could be transformed in VHDL code that is made automatically by System Generator Tool for Simulink, developed by Xilinx.

To increase hardware performance, most System Generator blocks are implemented in hardware using Xilinx Smart-IP (Intellectual Property) LogiCOREs. These modules make optimal use of FPGA resources to maximize performance.

During code generation, the System Generator creates all project files that are necessary for use in Xilinx 6.2i ISE. Opening Project Navigator project file, it is possible to import System Generator design into the Project Navigator, and from there, it can be synthesized, simulated, and implemented in the Xilinx 6.2i software tools environment.

Configuration ".bit" file is then downloaded in FPGA using for example the Parallel cable IV and Xilinx download program iMPACT.

# C. Propagation phase

The sensorial outputs from ADUC812 microcontroller represented on 8 bits parallel format and sampled at 10 ms are loaded in the neural network implemented FPGA. For testing the developed method we have used a sensorial system for an artificial hand composed of:

Data glove as signal source related with fingers and hand position

Force sensing resistors (FSR) to detect contact with an object and the force being exerted

Data acquisition system made up with ADUC812 microconverter

Analog signals from FSR are converted in digital signals by microconverter. Also it receives serial data from glove and output both signals time multiplexed in 8 bits parallel format. More precisely outputs 7 bytes of information about 5 fingers position and 2 about hand position (pitch and roll), followed by 6 bytes of information supplied by 6 touch-pressure sensors located on the fingertips as well as on the palm. The FPGA module serves as implementation framework for neural networks. It receives data from data acquisition system in 8 bits parallel format and outputs the recognized posture number.

The recognized posture can serve as feedback in a control system, or can be transmitted via a signal generator to the peripheral nervous system for the persons with loss of sensory nerve function, or can be used for teleoperating a robotic hand.

#### REFERENCES

#### V. CONCLUSIONS

This paper has presented a new method for the implementation of neural networks in FPGAs.

The main contribution of this work is the creation of the framework that permits rapid development of smart sensors with learning capability and adaptive behavior.

Another contribution is the creation of neural network specific modules such as MAC units, activation function.

The proposed method permits to easily adapt the number of neurons per layer, the weight of each input and the activation function.

A testbench was developed for application that permits to implement different types of neural network with different kinds of architecture.

Future work will focus on development of other neural network specific modules, optimization of implemented modules, and implementation of on-chip learning capability. [1] Jihan Zhu, Peter Sutton, "FPGA Implementations of Neural Networks - a Survey of a Decade of Progress", 2003.

[2] H. Ossoinig, E. Reisinger, Ch. Steger, R. Weiss, "Design and FPGA-Implementation of a Neural Network", Proceedings of the 7th International Conference on Signal Processing Applications & Technology, pages 939-943, Boston, USA, October 1996.

[3] Dr. M. Turhan Taner, "Kohonen's Self Organizing Networks With Conscience", Rock Solid Images, November 1997.

[4] K. Boehm, W. Broll, M. Sokolewicz, "Dynamic Gesture Recognition Using Neural Networks; A Fundament for Advanced Interaction Construction", SPIE Conference Electronic Imaging Science & Technology, San Jose California, USA, Feb. 1994.

[5] Rolf F. Molz, Paulo M. Engel, Fernando G. Moraes, Lionel Torres, Michel Robert, "Codesign of Fully Parallel Neural Network for a Classification Problem", International Conference on Information Systems, Analysis and Synthesis, Orlando, USA, 2000.

[6] R. Gadea, J. Cerda, F. Ballester, A. Mocholi, "Artificial neural network implementation on a single FPGA of a pipelined online backpropagation", Proceedings of the 13th International Symposium on System Synthesis (ISSS'00), pp 225-230, Madrid, Spain, 2000.