RESEARCH TOPICS
Since its coordination of the EU CAVIAR project, the IMSE Neuromorphic
group develops sensory and processing microchips that mimic sensing and
processing in biological beings. It also develops multi-chip and hybrid
chip-FPGA systems to scale up to higher complexity systems. The group also
works on algorithms and sensory processing for spiking information sensing,
coding and processing. Chips use mixed signal, low current, and/or low power,
circuit techniques, as well as high speed communication techniques. The group
uses mixed or digital CMOS technologies, as well as application projections
exploiting emergent nanoscale technologies or new
devices like memristors (EU NABAB project , PNEUMA project and NEURAM3 project).
The group focuses mainly on event-driven (spiking)
frame-free vision systems, developing sensing retinas
for spatial or temporal contrast (such as DVS – Dynamic Vision Sensors), as
well as event-driven convolution processors,
which allow to assemble for example large scale spiking “Convolutional Neural Networks”
for high speed object recognition. These chips and systems use AER (Address
Event Representation) communication techniques.
Presently, the group is also involved in the
development of an fully event-based gesture
recognition vision system to be embedded in portable devices in the context of
the H2020 EU project ECOMODE.
Event-driven retinas do not produce sequences of still
frames, as conventional video cameras do. Instead, each pixel senses light and
computes a given property (spatial contrast, temporal change) continuously in
time. Whenever this property exceeds a given threshold, the pixel sends out an
event (which usually consists of the pixel x,y coordinate and the sign of the threshold), which
is written onto one (or more) high speed bus with asynchronous handshaking.
This way, sensors produce continuous event flows, and subsequent processors
process them event by event.
Currently, our research targets at performing
high-speed visual recognition using our event-based hardware. Some examples of
systems developed until now are:
Example of
event-driven sensing and processing system.
(a) The system has a temporal contrast DVS retina as sensor. Each pixel sends
an event when it detects light changes. The retina is observing a 5KHz spiral on an analog oscilloscope (shown in (e)) while
the only illumination is that of the oscilloscope. The system in (a) comprises
a DVS sensor, a merger which arbitrates and sequences events coming from two
inputs, and an event-driven convolution processor. Bit k indicates whether the
event comes from the retina or the processor output. If it comes from the
retina, the event is convolved with the kernel in (b), and if it comes from the
processor output it is convolved with the kernel in (c) which implements a
competitive winner-takes-all type of operation: it amplifies the effect of the
event while inhibiting the effect of spatially nearby events. The result is
that the output follows the center of the cloud of input events. This is shown
in (d) where the 500 microsecond (x,y,time)
diagram shows the retina events (green dots) and the system output (red trace).
Example of event-driven
sensing and processing for object recognition.
(a) A card deck is browsed quickly in front of a DVS (as in (b)). (c) Speed or
event density at the retina output (eps: events per
second). The full card deck is browsed in about 650ms. Maximum event rate
produced by the retina is about 8 million eps. (d)
Image formed by collecting events during 5ms. The system, shown in (a),
comprises one retina and two AER convolution processing chips in cascade. The
first chip is programmed with a radial symmetry Center-on kernel. The kernel of
the second chip, shown in (e), detects a club like pattern by simple template
matching. (f) Sequence of events (projected in y,t) at the output of the retina (small red dots),
output of first convolution filter (green circles), and output of second filter
(black crosses). The small 3ms box in (f) is zoomed out in (g) in y,t projection and in (h) in x,y projection. In (g) one can see that club symbol
recognition (crosses) is available after 1.5ms of symbol events start, once the
retina has provided sufficient events for proper shape recognition. The time
needed by fabricated event-driven processor chips to process one event varies
from about 60 to 700ns, depending on programmed kernel.