Introduction

Just a few years ago, this matter was only a construction in the mind of some technology gurus. Nowadays, the convergence of computer vision, embedded systems, sensor networks and integrated image sensors, allows us to think on distributed processing systems, able to realize cooperative vision tasks in real-time by using multiple video sources [1]. The fundamental element of such systems is the smart camera. Its functionality goes beyond the mere acquisition and transmission of images. In addition to this, smart cameras survey their visual field and analyze the events that take place under their sight. This analysis can be local or in cooperation with other smart cameras. Objects are recognized and singular events are detected and time stamped. The final outcome of the operation of the smart camera is a well elaborated response, consisting in the activation of actuators or the transmission of an alarm code [2].

Applications of distributed smart cameras range from control and surveillance of buildings, borders, perimeters to process monitoring, environmental control, stock control, automatic transportation, agriculture and, obviously, military applications [3]. There are many advantages of using a network composed of intelligent nodes. On one side, the computational load of a central computer at a base station responsible of analyzing the information arriving from every node would be intractable, unless we can count on a supercomputer. On the other side, the costs associated to the transmission of such a data flow can seriously compromise the viability of the camera network deployment [4]. Consider that we are concentrating in applications for the monitoring of unstructured spaces. Here no data or power wiring makes sense. Also, the extension and perimeter of the surveyed areas can vary. In this case a scalable solution will be very appropriate. The elementary devices that constitute the network must present very particular characteristics. First of all, they must be very efficient in terms of energy consumption. This extends the lifetime of the network and avoids maintenance of thousand of geographically dispersed nodes. Second, the cost of each node must be low; in order to represent a competitive alternative to centralized information processing. And finally, communications between nodes must be wireless, because wiring can be unthinkable in the worst scenarios. Also a wireless network has a smaller deployment cost and better scalability figures.

There are several attempts reported concerning the application of a network of distributed smart cameras to monitoring and surveillance of public and private spaces [5]. As already mentioned, the central element of the network is a device that is able to capture images, but also able to interpret the visual stimulus and to transmit relevant data over the network. These abilities, in situ image processing and data transmission, are largely mediated by the power limitations of the sensor node. The challenges are in the design of wireless vision sensors in which the processing circuits and the interface and communication circuits are especially efficient in the use of the available energy. Concerning inter-node communications, the advances in personal consumer electronics and the infrastructure for ambient intelligence have rendered well established standards (Bluetooth, Wi-Fi) [6]. These schemes have closed specifications that allow us to univocally estimate the necessary power budget for a particular application, once the communication standard and the required data rates are known [7]. In this way, scalar sensor networks, i. e. sensors that provide a scalar magnitude, usually of an environmental interest, like temperature, humidity, chemical agents concentration, etc., employ the physical layer described by these standards. Any advances are introduced in the data link layer: the medium access control (MAC) and the logic link control (LLC) [8].

At the other extreme of the problem, local and/or distributed implementation of vision algorithms within the network nodes, advances are slower. The main reason: handling, in real-time, the large amount of information contained in the visual stimulus is not an easy task. For instance, in a vision system with a conventional serialized digital processing scheme, a CMOS or CCD sensor acquires around 25 or 40 frames per second, of at least 176x144 pixels in order to be practical (QCIF [9]). This means a flow of 0.63 Mpixels/s. If these are grayscale images encoded with a 8b depth, the data transmission rate goes up to 5.04Mbps. Some of the standard wireless communication protocols can not reach these figures. The maximum transmission rate in Bluetooth (IEEE 802.15) is 732kbps. Using a Wi-Fi (IEEE 802.11) based network, able to support a data rate of 31.4Mbps, the maintained transmission of this data flow will require up to 50mW [6]. In order to establish a benchmark, think that an AAA alkaline battery [10] operating at 1.5V, with a battery capacity of 1.15Ah, will be exhausted in less than 35 hours. And this without considering the energy required for operating the sensor. It is pretty evident that some way around must be found in order to avoid the transmission of such amount of data. An option is to provide the means for an efficient processing at the node level. But realizing a highly intensive computation under tight power restrictions is not an easy job.

Let us consider a hypothetical application in which we need to compute the brightest points in the image. Local processing can drastically reduce the amount of data to be transmitted. For instance, some filtering can be done to reduce noise, by using a 3x3 convolution mask. Then a threshold will be applied to detect the most brilliant spots. The location of those points is the only information that is going to be transmitted over the network. This basic processing requires of 9 products, 8 additions and a comparison for each pixel. This is without considering the access to the image memory. For the pixel flow depicted above, the required computing power is of 11.34 MOPS [11]. Having a look at some recent general purpose microprocessors, the Intel Pentium 4EE (EE for Extreme Edition) [12], is able to compute 9726MIPS . This is achieved with a 3.2GHz clock and consuming 103W. This corresponds to a computing power per unit power consumption of 0.094MIPS/mW. For the Pentium 4EE to realize the previously describe processing, a maintained supply of 113.9mW will be required. If we suppose that the system can operate at 1.5V and the battery characteristics are maintained during its lifetime, the energy stored in the battery will be starved in a little more than 14 hours. In the case of the AMD Athlon 64 FX (Dual Core) [13], computing power per miliwatt reaches 0.201MIPS/mW, what means 31 hours before running out the battery. For best figures, we need to try with processors that are less powerful but more efficient, for instance processors devised for portable applications (PDA's, mobile phones). The Hitachi SH7705 [14], can be an example. It provides a throughput of 0.865MIPS/mW. The same battery will now operate for 5.5 days. The NEC 4131, with 1.54MIPS/mW [15], will be 10 days working with this battery. We can think on digital signal processors designed for intensive computation at the lower cost. For instance, the TMS320C6411 from Texas Instruments reaches 9.6MIPS/mW [16]. Or the ARM7TDMI, a general purpose processor optimized to be employed in embedded systems. This last one contains a high-performance multiplier [17]. When implemented in a 0.13um technology reaches, 11.06MIPS/mW. In these cases, battery life reaches four months. Therefore, with the introduction of especially dedicated hardware, the efficiency of the processor is enhanced in two orders of magnitude. However, a more involved computation at the image plane will require stronger measures. By the way, we have obviated some very relevant issues in our energy balance. First of all, signal conditioning and conversion to the digital domain require a non-negligible amount of energy. For state of the art A/D converters, the same data flow as in the example 0.63MS/s and 8b per sample means an extra power consumption of 0.29mW [18]. In addition to that, the contribution of the ADC grows with technology scaling [19]. Another important issue is the so-called memory gap [20]. Due to the different nature of the technology for the microprocessors and the external memory, advances in microprocessor speed are not equally translated into advances in the speed of the access to the memory [21]. As a result, memory access is an unavoidable bottleneck unless architectural alternatives are considered.

Therefore, for the efficient implementation of cooperative vision algorithms with a wireless sensor network, we need to contemplate power consumption in a comprehensive manner. We will need to increase the efficiency in the whole process, covering image sensing, signal conditioning, local and distributed information processing and data coding. We are going to depart from the traditional sensing + A/D conversion + processing. We will follow an alternative scheme based on the internal organization of the biological sensors. In nature, the most tedious processing tasks, that require a higher computational power, are assigned to aggregates of relatively simple, slow and imprecise elementary devices [22]. However, they still result less bulky and more energy efficient that their artificial counterparts. One of the tools employed by nature to reach these optimized organs is adaptation. The architecture of the biological devices for sensory processing is adapted to the nature of the stimulus. Instead of using the Von Neumann architecture, they are commonly based on massively parallel analog networks [23]. Besides, the adaptation of the system parameters, by automatic training and learning, can be employed to correct disparities and attenuate errors. This allows for an acceptable accuracy with relatively imprecise circuits [24]. In a sense, this approach toward a parallel architecture that allows for higher speed and reduces memory access times and energy is being introduced into the last commercial microprocessor generations [25]. Concerning image processing, the great leap forward would be in the incorporation of processing features at the pixel level. This is, a concurrent implementation of image acquisition and processing. In [26] we can find a 1b A/D converter at each pixel, several logic circuits and a couple of memories. The figures for this are 23.20MIPS/mW. A similar architecture allows the chip in [27] to reach 27.5MOPS/mW. For them the alkaline battery of our example will last for 5 or 6 months.

A higher efficiency level can still be reached by emulating the biological sensory organs at the circuit level. Thus, focal plane processing will be realized by low or moderate accuracy analog circuits. They outperform their digital counterparts in the use of energy. In this sense, our research group has dedicated a considerable effort in the VLSI implementation of nonlinear dynamic processor networks, concurrent with photosensor arrays, and based on bioinspired processing models. In addition to the analog circuit that supports the operation of the photosensor and the nonlinear network dynamics, our chips contain a distributed memory and support for logic operation at the pixel level. The result is a set of highly efficient processors, rendering 82.5MOPS/mW [28] and 250MOPS/mW [29]. In the experiment proposed as the basis of the comparison, these figures mean 1.5 and 4, respectively, years of operation with the same battery. This qualitative jump is favored by the low level emulation of the retinal processing models [30]. The chip described in [29] contains a network of 2x32x32 elementary processors. By programming the interconnection weights, this chip permits the exploration of complex reaction-diffusion dynamics [31], including the generation and transmission of activation patterns, much in the same way in which they occur at the vertebrate's retina [32]. Thus by using these adapted architectures, with efficient analog programmable elementary processors, we can think of a vision system on a single chip. This is exactly what we need for realizing local image processing in a network of distributed smart cameras. We have not considered yet the incorporation of a microcontroller unit (MCU) embedded in the same silicon chip as the focal plane processor. The MCU will be responsible of the communications between the different elements of the system and the execution of a high level program [33]. In this scheme, the sensor/processor array is a specialized peripheral device, i. e. a visual co-processor that alleviates the computational load of the CPU. It is convenient to point out that the embedded MCU represents a very small fraction of the system area. In our first attempts, it occupies 0.13mm2 for the 64mm2 of the sensor/processor. As the MCU will not take over the most tedious, but regular, tasks, the required clock frequency will not be high (20-50MHz) so the extra power consumption will be reduced. Previous works [34] [35], have inserted a camera at every network node. Most of the effort has been dedicated to image compression and codification for being broadcasted over the network. In our approach, we propose to realize image processing at the nodes by incorporating an efficient vision system on a single chip. Some works reported point in this same direction. In [36] the image is sensed and events are detected, but still the complete raw image is delivered through the net. In [37], however, high level processing is available at the nodes. An event detector decides if the captured image should be delivered. Closer to a distributed smart camera network are the systems reported in [38]. Here, commercially avilable sensors of different spatial resolutions operate co-operatively in order to simplify the tasks to be carried out when in surveillance mode, and to allow the operation at full-resolution only when the appropriate conditions are met. Also in [39], where a low power device is employed in a conventional processing scheme. Or in [40], where low level processing is sent to a special processor. In all cases, the concurrence of sensing and processing is broken, thus creating a bottelneck at the A/D conversion that finally limits the throughput of the system. In addition, the energy invested in memory access for low IQ tasks compromises the efficiency and viability of the in-node processing. As a counterpoint to this approach, we pretend to solve these limitations by sending the most computing intensive tasks to analog processors at the focal plane. This will allow, at the same time, for high speed and energy efficiency.

  1. H. Aghajan, R. Kleihorst, B. Rinner et al., "Introduction to the Issue on Distrib. Processing in Vision Networks". IEEE J. Sel. Topics Sig. Proc., Vol. 2, No. 4, pp. 445-447, August 2008.
  2. M. Bramberger, A. Doblander, A. Maier et al., "Distributed embedded smart cameras for surveillance applications". Computer, Vol. 39, No. 2, pp. 68-75, February 2006.
  3. D. Puccinelli, M. Haenggi, "Wireless Sensor Networks: Applications and Challenges of Ubiquitous Computing". IEEE Circuits and Systems Magazine, Vol. 5, No. 3, pp. 19-29, August 2005.
  4. G. J. Pottie, W. J. Kaiser, "Wireless Integrated Network Sensors". Communications of the ACM, Vol. 43, No. 5, pp. 51.58, May 2000.
  5. I. F. Akyildiz, T. Melodia, and K. R. Chowdhury, "Wireless Multimedia Sensor Networks: Applications and Testbeds". Proc. of the IEEE, Vol. 96, No. 10, pp. 1588-1605, October 2008.
  6. E. Ferro, F. Portoti, "Bluetooth and Wi-Fi Wireless Protocols: a Survey and a Comparison". IEEE Wireless Communications, Vol. 12, No. 1, pp. 12-26, Feb. 2005.
  7. L. M. Feeney and M. Nilsson, "Investigating the energy consumption of a wireless network interface in an ad hoc networking environment". Proc. 20th Annual Joint Conf. IEEE Computer and Communications Soc., Vol. 3, pp. 1548-1557, April 2001.
  8. C. Jyh-Cheng, K. M. Sivalingam, P. Agrawal, and S. Kishore, "A comparison of MAC protocols for wireless local networks based on battery power consumption". Proc. 17th Annual Joint Conf. IEEE Comp. and Comm. Soc., Vol.1, pp. 150-157, March-April 1998.
  9. "Rec. H. 261: Video Codec for Audiovisual Services at p x 64kbit/s". Recommendations of the International Telecommunications Union, Helsinki, Finland 1993.
  10. Duracell® Alkaline-Manganese Dioxide Battery MN2400 Size AAA (LR03) Datasheet.
  11. R. C. Gonzalez, R. E. Woods, Digital Image Processing. Addison-Wesley Publishing Co. Reading (MA) 1992.
  12. Intel® Pentium® 4 Processor on 0.13 Micron Process Datasheet. Document Number: 298643-012, February 2004.
  13. AMD Athlon™ 64 FX Product Data Sheet. Document No. 30431, May 2004
  14. SH7705 Group Hardware Manual: Renesas 32-Bit RISC Microcomputer SuperH_ RISC engine Family/SH7700 Series. Revision 2.00. Sept.2003
  15. NEC VR4131 64-/32-Bit Microprocessor Hardware: User's Manual. Document No. U15350EJ3V0UM00 (3rd edition), Jan. 2004.
  16. TMS320C6411 Fixed-Point Digital Signal Processor (Rev. I). June 2005
  17. ARM7TDMI Technical Reference Manual, Doc. No. DDI 0029G Rev. 3, April 2001
  18. S. Limotyrakis, S. D. Kulchycki, D. K. Su, B. Wooley, "A 150-MS/s 8-b 71-mW CMOS Time-Interleaved ADC". IEEE J. Solid-State Circuits, Vol. 40, No. 5, pp. 1057-1067. May 2005.
  19. K. Uyttenhove, M. S. J. Steyaert, "Speed-Power-Accuracy Tradeoff in High-Speed CMOS ADCs". IEEE Trans. on Circuits and Systems-II, Vol. 49, No. 4, pp. 280-287, April 2002.
  20. M. Wilkes, The Memory Gap and the Future of High Performance Memories. Technical Report 2001.4. The Computer Laboratory. University of Cambridge, 2001.
  21. W. A. Wulf and S. A. McKee, "Hitting the Memory Wall:Implications of the Obvious". ACM SIGARCH Computer Architecture News, Vol. 23, No. 1, pp. 20-24, March 1995.
  22. P. Churchland, T. Sejnowski, The Computational Brain. Cambridge, MA. MIT Press, 1993.
  23. N. H. Francescini, "Vision, Flies, Neurons and Robots". Proceedings of the IEEE Int. Conf. on Robotics and Automation, Vol. 3, pp. 3165, May 1995.
  24. C. Diorio, D. Hsu, M. Figueroa, "Adaptive CMOS: From Biological Inspiration to Systems-on-a-Chip". Proceedings of the IEEE, Vol. 90, No. 3, pp. 345-357, March 2002.
  25. Intel® Core™ i7 Processor Extreme Edition Datasheet. Document No. 320834-001, November 2008.
  26. T. Komuro, I. Ishii, M. Ishikawa, A. Yoshida, "A Digital Vision Chip Specialized for High-Speed Target Tracking". IEEE Transactions on Electron Devices, Vol. 50, Np. 1, pp. 191-199, Jan. 2003.
  27. P. Dudek and P. J. Hicks, "A General-Purpose Processor-per-Pixel Analog SIMD Vision Chip". IEEE Transactions on Circuits and Systems-I, Vol. 52, No. 1, pp. 13-20, Jan. 2005.
  28. G. Liñán, A. Rodríguez-Vázquez, R. Carmona, F. Jiménez-Garrido, S. Espejo, R. Domínguez-Castro, "A 1000 Fps At 128*128 Vision Processor With 8-Bit Digitized I/O". IEEE Journal of Solid-State Circuits, Vol. 39, No. 7, pp. 1044-1055, Jul. 2004.
  29. R. Carmona, F. Jiménez, R. Domínguez, S. Espejo, T. Roska, Cs. Rekeczky, A. Rodríguez, "A Bio-Inspired 2-Layer Mixed-Signal Flexible Programmable Chip for Early Vision". IEEE Trans. Neural Networks, Vol. 14, No. 5, pp. 1313-1336, Sep. 2003.
  30. B. Roska, F.S. Werblin, "Vertical Interactions Across Ten Parallel, Stacked Representations in the Mammalian Retina". Nature, Vol. 410, pp. 583-587, March 2001.
  31. A. Adamatzky, et al., "Reaction-Diffusion Navigation Robot Control: from Chemical to VLSI Analogic Processors", IEEE Transactions on Circuits and Systems I, Vol. 51, No. 5, pp. 926-938, May 2004.
  32. D. Bálya, I. Petrás, T. Roska, R. Carmona and A. Rodríguez, "Implementing the Multilayer Retinal Model on the Complex-Cell CNN-UM Chip Prototype". Int. J. Bifurcations and Chaos, Vol. 14, No. 2, pp. 427-452, World Scient. Feb. 2004.
  33. F. J. Sánchez, C. M. Domínguez, R. Carmona and A. Rodríguez, A Microcontroller Unit Based on SIMPLEZ for Embedded Control of a Vision SoC. Internal Report. IMSE-CNM-CSIC. April 2006.
  34. L. Ferrigno, A. Pietrosanto, "A Low Cost Visual Sensor Node for Bluetooth Based Measurement Networks". Proc. of the Instrumentation and Measurement Technology Conference, pp. 895-900, Como (Italy), May 2004.
  35. Z.-Y. Cao, Z.-Z. Ji, M.-Z. Hu, "An Image Sensor Node for Wireless Sensor Networks". Proc. of the IEEE Comp. Soc. International Symposium on Information Technology, Coding and Computing (ITCC 2005), Vol. 2, pp. 739-745, Las Vegas, Nevada, USA, April 2005.
  36. T. Teixeira, A. G. Andreou, E. Culurciello, "Event-Based Imaging with Active Illumination in Sensor Nets". IEEE Int. Symp. Circuits and Systems, Vol. 1, pp. 644-647, Kobe (Japan) May 2005.
  37. G. Zhang, T. Yang, S. Gregori, J. Liu, F. Maloberti, "Ultra-Low Power Motion-Triggered Image Sensor for Distributed Wireless Sensor Network". Proc. of IEEE Sensor Conference, Vol. 2, pp. 1141-1146, Toronto (Canada) Oct. 2003.
  38. S. Hengstler, D. Prashanth, F. Sufen, and H. Aghajan, "MeshEye: A Hybrid-Resolution Smart Camera Mote for Applications in Distributed Intelligent Surveillance". Proc. 6th Int. Symp. Information Processing in Sensor Networks, pp. 360-369, April 2007.
  39. A. Rowe, D. Goel, and R. Rajkumar, "FireFly Mosaic: A Vision-Enabled Wireless Sensor Networking System". Proc. of the 28th IEEE International Real-Time Systems Symposium, pp. 459-468, January 2007.
  40. I. Diaz, M. Heijligers, R. Kleihorst, and A. Danilin, "An Embedded Low Power High Efficient Object Tracker for Surveillance Systems". Proc. 1st ACM/IEEE Int. Conf. Distributed Smart Cameras, 372-378, September 2007.
 
Last update: Mar 6, 2011