In conclusion, we have demonstrated a video pre-processing sensor that acts as a convolutional neural network accelerator by handling occlusion directly at the intelligent system’s level, leading to cost effective and latency-free real-time object detection and tracking. The core idea towards achieving such an edge-computing computer vision camera is to introduce photovoltaic pixels that exhibit a dual function. First, the pixels are capable of learning the perceived object and utilizes this knowledge to reconstruct it when being blocked, without the use of any computationally expensive and potentially laggy cloud computing. Second, the pixels have the ability to estimate the speed of a moving object via an analog open circuit voltage dependence on illumination time. This feature enables tracking of an obstructed object by predicting its most probable path based on the route travelled prior to the occlusion event. We have illustrated how our sensor successfully handles some simple occlusion scenarios related to everyday situations. Expanding the number of pixels would potentially allow the detection and tracking of more complex objects travelling in different directions.

Acknowledgements

This work is supported by the Ministry of Education, Singapore, under its Academic Research Fund Tier 1 (2020-T1-001-061) and by the National Institute of Education, Singapore, under its NIE Academic Research Fund (Project Reference No.: RI 4/17 THN).

Conflict of interest

The authors declare no conflict of interests.