Computer Sciences

Machine learning and signal processing design for edge acoustic applications

Machine learning and signal processing design for edge acoustic applications
Credit: Emad Ibrahim

Edge computing is an architecture in which data and processing are placed as close as possible to end users. In cloud computing, in contrast, the computations happen in centralized data storage and processing units in which algorithms enjoy the luxury of complex computations. Edge computing decentralizes such computations for reasons of privacy, latency, or energy efficiency. For his Ph.D. thesis, Emad Ibrahim used methods and practices from digital signal processing and machine learning to generate new solutions hosted on edge platforms.

Applications hosted on an edge platform require resource-aware strategies to fit a certain algorithm in the limited storage and processing budget of the edge devices. In his Ph.D. research, Emad Ibrahim used methods, practices, and processes of both digital signal processing (DSP) and machine learning (ML), with the intention of generating new solutions at the system level where DSP techniques improve the ability of ML models to converge on an optimized solution.

Ultrasonic gestures

Any edge device requires also a means of control. Acoustics from commercially off-the-shelf (COTS) sensor components can be used as a means for contactless control via in-air ultrasonic gestures and/or speech. The core brain for such control is a systematic combination of DSP and ML techniques.

In-air ultrasonic gestures are hand gestures that are detected by first transmitting an ultrasound signal from a device and then extracting useful information from the received echoes. This can be of great value in devices that lack enough space to host physical buttons.

The combination of low-cost speech recognition with ultrasonic gesture recognition opens the door for innovative applications in consumer electronics. Ultrasonic definitions use the Doppler shift due to hand (object) motion as a building block, where its variations in time and space construct unique detectable patterns. Some edge devices have space to accommodate only one sensor, while others have more space to host multiple sensors. Each of these scenarios can have a different computation flow that requires further research for optimum performance where Ibrahim and his colleagues explore and recommend different applications.

See also  Human vision—a challenge for AI

One such application presented in Ibrahim’s work is a Virtual Proximity Detector (VPD) that can be used to robustly detect proximity in smartphones using solely the ultrasound band in the built-in speaker. Another application is detecting hand gestures using a small-form factor sensor array that is small enough to be embedded in consumer electronics such as smart speakers.

Leave a Response