Real-time acoustic event classification in urban environments using low-cost devices
In the modern and ever-evolving society, the presence of noise has become a daily threat to a worrying amount of the population. Being overexposed to high levels of noise may interfere with day-to-day activities and, thus, could potentially bring severe side-effects in terms of health such as annoyance, cognitive impairment in children or cardiovascular diseases. Some studies point out that it is not only the level of noise that matters but also the type of sound that the citizens are exposed to. That is, not all the acoustic events have the same impact on the population.
With current technologies used to track noise levels, for both private and public administrations, it is hard to automatically identify which sounds are more present in most polluted areas. Actually, to assess citizen complaints, technicians are typically sent to the area to be surveyed to evaluate if the complaint is relevant. Due to the high number of complaints that are generated every day (specially in highly populated areas), the development of Wireless Acoustic Sensor Networks (WASN) that would automatically monitor the noise pollution of a certain area have become a research trend. Currently, most of the networks that are deployed in cities measure only the equivalent noise level by means of expensive but highly accurate hardware but cannot identify the noise sources that are present in each spot. Given the elevated price of these sensors, nodes are typically placed in specific locations, but do not monitor wide areas.
The purpose of this thesis is to address an important challenge still latent in this field: to acoustically monitor large-scale areas in real-time and in a scalable and cost efficient way. In this regard, the city centre of Barcelona has been selected as a reference use-case scenario to conduct this research. First, this dissertation starts with an accurate analysis of an annotated dataset of 6 hours corresponding to the soundscape of a specific area of the city (l’Eixample). Next, a scalable distributed architecture using low-cost computing devices to recognize acoustic events is presented. To validate the feasibility of this approach, a deep learning algorithm running on top of this architecture has been implemented to classify 10 different acoustic categories. As the sensing nodes of the proposed system are arranged in such a way that it is possible to take advantage of physical redundancy (that is, more than one node may hear the same acoustic event), data has been gathered in four spots of the city centre of Barcelona respecting the sensors topology. Finally, as real-world events tend to occur simultaneously, the deep learning algorithm has been enhanced to support multilabel (i.e., polyphonic) classification. Results show that, with the proposed system architecture, it is possible to classify acoustic events in real-time. Overall, the contributions of this research are the following: (1) the design of a low-cost, scalable WASN able to monitor large-scale areas and (2) the development of a real-time classification algorithm able to run over the designed sensing nodes.
Keywords: Acoustic Event Detection, Urban Noise, Wireless Acoustic Sensor Network, Real-time Classification, Polyphonic Event Classification.