Sonoscopy of a Panacoustic
2021 · Book
· PodcastThis book does not provide direct access to its content (available in the form of an audio file), but it is a visualization of how an automatic listening program analyzes sounds and deduces information about their nature. I recorded myself reading its content in various locations (at home, in the city, in a bar, in a cathedral…) using a microphone that captured both my voice and the surrounding environment. I then submitted this recording to a pre-trained neural network for audio pattern recognition, accessible as open source, to experiment with automatic identification of sound environments. This customized program transcribes all the sound events, such as speech, a drawer opening and closing, or a meow, while attributing a recognition rate to them. The textual interpretation it provides of the recording is the sole composition of the pages of the book.
Upon analyzing the recognized sounds, I noticed that biases had crept in. For example, my voice is always labeled as a male voice: “Male speech,” because the program was trained on a sound database where male voices are predominant.
‘Metadata’ visible on the first line groups the metadata of the file, such as its name, block length, block size, sampling rate, number of blocks, and the parameters I pre-specified for the analysis.
‘Raw_datas’ is the list by block of all the sounds successively recognized in the recording. A block is formatted as follows: [['Speech', 0.7751136], ['Music', 0.21910243], ['Inside, small room', 0.13027504], ['Male speech, man speaking', 0.11066271]], enclosed in brackets.
‘Tags’ indicates the recurrence of each identified sound in the different blocks and its recognition rate. For example: ‘Walk, footsteps’: [0.15982443, 0, 0, 0, …], means that footsteps are recognized at 15.98% in block 1, but are not recognized in blocks 2 and 3, and so on.
June 19-23, 2023, at Le 108 in Orléans.
Photo : La Labomedia
Photo : Paul de Lanzac