GestureSOM


This patch emerged from a broader interest in navigating sonic content in performance. Unlike approaches such as concatenative synthesis, which look for small “grains” of sound, this system searches for Sonic Gestures, which are defined based on their spectro-temporal profile. This work builds upon past approaches to sonic gestural navigation this found in the FILTER and GREIS systems, and is currently being explored as one means to performatively engage with what we might call “posthuman gestures” as part of a sonic performance practice.

More Info

Self Organizing Maps (SOMs) are a simple type of neural network that organize high-dimensional data into a lower-dimension space – for example, organizing RGB colours (3-dimensional data) into a 2-dimensional space. The dispersion.gestureSOM Max patch takes sonic gestures from the Performable Gestural Database and organizes them using two parallel 2-dimensional SOMs, each of which can be explored independently.

Sonic Gestures

Sonic gestures stored in the Performable Gestural Database contain gestural data, the audio that the gesture produced, and various analyses of these components. The GestureSOM patch organizes sonic gestures exported from the database according to user selected audio descriptor sets. Several sets of descriptors are calculated using the MuBu library, including:

  • Fundamental frequency (Yin)
  • Signal
  • Spectral
  • Perceptual
  • Harmonic

These audio descriptors are calculated for frames across the entirety of each sonic gesture. To reduce each audio descriptor to a single value per sonic gesture, statistical descriptors – mean, median, standard deviation, minimum, maximum, and range – are calculated for each audio descriptor. The user can select which of these statistical descriptors they want to use for the organization of the sonic gestures.

Parallel SOMs

The GestureSOM patch makes use of two parallel SOMs – one from the ml.* library and one from the ml-som Node package. Each SOM receives the same set of parameters including size, number of training iterations, learning rate, neighbourhood size, and plasticity. Due to their different implementations, each SOM produces a unique map, even with the same settings.

Once gestures are added to the SOMs, the user can interactively graphically explore the produced clusters. As the user navigates the selected SOM with the mouse, the weights of the selected SOM cell are used to find the closest matching gesture from within the training gestures. The recalled gestures can be listened to, allowing the user to compare different organizations of gestures based on which descriptor groups were selected.