intellistener audio annotation
workshop by dirk van oosterbosch
friday june 10, 2005
de waag, amsterdam
The intellistener project revolves around annotated audio. It deals with creating annotations over audio tracks, maintaining a collection of rich audio files, creating a mesh of links between those files and with playing the audio back constrained by this mesh.
Markers in an audio recording, or side notes "in the margin" should be bound to a specific time in the audio, so that the actual spoken material can be accurately accessed. I see these markers or notes as sitting in their own specific layer on top of the audio timeline. So you'll have a layer with textual side notes, a layer with keywords indexing the subjects spoken about, or a layer of markers referencing other audio files. And these synchronized layers of metadata, relevant to the bare audio file, should be bundled with that audio file in an atomical annotated audio wrapper. This allows these now annotated audio files to be shared and edited independently. This structure of markers in layers and layers on top of files can be seen as similar to what the written medium (e.g. books, journals, newspapers) has had for a long time: title sheets, pages of contents, headings, footnotes, page numbers, a margin to scribble things in, indexes, and references.
With the use of computers, these kinds of structures could even grow into much more complex systems. You could call that hyperaudio, analogue to hypertext, the system designed by Ted Nelson1. The greater goals of this project must then also be seen along these lines: to create a system so well structured and with so many layers of abstraction, context, semantics and reference that it functions as a speaking collective knowledge repository, which as Doug Engelbart1 would say it augments human intellect. I envision that "dreamed" system as an artificial intelligent conversation partner. Together with the machine the user should be able to flow seamlessly from topic to topic, from thread to thread. Then the ultimate goal would be a system, which gets, in symbiosis with the user, into a state of singularity, of convergence and leaves the user after some insightful paradigm shifts, with a feeling of euphoria about how well everything fits together: "Eureka, the world is finally understood!".
Hence the name "intellistener". As it is derivative from intelligent and listener. But also because the Latin intelligent actually stems from inter (in between) and lego, legare (to collect, to gather, to read, to read out loud or speak justice and to overhear). So intel-ligent refers to the ability to read in between the lines, to get the broader context from a collection. Therefore intellistener does not only stand for a system that affords "smarter" listening, but also for one that weaves multiple threads of audio together, creating a meaning significance from their common essence.
Currently I have an application, running natively on Mac OS X. The application has two modes of operation: an editing mode and a playback mode. In the editing mode, markers are set and notes are written. In the playback mode a graph of connected audio fragments is rendered and the user can traverse the different branches coming off from the current audio playing. By switching between editing and playback the user can hear (and see) how his or her editing decisions turn out in the graph and judge the resulting narrative effect.
To test out the beta version of Intellistener, there is a series of workshops organized, of which one at the final show. The workshops are an opportunity for the beta testers to discover the application, but also to discuss the process of annotating audio and to experiment and play with a shared collection of connected audio files. Which experiment exactly the workshop participants will engage in, or what feature, effect or narrative they will try out, is yet to be seen, or heard, actually.