Archive for May, 2009

  • Categories
  • Audio mosaicing with hidden Markov models initialized for chord detection

    Thursday, May 21st, 2009
     collage 8192 by makeyourselftransparent 

     

    by Adam Rokhsar
    This song was composed by me in Protools and Max/MSP. I then performed chroma extraction in Matlab to obtain a pitch class profile by frame for the entire song. Though the octaves are lost, the data contains only an energy value for each of the 12 chromatic notes in Western music, which serves as a good template for identifying chords. The chroma features are considered the observation data for a hidden Markov model (HMM), and the hidden state is the chord which is responsible for the chroma distribution of that frame.
    HMMs are a kind of finite state machine in which the probability of an observation is dependent on current state of the system. The state is not directly observable, or “hidden.” The model is comprised of three parameters: the initial state of the system (pi), the transitional probabilities between states (A), and the emission probability for each state (B).
    Using the method described in Bello & Pickens (2005), I initialized the parameters of a hidden Markov model to reflect a priori knowledge of music theory. Therefore, A is initialized to a double-nested circle of fifth, since certain chord transitions are more likely than others (e.g, Cmaj to Fmaj, as opposed to Cmaj to F#maj); and B is initialized such that the mean and covariance of all the chroma values by state reflects the triad that makes up the chord. I then disallowed updating during training of B, and allowed A and pi to be updated as normal. The Baum-Welch algorithm was used during training, and the Viterbi algorithm yielded the most likely chord sequence to explain the chroma features.
    Once the chord sequence was determined, I used audio-mosaicing to randomly replace each state-labeled frame with another frame from the song with the same label (Hoffman, Cook, & Blei, 2008). In other words, each frame that was labeled Cmaj is replaced with a different frame also labeled Cmaj. collage 8192 is the result.
    References
    Bello, J.P. and Pickens, J. A Robust Mid-level Representation for Harmonic Content in Music Signals. In Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR-05), London, UK. September 2005.  
    Hoffman, M., Cook, P., and Blei, D. Data-driven recomposition using the hierarchical Dirichlet process hidden Markov model.  In Proceedings of the 2008 International Computer Music Conference, Belfast, 2008. 

    Computer Memory

    Monday, May 18th, 2009

    by Adam Rokhsar

    You are watching a home video my father shot on his camera in the mid-1980s of my sisters and me in the backyard.

    Using C and Max/MSP, I wrote a program that replaces each pixel of the video with the memory address where that pixel is stored inside the computer. The address, written in hexadecimal, is colored according to the value stored there. 

    Because computers store video color as a mix of red, green, and blue, you can see here when the video is rotated the primary color planes that make up all videos.

    This piece is about computer memory and human memory.

    Gospel

    Thursday, May 7th, 2009

    by Adam Rokhsar

    In this piece, the audio generates the video by taking certain features (like amplitude, zero-crossings, etc) and using them to synthesize video content.

    WAVE_WAVE

    Friday, May 1st, 2009

    by Adam Rokhsar

    This piece was displayed in the Jakopic Gallery as part of the multimedia project Senza Televisione.

    Composed in Max/MSP/Jitter.  Edited by hand in a hex editor. What you see is being generated by the audio.