Audio Content Analysis

We develop software to extract meaningful information from audio signals

Music metadata extraction

Our Algorithms are able to automatically extract Tonal (Scale, melody, key, bass, chords), Rhythmic (tempo, meter, beat-pattern), Timbral (loudness, sound quality, instrumentation, audience sounds), Structural (interlude/chorus/verse/phrase) information from music. These algorithms can operate at a large scale i.e. process very large numbers of songs. Combinations of these features can lead to Semantic descriptions of music, such as mood, genre, danceability. Applications of our music-based descriptors range from the enhancement of existing music recommendation, personalization & discovery consumers-experiences to the development of music-based games.

Music similarity analysis

Have a large music database that is corrupted with many duplicate audio files, and need it to be cleaned-up? Need to identify the tune stuck in your head ? At a concert and need to identify the original version of the cover being played on stage? All these use-cases can be handled by systems that can identify whether two pieces of music are similar to each other or not. We develop algorithms that are able to identify exact matches (duplicates) of music tracks, hummed or sung renditions of a song, and cover versions of original songs (provided that a human being would identify them as similar as well). Such algorithms can enhance existing music identification services.

Speech Analysis

We provide software algorithms for a host of speech applications. One example is that of spoken language learning. Present systems present the learner with the ideal utterance and expect the learner to learn by repetition, but do not provide real-time feedback on correctness of the learner utterance. Our algorithms can rate the utterance of the learner with respect to the benchmark ideal utterance in real-time, thereby greatly enhancing the spoken language learning experience. Other examples of speech applications that can be powered by our algorithms are systems that need to identify gender from speech, systems that need to differentiate between speech, silence & music, limited vocabulary speech recognition system.