We develop software to extract meaningful information from audio signals
Music metadata extraction
Our Algorithms are able to automatically extract Tonal (Scale, melody, key, bass, chords), Rhythmic (tempo, meter, beat-pattern), Timbral (loudness, sound quality, instrumentation, audience sounds),
Structural (interlude/chorus/verse/phrase) information from music. These algorithms can operate at a large scale i.e. process very large numbers of songs. Combinations of these features can lead
to Semantic descriptions of music, such as mood, genre, danceability. Applications of our music-based descriptors range from the enhancement of existing music recommendation, personalization & discovery
consumers-experiences to the development of music-based games.
Music similarity analysis
Have a large music database that is corrupted with many duplicate audio files, and need it to be cleaned-up? Need to identify the tune stuck in your head ? At a concert and need to identify the original
version of the cover being played on stage? All these use-cases can be handled by systems that can identify whether two pieces of music are similar to each other or not. We develop algorithms that
are able to identify exact matches (duplicates) of music tracks, hummed or sung renditions of a song, and cover versions of original songs (provided that a human being would identify them as similar
as well). Such algorithms can enhance existing music identification services.
Speech Analysis
We provide software algorithms for a host of speech applications. One example is that of spoken language learning. Present systems present the learner with the ideal utterance and expect the learner
to learn by repetition, but do not provide real-time feedback on correctness of the learner utterance. Our algorithms can rate the utterance of the learner with respect to the benchmark ideal utterance
in real-time, thereby greatly enhancing the spoken language learning experience. Other examples of speech applications that can be powered by our algorithms are systems that need to identify gender
from speech, systems that need to differentiate between speech, silence & music, limited vocabulary speech recognition system.