Abstract:
Every day, millions of people search across YouTube’s billions of videos. The search index is built upon
metadata like titles, descriptions and even the content of pages with inbound links. The videos’ image
content is also classified and labelled [2].
However, the videos’ audio content itself is not indexed, and thus not used in matching or ranking. Users
cannot perform text queries for audio content across videos, nor within a given video. In 2008,
YouTube’s parent company Google itself launched a similar project for limited number of political
videos [1], but it is not available anymore.
This work is an attempt to start solving this issue. The technology makes YouTube videos searchable,
building on the recent advances in speech recognition and continued advances in network bandwidth,
storage and processing power.