The ever-growing amount of media content published each day makes it extremely challenging for human editors to consistently segment and annotate it. However, segmentation and labelling of media content are necessary to make short-form content available. For example, if someone is looking for a particular piece of news from a programme aired one month ago, some pre-segmentation and/or annotation of the news story in the show would be a massive help!
So how can we segment and annotate media content without direct human effort? It probably is no surprise that the answer is artificial intelligence (AI). PhD student, Iacopo Ghinassi, has been working with 成人快手 R&D on ways to solve this problem as part of our Data Science Research Partnership.
I have been working on a fascinating project that uses AI to segment and annotate TV and radio programmes automatically. The project is part of my that the 成人快手 sponsors. The 成人快手 has provided valuable data and continuous support that allowed me and my supervisors (Dr Huy Phan and Prof Matthew Purver) to investigate new ways of automatically understanding the content of media.
'Understanding' is, in fact, crucial to solve the problem of segmenting an otherwise undivided piece of content, such as a news show or a podcast. We aim to segment content by topic, meaning that an automatic system needs to 'understand' when the topic changes. To achieve this, we turn to the branch of AI that is concerned with understanding human language. Sounds and acoustic elements are also explored, but understanding language is crucial if we want to isolate a self-contained section of the programme on one topic and, eventually, label the segment with the topic itself.
In a sense, this is not too different from what a search engine does when trying to return results relevant to your query. That's why our research takes a different direction from previous research on the topic - by investigating models and techniques from AI that are closely connected to . A general understanding of language like this could be a unique way to segment and label content - recognising different topics and the way they appear within the programme. If our algorithm has a good understanding of the content, we can then potentially adapt it for things like automatic summarisation at little or no cost!
at an important academic workshop about the broadcasting industry鈥檚 use of data science, which led to . Another paper is on its way documenting the latest system built with this approach that managed to correctly segment a set of 270 news programmes from the 成人快手 News Channel more than 90% of the time. This system has been adopted by R&D in a prototype news segmentation system called Yuzu, which will be used to explore potential applications for automatic segmentation.
Much more has yet to come, though! The potential that AI and data science have in helping shape processes and media consumption is, if not limitless, very far-reaching. I鈥檓 glad to have had an opportunity to lay a (small) tile on that path.
- -
- 成人快手 Media Centre - 成人快手 and UK universities launch major partnership to unlock potential of data
- 成人快手 R&D - Artificial Intelligence & Machine Learning
- 成人快手 R&D - Natural language processing
- 成人快手 R&D - Developing automated user generated content filtering tools for news events
- 成人快手 R&D - Creating automatic video summaries with text queries
- 成人快手 R&D - Using Algorithms to Understand Content
- 成人快手 R&D - Content Analysis Toolkit
- 成人快手 R&D - Snippets