Streaming service develops AI to detect and filter explicit song lyrics in 40,000 new tracks uploaded every day
- The music streaming service Deezer is developing a new content screening AI
- The tool will analyze song lyrics and flag tracks for potentially explicit content
- Deezer says they receive as many as 40,000 new unlabeled tracks every day
The music streaming service Deezer is developing an AI tool to help determine whether new songs added to its library should be flagged as explicit.
The new tool is still in development, but company executives hope it will help them sort through the up to 40,000 new tracks it receives every day from record companies, most of which are unlabeled.
Once finalized, the tool won’t automatically label tracks, but instead flag tracks for review, which will be conducted by one of the company’s executives.
The music streaming service Deezer is developing an AI tool to help the company identify explicit content in songs that come unlabeled from record companies
‘When it comes to figuring out what explicit lyrics are, there is no general consensus,’ Deezer’s Manuel Moussallam wrote in a blog post outlining the new project. ‘It’s obviously a cultural issue, with lots of considerations about the intended audience and the listening context.’
‘As is the case with movies, the primary objective of tagging a piece as “explicit” is to provide guidance to determine how suitable it is for an intended audience.’
The current system relies on record companies to evaluate their own material before publishing it and decide whether it qualifies for a ‘Parental Advisory’ sticker based on guidelines from the Recording Industry Association of American (RIAA).
Because these are often judgement calls made by human executives, and depend heavily on the culture of the market the song or album is being released into, creating an AI to perform the same function has been a challenge.
‘None of the systems [we] considered reached levels of accuracy comparable to human ones,’ Moussallam told the BBC.
According to Deezer, it receives as many as 40,000 new tracks from record labels every day and the majority are unlabeled, making it easy for potentially explicit or offensive content to enter their system without any content warning
Deezer’s AI tool relies on a previous program developed by the company, called Spleeter, which can automatically isolate the vocals from any song into a separate track.
The AI creates a written lyric sheet based on the isolated vocal track from Spleeter and cross references it against a list of explicit words for a particular language.
Rather than simply flag tracks based on the presence of a certain word, the AI assigns an overall probability as to whether the words are meant to be explicit based on context.
Songs flagged as having a high probability of being explicit will then be forwarded to an executive for a final review and decision.
Moussallam acknowledges there’s a potential for bias in the system, which he says his team is trying to avoid as much as possible.
As an example, he explains that because there may be a higher percentage of hip hop songs that contain explicit lyrics compared to country songs, an AI might end up training itself to simply identify hip hop songs over time, while explicit content in other genres and idioms.
To prevent this kind of bias, the team has exposed the AI is to roughly the same percentage of explicit and non-explicit music across all genres.
There’s still no final timeframe for when the tool will be released.