Bringing AI To Video

ByRichard Griffiths

January 24, 2019

Richard Griffiths


AI has been talked about with video for a long time but it’s only now that we are seeing some really intelligent video applications. There’s a lot of exaggeration talked about under the general banner of AI. For instance, I’ve seen voice control called AI when it’s really just a speech to text converter. There’s nothing intelligent about “Set an alarm for 6am”. If it was intelligent it would be asking me around 10 pm if I want the alarm setting and reminding me that at my age I should go to bed and aim for eight hours of quality sleep.

What AI Isn’t

In other areas, what is called AI could simply be the application of massive amounts of powerful cloud computing to crunch data in an instant rather than anything that’s actually intelligent. Making a production workflow more efficient isn’t AI. For me, I want to see AI that learns and improves its output over time. But others may disagree and apply the term “AI” to applications that take your raw photos and video to produce a professional-quality video complete with mood music. I think this is neat but I don’t think it’s AI because the application will not produce a better video until somebody feeds it better instructions.
Other software scans your video’s dialogue so you can instantly locate specific scenes based on the words. This could be AI if the software learns and improves its transcribing of the dialogue over time. Then, there is the well-known short film called Impossible Things that was written by a computer in 2016 after being fed a variety of disparate sci-fi movie scripts. The result is on YouTube and it’s barely watchable.

Ups and Downs with AI

I usually think of the elevators in my apartment block when applying AI to a real world situation. There are four of them and they all operate independently. This is unusual because normal elevator systems are connected to a single controller that sends the nearest one to you. In my apartment block, however, if you want the nearest one then you have to press all four call buttons. So, I bring my own intelligence to it and catch the nearest one just by pressing the button of the elevator that’s nearest and travelling in the correct direction. If I had worked out the algorithm to summon the nearest lift, like most lift controllers, then would this be an example of AI? No. It’s simply an algorithm that applies a series of rules to determine the correct outcome. For a lift controller to be AI it would need to learn how the occupants use the lifts over time and design its own algorithms or rules to improve its service by being at the right floor at the right time without me pressing a button. For example, it would work out that Floor 15 calls the lift at 7.10 am every morning and the lift would be there waiting for me. It would further work out that this only happens weekdays and so it wouldn’t apply the rule on Saturdays and Sundays.

Making Video Recommendations Intelligent

So, considering this, is what we used to call “smart search and recommendations” a good example of AI in video? Historically, no. Recommendations have traditionally been based on some quite simple rules written by humans that link movies and shows in the video library based on metadata like genre, cast or crew. It’s crude. For instance, just because you watched Die Hard doesn’t mean you’ll like Bonfire of the Vanities despite them both starring Bruce Willis and coming from his golden era. And even though you found Kingsman: The Secret Service to be a surprising delight doesn’t mean that you’ll consider its sequel to be two hours of your life well spent, despite the same stars and director. Recommendations that are simple matches of cast and crew metadata aren’t intelligent and are likely to annoy a viewer more than delight, particularly if you’ve paid good money for a promising new TVOD title recommended by your IPTV storefront only to discover that there’s a good reason it went straight to Netflix (for anyone over 40, this is what we used to call “straight to video”).
AI is based on analysis which requires data, and our search for AI in video must begin with data sources that can be interpreted and combined to produce useful results. If you start with only a few data sources, such as cast and crew, then your recommendations will be crude rather than intelligent. We need to bring in more data sources to make our recommendations more intelligent. Going back to the elevator, its only data sources are the call and floor buttons in the elevators themselves. We can add timing information so it can work out it should be ready for me at 7.10 am, but that’s about it from such limited data. What if we added a sensor in the corridor that detected footsteps? Or one in the electric lock of my front door? This would alert the lift that it’s statistically likely somebody will call the lift within 30 seconds and it could anticipate the call for you. Over time it would work out exactly how likely it is that the sound of footsteps will head to the elevator button and how likely they head to the bins and act accordingly. That would be great.

So, to improve our video recommendations we need to add more data sources.

First, the analysis should not just consider the titles you’ve watched or recorded, but also include the ones you’ve searched for or spent time on reading the synopsis. Second, we should only recommend titles within similar genres. For instance, if you’ve just had an all-night Schwarzeneggerthon watching Terminator 2, Total Recall, and Predator back-to-back then, apart from having had a great evening, you would want your IPTV platform to recommend True Lies and not Jingle All the Way. And third, we want to build trust and only recommend movies that are considered to be “good”, so let’s bring in some ratings data from IMDB, Rotten Tomatoes, or from your own built-up database by getting people to rate movies as soon as they’ve watched them. You still have the promotional capabilities of your video platform to push the straight to video stuff that you’ve been obliged to license by the studio to get the good stuff that you really want.
Finally, we need the AI engine itself to evaluate how often recommendations based on a particular set of algorithms results in a successful sale or view. Then it needs to create its own new rules that prioritize the successful algorithms and discard the poor ones. Easy.
So, now we’ve understood the basics of AI in video recommendations then we can start to look at adding more data sources (subject to explicit customer approval and GDPR compliance) to build a better profile of the viewer or household. Then we can look at using that profile for more video applications such as targeted advertising and video marketing. Online video scores consistently highly with customer engagement against other media. As people are reading less, there are surveys reporting that already 30% of time online is spent watching video. However, video ads are considered highly irritating when they are not relevant. Extending the AI we use for recommendations to video advertising would help reduce that annoyance and may even result in a positive reaction from the viewer.
We now have the AI to do the analysis and interpretation. We have enough data sources to make it smart. And we have the platforms to deliver the video. Let’s go.
Click the link for more information about Huawei’s AI strategy and portfolio.

Disclaimer: Any views and/or opinions expressed in this post by individual authors or contributors are their personal views and/or opinions and do not necessarily reflect the views and/or opinions of Huawei Technologies.

Leave a Comment

Posted in Technology Posted in Technology
Published by

Richard Griffiths

VP, MSSD Consulting Office, Huawei. Richard's extensive ICT expertise covers the TV, video, and carrier network sectors, including 5G.

View all posts >