Machine Learning as a Video Filter

Machine learning has a lot of possibilities in almost any software moving forward. You could use it to learn your preferences in apps, to train an AI how to understand your voice, and there are a lot of clever things you can do when dealing with imagery.

I started looking into using machine learning as a video filtering tool.

There is a good rundown and sample code here: which is what I used for this session. This page says that it requires Linux, but you can use it in either Windows or macOS. You need a CUDA-capable processor, which is most recent NVIDIA video cards. If you’ve been experimenting with VR, you likely have a compatible video card.

The basic premise of my usage of pix2pix was to give the algorithm frames from video to compare, to have that made into a “model” which is like a list of rules, and to then give it fresh images to run those rules on. ie, compare frame 1 of video to what has changed in frame 2 of that same video, do that for the rest of the frames in the video (2>3, 3>4, and so on), once they’ve all been looked at, use what was learned to do the same things that changed from 1>2 to your new image.

All that appears to have happened in this case to all of my attempts is that the new video has been given some of the stripe artifacts from my import of this analog video, and some glitching appropriate for a black metal music video. Subtle, and probably not too useful so far.

I need to look into this more. The sample code I was using has a lot of extra pieces attached in order to save out a web gallery of what you converted, which I don’t need, and has some image conversion going on in order to put someone’s random images into a format that would show better results. Since I know how to get the images into the format and sizing that I want, and don’t want a web site of my results, I’m working on stripping all of that out.