Through the use of freely available software and AI models, some restoration of audio from calls has been performed and clipped for soundbite use.
The software in question is Ultimate Vocal Remover, available from its GitHub page: https://github.com/Anjok07/ultimatevocalremovergui – I highly recommend anyone interested in audio restoration check this out.
I then use Audacity to label segments of audio out into phrases or words, depending on the situation. I use the label feature to export the snippets of audio into a folder, where I will further manipulate the audio to fit the needs of the situation.
In this case, N00N00 asked me to grab a soundbite from a single Carlito call, so I went about my normal business of grabbing the soundbite, but then decided to try and lift the voices out from the noise of the crummy telephone line. Using the above software, and some very specific settings, I was able to isolate the voices in their entirety, and then remove excess echo introduced by Carlito by accident at the start and end of the call.
In the end, the fidelity of the voices was increased, I had the soundbite N00N00 requested, and now had the clearest version of the call available.
This is the methodology that has been adapted for the project thus-far.
It works. I don’t think we’re about to trudge through every single episode unless there’s a specific request made for a call from the episode. There’s just too much data to process.
There’s limitations in how much processing of the data I can do on my ancient hardware (1080Ti), and the software hasn’t been tested on cloud GPU hardware yet, except in Google Colab projects. That’s the only avenue for me in the future, and for this project to move further into the future.
As time progresses, things get easier. If people request a call, and know what episode it’s from, we can do the work of finding, isolating, and then remastering the call from what these AI tools provide as outputs.
I’m providing such outputs from the call I mention above.
Just Carlito doing his thing.