In a collaboration with api.video (what a coincidence and no, it's not related to api.audio) developer evangelist Erikka Innes created this amazing blogpost. Power to her and a big thank you to all of the folks at api.video for the collaboration.
You've prepared the perfect ad copy for your business, but now you're trying to figure out how you can personalize the ad for each business. One great way to do that would be to use a voice over tool that allows you to customize the content based on the location of your customers. Fortunately, this is available! By combining api.video with api.audio's voice over this problem is easy to solve.
Imagine, if you will, that you own a pizza chain
To make this fun, let's imagine you are the owner of a popular pizza chain - Renzo's pizza chain. You have fifty different outlets and you've prepared the perfect ad:
This ad is great for one of your locations, but you want people near all of your restaurants to know they can eat the best pizza for the least dough. For each location, you'd like to run this ad, but have it play information that tells the listener the city, address, day of the week and the time of the offer being described.
No problem! We can do this in the tutorial today. Let's get started.
Prerequisites
For this project, you're going to need:
- An api.video account - Sign up here!
- An api.audio account - Sign up here!
Installation
We will use the api.video Python client and Aflorithic.ai's apiaudio library.
Installation for api.video:
Installation for api.audio:
ffmpeg installation
If you want to run the project as-is, you'll also need to install ffmpeg. These instructions help you with installation on a mac. What you'll want to do is make sure you have brew installed. Then it's very easy, you just install with:
You need to install ffmpeg before you install the next two items.
pydub and pyaudio installation
pydub and pyaudio can also be difficult to install, depending on your set up and what you've tried to install already. If you made the mistake of trying to install these before installing ffmpeg, then what you would do is first run:
Then, reinstall this like so:
After these steps, you should be able to successfully install the modules you'll need. Here's the commands:
and
Project overview
Here's what we're going to do with the example script today:
- Prompt to collect the api.video API key and the api.audio API key.
- Select our video from the Videos folder (for your own tweaks later you can drop other videos here to use, or use a different folder system to organize your content).
- Select the script we want from the Script folder and offer the user a preview of the script to make sure it's the right one.
- Select the localization .csv file we want to use with our script.
- Create a sample audio file with the voice over, using the first line from the .csv file
- Give the user the sample file so they can make sure it sounds like what they want.
- Next we'll combine the first audio file with the video.
- We'll display that file for review by the user to make sure it looks right.
- If the user approves the file, one by one we'll create an audio file, combine it with the video, tag each video with the information from the csv, and upload it for storage and hosting on api.video.
- We'll return the user a list of video titles and a link to a playable copy of the final file.
Code sample
Here is the code sample. It's a wizard that will walk you through all the steps. The complete project is available on github here: https://github.com/apivideo/python-api-client/blob/master/examples/video_audio/README.md
Walkthrough the important stuff
A bunch of the demo is set up to walk you through the process of combining information. We don't need to go over each while loop, but let's go over some details regarding API behavior and the tools used to create the demo.
The script
You can place your script into your code directly by using quotes. It could look like this (I'm saying could because there are a few different ways to set up your api.audio script):
You can see that the script is split into three sections. This demo doesn't really make use of the features available per section so I'm not going to go into detail for this part. Personalization is achieved by using {{ }} around the name of a column from your csv spreadsheet. They can be used in whatever order you like.
You can also choose to read your script from a .txt file. If you choose to do this, make sure to take the quotes off from the beginning and end or it will screw up the parser. A common error you will get will say there's a problem with creating a final file or working with a URL and it's usually because something broke before that point, so keep that in mind when debugging.
In your code, when you set up your script, you'll use a command that looks something like this:
In this snippet, you've already authenticated with your api key
so now you can work with the api's endpoints.
The only required field is scriptText, which will contain the text of your script. However it's useful to provide the other names for reference. Be aware that you cannot delete projects or modules. After you set up your script this way, you'll be able to reference it in your code easily via scriptId. Like so:
The audio file
For starters, when you create an audio file with api.audio, it names your file based on your project name and personalization parameters. In this demo, if you named your project "ad" then every audio file would begin with the word "ad." Next, it puts the headers for the csv file in alphabetical order, and adds the appropriate parameter next to each header. For example, our .csv file has the columns (in this order):
- city
- address
- day_of_week
- offer_time
After each column title, the entry that appears in the audio file will appear. To separate each column, two underscores are used. Here's a sample:
So something you'll possibly want to do is rename the files when they arrive. An easy way to handle this is with the built-in os module.
When creating an audio file you'll do two steps, one is text-to-speech and one is mastering the audio. For text-to-speech you can pick a voice, how fast it will speak and then give it a dictionary containing the list of personalization terms you want to insert into your script.
You can also choose a template, which will play some background music under your spoken audio. You can see in the demo I used "copacabana." To list voices or background music options, go to the api.audio API reference docs and use the endpoint for listing voices or the endpoint for listing music.
The .csv
This demo reads information from a .csv file. If you want to read from something else, you can as long as your output for your program to work with becomes a list of dictionaries. .csv is pretty simple to use, so I went with that option.
Playing a track from your application
You can play a track to check out the audio by using a variety of tools. For this one, I used pydub and pyaudio. These are fairly popular modules. In order to use them, however, audio must be converted to .wav. You will see in the code that two imports are made from pydub:
These allow us to play straight from the terminal or wherever we may be. The code to convert to .wav and play is very straightforward:
There are other choices available for converting, but api.audio returns .mp3 files, so we use the from_mp3 choice.
After we have the track, playing it is as simple as this:
Merging audio and video
Prior to uploading your video to api.video, you will want to add the sound and video together. This can be accomplished with ffmpeg, which you need to import to use pydub and pyaudio anyway. The code for merging audio and video is:
This will produce an mp4. You can then upload it to api.video.
Upload to api.video
For details about uploading a video with api.video, you can check out the tutorial about it here: Upload a Video with the api.video Python Client
Something to note is to upload a file, it must be in the same folder as your application or it will not upload.
Once it's uploaded, you can retrieve the .mp4 from the response and play it right away in your browser using the built-in webbrowser module.
To retrieve the mp4 from the response, you do this:
And then you can open the link like this:
This will let you make sure everything combined properly so that the audio matches with the video the way you want.
Create all the localized videos
After all the steps to make sure you're creating the right type of video, you can use the recipe from api.audio with a couple of tweaks to create all your new videos with personalized ads by location, then upload them for hosting to api.video.
This demo deletes every video you upload right after the upload happens so you don't end up sitting with fifty videos in a folder.
Thanks for reading! Happy coding. :)
TLDNR? Watch the video tutorial:
About:
Aflorithmic is a London/Barcelona-based technology company. Its api.audio platform enables fully automated, scalable audio production by using synthetic media, voice cloning, and audio mastering, to then deliver it on any device, such as websites, mobile apps, or smart speakers.
With this Audio-As-A-Service, anybody can create beautiful sounding audio, starting from a simple text to including music and complex audio engineering without any previous experience required.
The team consists of highly skilled specialists in machine learning, software development, voice synthesizing, AI research, audio engineering, and product development.
API.video is just what it says, a video API built by developers, for developers.
Their mission is to connect people through their cameras and deliver valuable insights
from the video-1st World.
The video distribution inside traditional, online and mobile apps stays a challenge due to the complexity of managing heavy files, making them available on any screen and worldwide in seconds. That’s why they've built api.video, the new Standard to manage online video streaming.