We are delighted to announce version 0.1.0 of API.audio.
API.Audio is our first general availability release of our audio-as-a-service API. Our Audio-as-a-service API is our first release of self-service developer first tooling. The easiest way to add audio to your apps.
The mission of API.audio is to make audio simple. We do that by building tools that any platform developer — from indie hackers to some of the world’s largest companies — needs to build an elegant audio creation product. By integrating audio into their platforms, our customers enhance their products by saving money on content creation or adding personalisation.
So, what does API.audio do?
The core of the API.audio platform is our robust API. Our API abstracts away the complexities of creating audio: voice, mixing, versioning. This enables companies to easily build fully integrated audio products to offer to their customers.
The fastest way to experiment with audio and add it to your application
We built the fastest integration with audio available and have integrations with Azure, AWS and Google - so one single point of contact. We also have our own, custom-built in-house voices.
However, we cover not just Text-To-Speech. In less than 2 minutes you can create your own customised audio track:
- You can change the text, the sound design
- You can iterate through 400+ voices and 50+ sound designs and effects
- You can create beautiful audio from text without having to speak it yourself or hire someone else to do it
Add personalisation to your audio
The flexibility of creating audio programmatically with API.audio opens up an entirely new universe of possibilities. For example, you can now dynamically change or easily personalize an audio track by adding the user's name and other attributes.
Here is an example with 3 parameters - name, percent, and phone number. Our personalisation engine handles over 10k personalisation parameters (over 1k names) and more are being added daily.
Add sound design to your audio experience
The power of API.audio starts where text-to-speech solutions stop, from working closely with developers we developed first-of-its-kind sound-design, mixing and mastering functionality. This enables you to select background music for your content from a growing list of professional sound designs, ranging across several genres. Once selected, our sound design engine will handle the mixing and mastering of the speech and music; applying EQ, filters, fades, and effects to enhance the sound of the overall track. Sound engineering is usually a long and arduous manual process, but Api.audio facilitates the creation of professional-sounding audio in a matter of seconds.
Upcoming releases of API.audio will allow for even more sound design customisation, where you can select from assorted Sound Design Packages and place different sound elements - or even sound effects - throughout your speech content. This allows you to produce full and complex sound design for your audio experience with just a few lines of code.
Here’s an example.
In less than 10 lines of code you can add sound design to your audio:
Choose from over 400 voices
You can choose from our library of 400+ voices across several TTS services including Amazon, Google and Microsoft - picking the right voice for your use case. On top of that we also have our own custom-built voices such as Digital Einstein.
- Voice selection - A catalogue of over 400 voices (more added weekly) from various voice providers including Messner, Azure, AWS and Google. Get voices from various providers with just a single integration in your application and radically reduce the amount of code you need to write!
- Speedy and frictionless - integrate end-to-end audio creation in less than 10 lines of code
- 50+ languages supported - including English, German, Spanish, French and many more
- Script creation and versioning- Import or create your own script and version it (e.g. for personalization or to make it dynamic)
- Personalisation and dynamic parameters inside your script - deliver dynamic audio content such as name, speed, location. We support over 10k personalisation parameters
- Voice filtering to find the perfect voice for your use case. Filter by language, gender, accent, age, or by industry example
- Share your audio on slack or twitter - more integrations added monthly
- An ever growing list of professional sound designs for your audio needs - whether you need a spooky nighttime track or some chill jazz tunes, with us you’ll have options
- End-to-end examples for Health & Fitness, MarTech, SalesOps use cases with Python and NodeJS SDKs, with more added regularly
- Asynchronous text to speech engine & Synchronous text-to-speech engine for enterprise users
- SSML validator and support for multiple providers - upload scripts with your existing SSML formatting- no need to reformat
- State of the art rendering engine - make up to 100 concurrent mastering requests
- Signed URLs - Increased security to prevent leeching and hotlinking of your audio content. Api.audio uses signed JSON Web Tokens (JWT) to describe access restrictions
- And much more to come...
Try it here.
"It was straightforward to integrate Aflorithmic SDK, and I really appreciate the way you guys wrote the docs." - Hyperhuman
Aflorithmic is a London/Barcelona-based technology company. Its API.audio platform enables fully automated, scalable audio production by using synthetic media, voice cloning, and audio mastering, to then deliver it on any device, such as websites, mobile apps, or smart speakers.
With this Audio-As-A-Service, anybody can create beautiful sounding audio, starting from a simple text to including music and complex audio engineering without any previous experience required.
The team consists of highly skilled specialists in machine learning, software development, voice synthesizing, AI research, audio engineering, and product development.