A while ago, I wrote a short post with some tips for creating great prompts to be used in your phone system. Since then, I’ve been experimenting with some of the technical methods that can be used to process the recorded audio files. Here are a few tips with examples so you can hear the result of each method. A word of warning – what follows is a bit technical. 🙂
This is the original recording I began with (44.1 kHz 16 bit stereo).
I recorded it in my home office using my laptop and a decent USB condenser microphone. I also used Adobe Audition for the audio recording and editing. There is a bit of echo since I don’t have a proper studio, but it could be worse. I did a bit of noise reduction to take out the hum from my laptop fan and the furnace and normalized.
Now for a quick history lesson…
In the mid-20th century, the basic rules for TDM telephone circuits<https://en.wikipedia.org/wiki/Time-division_multiplexing> were invented. The short version is that we’re still generally using the same foundational principles for telephone calls. Yes, VoIP is introducing wide-band codecs like G.722 that have higher quality, but most phone calls will still have to traverse a legacy circuit at some point in their path between caller and called… which means that they are still limited to the original 64-kbps.
In other words, regardless of the quality of the prompt that exists on the phone system, the quality of what the caller hears will be limited by the circuits through which the call has to travel. Many phone systems require audio files to be in the following format (or something similar): CCITT µ-Law 8 bit 8 kHz mono (often referred to as simply u-law or a-law).
All this means that it is best to convert and edit the audio files yourself rather than allowing an efficient algorithm somewhere along the line to do it. You’ll end up with MUCH better quality audio.
I tried a bunch of different methods to convert my original audio file to the required destination format, and got just as many results (click on the numbered examples below to hear the audio sample):
- Example 1 – First, I tried directly exporting my CD-quality original to u-law. The output file had quite a bit of background hiss. Usable, but unacceptable.
- Example 2 – Next, I converted the sample type to 8 bit 8kHz mono and then exported. Better, but still adds hiss to the words.
- Example 3 – For this example, I opened a new project as 8 bit 8 kHz mono and set the project sample rate to 8 kHz in the audio hardware preferences. I then recorded a new example and exported. The result was similar. Not bad, but a little hiss added to the words.
- Example 4 – For the final example, I went back to the original file and applied hard limit and compressor filters, then followed the same steps as Example 2. The output file has a more depth to it. Still not perfect, but really not too bad. I also found that this method works better for music.
I’m sure there are more tricks that could be used to get an even better result (and if you know of any, please let me know). But the main takeaway from these tests is to avoid directly exporting to a compressed format. Convert, then export.
… and if you’d rather not go through all the trouble of recording and editing your own prompts, ask GM Voices for a quote. They’re great to work with.
If you have any questions about your IVR, please reach out. We’d be happy to help.