Adaptive Audio

Adaptive Audio technology allows a traditional interactive voice response, or IVR system to become adaptive to the skills and preferences of individual telephone callers in real-time. When operating in this adaptive mode, the IVR dynamically adjusts the speaking rate (in words per minute spoken) and tone and content of audio messages in real-time based on exhibited caller behavior.

The software allows the IVR to emulate what humans do naturally and instinctively (and often unknowingly) to communicate more effectively with each other during normal conversation.

Using an Adaptive IVR is important to call centers and IT and Telecommunications departments because it saves significant operating expenses, boosts IT efficiency, enhances the caller experience and improves customer satisfaction.

With Adaptive Audio software added to a traditional IVR system, the system becomes adaptive to the needs and skills of each individual caller in real-time. Every individual caller’s speech and/or DTMF responses are continuously monitored for speed of entry and accuracy in real-time, point by point in the telephone call.

This information is initially used to build a database reflecting the behavioral characteristics of callers as they interact with the IVR System. Once a behavioral database of sufficient size has been acquired, the Adaptive Audio software automatically enables the IVR to adjust the speaking rate, audio message tone and content and time allowed to respond based on exhibited caller behavior.

This provides a call experience that is custom tailored in real time to suit the skills and environment of the caller and saves significant operating costs for businesses. The Adaptive Audio software also selects the best form of input (Speech or DTMF) for each point in the call and transfers those callers having particular difficulty to a human. Audio message content with variations in tone, inflection, volume and any other vocal characteristic can be used selectively by the Adaptive Audio software to custom tailor the call experience in real-time.

The technology also provides detailed reports on how callers interact with the IVR system. Data is provided that shows where callers encounter difficulties, when Speech or DTMF is more effective as a means of input, how many callers reached unacceptable levels of poor interaction with the system and other important behavioral characteristics.


In the field of telecommunications, interactive voice response, or IVR is a technology that allows a computer to detect voice and touch tones over the Public Switched Telephone Network, or PSTN. The IVR system responds to a callers input with pre-recorded human or computer generated text-to-speech audio to further direct callers on how to retrieve the information they seek.

Every caller to an IVR has his or her own individual set of aural, speech, hand-eye coordination (as used in DTMF keypad entry) and material comprehension skills. Add to this environmental variables such as background noise, poor mobile phone connections and caller distraction and it becomes clear that each call to the IVR system is truly a unique, dynamic interaction. For IVR's to be effective tools for such human interaction, good practice dictates that they adapt to the dynamics of this caller interaction in real time as the call dialogue progresses.

This is one of the main reasons real human operators are so good at handling any type of telephone call – they can handle the dynamics of human conversation intuitively and with ease. Telephone callers know this all too well and will attempt to circumvent the system and reach a human as soon as the IVR fails to be productive for them.

Traditional IVR's are “static” and make no dynamic adjustments for the real-time behavior of individual callers. As a result, all callers are handled in the same way regardless of their knowledge, experience, navigation skills and willingness to use the IVR system. Traditional IVR systems do not listen for signs that the caller understands what is being said and is comfortable with the pace, message content and timing of the dialogue. Without “tuning in” to each callers behavior during the call, real IT operational efficiencies are lost and the call experience is needlessly compromised.

While a well designed telephone call script with optimal structure and content, intentional pauses, grammar tuning and context forming are excellent design principles, the system falls short if it does not consider the real-time behavior of the caller just as a human would under the same circumstances.


Adaptive Audio has some effective features that will not only improve the performance of poorly designed voice applications, but will allow well designed apps to perform even better.

In addition to adjusting the pace and audio content of prompts in real-time based on individual callers skills, the software continually analyzes how callers are navigating the voice application and optimizes performance in several ways.

Using the Dynamic Application Smoothing feature for example, the software tracks all points in the telephone dialogue that require a response from the caller. As the application is used over time, Adaptive Audio determines which of these points present the most and least difficulty for callers. The software then slows down the speaking rate for the most difficult areas and speeds it up a little for the areas that callers find easiest.

So if for example, callers are having difficulty responding accurately to a long, complex or poorly worded voice prompt, the software will eventually learn that this is the case and will slow down this part of the dialogue for callers. If on the other hand, there are parts of the call script that callers find very easy to respond to, such as short prompts with easy questions and/or simple content, the software will eventually speed these segments up by a small percentage. This reduces error rates and the call dialogue simply flows as it should based on aggregate caller skills. This process works in parallel with Adaptive Audio speed adjustments for individual callers based on their own particular navigation skills, so the net adjustment here is relative to that skill level.

Another way the software improves performance is by the use of its Best Modality Signaling feature. This feature informs the voice application whether Speech or DTMF input by callers has historically been more efficient and/or more successful by a significant margin at points in the call script.

If for example, entering a string of digits or choosing from a particular menu of allowable named options via Speech has proved to be error prone in the application, Adaptive Audio will learn this and recommend the use of DTMF for these points. Conversely, if both DTMF and Speech input have proved to be equally effective (meaning they have about the same proportion of input errors) at particular points, but Speech proved to be faster, then the software will learn this and recommend the use of Speech at these points. This increases call automation rates, shortens the call and reduces user error/retry attempts resulting in less frustration for the caller.

The software also has an Adaptive Timeout Control feature. This allows the voice application to dynamically extend timeout periods for individual callers having demonstrated difficulty navigating any area of the call script. Since Adaptive Audio is continuously aware of all caller behavior while the application is executing, those callers that are experiencing particular difficulty are given a measured amount of additional time to respond. The measured amount of additional time is based on the callers progress, their skill level and the difficulty level of the next voice prompt. This increases call automation rates and reduces error rates and unwarranted call transfers to human operators by allowing more time for novice callers to respond without penalizing the expert and skilled users.

The Preemptive Transfer Alert feature of Adaptive Audio keeps a cumulative index of how well each individual caller is navigating the call script and identifies callers having excessive difficulty throughout the call. When such callers are identified, Adaptive Audio recommends preemptively transferring them to a human operator. Thresholds for PTA signaling are software programmable and that signaling factors in the likelihood that a human operator is available based on incoming call volume. This further reduces unnecessary callbacks and abandoned calls.

The Application Dependent Profile feature provides independent control over the audio playback rates, audio content and adaptive patterns used in multi-application environments. This allows the application developer to set custom values for each voice application supported by the IVR resulting in better performance and efficiency improvements over a wide range of distinct applications.

The Caller Behavior Analytics Reporting feature provides real time, comprehensive analysis and reporting on caller behavior, caller expertise and the willingness of callers to use the voice application. These periodic reports pinpoint application trouble spots and indicate where the application call flow can be improved. Also included are caller navigation difficulty ratings for each point in the call script and adaptive versus non-adaptive performance comparisons.

Any IVR system that supports telephone inquiries of at least 50 seconds in duration and with at least three responses from the caller can benefit from Adaptive Audio technology. There needs to be enough interaction with the caller so that the system can adequately gauge their skill level and ability to navigate the system.

The types of IVR applications that benefit from the use of Adaptive Audio technology include:

*Credit Cards
*On-line shopping

Generally speaking, any business enterprise or call center that uses IVR systems to handle significant call volumes will reduce IT operational costs and improve customer satisfaction with Adaptive Audio technology. The higher this call volume, and the longer the average automated call, the greater the benefits will be.

In addition to traditional on-premise IVR systems, the technology can be deployed on hosted and Open Source solutions such as VoiceXML and on VoIP networks.

See also

*Voice User Interface
*User interface
*Interactive voice response
*Human-computer interaction
*Digital audio


* [ The Effective Use of Adaptation in VUI Design]
* [,-It's-Personal-49913.aspx This Time, It's Personal]
* [ Interactive Digital: The Effective Use of Adaptation in VUI Design] , Nancy Jamison, "Speech Technology", June 2007, p42.

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Adaptive — A&V Able to adjust or react to a video condition or application, as an adaptive circuit. This term usually refers to filter circuits …   Audio and video glossary

  • Adaptive Multi-Rate — L Adaptive Multi Rate (AMR) ou Adaptive Multi Rate narrow band (AMR NB) est un format de compression audio de faible qualité normalisé par l ETSI. Le format AMR est souvent utilisé dans la technologie des téléphones mobiles. Il existe une version …   Wikipédia en Français

  • Audio-Technica — Rechtsform Corporation/Ltd. Gründung 1962 in Tokio, Shinjuku ku Sitz …   Deutsch Wikipedia

  • Adaptive Transform Acoustic Coding — (ATRAC) est une technique de compression audio avec pertes développée par Sony en 1992 et utilisée notamment dans les appareils MiniDisc. Seul le ATRAC Advanced Lossless propose une compression sans pertes. Aujourd hui, le format ATRAC est… …   Wikipédia en Français

  • Adaptive Huffman coding — (also called Dynamic Huffman coding) is an adaptive coding technique based on Huffman coding. It permits building the code as the symbols are being transmitted, having no initial knowledge of source distribution, that allows one pass encoding and …   Wikipedia

  • Adaptive Internet Protocol — (AIP) is a multi channel protocol that allows a user on any of a wide range of client systems to connect to applications running on multiple platforms. It supports rich remote display and input services with a number of display options to deliver …   Wikipedia

  • Adaptive feedback cancellation — is a common method of cancelling audio feedback in a varietyof electro acoustic systems such as digital hearing aids. The time varying acoustic feedback leakage paths can only be eliminated with adaptive feedback cancellation. When an electro… …   Wikipedia

  • Audio signal processing — Audio signal processing, sometimes referred to as audio processing, is the intentional alteration of auditory signals, or sound. As audio signals may be electronically represented in either digital or analog format, signal processing may occur in …   Wikipedia

  • Audio feedback — (also known as the Larsen effect after the Danish scientist, Søren Larsen, who first discovered its principles) is a special kind of positive feedback which occurs when a sound loop exists between an audio input (for example, a microphone or… …   Wikipedia

  • Adaptive Spectral Perceptual Entropy Coding — (ASPEC) ist ein Verfahren zur Datenkompression von Audiosignalen, bei denen unter anderem pyschoakustische Modelle angewandt werden. Das Verfahren ist Basis von Advanced Audio Coding (AAC), AC 3 und MP 3. Literatur Thomas Görne: Tontechnik. 1.… …   Deutsch Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”

We are using cookies for the best presentation of our site. Continuing to use this site, you agree with this.