7 Steps to a Successful Recording in an eLearning Localization Project

Audio is one of the key components of an eLearning course. It has the power to create the perfect atmosphere and set the tone for the entire eLearning experience for a specific eLearning course. Knowing how to produce high quality audio recordings in the target language to fully convey the meaning of the source language is crucial to successfully localizing any eLearning course.

At Boffin, we follow the 7 steps below to ensure a successful recording job.

1.    Prepare standard voice samples

With the provided course or storyboard, we identify the scenes and characters involved in a recording, as well as the characters' features, such as gender, age, occupation, and position. Based on this information, we will prepare and present several qualified and suitable voice samples for our client from which to choose. To save time and ensure that there will be voice talents available when actual recording takes place, we ask our client to also choose backup talents just in case the first-choice voice talent is not available at the time of recording.

2.    Create a pronunciation guide

Some words within an audio may be used with high frequency or have uncertain pronunciations. Other words may be pronounced differently in different languages. Examples of such words include company, product, or brand names, as well as abbreviations and figures in serial numbers.

Before recording, we work with the voice talents to identify these words and confirm their pronunciations with our client. Doing so prevents possible mispronunciations that could result in having to re-record the entire audio—adding both time and cost to the project.

For instance, take a look at the terms in the below chart.  As you can see, the treatment of such terms will differ for various languages.

TermsFrenchSpanish (Latin)Portuguese (Brazil)
Vocollect"Vocowllect"American Accentsay it in English
ROI"Air-Ow-E"ROI (Spanish)retorno do investimento
Vocollect VoiceDirect®"Vocowllect VoiceDirect"American Accentsay it in English
RFAir Fere efe (Spanish spelling)erre efe --> in Portuguese
T700T seven hundredT seven oh ohT-7-0-0

3.    Record custom voice samples

To ensure the style and rhythm of a recording match the course exactly and the pronunciation of special words are correct, we ask the chosen talent to provide our client with a recorded sample of the content and special words. If the client is not satisfied with the sample, we ask the talent to adjust as requested. In certain cases, we may suggest that you select other qualified talents to avoid having to re-record after the initial recording has been completed.

4.    Record the audio

After the above preparation steps, we launch the actual recording work of the project by sending files to the chosen talent(s). Files include:

a. The script, which contains the file name, text for recording, character, and talent. We use colored highlighting to indicate a change in character. In the example that follows, green highlighting tells us that the narrator speaks after the opening lines of the character "Boss".

No.File NameScriptCharacterTalent
1M01_t01_p01_01.wavWelcome to this course for implementation professionals who work with Vocollect Voice® solutions.
I'm known by many names, but you can just call me 'The Boss'.
2M01_t01_p01_02.wavIf your responsibilities include installing voice systems, training, managing changes, and coordinating the hand-off to technical support, you can consider yourself an implementation professional.NarratorGary

b. Pronunciation guidelines for special words.

c. The approved sample.

d. Style and format recording requirements, for instance:

File format: 44100 Hz, 16-bit, mono, normalized to -6db (or similar), 1 big *.wav file without breaths, reading errors, background noises, echo, reverb or any other similar sound effects. Please also leave about 0.5 sec length of blank at the beginning of every cell.

Style: Please use a clear, bright, neutral voice. Also, use a conversational tone as if you are talking with a friend. Please keep the same style as the attached approved sample as well.

e. If the recording is for a video, the length of the audio is strictly limited due to video synchronization requirements; we remind the talent to adhere to time restrictions noted in the script.

5.    Perform quality assurance

We send both the recorded file and the script to a native speaker to check the accuracy of the content and pronunciation. If the native speaker identifies problems, he or she notes concerns in the script using tracked changes and returns the script to us. For example:


6.    Identify and correct pickups

Pickups are problems identified during quality assurance, such as missing text, mis-readings, duplicate readings, and mispronunciations. We send pickups to talent for re-recording, emphasizing the need to maintain consistency in style and volume through all batches. Our goal is to avoid disharmony and style differences when adding re-recordings to the main recording. For instance, even if there is but one pickup in a sentence or paragraph, we ask the talent to re-record the entire sentence or paragraph to ensure harmony.

7.    Perform post-production work

We typically use audio-editing software such as Adobe Audition and Sony Sound Forge for post-production work, which includes:

a.  Noise reduction: The most common noises are background and breathing noises. We typically fix these and other unnecessary noise issues by removing or muting the unwanted sounds.

b.  Normalization: Sometimes audio volume varies from audio batch to audio batch even though the same talent recorded the audios. We normalize audio to achieve consistent volume across the recording.

c.   Splitting (cutting): Generally speaking, many individual modules and pages comprise a course. We record the content of all modules in a single audio and then split or cut the audio into smaller pieces according to the script. To ease synchronization, we maintain approximately 0.5-second intervals at the start of each split audio. The interval between different sections within an audio generally runs from 0.5 to 1 second.

d.  Format conversion: Most multimedia audio courses use a 44.1 KHz sample rate, a 16-bit sample resolution, a mono sound channel, and the mp3 format. We will confirm the format requirements at the start of the project to avoid or prepare for needed batch conversion.

