Add Audio to Your Etext

Pearson Education © October 2011

Adding Audio to an eText

October 2011

The document describes:

· How audio can enhance eText

· Description of audio options and audio sync styles

· How to add clickable links to your eText

· Important info about vendors, costs, and timelines

· Frequently asked questions

How audio can enhance eText

Business units (BUs) can enhance the student’s learning experience by adding audio to their eTexts. Here are some ways it is used today:

· For all learners, as a way to bring a written book to life and enhance engagement and interest in reading and learning.

· For early readers, to enable students to follow along as text is read using word-by-word highlighting.

· For language learners, to provide proper pronunciation and tonal qualities to enhance language acquisition.

· For accessibility for the visually impaired, to listen to an eText without the need of an additional reading device.

Description of audio options

eText offers four ways you can add audio to your eTexts:

· Modal audio – Modal audio utilizes a pop-up audio player that stays in front of the user until it is closed. While the audio plays, the background page is dimmed and only the audio playback controls can be used. The modal audio player is used when the author wants the reader to maintain focus on the audio track being played. Modal audio is supported by the Mac/Window “live” view, the iPad view, and the “offline” PSN, OCAS and CD packages.

· Faceless modal audio– Faceless modal audio plays an audio file but doesn’t display the audio player. Because faceless modal audio doesn’t provide any controls for starting, stopping, pausing or resuming audio play, it’s intended for use with short audio clips such as a glossary terms, keywords, or relatively short phrases. While the audio clip is playing, the user cannot press any controls or click on the page without stopping the audio. Faceless modal audio is supported on the Mac/Window “live” view, the iPad view, and the “offline” PSN, OCAS and CD packages.

· Non-modal audio– The non-modal player is activated by clicking an icon on the eText toolbar. The Elementary user interface (UI) or “skin” displays a hippopotamus while the Standard UI displays a megaphone. When the non-modal player is active, a user can scroll up and down the page, access the table of contents and see all non-modal audio icons/highlights. All other hotspot links on the page will not be visible. Non-modal audio is supported on the Mac/Window “live” view, and the “offline” PSN, OCAS and CD packages. Non-modal audio is available of Mac/Windows and iPad.

· Non-modal audio with synchronized highlighting – This option enables the playing of audio with the synchronized highlighting and/or underlining of corresponding words, phrases or sentences. This option requires working with a third party vendor, VitalSource Technologies Inc. (VST), formerly VPG Integrated Media, who creates XML files for each page needed to synchronize the highlighting. Non-modal audio with synchronized highlighting is supported on the Mac/Window “live” view, as well as the PSN, OCAS and CD “offline” packages. This solution is available for both Mac/Windows and iPad.

Synchronized highlighting options

Type / Description / What it looks like
Word/Sentence / Provides highlights by sentence in color and then word-by-word in a darker shade. /
Word/Paragraph / Highlights an entire paragraph in color and then word-by-word in a darker shade. /
Word/Line
/ Highlights the line (regardless of sentence breaks) in color and then word-by-word in a darker shade. /
Block/Multiple / Highlights a pre-determined block of text specified by the audio clips on the page. Most commonly used in School Math and Science books. /

Steps to implement

A. Modal and faceless audio:

§ The Business Unit (BU) creates a Custom Hotspot Spreadsheet. For details on this spreadsheet, see:
http://cmsna.pearson.com/groups/etext/wiki/c50b9/Production__Higher_Ed_School__Spreadsheet_Workflow.html.

§ The BU specifies whether the audio is to be accessed via:

o Highlighted text, artwork or other region: The user clicks on a specific region – a “hotspot” -- on the page to play the audio

o Icon: The user clicks on an icon (either the default eText audio icon or a BU-supplied custom audio icon) to play the audio.

§ The BU submits the hotspot spreadsheet to the eText Production Vendor.

§ The BU uploads audio files to the Pearson Media Server using the same path/file information supplied in the spreadsheet.

§ The Production vendor ingests the hotspot spreadsheet, and repositions and resizes hotspots on pages where necessary.

§ The BU checks to see that icons and/or regions have been correctly positioned and linked to the appropriate audio file. Highlights and icons are repositioned by the production vendor in the Authoring Tool or they may be adjusted by ingesting a modified spreadsheet.

§ Once the eText is final and has received BU approval, the BU submits a Promotion Request Form: (http://wpslive.pearsoncmg.com/cmg_forms_library/116/29779/7623431.cw/index.html) to make the eText title available for customer access.

B. Non-modal audio:

§ The Business Unit (BU) creates a Custom Audio Spreadsheet. For details on this spreadsheet refer to:
http://cmsna.pearson.com/groups/etext/wiki/c50b9/Production__Higher_Ed_School__Spreadsheet_Workflow.html. The BU can specify whether the audio is accessed via:

v Highlighted text: User clicks on specific region on the page to play the audio

v Icon: User clicks on the multimode eText audiotext synch icon to play the audio.

§ The BU submits an eText Conversion Request form.

For School user: http://cmsna.pearson.com/groups/etext/wiki/89106/School_USCanadaERPI_eText_Conversion_Request.html

For Higher Ed:

http://cmsna.person.com/groups/etext/wiki/b3bca/HEINTL_eText_Conversion_Request.html

§ BU uploads all assets including the Custom Audio Spreadsheet.

§ The Production vendor ingests the audio spreadsheet.

§ The BU checks to see that icons and/or regions have been correctly positioned and linked to the appropriate audio file. Highlights and icons can only be repositioned by the production vendor through the ingestion of a modified spreadsheet.

§ Once the eText is final and has received BU approval, the BU submits the Promotion Request Form: http://wpslive.pearsoncmg.com/cmg_forms_library/116/29779/7623431.cw/index.html to make the eText title available for customer access.

C. Non-modal audio with synchronized highlighting:

§ The Business Unit (BU) contacts VST and agrees on the scope of work and timelines. VST will need the following:

v book title

v page count/estimated word count

v synch type (word/sentence, word/paragraph) See examples included at end of this document.

v audio recording information i.e. will VST be recording or will BU deliver audio files?

v dates needed for final delivery to Pearson

§ The BU submits an eText Conversion Request form:

For School: http://cmsna.pearson.com/groups/etext/wiki/89106/School_USCanadaERPI_eText_Conversion_Request.html

For Higher Ed:

http://cmsna.person.com/groups/etext/wiki/b3bca/HEINTL_eText_Conversion_Request.html

§ BU uploads all necessary components (cover pages, icons, pages, and spreadsheets) to the Production Vendor.

§ Upon the successful ingestion by the Production Vendor, the BU reviews the pages in Authoring Server and requests final changes to pages. Once the pages are final, the Production Vendor uses eText Content Manager to create a “package” which contains the following:

v PDFs: Low-res, cropped PDFs of the entire title.

v SWFs: The swf files generated from the lo-res pdfs by the Pearson eBook team.

v Manifest XML: Each book requires a file named cmmanifest-audio-text.xml, which is generated by the Pearson eBook team.

§ If the BU is providing prerecorded audio, the files must be uploaded to the VST FTP server as MP3 files. The sample rate must be 44.100 kHz, to avoid a known bug with Flash Player 9.0.115.

§ If VST is recording the audio, choices of voice talent will be posted and a style guide will be created for the recording and syncing.

§ VST produces the agreed upon scope of work. This may include recording, “chunking”, and creating the XML that provides the synchronized highlighting. VST works directly with the BU to preview and QA audio using VST tools.

§ Once completed, VST uploads the audiotext synch page XML files to the eText Content Manager and uploads the corresponding audio files to Media Server. VST then notifies the BU and the Production Vendor when this is completed.

§ The Production Vendor ingests the XML audiotext synch page content. If they run into any errors, the Production Vendor will consult with VST for resolution.

§ The Production Vendor alerts VST and the BU when the title is ready for final review.

§ The BU provides the Production Vendor with final approval via email notification or continues with additional production work as needed until the eText is complete and ready for promotion.

§ When final, the BU submits a Promotion Request form http://wpslive.pearsoncmg.com/cmg_forms_library/116/29779/7623431.cw/index.html to promote eText to PROD1 (HiEd/Int’l) or PROD2 (School).

Important info about vendors, costs, and timelines

Business Units (BU) can work with their vendor of choice to record the audio files. VST is currently the sole provider for Pearson Education for audiotext synchronization services.

Please note that VST has a professional audio recording studio on their premises. If a BU hires VST for both audio recording and audiotext synch services, there is a price reduction for synching.

BUs must contact VST directly and negotiate specific terms of contract such as project scope, costs, and timeframes. However, there is a standard rate and schedule that usually applies (below).

Schedule

Please allow 4 weeks from the time VST receives final audio files and the eText information package from HOV. Please allow 1 week for QA, corrections, and deployment to Pearson servers.

VST’s Quoted Pearson Rates – As of October 2011

Word-by-Word Sync

These rates apply regardless of background highlighting treatment.

VST recorded audio / English / $ 0.05 / Word
VST recorded audio / Spanish / $ 0.07 / Word
Non-VST recorded audio / English / $ 0.08 / Word
Non-VST recorded audio / Spanish / $ 0.10 / Word

Block Sync

VST recorded audio / English / $ 0.75 / Block
VST recorded audio / Spanish / $ 0.90 / Block
Non-VST recorded audio / English / $ 1.05 / Block
Non-VST recorded audio / Spanish / $ 1.20 / Block

Non-Synched Link on Page

VST recorded audio / English / $ 0.75 / Link
VST recorded audio / Spanish / $ 0.90 / Link
Non-VST recorded audio / English / $ 1.00 / Link
Non-VST recorded audio / Spanish / $ 1.25 / Link

In addition to the fees outlined above, there is a $100 per title deployment fee. A title is defined as a standalone file deliverable to the eText Production Vendor and includes demos, full product, and/or revisions. This is a $50 HOV charge for ingestion of audio sync content.

Titles are subject to the following minimum fees in cases where the word count thresholds are low.

English: $250 per title
Spanish: $300 per title

Vendor Contact Information

VitalSource Technologies Inc. (formerly VPG Integrated Media)

200 Portland Street

Suite 201

Boston, MA 02114-1701

617-523-1770

www.vpg.com

Karen Greenleaf is the Pearson Business Development Manager. She can be reached at or 617-523-1770 x295.

Frequently asked questions

1. What are the accepted audio formats? Audio must be stored in MP3 format.

2. How can I convert existing audio into MP3 format? For School, BUs can use the digital conversion team in Chandler. Higher Ed’s Video Production Services offers a similar capability.

3. Does the eText use a synthesized voice to “read” the text? No, eText audio is based on actor-recorded audio assets provided by the BU and does not provide OCR with speech generator.

4. What are the additional costs? Total costs depend on services required. In addition to VPG charges outlined above, HOV charges $50 to ingest the audio synch content from VPG into eText. We recommend you check with HOV or other Production Vendor on all fees and charges at start of project.

5. Who creates the audio content? This is a BU decision. Most BUs deliver actor-recorded audio files to VPG. VPG also offers audio recording services.

6. Where are the audio files stored? Audio files should be hosted on the Pearson Media Server.

7. Can I have audio with CD-ROM/OCAS/PSN package eTexts? Yes, packaged versions of eText can include audio and other media assets. When the Production Vendor creates a final package, all assets including audio are typically harvested from the Media Server and included in the CD

8. Are there different system requirements for eTexts with audio? No, the system requirements are the same.

9. How long does it take? This depends on the BU contract with VPG and the BUs ability to coordinate activities among the various teams involved including eText, HOV, VPG, and the BU.

10. Can files play automatically on page open? No, the user must click on an audio icon or an appropriate highlight to play a modal- or faceless-audio linked file. For non-modal audio, the user must first shift into non-modal audio mode by clicking the “hippo” or megaphone icon and then clicking an audiotext synch button or an appropriate highlight.

11. What are differences in audio between the Elementary and Standard user interfaces? The only difference is the icon used to activate audio player. For the Elementary UI or “skin”, the audio icon is a purple Hippo. For the Standard UI, the audio player icon is a megaphone. These default icons are part of the eText UI and cannot be replaced.

12. Can I use non-modal audio in Whiteboard mode? No, eText does not support the use of non-modal audio in Whiteboard mode. If a page have non-modal audio clips, the non-modal player icon will not appear.

13. Can I use non-modal audio in iPad? Yes. eText for iPad v1.1 added support for non-modal audio. On any page with associated non-modal audio, an audio icon will appear in the upper right hand corner which user can click to open player and play the audio.

page 3