Text-To-Speech Services: Choosing ROKTalk over BrowseAloud

This post briefly describes ROKTalk and BrowseAloud, and discusses why ROKTalk was chosen over BrowseAloud for an upcoming Web Site for people with cognitive disabilities.

That Web site will be a report for The Massachusetts Department of Developmental Services.  It will be published by The Eunice Kennedy Shriver Center, for which I work.  The project staff and I received good Webinar-based demonstrations from company representatives.  We saw BrowseAloud and ROKTalk in action on production Web sites.  We also paid careful attention to their usage instructions provided for the benefit of site visitors.

Description

BrowseAloud and ROKTalk read Web pages aloud.  Web sites incorporate these text-to-speech (TTS) services for use by site visitors who find reading difficult.  Common reasons for this include intellectual disabilities, learning disabilities, mild visual impairment, and/or illiteracy.

Both services offer other features, such as increasing text-size, highlighting text as it is read, translation of text to other languages, and changing the background color.  These features can be activated from a floating toolbar, the appearance of which can be invoked by the site visitor.

An image of the BrowseAloud toolbar is displayed below.Browse Aloud toolbar has 7 multi-colored buttons in a horizontal strip.The ROKTalk Toolbar is pictured below.  The image is significantly smaller than the toolbar’s actual size.

ROK Talk toolbar has feature sections with icon-based buttons in each.

Choice

BrowseAloud requires Web site visitors to download and to install software.  For people who do so, an advantage is that BrowseAloud can be used for other purposes, such as word processing.  For Web developers, perhaps especially for inexperienced ones, an advantage is that no related scripting has to be incorporated into the Web site.

Neither of these advantages outweighed what the Shriver project staff and I believe to be the primary disadvantage.  Downloading and installing software would be too difficult for anticipated site visitors, some of whom are likely to have intellectual disabilities.  Likewise, we judged Browse Aloud’s usage instructions to be too complex.

For people who visit a Web site with ROKTalk, the TTS and other text-accessibility features can be used immediately.  No download and installation are required.  That said, it may be the toolbar itself is too complicated.  We may try a customized, less fully-featured version that presumably would be easier to use.

Notes

Upcoming Web Site to Include Accessibility for People with Cognitive Disabilities

I am working on a Web site that will incorporate two significant features with which I have experimented: text-to-speech (TTS) and plain language. The site will have other accessibility features for people with cognitive disabilities, text enlargement and text highlighting among them.

The site will be a report for The Massachusetts Department of Developmental Services (DDS).  It will be published by The Eunice Kennedy Shriver Center, for which I work.  Because the constituency of The DDS is people with intellectual disabilities, Shriver project staff would like the report to be as accessible to them as can be afforded at this point.  I have thus been in discussions with representatives of Web-accessibility technology companies.

Web Accessibility Technologies

An accessible content management system (CMS) from WebCredible has been purchased for the Web site. WebCredible reports that, in addition to its purpose of creating accessible Web pages, the CMS includes two other features, ones that attracted me to it.

  • Its back-end, content-management interface is itself accessible; and
  • “… content editors are forced to … produce accessible and well-written page content …”.

I will begin using the WebCredible CMS next week.  Future blog posts will describe what I learn about it.

The Shriver project staff and I are considering two other products from The United Kingdom: BrowseAloud and ROKTalk.  Each provides TTS and text-accessibility features for Web sites. I have mentioned both products in previous blog posts, and reviewed the one from ROKTalk.  A future post will describe which we choose, and why.

Plain Language

The report to be published is long and contains complex information.  For the home page of each section, we plan to write a plain-language version of the section’s main points.  I am concerned about doing this well because, as I have said before, writing “easy” text is not so easy.  Our related efforts will also be the subject of future blog posts.

Note: No endorsement of the above-mentioned products is expressed or implied.

Text-To-Speech Experiment & Evaluation: Cognable Speeka

I created a test page for an experiment with Speeka text-to-speech (TTS), graciously provided by Simon Evans of Cognable. I plan to incorporate TTS into every page of the future Clear Helper Web site.

Background

Speeka, a free service, is a work in progress. It is not a polished, commercial product. It is one of many Mr. Evans is developing to improve accessibility for people with intellectual disabilities.  A brief description of each of his projects can be found on the Cognable home page.

I think Speeka’s initial implementation was on the the Web site of Inclusive New Media Design. INMD is an organization that, like me, is working to develop best practices of Web accessibility for people with intellectual disabilities. When I first saw Speeka, I immediately liked its small form factor compared to that of ccPlayer, which I have been using.

Appearance & Placement

Speeka is embedded throughout the INMD site in the top, right of the content section. It appears as the image below. Rectangular. 3 buttons: play, back, forward. A speaker symbol and the word 'listen'On my test page, it appears as the following image.3 buttons: play, back, forward. The words 'audio stopped' underneath

I too placed it in the top, right of the content section.  Of the Web sites I have visited that use a TTS feature, most embed it in a similar location.  Those that don’t place it on the bottom of their pages.

Configuration

Setting up Speeka in my test page was a simple affair.  I inserted the HTML code provided by Mr. Evans.  I needed only to change the referenced file name.  I made one addition; that of the application landmark role to Speeka’s container. This helps people with screen readers, who use WAIARIA, to identify it. Upon placing the test page on the Clear Helper Web site, I invoked a hyperlink Mr. Evans provided to inform Speeka of the page’s presence.

Voice-Narration

I configured Speeka so it reads only primary content.  It can be set up to read all the textual content of a page, including menus, but I suspect it would be tiring to listen to the same menu over and over.

I chose to use a natural sounding, British male voice. [Edit on 2010-01-31: The voice is now an American one.] The test page it is reading contains text written as simply as I could at the time. Its pronunciation of the words and the sentences is very good. It had no problem with my last name.  I will have to test it with more complex text and with unusual proper nouns.

It announces every heading with the word “heading”; each list item prefaced by the word “bullet”; and the beginning- and the end of every list.  I was surprised. This feature is the first I have experienced with a TTS application.   It may be useful, but I think it would better serve as an option. [Edit on 2010-03-14: Announcement of list bullets, beginnings and ends is now an option. It is not active on the test page.]

General Navigation

The three-button interface is simple.  The audio narration can be played and paused with the same button. The forward button advances the narration by six seconds; the back button rewinds it by four.  Suggestions:

  • Perhaps it would be better if the forward- and the back buttons advance and rewind to adjacent sentences.
  • An option to restart the narration from the beginning may be helpful.  The only way I could do it was by refreshing the page using the Web browser.
  • Audio- and visible text labels for the buttons are a necessary feature, I think. An example can be found in a BBC Flash Player designed for people with intellectual disabilities.  It can be seen on the BBC’s Us 5 site, by clicking the link “Launch Us5 videos in pop-up windows”, then by selecting an actor.

Keyboard Navigation

Pressing the Tab key cycles through the buttons. The Space Bar or the Enter key invokes them. I had no trouble with this navigation within Speeka, but I could not tab inside the Web page to get to it. I could use the Tab key with Speeka only after changing focus to it by clicking it with my mouse.  This is not unique to Speeka.  I experienced the same with ccPlayer.  Keyboard navigation is important because many people with intellectual disabilities also have physical ones.  Such disabilities often preclude the use of a mouse, and require keyboard use or a single-switch device.

Interface Text

When the play button is clicked, the “audio stopped” text changes to a countdown of time until the end of the audio narration.  I think being presented immediately with the “audio stopped” text is potentially confusing.  I also think both it and the countdown test may not be necessary.

Speaka-Service Functions

Speeka converts Web-page text to MP3 files.  When a Web site visitor clicks the play button, the MP3 is streamed to the visitor’s computer from a Cognable server.  This is advantageous for Web sites that do not have a streaming-media server nor the bandwidth to support one.

A great feature of Speeka is it checks the text of each page on a regular basis.  When it detects a change, it updates the associated MP3 file.  Graphed statistics about this can be found on the Speeka home page.

Conclusion

Speeka has many nice features.  I think its inclusion on a Web site designed for people with intellectual / cognitive disabilities would provide site visitors with a significant accessibility feature. With all of Mr. Evans’ projects, I don’t know if he has the time to consider some of the options I have mentioned, but I plan to discuss them with him.

Note: No endorsement of Speeka or Cognable is expressed or implied.

Ray Kurzweil’s Blio eReader: New, Free & Accessible to People with CD

Ray Kurzweil is a giant in the accessibility industry.  He has been inventing reading machines and devices used by people with visual- and reading disabilities for 35 years.  His newest creation is the Blio eReader, digital-book-reading software.

Note: At the time of this writing, the Blio eReader is not yet available to the public.  However, in a CNET interview (video below), Ray Kurzweil says it will be within one month.

Blio eReader Feature Highlights

  • It combines full-color, digital content with Web content, video, and audio narration.
  • It runs on Windows computers, tablets and mobile devices such as the iPhone.
  • It is free, and has access to a million free books. (Presumably, there will be a store of books for sale.)
  • Its catalog includes “cookbooks, travel guides, how-to books, schoolbooks, art books, children’s stories, and magazines”.
  • Books can have interactive, multi-media content and quizzes.

Accessibility Features Good for People with Cognitive Disabilities

The Blio eReader:

  • reads books aloud via either an accompanying, human-read audio track or via a text-to-speech reader;
  • synchronizes its synthesized voices with “follow-along word highlighting”;
  • has adjustable reading speed and font size;
  • has a text-only mode good for minimizing distractions and also for displaying on small screens;
  • uses a “3D book view which includes realistic page turning”; and
  • can be connected to a personalized set of reference Web sites for “one-touch look-up of highlighted phrases”.

In the YouTube video below, CNET interviews Ray Kurzweil about the Blio eReader.  A demonstration of it begins at about 2 minutes, 23 seconds (point 2:23).  This video is not closed captioned.

References

Note: No endorsement of the Blio eReader is intended or implied.

Free Talking Firefox Extension for People with Cognitive Disabilities

CLiCk, Speak, created by Charles L. Chen, is an open-source, free extension for Firefox that reads Web pages out loud in a voice.  It is designed for sighted users with cognitive disabilities.

Interface

The image below shows CLiCk, Speak’s interface.  It is a toolbar with three, simple buttons for “Speak Selection”, “Auto Reading” and “Stop Speaking”.  Each has an accompanying image that can be enlarged and uses contextually-relevant colors.

Firefox toolbar of 3 buttons, each with a text label and image

CLiCk, Speak is mouse-driven. Web page text can be selected by clicking and dragging, then read out loud via a click to the “Speak Selection” button.  The “Auto Reading” button starts narration from the top of the page.  Each sentence is highlighted as it is read.

It does not identify elements or announce events, which Mr. Chen says is “… very annoying to sighted users,” but which would be important for people with visual disabilities.  This means fewer distractions from the primary, textual content.

Compatibility

CLiCk, Speak is compatible with Windows, Macintosh, and Linux; and has multilingual support.  Design information and source code are available for developers.

At the time of this writing, CliCk, Speak was last updated June 18, 2008.  It is compatible with Firefox 3.x.

Note: No endorsement of CliCk, Speak is intended or implied.

Web Browser for People with Intellectual Disabilities

Web Trek is a Web browser designed specifically for people with intellectual disabilities. Based upon research (more info below), it is sold by AbleLink Technologies as part of two software suites for $199 and $399.  The following image shows screen shots of Web Trek in the background, and its associated “Visual Search Site” in the foreground.

screen shots of browser & search site, showing picture-based interfaces

Highlights Of WebTrek’s Features

  • built-in screen reader that narrates Web-page text aloud in a voice;
  • facility to use a picture from a Web page as an oversize favorites button on the user’s home screen;
  • a single-click interface for buttons on the home screen; and
  • access to the “Visual Search Site” (link to screen shots), a picture-based, Web search engine.

WebTrek’s Prototype Features

The prototype included the following features.  The AbleLink Technologies Web site does not mention them, so I do not know if they are present in the current product.  I hope they are.

  • an audio prompt-description of a button when the cursor hovers over it; (This was set up to be similar to balloon help.)
  • an audio prompt following a user-initiated event, such as a click, to guide the user through the next most-likely step in a task; (This was designed to minimize errors.)
  • a minimum of buttons displayed, and only when the current task requires them; (An attempt to reduce clutter / distractions.) and
  • the user’s name displayed on the start button and on the start page.  (Personalization is its goal.)

Grant- & Pilot Study

The prototype was developed starting in 1999 with a grant from The U.S. Department of Education’s National Institute on Disability Rehabilitation Research (NIDRR).  On the AbleLink Technologies site are a summary of the grant and the pilot study’s detailed description in an image-based, non-accessible PDF.

Notes

Great Text Accessibility Toolbar for People with Cognitive Disabilities

I recently discovered Talklets, a text accessibility toolbar for Web sites that could be of great help to people with cognitive disabilities.  It can be seen in action on the Web site of Rok Talk, the developer, and on the Web site of Regional Support Centre, Scotland North & East.  Take a look at it on the latter site.  To do so, click the button entitled “Click to Show Text Reader” on the right of the home page, near the top.  The toolbar then appears at the bottom of the page.  The main part of it looks like this.

strip of round, colored buttons with symbols for play, stop, record, etc.

Features

Via simple buttons, the toolbar enables Web site visitors to:

  • listen to the text of the entire page or just to the text to which a user points the cursor;
  • record the text to a MP3 file that can be easily downloaded;
  • enlarge, reduce or restore the text size;
  • highlight the text in different colors; and
  • see a help window that explains how to use each feature.

Extra features include enabling users to retrieve the definition of any word, change the pronunciation of a word, and highlight words as they are read.

The developer says the toolbar does not interfere with screen readers, and can be used by people who are blind (and don’t have access to a screen reader) via keyboard controls.

Follow-Up

I will be contacting Rok Talk to discuss its pricing structure and to determine if it would be willing to let me experiment with the toolbar on the future Clear Helper Web site.

Note: No endorsement is intended or implied for this product.

Screen Readers, Web Site TTS Plug-ins, Etc.

For people who are blind or who have difficulty reading, there are a variety of solutions for converting text to speech (TTS).  This post is a follow-up to my brief look at Accessible Rich Media Players and TTS for Web Sites.

Screen readers are software programs that read out loud in a voice the text that appears on the computer screen.

Screen Readers For All Purposes

JAWS (Job Access With Speech) is the most popular.  I have been using its professional version, since its inception many years ago, to test Web site accessibility. Windows Eyes, Zoom Text and System Access are its closest, commercial competitors.

Free alternatives include Thunder and NVDA, which is growing in popularity.  Others are built into Windows (see Microsoft assistive technologies) and OS X (see VoiceOver, the well regarded screen reader that is part of Apple assistive technologies).

Screen Readers For The Web

WebAnywhere and FireVox, which is a Firefox extension, work only for the Web.  Both are free.

  • WebAIM Screen Reader Simulation provides a way to experience what it is like to use a screen reader.
  • Fangs, a Firefox extension, is a screen reader emulator that recreates a Web page similar to how it would be read by screen reader.

Text-To-Speech Plug-ins for Web Sites

Many Web sites offer their visitors TTS capability.  Visitors are required to download and install a software plug-in.  Once that is done, visitors are able to listen to the text on any Web site that uses the same TTS technology.  One popular example is BrowseAloud.  Its costs, which are for the Web site owner, are not listed on its Web site.  It does have a free trial.  Another example is Speaks For Itself. It appears to be free, but it seems it has not been updated recently.

Miscellaneous, But Related

ClaroRead, PenFriend and EasyTutor have screen-reader functionality, but are intended more for helping people read and write.

There are also screen magnification programs such as Magic (commercial) and Virtual Magnifying Glass (free).

Accessible Rich Media Players & TTS for Web Sites

At this point, I know of a few Web-based, rich media players that have various levels of accessibility, and a few possible text-to-speech (TTS) solutions.  This post is a follow up to my earlier one about Ideal Criteria for People with Intellectual Disabilities to Listen to Web Text.

Accessible Rich Media Players Embeddable in Web Sites

  • JW Player with the JW Controls Accessibility Plug-in. It plays MP3s and videos, including YouTube videos.  It is skinnable and re-sizable.   It has an intriguing Google Analytics plug-in that enables tracking of which videos are watched and for how long.  It’s cost is low.  It may be the most promising for the Clear Helper Web site.
  • Nomensa Accessible Media Player. Its Web site’s description says it plays videos and audio, but not which formats.  It does say it plays YouTube videos.  There is no mention of skinning or resizing capability.  The site says its cost is low, but there is no pricing information on it.
  • CodePlex Accessible Media Player is Silverlight-based, Microsoft’s Flash competitor.  It is in its first release.
  • Easy YouTube is designed to play YouTube videos only.  It has an interface with easy-to-use controls and big buttons.  I am not sure if it has any built-in accessibility features.

Text-To-Speech For Web Sites

  • TextAloud converts text to MP3s and has natural-voice fonts.  TextAloud can be used on a Web Site and can generate text on-the-fly.  Its cost is low, but the cost of its compatible natural voices for the Web start at $1500.
  • Cognable Speeka converts Web text to MP3s.  It uses open-source voices only.  It may have its own embeddable player, but there is no information about its accessibility.  It was developed specifically for people with intellectual disabilities.  No cost is listed on its Web site.
  • SpokenText converts Web text to MP3s, but does not appear to have an on-the-fly generation capability.  It has a variety of low, annual subscription costs.  SpokenText also has a Firefox extension.

Other TTS Applications

To manually convert Web page text to MP3s, there are non-Web TTS programs I could use, such as Alive Text To Speech or SpeakText.  Previous posts mentioned other such programs I have used, but I will likely try both  TextAloud or Cognable Speaka.

I found a couple of Web-embeddable TTS applications with avatars (talking, lip-syncing characters).  I might experiment with CrazyTalk or Cognable Avatar TTS.

Ideal Criteria for People with ID to Listen to Web Text

People with intellectual disabilities, as do many people, have trouble reading.  For the Clear Helper Web site, I am considering setting up a way for people to listen to its textual content.  Here are my ideal criteria for this feature.

For Web site visitors, I would like:

  • No need for a screen reader.  They are complicated, are very expensive and, at least to my knowledge, are generally not used by people with intellectual disabilities.  (Of course, I will make the Clear Helper Web site compatible with screen readers.)
  • No need to install related software (e.g. a plug-in or a Web browser extension).  This too, in my opinion, would be too complicated.
  • An easy way to play the audio version of the text.  For instance, a click to a standard “play” button that appears and acts the same way on every page.

For the Clear Helper Web site, I would like:

  • A rich media player that:
  • Sound files that:
    • can be played both by an embedded rich media player and later by the user on any computer.  MP3s come closest because of their ubiquity and the wealth of software, free and commercial, that play them;
    • either are generated on-the-fly for dynamic text or are automatically updated any time static text is edited; and
    • use natural-sounding, royalty-free voices;
  • Video files that:
    • are embeddable into a Web page without the need for Flash (HTML 5 has that promise, but it’s probably a long way off.)

Am I asking too much?  Well, my research so far has revealed the solutions that come closest to these ideals.  They will be the subject of my next post.