How Our Technology Works

ROK TALK | Text to Speech and Accessibility Technology for Websites

ROKTalk from ROK Talk is a high quality Software as a Service (SaaS) text-to-speech system for websites, which reads out web text on demand in a clear, life-like voice.

The technical benefits of using ROK Talk's technology on a site include voice quality, speed, accessibility, interface and scalability, plus easy integration for website owners and no download needed by end users.

Using the SaaS model means that vocalisation accessibility is available on demand to all visitors to a website without them having to install an application or any special software or facilities.

From the perspective of the website owner, the SaaS model means minimal integration effort and no installation or hosting overheads as all speech processing is conducted remotely.

Advanced technology

ROKTalk uses AJAX technology and other linked technical processes to convert, with minimal latency, text elements on an HTML web page into clear sound files and return them to the user.

This allows web text to be ‘read out’ in what is, effectively, real time. This increases website accessibility and usefulness for many visitors who might have difficulty reading due to vision impairment or challenges such as dyslexia.

Some visitors may simply prefer listening to text content, time-shifting it, or transferring it to a device such as an iPod for listening on the move.

ROKTalk can be deployed on any website in a variety of voices and languages and can operate with custom voice fonts with specialist lexicons to optimise the handling of specialist words, phrases and abbreviations.

Toolbar or Toolbox

The vocalisation system is highly flexible and configurable and can be controlled by a variety of methods to suit the requirements of the website owner and sit neatly with the design and configuration of the web page.

A standard control Toolbar can be deployed at the bottom of enabled web pages, but many users prefer to access speech via flexible Toolbox elements added as integrated control buttons anywhere on the page, offering the opportunity to ‘hear this page’, or ‘save this page as MP3’.

Users of the Toolbar method have the option of a show/hide button so the vocalisation service and thus the control Toolbar is only displayed if users switch it on.

Simple integration

Whatever method or design of vocalisation control is used, a small element of Javascript must be added to each page to be vocalised. This is a straightforward process similar to the way services such as Google Analytics might be added to a website.

Requests for conversion of web screen text are referred to the ROKTalk server(s) simply by ‘pointing’ using the mouse, or highlighting and then clicking the ‘play/save’ control, or, uniquely for SaaS of this type, by use of keyboard ‘hot keys’ rather than via mouse controls making ROKTalk genuinely accessible even to those with little or no vision.

Remote processing

Speech requests are processed by the server-located voice engine and the associated voice conversation systems. Various layers of additional processing rules are applied at this stage to tune pronunciation and to deliver the correct language, voice and specialist dictionary. Processed text is converted into .WAV files and then into .MP3 files and streamed back to the machine from which the request originated. All this happens in a few fractions of a second.

The AJAX control triggers the reading of the sound file through a hidden Flash object that is transparent to the user. A mixture of real-time processing and cache retrieval allow scalability and robustness to support numerous concurrent user sessions.

Text can be listened to in real time or saved as an MP3 for later listening or transfer to a mobile device, such as an iPod or mobile phone.