AI Interview with Martin Reber, CEO of SVOX AG
Imagine a world where you could communicate with any device as if it were a person
Imagine a world where you could communicate with any device as if it were a person. Tell your car what music you want to hear in the morning and ask for it to be changed by evening. And this after you have told your car where you want it to go while at the same time, getting your car-phone to dial your grocery store for the family’s daily needs. Much of this is already reality and the rest could soon follow if Swiss company SVOX AG has its way. SVOX is the world leader of embedded speech solutions for the automotive and mobile devices markets. It is deeply committed to a world where humans can talk to any device with flexible and open speech dialog being an important component of a multi-modal user interface.
For the automotive industry, SVOX solutions are used for devices such as infotainment systems and hands-free car kits. In these devices, speech is used to enable turn-by-turn directions, command and control functionality, music track selection, hands-free dialing and similar actions. SVOX mobile solutions are for handsets, smartphones, portable navigation and other mobile devices and there speech is used for device control, SMS and email reading, caller name announcement and so on.
SVOX is a university spin-off founded in 2000 – it was started by a group of researchers at the Federal Institute of Technology Zurich. Today, the company is a fast-growing privately-held company headquartered in Switzerland and with offices in Germany and the USA. Its clientele include names such as Audi, BMW, Asahi Kasei, Continental
Ferrari, HarmanBecker, Hyundai, Mercedes- Benz, Nissan, Nokia, Peugeot, Porsche, Renault, Volkswagen and Xanavi.
The company offers Speech Recognition (ASR), Speech Output (TTS) as well as complete Speech Dialog solutions. The company’s ASR product is tailored for embedded use and allows customers to enable reliable and accurate ASR in their automotive and mobile products. SVOX’ ASR products are offered in numerous language choices to enable a global reach for the products. “Available in a wide variety of footprints and configurations, SVOX Speech Recognition solutions can scale from simple mobile phones to powerful in-dash car infotainment systems – all using the same, uniform base technology. Porting to various embedded environments is straightforward thanks to a platform-independent, modular architecture,” says the company.
SVOX Automotive Text-to-Speech solutions are tailored for noisy car environments. The company says that its TTS systems are characterized by natural and clear sound as well as a unique polyglot capability – the same voice can speak multiple languages like a native speaker. Currently, SVOX’ TTS systems are offered in 24 languages and 35 voices. “Customers can use SVOX embedded TTS technology to enable speech in a variety of devices. Examples include mobile handsets, smartphones, personal navigation devices, hands-free kits and in-dash car infotainment, among others. Excellent portability and scalability are ensured due to a design that takes into account constraints inherent to embedded environments,” says SVOX.
Then there is SVOX speech prompts development tool SpeechCreate. This product was created after getting inputs from car manufacturers and automotive Tier-1 suppliers. This resulted in SpeechCreate being an easy to use tool to create recording-like prompts using TTS. The SpeechCreate software can be licensed so users can create their own prompts.
Automotive Industries spoke to Martin Reber, CEO of SVOX AG.
AI: In the beginning of the year SVOX has acquired speech processing unit of Siemens AG. Could you shed light on the rationale behind this acquisition?
Our goal was to make SVOX a one-stop destination for embedded speech. While company’s roots are in speech output (TTS), the acquisition has enabled us to offer a complete range of speech solutions – speech recognition (ASR), speech output (TTS) and speech dialog. The ASR technology we have acquired is proven and mature; it has been deployed in several car products, and constant improvements are making the system even more successful.
AI: Recently SVOX speech solution has become a part of Android mobile phone platform. Do you see Android and similar platforms becoming relevant for the automotive industry as well?
SVOX Pico text-to-speech solution is indeed part of the newest release of the Android platform. This solution is unique in the market for it offers a very attractive trade-off between low footprint and highly intelligible speech output. At SVOX we observe that the trend for open and often open-source software platform is spreading also to the automotive industry, with initiatives like Continental’s AutoLinQ and GENIVI being good examples. We see it as a part of bigger “convergence” trend, where the line between mobile devices and in-car telematics is increasing blurring. At SVOX, we ready to work with all this new players to speech-enable their platforms.
AI: Tell us a little about Microsoft and SVOX’ speech technology solutions for the automotive industry.
SVOX focus is firmly on embedded speech and solutions we offer are highly portable and platform-independent. Since Microsoft Auto platform is clearly a success, with Ford Sync shipments already exceeding 1mln units, SVOX made it a priority to optimize its offering for this platform. SVOX Automotive suite, a benchmark for quality in embedded speech, is a complete (TTS and ASR) solution available to all OEMs and Tier Ones using the platform. The suite allows customers differentiate themselves by making their Microsoft Auto-based in-car products more intuitive to use and thus more attractive to consumers.
AI: What are some of the new automotive solutions SVOX is working on?
At SVOX we are always looking at new things, striving to make speech solutions even more intuitive and easy to use. Vast experience in executing high-profile speech projects with our customers puts us in a unique position to innovate. The next “big thing” in speech user interfaces is flexible dialog – a solution that allows the user to communicate with the car as if it were a human being. For example, a user may ask “I would like to read something in the evening,” and the system will ask back “Would you like to download a book or visit a book store?” The dialog will then continue until the task is accomplished.
Plenty of challenges need to be overcome to make this vision a reality, and SVOX is at the forefront of these developments. We are collaborating with leading OEMs and TierOnes to bring flexible dialog to the market already in the nearest future.
AI: How global has SVOX’ reach been so far – and how do you hope to further expand your company’s reach?
SVOX business is global by nature – we have about 25 languages in our portfolio and our clients are major international OEMs and Tier Ones. We currently have offices in Switzerland, Germany and the USA and the team is truly international: there are more than 15 nationalities among our 100 employees. For SVOX, the most obvious next step would be expanding in other major regions, for example Asia.