Although the domain is still dominated by the big tech companies, the growing market is far from exhausted. The potential applications range from chatbots to the automatization of audio content in news media. One obvious example of a TTS application for news media is a Read Aloud feature for a news article that makes it possible to listen while commuting. In fact, this feature is already offered by some newspapers, though they mainly use external TTS services to power their applications.
As Europe’s largest digital publisher, we decided to go one step further by building up our own technology in-house and creating a custom brand voice instead of relying on third-party providers. In this post, we give you a brief introduction into the mechanics of modern neural speech synthesis. We also share some insights about creating our own TTS technology called ForwardTacotron, a TTS solution that is specifically focused on robust and fast speech synthesis. In fact, we even open-sourced it. You can find the ForwardTacotron project on GitHub.
Amazon Echo, Google Home, Apple HomePod – There are many smart speakers these days and those smart devices are an important channel for Axel Springer. That’s why we’re building a text-to-speech model that synthesizes speech from text with a particular voice and emotion for the German language. This is pretty useful to read out all our amazing articles from our media outlets such as WELT and BILD automatically.
Schäfer, Christian; Tran, Dat (2020): Creating Robust Neural Speech Synthesis with ForwardTacotron. Online verfügbar unter https://developer.nvidia.com/blog/creating-robust-neural-speech-synthesis-with-forwardtacotron/, zuletzt geprüft am 27.11.2020.
Axel Springer Ideas Engineering GmbH (2020): Portfolio. Text-to-Speech. Online verfügbar unter https://www.ideas-engineering.io/portfolio/2020/2/text-to-speech, zuletzt geprüft 28.11.2020.