C-LARA

An AI collaborates with humans to build a language learning app.


Improving C-LARA’s TTS

In her post two days ago, Cathy rightly said that the free Google TTS voice, which we’ve previously been using as our default, is not very good. I am in the middle of experimenting with connecting up various pieces of OpenAI software to C-LARA, so I thought I would make their TTS engine available too. You can now use it if you go to the ‘Audio Processing’ view (this was previously called ‘Human Audio Processing’) and select OpenAI and a voice. The one I like best is ‘Onyx’:


I tried this on the first two English texts from our ALTA exercise, “Bible” and “Children’s story”. They sound much better than they did with Google. I was curious to see how Onyx would do on a tougher assignment, and gave it Othello’s soliloquy from Scene V. It is not perfect, a human voice actor would have delivered it with more feeling, but it’s reading the segments in isolation, one at a time, and I think it’s done a pretty good job under the circumstances. By the way, I was unable to get an image when I used the direct route through C-LARA: it reasonably complained that the content (murder, necrophilia) violated its constraints. However, when I then went to the ChatGPT-4 online interface and explained the background, it was happy to oblige. Chat loves literature.

Cathy is not yet satisfied, since OpenAI gives you no control over the voice and it always comes out as American-accented. But she’s been looking around: there are several other TTS engines available which do give you the requisite control, and seem as good or even better. We just need to select one.

TTS technology is clearly advancing at a prodigious rate, like the rest of AI. More about this soon.


As an even tougher test, I gave it the first page of the General Prologue from the Canterbury Tales; you can see the result here. I am not convinced by Onyx’s reading, but on the other hand I’m sure many humans would do no better. Also, I don’t really know how you’re supposed to pronounce Middle English, and it’s possible that the AI is acquitting itself better than I imagine.


I thought I should post a few proper examples where we use the OpenAI TTS in languages other than English. Here are links to Dinosaur Love Story (French), a revised version of Willkommen Finley (German) and Fondazione e impero (Italian).

In fact, the French sounds quite good to me, though I’m dubious about the German and Italian. What do other people think?



One response to “Improving C-LARA’s TTS”

  1. I have now made a post about other options.

    Like

Leave a comment