The main thing I’ve been doing this week is still the report, but I’ve also been improving processing for human-recorded audio. This will be important when Stéphanie and Anne-Laure visit Mar 10-17, since the plan is to create content based on the Iaai songs that Stéphanie has recorded and transcribed.
Report
The text continues to grow. We now have over 75 pages, though there are still a large number of holes to be filled.
We have decided to put back the deadline for releasing the report to Mar 18, partly so that we can include material about the work on Iaai.
“Paul und Emma”
We got good feedback on the “Paul und Emma” example from Christèle, a germanophone student of hers, and a germanophone friend of Pauline’s. It seems the text is almost completely correct (there is one clear error). We got some interesting comments on the images.
I’m wondering if we can use this as a model for evaluating the quality of the texts that C-LARA is producing, it would be good to discuss.
Using human-recorded audio
I have improved the handling of human-recorded audio so that it now optionally takes account of context: you can have two different pieces of segment audio that correspond to the same piece of text. This is essential for songs, where lines are often repeated but sung differently.
C-LARA offers you two ways to add segment audio. The simple one is the same as before: create the audio files and upload them individually. I’ve now also included a port of the more efficient method from our ComputEL-6 paper, where you use Audacity to mark the segment boundaries in the audio, and C-LARA automatically cuts up the mp3 for you. I am planning to add documentation tomorrow in one of the appendices to the report.
You can see an initial example here, a C-LARA version of Nancy Sinatra’s classic hit “These Boots Were Made for Walkin’”. It’s a bit of a tangent, but I was rather impressed with GPT-4’s cleverness in annotating this text. It started by constructing an image which IMHO fits the lyrics perfectly:

The French glossing of the highly colloquial American English has also been done remarkably well. For example, the line
You | keep | samin’ | when | you | oughta | be | a’changin’
is glossed as
tu | continues | à faire la même chose | quand | tu | devrais | être | en train de changer
We now all take GPT-4 for granted, but just stop and think about that for a moment. I have left the glossing unedited for people who are interested in checking the details.
Next Zoom call
Thu Feb 29 2024, 20:00 Adelaide (= 09.30 Iceland = 09.30 Ireland/Faroe Islands = 10.30 Europe = 11.30 Israel = 13.00 Iran = 17.30 China = 20:30 Melbourne/New Caledonia)
Leave a comment