C-LARA

An AI collaborates with humans to build a language learning app.


Weekly summary, Aug 29-Sep 4 2024

We are continuing with two large items from the priority list: adaptation of the reinforcement learning/Chain of Thought idea to C-LARA, and support for non-AI languages. The Palgrave Encyclopaedia article will be published in the EUROCALL proceedings and also as a ResearchGate preprint.

Priority list

Reinforcement learning and Chain of Thought for MWEs. We have made further progress on adapting the reinforcement learning/Chain of Thought method from the Tic-Tac-Toe paper to the task of annotating multi-word expressions:

  • We now have scripts to split Francis’s MWE-annotated Sherlock Holmes corpus into train/dev/test portions and convert the result into C-LARA compatible form.
  • We have implemented two different similarity metrics to find few-shot examples created from sentences similar to the current one. One is based on OpenAI embeddings, the other on n-grams of POS tags.
  • We have extended the core annotation code. We now have functionality that lets us input a list of sentence, a set of few shot examples, and a similarity metric, and output MWE-annotated versions of each sentence using the few shot examples closest according to the similarity metric.

We are now very close to being able to try an initial learning experiment.

Support for non-AI languages. I have continued to discuss support for non-AI languages with Sophie Rendina. We have implemented functionality that automatically corrects syntax errors in manual annotation, though it’s not yet checked in. The next steps are automatic correction of misaligned text versions and rearranging the image generation process so that the user is obliged to carry out the steps in the right order. These are fairly simple changes, but watching Sophie try out the platform I can see that they will have a large impact on usability.

Encyclopaedia article

We will publish the Palgrave Encyclopaedia article in two versions: a minimally modified preprint on ResearchGate, and a more substantially rewritten version for the EUROCALL proceedings. Branislav has written a first draft of the EUROCALL version.

Next Zoom call

The next call will be at:

Thu Sep 5 2024, 18:00 Adelaide (= 08.30 Iceland = 09.30 Ireland/Faroe Islands = 10.30 Europe = 11.30 Israel = 12.00 Iran = 16.30 China = 18:30 Melbourne = 19.30 New Caledonia)



Leave a comment