I was hoping to be able to report progress on both Multi-Word Expressions (MWEs) and images; but the MWEs have been more challenging than I had anticipated, and I ended up focussing on them this week. We are however making good progress.
Multi-Word Expressions
Working together with the AI, I started off by implementing a method for tagging segments with MWEs, closely following the recipe we used for glossing and lemma tagging: GPT-4 is passed a JSON structure containing the words in a segment, and asked to return a JSON structure containing a list of the MWEs the segment contains. We provide a set of few-shot examples illustrating typical input/output pairs. However, the results were disappointing.
Discussing further with the AI, I asked it whether it thought it might help to change the specification of the task so that GPT-4 was asked to ‘think aloud’, and begin by writing an analysis of the segment, considering each phrase that could potentially be an MWE and listing reasons why it is should or should not be counted as one; at the end, the phrases identified as MWEs would be collected together and returned as a JSON list. The AI thought this was promising, and when we tried using the strategy interactively in the ChatGPT-4 web interface we found that things did indeed improve. I now have an initial implementation running on my laptop, and it looks much better than the first version. I should have this installed on the server soon, so that people can experiment. If things hold up, it’s natural to wonder if the ‘think-aloud’ approach can also be used to improve performance on other annotation tasks.
This might make a nice paper for the ALTA 2024 conference (deadline sometime in September). It would be a natural continuation of last year’s paper, where we identified MWEs as the most important problem when doing GPT-4-based annotation.
Next Zoom call
The next call will be at:
Thu May 9 2024, 18:00 Adelaide (= 08.30 Iceland = 09.30 Ireland/Faroe Islands = 10.30 Europe = 11.30 Israel = 12.00 Iran = 16.30 China = 18:30 Melbourne = 19.30 New Caledonia)
Leave a comment