It occurred to me today that we could make C-LARA a good deal faster. When we do annotation operations, we break the text up into chunks and perform a sequence of OpenAI calls, typically one per chunk. But in fact it doesn’t have to be a sequence, since those calls are in general independent of each other and they’re being executed in the cloud. We could instead create a bunch of asynchronous calls and execute them all in parallel.
I have just been discussing this with the AI, who knows asynchronous Python methods well – it has written all of that part of the codebase. It immediately confirmed that this should be fairly easy to do, and suggested some concrete approaches.
We should finish some of the tasks currently in progress before starting on this, but it would be a great thing to get back to a little later in the year.
Leave a comment