The new C-LARA image-generation functionality is making excellent progress, and the latest thing I added is so cute I just have to share it with you all. A couple of weeks ago, Fabrice said it would useful for picture dictionaries if we could mix AI-generated and uploaded images. The rationale is straightforward. For words that refer to non culturally specific concepts (generic nouns, colour adjectives, basic action verbs…), AI-generated images work well; but for words that are highly culture specific, like local flora and fauna, the AI often struggles to create a good image, and an uploaded image is usually better.
It was easy to implement Fabrice’s suggestion, but then I started wondering if we could combine the two kinds of images in other ways. In particular, we have a control in the new image editing interface that tells the AI to create variants of an existing image. For AI-generated images, it’s clear what this should do: the AI takes the prompt used to create the image and runs it again. But what should “Create variants” do with an uploaded image? There was a possible way to answer the question which to me seemed logical. The AI could analyse the uploaded image to create a text description; it could then combine the description with existing descriptions of the style and the recurring elements in the text. Then it could use the result as the prompt, to create a new image based on the uploaded image but in a style consistent with the AI-generated images.
I just tried it on the toy text, “Boy Meets Girl”, that I’ve been using to test the functionality. I specified illustrations in manga/anime style, and the AI’s image illustrating that style looks like this:

The images illustrating the AI’s generated descriptions for the two characters ‘Boy’ and ‘Girl’ look like this:


Now I uploaded one of the top images I found for “Boy Meets Girl” on Google Image Search,

and hit “Create Variants”. Here’s the result:

I don’t actually know if it’s useful yet, but it is rather agreeably magical. Curious to know what people think!
Leave a comment