Make your own amazing AI Art using neural networks

As a machine learning practitioner/student/fan since 2013, I’m always on the lookout for novelties. I first thought that Generative Adversial Networks were technically cool, but still a bit “procedural and predictable”. And then came the next generation of generative models, based on transformers. I was very impressed with GPT-3’s apparent creativity and used OpenAI Codex a lot, through the essential Github Copilot (If you write code and don’t use Copilot yet, I strongly recommend that you give it a try !).

From there was born the great era of limitless visual possibilities and super weird and surprisingly beautiful images: text2image models ! Large text-to-image models achieved a remarkable leap in the evolution of AI, enabling high-quality and diverse synthesis of images from a given text prompt. It started with VQGAN+CLIP, and then we had DALL·E (OpenAI), and then we had (the wonderful) Stable Diffusion (StabilityAI), Imagen (Google) and Midjourney.

After reading on those for a while, and trying out text-to-video, image-to-text, text-to-3D, and image-to-sound, I heard about Dreambooth and wanted to try. So I fine-tuned Stable Diffusion 1.5 using a few pictures of mine and saved the weights for later use (I used code from this Python notebook).

After that, I went here for inspiration on these two websites:

And generated a few hundred pictures. And then I was WOW.
Here’s a quick selection of fun pics:

Amazing isn’t it ? please note that Stable Diffusion 2.0 is on its way too ! That means the text2image saga is not finished yet and we’ll probably see further exciting improvements in efficiency and quality of generated content.

Make your own amazing AI Art using neural networks

So I've been playing with text2image models

Links: