This is the second article in a series on being creative with VQGAN+CLIP, outlining some techniques and ideas that work for me when using art genre prompts. The models we're using are trained on pretty much the entirety of art history, so there's vast array of movements, artists and styles hiding in there, just waiting to be used. All we need to do is find them.
I'm going to assume you know the basics of using VQGAN+CLIP with Google Colab, so please do read this guide first if you're just getting started. I’m also using Pytti 5 as it’s my preferred notebook, but the concepts I’ll be talking about will work with any VQGAN+CLIP notebook.
Artist modifiers work in a similar way to style modifiers – adding artist’s name, art movement or genre to your prompt will cause your generated image to take on aspects of that particular artist or style. As with other modifiers, the more specific you are, the more coherent the result will be.
While the results are similar to another AI technique called Neural Style Transfer (NST), where a neural network takes the characteristics from one image and applies it to another, VQGAN+CLIP works differently. The models are trained on many millions of images so the system is able to generate an image in pretty much any style.
It doesn’t stop there though. The system also associates the content of the artist’s work, and if they’re well known, what they look like as well. If you specify a portrait painter, you’re going to get faces, if you specify Simon Stålenhag you’re going to get sad robots, and if you specify Bob Ross you’re going to get happy trees with big ginger hair as the system can’t differentiate between Bob Ross himself and his artwork.
However, making an image in another artist's style might be fun but it's not particularly original. The real creativity comes from mixing art movements and artists to make something new. The artists and styles themselves become your colours and palette.
Selecting the right styles
When planning your prompt, think about the effects you want to achieve in your image and find artists or styles that reinforce them. These choices will effect the final image beyond just the superficial style, so adding an artist who is known for intricate work will result in a more detailed image, while a sculptor or 3D artist will give the image more dimensionality. The trick is to find artists that compliment your subject as well having the style you want.
One of my favourite techniques is to bring together different styles to see what happens - for example a brutalist building covered in flowing, tangled art nouveau vines, or ornate Rococo ornamentation on a computer control panel. Sometimes the results are a mess, but other times the results are startlingly good.
Here’s some things to think about when selecting artists:
Colours – What colours do you want in your image, and are there any artists who use colour in a way that you like? For example Georgia O’Keeffe is known for large abstract colourful watercolours – what effect would adding her have?
Materials – What do you want your image to be ‘made’ from? Paint, stone, paper etc? Which artists work with those materials? Byzantine religious art uses a lot of gold leaf – what effect would that create?
Shapes – What forms do you want in your image? Art Nouveau creates long tangled intricate lines, while Henry Moore creates smooth, rounded abstract shapes.
Discovering artists and styles
Since discovering VQGAN I’ve learned a huge amount of art history by researching artists and genres for my images. WikiArt is a good starting place for this as they have a large database of artists and it was used for training the models. If an artist is listed then they are likely to be present in the datasets.
Reducing unwanted effects
Sometimes the content of an artist's work, such as faces or writing will manifest in your image in unwanted ways, and if the artist is well known their portrait may start appearing too.
You can mitigate these effects by using zero or negative prompt weighting, for example by adding 'face:0' or 'face:-1' to stop CLIP from creating faces, but it's not always successful. You might just end up with a weird fleshy nub where CLIP really wants to put a face. The best thing to do is stop generating before they manifest as they get progressively worse the longer you let it run for.
On the left image, CLIP has confused 'By William Morris' for a picture of William Morris himself. In the right image was also trying to turn the flowers into Morris-heads, but I attempted to stop that from happening by specifying 'face:-1'. As you can see it's not been wholly successful, leaving behind strange half-head-half-flowers with blank areas where the face should have been.
'By' vs 'in the style of'
There's no set way to add an artist to your prompt. 'By Georges Seurat' is the most direct, but 'In the style of Georges Seurat' works too, though it has a slightly different effect. It's worth experimenting with different ways to find which one works best for the artist. If 'By...' is not working or has too many unintended effects, then you could try another way.
As always, the fun of AI art is being able to experiment and try new things to see what happens and make something new. What will you create?
Next: Materials, colour and lighting.