Coarse is Better

When DALL-E came out, it took me a few weeks to pick my jaw up off the floor. I would go to sleep excited to wake up to complete the quota, with plenty of cues to try. It was magical, miraculous. Like discovering a new universe. I’ve compiled the best art in this post.

The other day a friend ran some of my old signals through Nano Banana Pro (NBP), and put the old models side by side with the new models. It’s interesting that after years of progress, models are much better at drawing, but very bad at making art.

Electron Shapes in the Style of Italian Futurism, oil on canvas, 1922, trending on ArtStation.

The old MidJourney v2 renders this:

Red and gold abstract shapes on a dark blue background.

NBP presents:

Muted, red and blue ellipses in front of mechanical background in golden frame.

Granted, MJ’s output doesn’t exactly look like futurism. but it seems
Some?It sounds compelling, The colors are bright and vivid, NBP’s output is studiously in the style of Italian Futurism, but the colors are very muted and dull,

Maybe “Trends on ArtStation” is a bit archaic and spoils the performance. Let’s try again without it:

Red, gold, yellow circles intersect, thick impasto, oil on canvas, the word ELETTRONICO written in black across the frame.

Meh.

Painting of a street in the Kowloon Walled City, Eugene Boudin, 1895, trending on ArtStation.

MJ gave me this:

Impressive painting of a city street with a canopy of trees overhead.

And it doesn’t look at all like Kowloon Walled City. but this is
BeautifulIt’s gross, impressive, vague, thought-provoking, contradictory, It is full of mystery, And indeed, it is in the style of Eugene Boudin, In contrast, this is the NBP output:

A silent painting of a commercial street in a Chinese city.

breath. It looks like every modern movie: so desaturated you feel like you’re color blind. Let’s try to force it:

Painting of a street in the Kowloon Walled City, Eugène Boudin, 1895. Make it gross, impressive, ambiguous, thought-provoking, contradictory, full of mystery.

A dark, muted painting of a commercial street in a Chinese city in the rain.

This is somewhat better, but why is it so dull and colorless? Is the machine trying to make me depressed?

Attar and Ferdowsi in the Garden of Dreams, Persian miniature, circa 1300, from the British Museum.

Midstream v2:

Above a cobalt blue landscape, on a green floating island, a man wearing a green cloak and a younger man wearing a golden cloak.

It looks like nothing at all. But it is beautiful and thought provoking. I like to imagine that there is a small patch of paint in the circle in the upper right. NBP Output:

Photograph of a typical Persian miniature in a display case.

Well, it looks like a Persian miniature. What I meant by the passage “from the British Museum” was that it should be interpreted suggestively rather than literally. prompt refer to An imaginary thing, bringing it into existence. But the NBP reads it like this: No, it’s a photo of a Persian miniature in the British Museum.

The Burning of Merv, 1896, by John William Waterhouse, from the British Museum.

Midstream v2:

A woman in a dress, surrounded by flames, black water, and a crowd watching.

It looks like Waterhouse. This is open to debate semantically: it looks like a woman is being burned at the stake, not a city being destroyed. But from an aesthetic point of view: it’s gorgeous. The flames are gorgeous, the red colors of the dress are gorgeous. Notice the reeds and dark water in the background, which looks like tarnished silver or zinc. The faces of the crowd. Is that a minotaur on the bottom left, or a flower? What is she holding on her folded left arm? A cross, a dagger? You can find the entire universe in this image, in this 1024×1024 frame.

In contrast, this is the NBP output:

Photograph of a painting of warriors on horseback outside the burning city. The photo shows that the painting is in a museum's display room.

What can anyone say? It doesn’t look like Waterhouse. The horsemen wear Arab or Central Asian attire, but Merv was sacked by the Mongol Empire in the year 1221. And, again, the “British Museum” line is taken literally rather than suggestively.

Portrait of Ada Lovelace by Dante Gabriel Rossetti, 1859, auctioned by Christie’s.

Mid-trip:

A portrait of Ada Lovelace in front of a dark green circle.

It is beautiful. It’s beautiful because the bold, impressive brushstrokes are more evocative than literal. And it really looks like a woman painted by Rossetti. And look at the greenery! Gorgeous green. The palette is very narrow, and the painting is very beautiful.

NBP Output:

A photograph of a 19th-century realist painting of a woman in a gilt frame, taken at an angle inside a gallery, shows Christie's action book on a table.

Pure altruism. “Auctioned by Christie’s”, again, is evocative: “This is the kind of painting that would be sold at auction”. But NBP makes it a photo of a painting in an auction house. Well, I think I got what I asked for.

But the woman doesn’t look like Rossetti! this is absurd. How can a 2022 model get this right, and the SOTA image generation model gives us the usual oil painting slop?

Herat, a Persian miniature of the Cosmic Microwave background, circa 1600, trending on ArtStation

Midstream v2:

A golden disc surrounded by concentric circles of Perso-Arabic letters on a dark blue background.

NBP:

Standard illustration of the CMB in the frame of a Persian miniature.

Then: What can anyone say?

Dream Story, 1961, blurry black and white photograph, yellow tint, from the Metropolitan Museum of Art.

This is one of my favorite DALL-E 2 outputs:

A photo of two trees in a dark forest illuminated by a sepia glow. Two people can be seen watching this scene at the lower right corner.

A sepia photograph, showing two girls on a bed and three men standing around them.

A blurry, blurry sepia photograph of an unidentified man and woman.

Sepia photograph: Three shadowy, almost alien-looking figures that may be a sculpture or painting.

they remind me king in yellowI love these because of how scary and mysterious they really are, You can come up with hundreds of these horror stories,

It’s hard to believe how bad the NBP output is:

A black and white photograph of people walking in a section. At bottom left, a legend says: "Dream Story, 1961 – Metropolitan Museum of Art Archive",

What are we doing here? The old models were beautiful and compelling because imperfections, ambiguities, mistakes, and contradictions all create little gaps through which your imagination can bring the art to life. Images are not a fixed, static thing: they can be infinitely many things.

New models—do I even need to finish this sentence? They’re very precise and high-resolution, so they can’t make abstract, multi-faceted things, they can only make specific, concrete things.

We need to make AI art weird again.



<a href

Leave a Comment