OpenAI’s new ChatGPT image generator makes faking photos easy

For most of photography’s nearly 200-year history, altering a photo required either a darkroom, some Photoshop expertise, or, at the very least, a steady hand with scissors and glue. On Tuesday, OpenAI released a tool that reduces the process of typing sentences.

This is not the first company to do this. While OpenAI has had a conversational image-editing model since GPT-4o in 2024, Google beat OpenAI to market with a public prototype in March, then refined it into a popular model called the Nano Banana Image Model (and Nano Banana Pro). The enthusiastic response to Google’s image-editing model in the AI ​​community attracted OpenAI’s attention.

OpenAI’s new GPT Image 1.5 is an AI image synthesis model that reportedly generates images four times faster than its predecessor and costs about 20 percent less via the API. The model launched to all ChatGPT users on Tuesday and represents another step toward making photorealistic image manipulation a casual process that doesn’t require any special visual skills.

Added “Galactic Queen of the Universe” to the sofa room photo using GPT Images 1.5 in ChatGPT.

GPT Image 1.5 is notable because it is a “native multimodal” image model, meaning that image generation occurs inside the same neural networks that process language signals. (In contrast, DALL-E 3, an older OpenAI image generator previously built into ChatGPT, used a different technique called diffusion to generate images.)

This new type of model, which we covered in more detail in March, treats images and text as the same thing: pieces of data, called “tokens,” to make predictions, patterns to be completed. If you upload a photo of your father and type in “he wore a tuxedo to a wedding,” the model processes your words and image pixels in a unified space, then outputs the new pixels the same way it outputs the next word in a sentence.

Using this technology, GPT Image 1.5 can more easily alter visual reality than earlier AI image models, changing one’s pose or position, or rendering a scene from a slightly different angle, with varying degrees of success. It can remove objects, change visual styles, adjust clothing and retouch specific areas while preserving facial likeness in successive edits. You can have a conversation with the AI ​​model about a photo, refining and revising it, just as you might workshop a draft of an email in ChatGPT.



<a href

Leave a Comment