
What did I just do? The Sora app is powered by Sora 2, which is an AI model – and honestly, quite breathtaking. It can create videos that range in quality from normal to insanely insane. It is a black hole of energy and data, and also a distributor of highly questionable content. Like many things these days, using Sora feels like it’s a slightly naughty thing to do, even if you don’t know exactly why.
So if you’ve just made a Sora video, this is total bad news. By reading this, you’re asking to feel a little dirty and guilty, and your wish is my command.
Here’s how much electricity you just used
A Sora video uses 90 watt-hours of power, according to CNET. This number is an educated guess derived from a study of GPU energy usage by Hugging Face.
OpenAI has not actually published the numbers needed for this study, and Sora’s energy footprint has to be estimated from similar models. By the way, Sasha Lucioni, one of the Hugging Face researchers who did this work, is not happy with the estimates above. “We should stop trying to reverse the numbers based on hearsay,” she said in MIT Technology Review, adding that we should pressure companies like OpenAI to release accurate data.
At any rate, different journalists have provided different estimates based on the Huginface data. For example, the Wall Street Journal estimated between 20 and 100 watt-hours.
CNET has based its estimate on a 65-inch TV lasting 37 minutes. The Journal compares the Sora generation to cooking a steak from raw to rare on an electric outdoor grill (because such a thing apparently exists).
In the interest of making you feel even worse it is worth clarifying a few things about this energy usage issue. First, what I just outlined is estimated energy expenditure, also known as Running models in response to a signalThe actual training of the Sora model requires some unknown, but certainly astronomical, amount of electricity, The GPT-4 LLM requires an estimated 50 gigawatt-hours – reportedly enough to power San Francisco for 72 hours, Sora, being a video model, took more than that, but how much more is unknown,
Viewed in a certain way, you assume a portion of that unknown cost when you choose to use the model before creating the video.
Secondly, separating inference from training is important in another way when trying to figure out how much eco-guilt to feel (are you sorry you asked yet?). You might try to write off the higher energy costs as something that’s already happened – like how the cow in your burger died weeks ago, and you can’t eliminate it by ordering the Beyond Patty when you’ve already sat down at the restaurant. In that sense, running any cloud-based AI model is like ordering surf and turf. The “cow” of all that training data may already be dead. But the “lobster” of your specific prompt lives on until you send your prompt to the “kitchen”, which is the data center where the inference occurs.
Here’s how much water you just used:
Sorry, we have to guess further. Data centers use large amounts of water for cooling – either in closed loop systems, or through evaporation. You don’t know which data center, or multiple data centers, were involved in making the video of your friend as an American Idol contestant farting to the song “Camptown Race.”
But it still probably has more water than you need. OpenAI CEO Sam Altman claims that a single text chatGPT query consumes “about one-fifteenth of a teaspoon”, and CNET estimates that a single video consumes 2,000 times the energy cost of text generation. So the back of an envelope might hold 0.17 gallons, or about 22 fluid ounces – a little more than a plastic bottle of Coke.
And that’s if you take Altman at face value. This could easily be more. Also, the same considerations about the cost of training versus the cost of inference that apply to energy use apply here. In other words, using sora is not a water-wise option.
There’s a slight chance that someone could create a really disgusting deepfake of you.
Sora’s Cameo privacy settings are strong – as long as you’re aware of them, and take advantage of them. Settings under “Who can use it” More like Keep your likeness from becoming a toy for the masses unless you choose the “Everyone” setting, which means anyone can make your Sora video.
Even if you’re careless enough to have a publicly available Cameo, you have some additional controls in the “Cameo Preferences” tab, like the ability to describe in words how you should look in the video. You can write whatever you want here, like “lean, toned, and athletic”, or “always picking my nose”. And you also need to set rules about what you should never be shown doing. For example, if you keep kosher, you might say that you should never be shown eating bacon.
But even if you don’t allow your Cameo to be used by anyone else, you can take some comfort in the open ability to create handrails when making videos of yourself.
But general material railing is not correct in Sora. According to OpenAI’s own model card for Sora, if someone prompts loudly enough, an incriminating video can slip through the cracks.
The card states success rates for different types of content filters in the 95%-98% range. However, just subtracting out the failures gives you a 1.6% chance of sexual deepfakes, a 4.9% chance of a video containing violence and/or gore, a 4.48% chance of something called “violent political persuasion”, and a 3.18% chance of extremism or hatred. These opportunities were calculated from “thousands of adversarial signals collected through targeted red-teaming” – in other words, deliberate attempts to breach the guardrails with rule-breaking signals.
So the chances that someone will create a sexual or violent deepfake about you are not good, but OpenAI (perhaps wisely) said never.
Someone could make a video of you touching your poop.
In my tests, Sora’s content filters generally worked as advertised, and I never confirmed what the model card said about its failures. I didn’t painstakingly create 100 different prompts to prompt Sora to create sexual content. If you prompt it for a nude cameo of yourself, you get a “Content Violation” message in place of your video.
However, Some? Potentially objectionable content is so lightly policed that it may go completely unfiltered. In particular, Sora appears to be unconcerned about scatological content, and will generate that type of content without any protection, as long as it does not violate other content policies such as sexuality and nudity.
So yes, in my tests, Sora produced cameo videos of a person interacting with feces, including pulling turds out of the toilet with his bare hands. I’m not going to embed the video here as a demonstration for obvious reasons, but you can test it yourself. It did not require any kind of cleverness or quick engineering.
In my experience, previous AI image generation models have taken measures to prevent things like this, including the Bing version of OpenAI’s image generator, Dall-E, but that filter is missing in the Sora app. I don’t think it’s necessarily a scam, but it sucks!
Gizmodo asked OpenAI for comment, and will update if we hear back.
Your funny video could be someone else’s viral spoof.
Sora 2 has opened up a vast and infinite universe of cheats. You, an intense, internet-savvy content consumer, would never believe that something like the viral video below could be real. It shows spontaneous footage that appears to have been shot from outside the White House. In the audio, which sounds like an overheard phone conversation, an AI-generated Donald Trump asks an unknown party not to release the Epstein files, and yells, “Just don’t let them out. If I go down, I’ll bring all of you down with me.”
Looking at Instagram comments alone, few people were convinced it was real.
The creator of the viral video never claimed it was real, telling Snopes, who confirmed it was created by Sora, that the video is “completely AI-generated” and was created “solely for artistic experimentation and social commentary.” A possible story. It was clearly created for impact and social media visibility.
But if you post videos publicly on Sora, other users can download them and do whatever they want with them – and that includes posting them on other social networks and pretending they’re real. OpenAI took a lot of thought to create Sora as a place where users can scroll endlessly. Once you put a piece of content in such a place, the context doesn’t matter, and you have no way to control what happens to it next.
<a href