Talking to Windows’ Copilot AI makes a computer feel incompetent

It’s not hard to understand what AI future Microsoft is betting billions on – a world where computers understand what you’re saying and do things for you. This is exactly the case in the latest Copilot PC commercials, where people happily talk to their laptops and they talk back, answering questions in natural language and even doing tasks for them. The tagline is simple: “The computer you can talk to.”

“You need to be able to talk to your PC, have it understand you, and then be able to do magic with it,” Microsoft’s Youssef Mehdi told us in October. “The PC must be able to act on your behalf.”

And it has nothing to do with Microsoft’s ultimate ambitions for AI, which are to completely rethink computing. recently Dwarkesh Podcast In the interview, when presented with the idea to the host, Microsoft CEO Satya Nadella agreed that “these models will be able to be used by a computer as well as a human being,” and went even further and offered an approach where Microsoft builds all of its software as an infrastructure for AI agents to use it in entirely new ways.

It’s a bold approach and a huge bet. The problem is that, right now, talking to Copilot in Windows 11 is an exercise in pure frustration — a stark reminder that the reality of AI is nowhere near the hype.

I spent a week with Copilot, asking it the same questions Microsoft asks in its ads, and tried to get help with tasks I found useful. And from time to time, the co-pilot made mistakes, made things up, and talked to me like I was a child.

CoPilot Vision scans what’s on your screen and tries to assist you with voice prompts. To invite CoPilot you need to press OK on each and share your screen as if you were on a Teams call. alone. Time. After getting your permission, it becomes extremely slow to respond, and addresses me by name whenever I ask it something. Like other AI assistants and LLMs, it’s here to please, even if it’s completely misguided.

Let’s start by examining what Microsoft’s ad shows. Several versions of the ad have been posted online, and it also airs on broadcast TV during NFL games. Surely it should be easy to replicate the specific actions Microsoft wants to expose millions of people to, especially if it’s the basis of how Microsoft is reorienting its entire business.

In the ad, Copilot Vision scans a YouTube video and correctly identifies the HyperX QuadCast 2S microphone when asked “Which mic is she using in this video?” In my tests, the Assistant first gave me basic information about the benefits of dynamic microphones. Then, without any prompting, it started talking to me as if I was the person in the video (“I can see your setup right now, and I see you have… a big setup there!”), then told me that the mic in question was actually a first-generation HyperX QuadCast. To be fair, HyperX makes a lot of similar-looking mics, although at one point it said, “Without seeing the exact lighting pattern or any distinguishing features, it’s hard to say with certainty which model it is” despite the image being bathed in RGB light.

On the other two occasions, it identified the mic as a Shure SM7b. And when I asked, “Where can I find it nearby?” As advertised, it gave me a dead link to Amazon once and then a correct link to the wrong mic at Best Buy.

The commercials also feature a man asking “What kind of strain is this thing on?” Pointing to a PowerPoint presentation about the Saturn V rocket. Contrary to advertising, the copilot could not identify the rocket from the vision image (or from the words “Saturn V” appearing on the screen). When I informed the copilot that it was a Saturn V, he told me that thrust is typically measured in newtons or kilonewtons, then gave me an estimated thrust of 7.5 million pounds. As advertised, asking CoPilot to “run some simulations on burn time” told me it couldn’t be done, and directed me to Matlab.

At the end, the ad shows a man looking at a picture of a cave filled with water and asking, “How do I get there?” From context, this should be a frame from a video, but that video doesn’t exist. While the longer version of the ad above correctly identifies the image as the Rio Secreto in Playa del Carmen, Mexico, the shorter version I first saw doesn’t answer the question at all. Without the answer already at hand, I used reverse image search and found a match for a photo of the cave from a cruise line and a real estate site, both of which reported it was from a cave in Belize. But it is listed elsewhere as a cave on Grand Cayman.

I made the image full-screen and asked the co-pilot how to get there. The results were inconsistent, to put it mildly.

  • About a third of the time, it instructed me to find the photo in File Explorer. One of those times it even told me, “This is the third icon on the taskbar” (it was the fourth).
  • On two occasions it told me how to launch Google Chrome.
  • About four times it gave me general advice about booking flights to Belize and some basic ideas about what to do there. This cave is in Mexico.

I renamed the file to mention Grand Cayman, and it told me how to book a flight to the Cayman Islands. Once I confirmed that Copilot was just looking at the file name, I decided to try to trick it. I renamed the image to “new-jersey-crystal-caves-limestone.jpg” and sure enough, the AI ​​assistant immediately told me about the famous Crystal Cave in Ogdensburg, New Jersey. At no point did it correctly identify the location of the image.

(To be fair to CoPilot, if you don’t already know where the image is from, it’s not easy to figure out. After manually searching through TripAdvisor images, my editor found a match in a user review album that confirmed Microsoft’s ad was correct in pinpointing Rio Secreto. Since the video depicted in Microsoft’s ad does not exist, it’s unclear what information CoPilot was using to identify the cave.)

Beyond just looking at things and trying to recognize them, Microsoft also shows CoPilot actually doing things. Specifically, it asked to “help me turn my portfolio into a bio”, a prompt that actually caused me a huge amount of mental damage. In the ad, Copilot looks at an artist’s portfolio of images (which look suspiciously AI-generated), their portrait, and a photo of their cat, and creates a one-sentence summary claiming they were inspired by their feline friend. embarrassing.

I don’t have a portfolio website for my (real) photos, so I shared it on my Instagram. It generated such nonsense about me being a “visual storyteller” “capturing the essence of life one frame at a time” that I wanted to sink under the floorboards. I feel physically sick whenever I think about it. And it didn’t happen Mention My cats, whom I miss every day. How dare you, copilot.

Other than trying to replicate advertising prompts, I struggled to find a use for CoPilot Vision. I’m sure it’s not writing for me, and it can’t perform simple tasks for you in Windows – not even to toggle settings like dark mode. Microsoft spokesperson Blake Manfre explains The Verge“CoPilot Actions on Windows, which can take actions on local files, is not yet available. This is an opt-in experimental feature coming soon to Windows Insiders in CoPilot Labs, starting with a narrow set of use cases while we optimize and learn model performance. This is separate from CoPilot Vision.”

In third-party apps, it may offer advice, like how to get a dreamy look in Adobe Lightroom Classic, but the tips are general. And since it transmits everything via audio, it goes from cramming a lot of preamble to setting up settings quickly at you like it’s critiquing from the worst YouTube tutorial possible.

I asked it to help me analyze a benchmark table in Google Sheets. It got some basic percentage calculations correct, but clear-day scores were consistently misread in both the spreadsheet and on-page review. So how can you trust it?

In gaming – something that Microsoft specifically advertises as using Copilot Vision – it provided the most basic and vague information. For Hollow Knight: SilksongIt gave me only cursory instructions, feeling like a child submitting his book report based on the cover alone. (Actually, talking to the copilot is like this, it’s weird.) BalatroIt couldn’t accurately identify the cards in my hand, but it gave me irrelevant information about the mechanics of other card games.

I tried to meet the copilot where he is, but he failed to do everything I asked him to do. Like most generative AI techniques, it is an incomplete solution to the problems in question. There could be something useful here, especially for the accessibility community, if it could one day take over Windows completely. But talking to copilots today makes powerful computers seem inefficient. It’s hard to see how we get anywhere near Microsoft’s bold vision of an agentic AI future with what we’re shipping to actual consumers today.

Follow topics and authors To see more like this in your personalized homepage feed and get email updates from this story.




Leave a Comment