Disclaimer: I’m definitely not a doctor (that’s the problem, really!) so please take anything I say with a grain of salt.
Some context (feel free to skip)
For a few weeks now, I have been experiencing some pain in my right shoulder. Although it seemed to be getting better, I decided to seek advice from an orthopedic surgeon. I won’t go into details, but he suggested me to get an MRI done, which was easily available at the clinic. I agreed and mainly I discovered that I had a “grade III (>50%-width) partial-thickness tear” at the top insertion of my subscapularis tendon. Of course, it made no sense to me, but the course of treatment he suggested was comprehensive; They started just a few minutes after I had my MRI. Coming out of the clinic, I felt as if they had pointed a gun at me.
Thankfully, before leaving, I asked him to send me a copy of the MRI results and a list of all the treatments he had done and suggested we repeat it a total of 3 times.
I sent everything over to GPT 5.5 Pro, and immediately it flagged two things:
- He performed shockwave therapy on my shoulder, even though a recent clinical practice guideline says that physicians should not use or recommend shockwave therapy for rotator-cuff tendinopathy without calcification; I was told during the ultrasound that there was no calcification.
- He injected me with Traumeel, which is registered in Germany as a homeopathic medicine “without any therapeutic indication”.
This did not boost my confidence. So this made me curious to analyze the MRI.
Opus established to perform first review of MRI
The MRI package was a standard DICOM export containing a few hundred files with no extension, totaling about 266 MB in length.
For analysis, I decided to use Opus 4.8 (xhigh) within Cloud Codes to give it the ability to run code and install packages. Before doing any work, I asked him to install any packages required for analysis. Using cloud code is especially important to enable a significant amount of work to be done on this matter. This may seem obvious to coders, but the difference between Cloud Code and Cloud.AI’s chat is huge, even though they both run on the same model.
Then it was time to start. Admitting that I knew nothing about MRIs, I set Claude to work hard on a detailed plan and then take action. The only instruction I gave was “right shoulder to be in pain for 2-3 weeks”, which I later realized was less than what human doctors received.

After about an hour, it came back with the report:
The serious problem with that report was that while the doctor saw a grade III (greater than 50%) partial-thickness tear at the apical insertion, Opus 4.8 reported an intact tendon!
It was quite disappointing. I expected the grades to be lower, but that conclusion was extreme.
mediate between two analyzes
To decide, I decided to compare both reports from the cloud. But this time, I gave it a little more context; In addition to giving the human report, I also provided a discussion I had with ChatGPT 5.5 Pro, where I provided activities and status as a way to figure out what my diagnosis was.

From the planning document, here’s the approach Opus took:

The approach was careful and systematic, using multiple sub-agents as a way to obtain new analyzes that were not biased by the existing context.
Then, after about an hour, I got a new report:
Its conclusion was:
Arbitrator’s Decision: The evidence favors Reader A (medium to high confidence). mild insertional tendinosis; No isolated partial or full-thickness tears including apical involvement were identified.
Extract from PDF of arbitration
I can’t help but find it interesting that the decisions are so far apart from each other. Looking further into the report, I can read that Opus was not afraid to say that there were some conflicts between the two reports that he could not resolve, and yet he could resolve this; And very decisively.
Where does that leave me?
It is incredibly peaceful to be in the hands of an expert you trust. Now you don’t have to worry and you can let them guide you through the process.
AI may completely shatter that feeling in an uncomfortable way: after receiving this AI-powered second opinion, the diagnosis and treatment plan seem premature and more intervention-heavy than facts… but I don’t know if I can completely trust AI. So I am left in a dilemma where I either try my luck with another doctor or wait and see if the rehab I am doing gets my shoulder better.
My hope is that in a few model generations, we will trust AI to review MRIs the same way we trust it to proofread our emails.
I am not naming the clinic or the doctor as that is not the point of the article. It’s about sharing my technical curiosity about using AI to get second opinions. I could be wrong, or the AI could be wrong. I could also have misunderstood the doctors. So basically, none of this should be taken as medical advice 🙂
<a href