What AI Benchmarks Miss About Real-world Performance

Presented by F5

Enterprise AI teams have spent years solving computations, securing GPU allocations, negotiating cloud capacity, and benchmarking training throughput. The underlying assumption in that work is that the path between storage and computation will remain. In production, that perception increasingly doesn’t hold up. Real traffic introduces latency spikes, network jitter and node degradation that controlled benchmarks fail to capture, resulting in pipelines that perform well in the lab but stall in deployment. A growing response is AI data delivery, deploying an application delivery controller (ADC) or application delivery and security platform (ADSP) in front of storage as a flexible and secure control point.

"Provisioning addresses capacity but not delivery, and this is where the bottleneck now lies," says Hunter Smit, senior manager of product marketing at F5. "Enterprises buy enough GPUs and enough storage, then assume the path between them will remain, but AI traffic is intense, highly concurrent, and random in its reads, which normal storage networking was never designed to absorb."

Production differences do not appear in the benchmarks

The standard benchmark methodology exacerbates the problem, says Paul Pindell, principal solutions architect for the technology alliance at F5.

"Benchmark testing is generally designed to produce the best possible performance or security results, not the most realistic results," He says. "With S3, latency is a known factor for performance degradation, so persistent latency would have to be introduced into the path for meaningful testing."

Most benchmark environments never do this, meaning that the performance numbers that enterprises rely on for infrastructure decisions are derived from conditions that production systems will never be able to replicate. To test this assumption, the F5 and MiniIO conducted throughput testing under poor network conditions.

"What stood out was how quickly S3 throughput goes away once you introduce latency," Pindell says. "Even modest latency takes its real bite, and as latency moves toward longer distances, the degradation becomes severe."

Testing also revealed that latency matters far more than jitter as a driver of throughput loss, which countered the team’s expectation. The consequence for enterprise architects is that S3 object storage deployments cannot be designed around clean-room assumptions; They must be engineered for the poor network conditions they will actually encounter.

Cost of fragile data paths

"In AI infrastructure, people naturally focus on GPUs because they are the most visible and expensive resources," says Tanu Mutreja, senior director of product management at F5. "But in a production environment, GPUs only generate as much value as the data path that feeds them."

That path passes through the storage, networking, database, security, and orchestration layers, often tied together from multiple vendors. Customers don’t experience any of these seams; They experience the output of the entire system.

When the data path degrades, the effects are complex. Low GPU utilization is the most immediate and visible symptom, but Mutreja pointed to a broader set of consequences: poor inference performance, poor quality AI output, high exit costs from unnecessary data replication, and increasing operational complexity.

"At scale, data-path efficiency becomes a strategic business lever rather than a technical optimization," She says. "When data paths are well engineered, GPUs remain productive, AI applications remain responsive and reliable, operations increase efficiency, and organizations maximize returns on their AI investments."

AI workloads are structurally more vulnerable to these failures than traditional enterprise applications. Databases, ERP systems, and Web services absorb transient storage delays through caching and buffering. AI workloads running in massively parallel GPU clusters have no equivalent protection. As Mutreja noted, even minor latency spikes or bandwidth bottlenecks can propagate across large GPU clusters, impacting utilization, training efficiency, and customer experience as well.

Treating storage edge as control point

For decades, storage and intelligence operated as sequential concerns in enterprise architecture: data was stored first, then analyzed downstream. Mutreja argued that this model no longer fits the demands of AI.

"Competitive advantage is determined not only by the quantity of data, but also by the relevance, lineage, security, and executable delivery of the data." She says. "Across the industry, from NVIDIA and AWS to enterprise storage providers, the movement is toward embedding intelligence directly into the data infrastructure rather than stacking it on top."

F5’s integration with Minio accelerates this approach to the level where storage and compute actually interact. As part of the F5 ADSP, BIG-IP sits in the data path, continuously monitoring the health of MinIO’s distributed storage nodes and directing only requests to those that remain available.

The operational impact of that capacity becomes apparent when nodes go down, which is expected in distributed storage clusters. Without intelligent routing, clients that land on an unhealthy node must retry and may land on another unhealthy node, causing overall performance degradation.

"F5 ensures that traffic only goes to the healthy nodes, or even the least busy nodes, so S3 client traffic is always processed in the most efficient way," Pindell says.

Governance in distributed environments

The challenge increases at scale when AI pipelines span multiple locations, clouds, or edge environments.

"Once an AI pipeline crosses fields and clouds, the question ceases to be about performance and becomes about control," Smit says. "You’re operating under different rules in each jurisdiction, and digital sovereignty is now a design constraint. Where your data is allowed to live, who is allowed to touch it, and what boundaries it cannot cross shapes the architecture now, before anyone talked about speed."

This pressure is driving a visible trend of enterprises moving AI workloads from the public cloud back to infrastructure they own and directly govern. The architecture described by Smit solves this by separating applications from any single storage location and placing a unified control point between them that enforces consistent policy across all of them.

"If you manage one area at a time, sovereignty, flexibility and cost do not become a trade-off," He explains. "They become a capability that you run as a system."

Storage-to-compute path as a managed control point

Smit says that to solve these issues, enterprise teams need to stop treating the storage-to-compute path as a direct connection and start treating it as a managed control point. SecureIQLab’s independent validation of the F5 BIG-IP in storage deployments has confirmed that the approach provides flexibility without sacrificing throughput.

"Insert a full-proxy ADC between the two, and the path becomes observable, programmable, and failure-aware, with health-based routing, quality of service, and security implemented inline." He explains. "That single step transforms data delivery from an assumption into an engineered discipline that keeps the GPU fed when conditions get rough."

Sponsored articles are content produced by a company that is either paying for the post or that has a business relationship with VentureBeat, and they are always clearly marked. Contact for more information sales@venturebeat.com.

<a href

What AI benchmarks miss about real-world performance

Production differences do not appear in the benchmarks

Cost of fragile data paths

Treating storage edge as control point

Governance in distributed environments

Storage-to-compute path as a managed control point

Like this:

Related

Leave a Comment Cancel reply

Production differences do not appear in the benchmarks

Cost of fragile data paths

Treating storage edge as control point

Governance in distributed environments

Storage-to-compute path as a managed control point

Share this:

Like this:

Related

Leave a Comment Cancel reply