Platform Engineer - Benchmark Lead At ARC Prize Foundation

A senior engineer who owns and develops the platform behind the ARC-AGI series of benchmarks. This person will act as the technical owner and architect of our benchmark infrastructure, from stabilizing the current system to laying the foundation for future versions. This is a remote, full-time role.

What would you do:

Stabilize and expand V3 backend and infrastructure – own performance to keep current benchmark platform reliable
Build validation and testing layers – systems for automated model runs, scoring, reproducible eval pipelines, and capturing and querying data exhaust so the team can perform in-depth model analysis
Support initial ARC-AGI-4 implementation by building the necessary backend and platform pieces for new environments, human data collection, scoring, and deployment
Set the initial technical basis for ARC-AGI-5

What we are looking for:

Strong backend engineering with Python, plus distributed systems, SQL, cloud infrastructure and production reliability experience
Experience building evaluation harnesses, testing pipelines, experiment/data logging and analysis workflows – ideally for AI/ML systems or other high-volume technology platforms
Senior enough to act as the technical owner and architect of the benchmark platform (we have a high agency team)

<a href

Platform Engineer – Benchmark Lead at ARC Prize Foundation

Like this:

Related

Leave a Comment Cancel reply

Share this:

Like this:

Related

Leave a Comment Cancel reply