One engineer made a production SaaS product in an hour: here's the governance system that made it possible

treasure data vibe coding smk1
Every engineering leader watching the agentic coding wave will ultimately be faced with the same question: If AI can generate production-quality code faster than any team, what does governance look like when humans are no longer writing code?

Most teams don’t have a good answer yet. Treasure Data, the SoftBank-backed customer data platform serving more than 450 global brands, now has one, though they learned some parts of it the hard way.

The company today officially announced Treasure Code, a new AI-native command-line interface that lets data engineers and platform teams conduct their full CDP through natural language, with cloud code underneath handling creation and iteration. It was made by a single engineer.

The company says that the coding took about 60 minutes. But that number is almost beside the point. The more important story is what had to be true before those 60 minutes were possible, and what fell apart after that.

"From a planning perspective, we still had to plan to de-risk the business, and that took a few weeks," Rafa Flores, chief product officer at Trezor Data, told VentureBeat. "From an idea and implementation standpoint, this is where you just blend the two and you just go, go, go. And it’s not just prototyping, it’s getting things into production safely."

Build governance level first

Before a single line of code was written, Trezor Data had to answer a difficult question: What does the system need to be restricted from doing, and how do you enforce this at the platform level rather than expecting the code to respect it?

Rails Treasure Data built live upstream of the code. When any user connects to a CDP via Trezor code, access control and permission management is inherited directly from the platform. Users can only access resources to which they already have permission. PII cannot be disclosed. API keys cannot be exposed. The system cannot make defamatory remarks about any brand or competitor.

"We had to involve the CISO. I was involved. Our CTO, our head of engineering, just to make sure this thing doesn’t just break down," Flores said.

This foundation made the next step possible: allowing AI to generate 100% of the codebase, with a three-tier quality pipeline enforcing production standards.

Three-stage pipeline for AI code generation

The first tier is an AI-based code reviewer that also uses cloud code. The code reviewer sits at the pull request stage and runs a structured review checklist against each proposed merge, checking architectural alignment, security compliance, proper error handling, test coverage, and documentation quality. It can automatically merge when all criteria are met. When they don’t, it indicates human intervention.

The fact that Trezor Data built the code reviewer into Cloud Code is not accidental. This means that the tool validating AI-generated code was itself AI-generated, a proof point that the workflow is self-reinforcing rather than relying on a separate human-written quality layer.

The second tier is a standard CI/CD pipeline that runs automated unit, integration and end-to-end testing, static analysis, linting, and security checks against every change. The third is human review, required where automated systems flag risks or enterprise policy demands sign-off.

The internal principle Trezor data works under is: AI writes code, but AI does not ship code.

Why doesn’t it just point the cursor at the database

The obvious question for any engineering team is why not point existing tools like cursors to their data platform, or expose it as an MCP server and let cloud code query it directly.

Flores argued that the difference is the depth of governance. A simple connection gives you natural language access to the data, but doesn’t inherit any of the platform’s existing permissions structures, meaning each query runs with whatever access the API key allows.

Trezor Code inherits the full access control and permissions layer of Trezor Data, so what a user can do through natural language is limited by what they are already authorized to do in the platform.

The second difference is orchestration. Because Trezor Code connects directly to Trezor Data’s AI Agent Foundry, it can coordinate sub-agents and skills across the platform rather than having them perform separate tasks: the difference between asking an AI to run analysis and orchestrating that analysis together across omni-channel activation, segmentation, and reporting.

what broke after all

Even with the governance structure in place, the launch didn’t happen cleanly, and Flores was clear about that.

Trezor Data initially made Trezor Codes available to customers without plans to go to market. The assumption was that it would remain quiet until the team took the next steps. Customers found it anyway. Through purely organic search, it gained adoption by over 100 customers and nearly 1,000 users within two weeks.

"We did not put any market-to-market proposition behind it. We didn’t think people would find it. Well, they did," Flores said. "We were struggling with how do we actually take the steps to go to market? Do we even do beta, since technically it’s live?"

Unplanned adoption also created a compliance gap. While Trezor Data is still in the process of formally certifying Trezor Code under its Trust AI compliance program, this certification was not completed before the product reached customers.

The second problem arose when Trezor Data opened up skills development to non-engineering teams. CSMs and account directors began creating and submitting skills without understanding what would be approved and merged, creating significant wasted effort and a backlog of submissions that the repository’s access policies could not clear.

Enterprise Verification and what’s still missing

Thomson Reuters is among the early adopters. Flores said the company is attempting to build an in-house AI agent platform and is struggling to move fast enough. It connected to Trezor Data’s AI Agent Foundry to accelerate audience segmentation work, then extended it into Trezor code to more quickly adapt and iterate.

Feedback focused on scalability and flexibility, Flores said, and the fact that the purchase had already taken place removed a significant enterprise barrier to adoption.

The gap that Thomson Reuters identified, and Flores acknowledged that the product doesn’t yet address, is guidance on AI maturity. The Treasure Code does not tell users who should use it, what to do first, or how to access different skill levels within an organization.

"AI that allows you to take advantage, but also tells you how to take advantage of it, I think is very different," Flores said. He sees this as the next meaningful layer of construction.

What should engineering leaders learn from this?

Flores had time to reflect on what the experience had really taught him, and he was direct about what he would change. Next time, he said, the release will be internal first.

"We will release it internally only. I will not release this to anyone outside the organization," He said. "This will be a more controlled release so we know exactly what we are facing with less risk."

On skills development, the lesson was to establish clear criteria for what is approved and merged before starting the process for teams outside engineering, not after.

The common thread in both texts is the same that has shaped the governance architecture and the three-tier pipeline: speed is only an advantage if the structure around it holds up. For engineering leaders evaluating whether agentic coding is ready for production, the Treasure Data experience translates into three practical takeaways.

  1. Governance infrastructure should precede the code, not follow it. Platform-level access controls and permission inheritance made it safe to generate AI independently. Without that foundation, the speed advantage disappears because each output requires detailed manual review.

  2. A quality gate that is not completely dependent on humans is not optional at scale. Build a quality gate that doesn’t rely entirely on humans. AI can continuously review every pull request without fatigue and systematically check policy compliance across the entire codebase. Human review is necessary, but as a final check rather than a primary quality mechanism.

  3. Organic adoption plan. If the product works, people will find it before you’re ready. The compliance and go-to-market gap Trezor data is still closing is a direct result of underestimating it.

"Yes, vibe coding can work if done safely and with proper guardrails in place," Flores said. "Approach it this way not to replace the good work you do, but to replace the hard work that you can possibly automate."



<a href

Leave a Comment