Palona goes vertical, launching Vision, Workflow features: 4 key lessons for AI builders

fQUa9jYY1tHu1pCu1Y vE
Building an Enterprise AI Company "sand shifting foundation" That’s the central challenge for founders today, according to Pelona AI’s leadership.

Today, the Palo Alto-based startup, led by former Google and Meta engineering veterans, is making a decisive vertical push into the restaurant and hospitality sector with today’s launch of Palona Vision and Palona Workflow.

The new offerings transform the company’s multimodal agent suite into a real-time operating system for restaurant operations – incorporating cameras, calls, conversations and coordinated task execution.

The news marks a strategic pivot from the company’s beginnings in early 2025, when it first emerged with $10 million in seed funding to build emotionally intelligent sales agents for broader direct-to-consumer enterprises.

Now, limiting our attention to one "multimodal native" Approach to restaurants, Pelona is providing a blueprint for AI builders on how to proceed "thin wrappers" Building deep systems that solve high-risk physical world problems.

“You’re building a company on top of a foundation that is sand – not quicksand, but shifting sand,” co-founder and CTO Tim Howes said, referring to the volatility of today’s LLM ecosystem. “So we built an orchestration layer that lets us change the model on performance, throughput, and cost.”

VentureBeat recently spoke in person to Howes and co-founder and CEO Maria Zhang — where else? – A restaurant in NYC talks about the technical challenges and hard lessons learned from its launch, growth, and pivot.

New Offering: Vision and Workflow as a ‘Digital GM’

For the end user – the restaurant owner or operator – the latest release of Pelona is designed to act as an automated "best operations manager" Who never sleeps.

Pelona Vision uses in-store security cameras to analyze operational signals – such as queue length, table turnover, prep interruptions, and cleanliness – without the need for any new hardware.

It monitors front-of-house metrics such as queue length, table turns and cleanliness, as well as identifies back-of-house issues such as preparation slowdowns or station setup errors.

Pelona Workflow accomplishes this by automating multi-step operational processes. This includes managing catering orders, opening and closing checklists, and completing food preparation. By correlating the video signal from Vision with point-of-sale (POS) data and staffing levels, the workflow ensures consistent execution across multiple locations.

“The Palona vision is like giving every location a digital GM,” Shaz Khan, founder of Tono’s Pizzeria + Cheesesteaks, said in a press release provided to VentureBeat. “It flags issues before they escalate and saves me hours every week.”

Going into the Field: Lessons in Domain Expertise

Pelona’s journey began with a star-studded roster. CEO Zhang previously served as VP of Engineering at Google and CTO of Tinder, while co-founder Howes is the co-inventor of LDAP and former Netscape CTO.

Despite this pedigree, the team’s first year was a lesson in the need for focus.

Initially, Pelona served fashion and electronics brands, creating "Magician" And "surfer dude" Personality to handle sales. However, the team quickly realized that the restaurant industry presented a unique, trillion-dollar opportunity "Amazingly recession-proof" But "stunned" Due to operational inefficiencies.

"Advice to startup founders: Don’t go into multi-industry," Zhang warned.

By verticalizing, Pelona moved beyond being a "Thin" chat layer for building "Multi-Sensory Information Pipeline" Which processes vision, voice and text simultaneously.

Clarity of Focus opens up access to proprietary training data (like prep playbooks and call transcripts) while avoiding common data scraping.

1. Building on ‘shifted sand’

To accommodate the reality of enterprise AI deployments in 2025 – with new, improved models arriving on an almost weekly basis – Pelona developed a patent-pending orchestration layer.

instead of being "bundle" With a single provider like OpenAI or Google, Pelona’s architecture allows them to change models on a dime based on performance and cost.

They use a mix of proprietary and open-source models, including Gemini for computer vision benchmarks and specific language models for Spanish or Chinese fluency.

For builders, the message is clear: never let the core value of your product become a dependency on a single-vendor.

2. From words to ‘world models’

The launch of Pelona Vision represents a shift from understanding words to understanding the physical reality of the kitchen.

While many developers struggle to tie together disparate APIs, Pelona’s new vision model turns existing in-store cameras into operational assistants.

system recognizes "cause and effect" Real-time—detecting whether a pizza is undercooked or not "light beige" If the display case is empty, paint it or alert the manager.

"In words, physics doesn’t matter," Zhang explained. "But in reality, I drop the phone, it always turns off… We really want to find out what’s going on in this restaurant world",

3. ‘Muffin’ Solution: Custom Memory Architecture

One of the most significant technical hurdles facing Pelona was memory management. In a restaurant context, memory is the difference between a disappointing conversation and a "magical" The one where the agent misses the diner "ordinary" order.

The team initially used an unspecified open-source tool, but found that it produced errors 30% of the time. "I think consultant developers always turn off memory [on consumer AI products]Because that’s guaranteed to mess everything up," Zhang warned.

To solve this, Pelona created Muffin, a proprietary memory management system named after the web. "cookies"Unlike standard vector-based approaches that struggle with structured data, Muffin is designed to handle four distinct layers:

  • Structured data: Static facts such as delivery addresses or allergy information.

  • Slow-changing dimensions: Loyalty preferences and preferred objects.

  • Momentary and seasonal memories: Adapting to changes such as preferring cold drinks in July versus hot cocoa in winter.

  • Regional context: Defaults such as time zone or language preferences.

Lesson for builders: If the best available equipment isn’t good enough for your specific scope, you should be prepared to build your own.

4. Credibility through ‘Grace’

In the kitchen, an AI error isn’t just a typo; This is a wasted order or a security risk. A recent incident at Stefanina’s Pizzeria in Missouri, where an AI conjured up fake deals during the dinner rush, highlights how quickly brand trust can be eroded when safety measures are absent.

To prevent such chaos, Pelona’s engineers follow its internal GRACE framework:

  • Guardrail: Strict limits on an agent’s behavior to prevent unapproved promotions.

  • Red Teaming: Proactive Effort "to break" Identify AI and potential hallucination triggers.

  • AppSec: Lock down APIs and third-party integrations with TLS, tokenization, and attack prevention systems.

  • Compliance: Grounding every response in verified, checked menu data to ensure accuracy.

  • Escalation: Escalating complex conversations to a human manager before a guest gets misinformed.

This reliability has been verified through large-scale simulations. "We simulated millions of ways to order pizza," Using one AI to act as a customer and another to take orders, Zhang said, measuring accuracy to eliminate hallucinations.

bottom line

With the launch of Vision and Workflow, Pelona is betting that the future of enterprise AI lies not in broad assistants, but in specialized assistants "operating system" One who can see, hear and think in a specific area.

Unlike general-purpose AI agents, Pelona’s system is designed to execute restaurant workflows, not just answer questions — it’s able to remember customers, listen to their orders, "ordinary," And monitoring restaurant operations to ensure that they deliver food to that customer according to their internal procedures and guidelines, flagging whenever something is wrong or critical. About this to go wrong.

For Zhang, the goal is to let human operators focus on their craft: "If you’ve fallen in love with that delicious food… we’ll tell you what to do."



<a href

Leave a Comment