
San Francisco-based AI lab Arcee made headlines last year for being one of the only US companies to train language models (LLM) at large scale and release them to the public under an open or partially open source license – enabling developers, solo entrepreneurs, and even medium to large enterprises to use powerful AI models for free and customize them as they wish.
Now Arcee is back again this week with the release of its largest, highest-performance open language model to date: Trinity Large, a 400 billion parameter mixture-of-experts (MOE), which is now available in preview.
With flagship release, Arcee is shipping "Raw" Checkpoint model, Trinity-Large-TruBase, which allows researchers to study what a 400B sparse MoE learns from raw data alone, before applying instruction tuning and reinforcement.
By providing a clean slate at the 10-trillion-token mark, Arcee enables AI builders in highly regulated industries to conduct authenticity audits and their own specialized alignment without having to inherit. "black box" Biases or formatting quirks of general purpose chat models. This transparency allows for a deeper understanding of the differences between a model’s internal reasoning capabilities and the helpful behaviors obtained during the final post-training phase.
This launch comes as powerful Chinese open-source LLM options from the likes of Alibaba (Qiwen), Z.AI (Zhipu), DeepSeek, Moonshot and Baidu enter the market, effectively leading the category with high-efficiency architectures.
Trinity Large also comes as the meta has significantly retreated from the frontier open-source landscape. This followed the introduction of Llama 4 in April 2025, which received a mixed reception, and former Meta AI researcher Yann LeCun later admitted that the company used several special versions of the model to boost scores on third-party benchmarks.
Amidst this domestic void, only OpenAI – with its gpt-oss family released in the summer of 2025 – and Arcee are currently taking charge of new US-built open-source models trained entirely from scratch.
as rare as they come
Trinity Large is notable for the extreme sparseness of its meditation system. a MoE architecture, "rarity" Refers to the model’s ability to selectively activate only a small fraction of its total parameters for any given task.
While Trinity Large has 400B total parameters, only 1.56% (13B parameters) are active at any time.
This architectural choice is important because it allows the model to have "Knowledge" Achieving the performance of a much smaller system that is approximately 2-3 times faster than its peers on similar hardware while maintaining the predictable speed and operational efficiency of a larger system.
sovereignty and "truebase" Visit
The most significant contribution of this release to the research community is Trinity-Large-TrueBase—a raw, 10-trillion-token checkpoint.
unlike almost every other "open" release, which comes after being "distorted" Through instruction tuning and reinforcement learning, TrueBase provides a rare, untouched look at fundamental intelligence.
In the race to make models helpful, most laboratories apply supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) before the weights are released. Although this makes the model better conversational, it may hide the underlying knowledge distribution.
TrueBase provides a "OG base model" Which has not yet gone through the learning rate announcement or step two and three pre-training where instruction data is usually introduced.
For researchers and enterprises in highly regulated industries, starting with TrueBase allows authentic audits and custom alignments. As Arci CTO Lucas Atkins explained in a video call with VentureBeat: "It is interesting that the Checkpoint is already one of the best performing base models in the world".
Technology: Engineering Through Constraints
The construction of Trinity Large was not the product of infinite resources, but it, says Atkins "engineering through obstacles".
Trained for approximately $20 million in just 33 days, this model represents a masterclass in capital efficiency.
Arcee, a team of only 30 people, worked on a total capital of less than $50 million, generating $20 million in funding. "company back" Condition.
"I have always believed that having a constraint, whether financial or personnel or anything else, is extremely important for creativity," Atkins explained. "When you have an unlimited budget, you naturally don’t have to hack your way out of complex problems.".
Architecture: 4 of 256 Sparsity and SMEBU
Trinity Large uses a 4 out of 256 sparse MOE architecture, which means it activates only 4 of its 256 experts for each token.
This high degree of sparsity – the highest ever successfully trained – created significant stability challenges during pre-training.
To solve this, Arci developed Soft-Clamped Momentum Expert Bias Updates (SMEBU). This mechanism ensures that experts are specialized and spread evenly across the general web corpus, preventing some from becoming experts. "winners" while others remain untrained "dead weight".
The speed of the training runs was facilitated by Arcee’s early access to an Nvidia B300 GPU (Blackwell). These chips provide nearly double the speed and significant memory increases compared to the previous Hopper generation.
"Pre-training was 33 days," Atkins noted. "We could have done it on Hopper and it probably would have taken two to three months. And by that point, we’re in a completely new generation model".
In partnership with DatalogyAI, Arci leveraged more than 8 trillion tokens of synthetic data. However, this was not normal "copy" Synthetic data where a small model learns to talk like a larger model.
Instead, the intention was to take raw web text – such as blogs or Wikipedia articles – and artificially rewrite it to condense the information into a smaller number of total tokens. This process helped the model learn to reason on information rather than remembering exact token strings.
The architectural design also includes optional local and global sliding window attention layers in a 3:1 ratio. This hybrid approach allows the model to be highly efficient in long-context scenarios. When trained for 256k sequence lengths, Trinity Large natively supports 512k contexts, and evaluation shows that it remains performant even at a 1-million-token horizon.
Technical Comparison: Trinity Large vs. GPT-OSS-120B
As an American alternative, Trinity Large can be compared to OpenAI’s gpt-oss-120b.
While both models use sparse architecture to achieve marginal-level performance under permissive licenses, they perform different operational roles.
While gpt-oss-120b currently holds the edge in specific logic and mathematics benchmarks, Trinity Large offers a significant advantage in inference capacity and raw parameter depth for complex, multi-step agentive workflows.
Sovereignty: filling the void
The release of Trinity Large is a technological statement as well as a geopolitical statement. CEO Mark McQuade told VentureBeat in the same interview that the void of the American open-source model at the marginal level forced a pivot in Arsi’s strategy.
"This kind of changed where US based or western players stopped open sourcing these models," McQuade said. "We’re relying on these models to go into organizations and take them forward… but Chinese labs have just started creating cutting-edge models and open sourcing them.".
For McQuade, this created a dependency that American enterprises were becoming uncomfortable with. "Especially in our conversations with larger organizations, they were unable to use Chinese-based architectures," he explained. "We want to be that champion in America. [It] doesn’t really exist yet".
By releasing under the Apache 2.0 license, Arcee provides the gold-standard permissioned framework that allows companies "own" Model layer completely. This is important for industries like finance and defense, where using a model hosted by a third party or a restrictive cloud provider is a non-starter.
balancing intelligence with utility
Arsi is currently focusing on "current thinking model" Converting Trinity Large from a simple instruction model to a full logic model. The team is struggling with balance "intelligence vs utility"- Attempting to create a model that excels at benchmarks without building anything "yappy" Or inefficient in actual production applications.
"We created the Trinity so that you could own it," The team says this signals a return to the core values of the American open-source movement. As the industry moves toward agentive workflows and large-scale reference requirements, Trinity Large positions itself as a not "Covering," But as a sovereign infrastructure layer that developers can ultimately control.
<a href