For makers and artists building autonomous systems or robotics, Decart’s latest release, Oasis 3, marks a significant shift: it is now possible to generate hours of photorealistic driving footage in real time via API. This capability allows creators to stress-test algorithms against rare, edge-case scenarios at a scale previously impossible, moving beyond static images into dynamic, interactive worlds.
The developer-first strategy
While the immediate target is the autonomous vehicle sector, Decart is positioning Oasis 3 as a foundational tool for a broader ecosystem. By opening API access from day one, the company aims to replicate the rapid expansion seen with language models, fostering a community of builders rather than just serving existing enterprise clients.
Dean Leitersdorf, co-founder and CEO of Decart, emphasised this vision: “It’s going to be the first usable world model that people can actually program on top of. I think there’s going to be an entire developer community that emerges on top of this.”
The new model builds upon the foundation of Lucy, Decart’s existing real-time video model which already powers products for over 100,000 developers in e-commerce and live streaming. Priced at $0.02 per second for standard access, with custom enterprise rates available, Oasis 3 represents Decart’s formal entry into physical AI.
Competing in a crowded arena
Decart is entering a field already populated by major players. Last year, Google released Genie 3 in research preview, while Fei-Fei Li’s World Labs launched Marble for commercial use. Meanwhile, video generation specialists like Luma and Runway are adapting their physics-aware models into world models.
The timing also coincides with Decart’s massive recent funding. Two years ago, the startup raised $300 million, boosting its valuation to nearly $4 billion. That round included strategic investors such as Toyota, Adobe, eBay, and Nvidia. Leitersdorf notes that these entities are potential customers for the technology.
Efficiency and cost
Oasis 3’s primary advantage lies in its photorealism and infinite generation capability, driven by the Decart Optimization Stack (DOS). This software allows models to run efficiently across Nvidia, Amazon, and Google hardware, significantly lowering operational costs compared to rivals.
“This is built on top of our entire real-time stack, which we optimize all the way down to the hardware,” Leitersdorf explained. “By being so vertically integrated, we’re able to be more than an order of magnitude cheaper than anyone else in the industry in order to run these models.”
The efficiency is so pronounced that the company reports having spent drastically less than $100 million in its entire lifetime to date.
Testing the limits
In practical testing, Oasis 3 produced multi-camera environments—one front-facing and two side-facing—designed for rigorous system training. Compared to alternatives like Genie 3 or Marble, the output from a single text prompt was notably more photorealistic.
However, the model’s ability to sustain coherence over time is its most significant weakness. While the initial scene generation is strong, thematic integrity degrades rapidly as the simulation progresses. A prompt for a specific New York City morning street might start beautifully but quickly morph into a generic, Western urban setting.
Navigation issues compound this effect. Attempting to return to a previously visited intersection often results in the environment vanishing and being replaced by a new, unrelated scene. The controls lack responsiveness, frequently causing the vehicle to lose control. The overall experience feels less like a coherent simulation and more like a disjointed, dream-like stream of consciousness that becomes nonsensical quickly.
Physics and architecture
Another persistent issue is the lack of physical interaction; vehicles in the simulation often drive through other cars. Leitersdorf describes this as a “major research problem” currently being addressed, attributing it to the fact that training data contains far more examples of good driving than accidents.
The root of the consistency challenge lies in the model’s architecture. Oasis 3 is auto-regressive, generating one frame at a time while referencing previous outputs. This process is compute-intensive and creates a context window that fills up very quickly.
“Every frame we generate is roughly 8,000 tokens,” Leitersdorf noted. “Generating this at tens of frames per second — that’s hundreds of thousands of tokens per second. The context window fills up very quickly. We’re researching how to do longer context to store millions more tokens, and how to compress the memory into fewer tokens.”
Looking ahead, the next version may allow users to seed worlds with video rather than images, potentially improving consistency. Leitersdorf acknowledges the field is still in its infancy but remains focused on the future potential unleashed by developer adoption.
“It takes me back to the early days of LLMs, when OpenAI invented the API for models,” he said, referencing the explosion of new use cases driven by a developer community. “When we talk again in three months, we’ll be like, ‘Here’s 100 developers that all built 100 different applications with Oasis that surprised all of us.'”
Key takeaways
- Oasis 3 offers real-time, API-accessible simulation of hours of driving footage, priced at $0.02 per second, targeting autonomous vehicle developers and robotics creators.
- Decart claims superior cost efficiency through vertical integration of its hardware stack, spending less than $100 million in its lifetime despite high compute demands.
- Current limitations include rapid degradation of thematic integrity over time, poor physics simulation where vehicles pass through each other, and navigation inconsistencies.
- The company is betting on an ecosystem play similar to OpenAI’s early API strategy, aiming for hundreds of new applications to emerge within months.
Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.




