NVIDIA researchers have built ENPIRE, a system that allows physical robots to learn through autonomous experimentation loops similar to those used by software coding agents.
In this article
How ENPIRE works
The framework creates a closed loop for physical robots to test, fail, and refine their own strategies. It operates through four core modules. An Environment module resets the scene automatically when a trial fails. A Policy Improvement module launches new refinements based on results. A Rollout module evaluates policies using single or multiple robots working in parallel. Finally, an Evolution module allows coding agents to analyse logs, review literature, and update the training infrastructure.
The system removes the need for constant human oversight. An automatic evaluation tool scores outcomes without human judgement, while the reset system returns the environment to a fresh initial state. The authors note that the complexity of tasks the system can handle is defined by how well these automated evaluation and reset functions work.
Each workstation uses two YAM arms from I2RT in a fixed bimanual setup, cameras, and a single NVIDIA RTX 5090 workstation running a FastAPI server.
Performance results
Frontier coding agents developed policies with a 99% success rate on dexterous manipulation tasks. These included pushing objects, organising pins into a box, and cutting zip ties. The system also successfully tested inserting GPUs into a motherboard.
Performance depends on the underlying AI models. Within the tested group, GPT-5.5 in Codex and Opus 4.7 in Claude Code offered the best performance, while Kimi-2.6 lagged behind. Scaling the number of agents improved results, with eight agents finding higher-scoring solutions faster than fewer agents. Multi-agent setups sometimes achieved higher absolute scores than single-agent setups by exploring a wider solution space.
Infrastructure limits
Scaling robot fleets introduces new challenges. Coding agents do not fully utilise robot resources when reading logs, writing code, or waiting for the language-model backbone. As the number of robots increases, the machines remain idle while GPU active utilisation rises. This lack of natural parallelisation creates infrastructure hurdles for expanding the system.
Humans struggle to predict technology
Matthew Tokson, Associate Dean for Research at the University of Utah S.J. Quinney College of Law, argues that predicting the future of technology is extremely difficult. His SSRN paper notes that experts often underestimate novel innovations or overestimate the social benefits of new tools.
History shows many experts were wrong about major technologies. Albert Einstein, Niels Bohr, and Robert Oppenheimer did not initially believe nuclear fission was achievable. Paul Krugman once stated the internet would have no greater impact than the fax machine. Technologists expected the internet to promote democracy rather than strengthen autocracies. Many scientists also rejected or underestimated human-caused climate change despite decades of evidence.
Tokson writes that both skeptics who fear economic disruption and optimists who expect universal benefits are likely incorrect. “History does not support complacency about the future impacts of AI.” Optimists have often erred regarding social ramifications, while skeptics have underestimated the likelihood of innovation.
Tencent debugs 10,000-GPU clusters
Tencent has released details on ARGUS, a software system used to generate telemetry and debug errors across large sets of chips.
ARGUS is a low-overhead, fine-grained, always-on tracing and real-time analysis system. It consists of three software layers. The Python layer handles scheduling and data preparation. The framework layer manages phase orchestration. The GPU runtime layer controls kernel execution.
The company deployed ARGUS on a production cluster of over 10,000 GPUs for more than six months. Five real-world case studies demonstrated its effectiveness. The system diagnosed compute stragglers, communication link degradation, pipeline bubble amplification, and JIT compilation blocking.
Specific training runs included a 4,096-GPU video language model job, likely HunyuanVideo. Another job used 512 GPUs for an audio model. A third run involved a 12,960-GPU MoE training job, likely a Hunyuan LLM.
The existence of tools like ARGUS signals mature, large-scale infrastructure. While the software itself is not unique to Tencent, its deployment on a 10,000+ GPU cluster for over six months shows the stability of their training environment. The system plays a key role in rapid failure detection and performance optimisation.
Disempowerment as a possibility
Fernando Borretti, a science fiction writer, has published a critique titled “No-One Escapes the Permanent Underclass”. The piece acts as a requiem for the era when humanity chose its own destiny and confronts the chance that machines will outsmart and disempower us.
Borretti argues that everyone made of flesh and blood will be replaced by machines. They describe a pyramid structure. At the base, AIs and robots perform all economic activity. At the top sits the state, holding a monopoly on violence and the power to enforce property rights. In the middle lies a thin layer of people holding shares in companies that have consumed the economy.
The author suggests that in an existential conflict threatening the state, the government would arrest the powerful rich and take their assets. In such a conflict, states where humans remove themselves from the decision loop gain an advantage. Decision-making moves to AI for the same reason a state with satellites outperforms one relying on bicycle messengers.
What it means
These reports highlight a shift from human-managed tasks to systems that manage themselves. NVIDIA’s work shows robots can now refine their own physical skills without constant human intervention. Tencent’s infrastructure proves that managing tens of thousands of GPUs requires custom, always-on monitoring software. Meanwhile, the theoretical work warns that as machines take over more decision-making, human control over economic and political outcomes may diminish further.




