NVIDIA BioNeMo Agent Toolkit Turns Biomolecular Models Into Callable Skills for AI Agents in Drug Discovery

NVIDIA has released the BioNeMo Agent Toolkit, a set of instructions that allows AI agents to call specific biomolecular models as standard tools. The update addresses a core problem in scientific computing: general coding agents cannot reliably execute wet-lab tasks or interpret complex biological data without a structured interface.

The gap between code and discovery

Current AI agents can read papers, write scripts, and call APIs. However, scientific discovery does not follow the same logic as software engineering. A hypothesis cannot be validated by a passing test suite. Research remains iterative, uncertain, and dependent on physical reality. Without a way to interact with specialised models, an agent is limited to generating text or code, not results.

What is BioNeMo Agent Toolkit

The toolkit is an open-source collection of ‘skills’ that package NVIDIA biomolecular models for direct agent use. These skills cover protein folding, molecular docking, generative chemistry, genomics, and protein design. NVIDIA structures the platform into two layers. The first is an accelerated tool layer using NVIDIA NIM (NVIDIA Inference Microservices) and libraries like cuEquivariance and Parabricks. The second layer wraps these capabilities into agent-ready interfaces.

Each skill documents the model’s purpose, inputs, parameters, and expected outputs. Model Context Protocol (MCP) server wrappers expose open models that are not yet packaged as NIMs. This setup allows an agent to discover, select, and invoke models independently. The repository organises skills into nim-skills, open-models-skills, and library-skills. A workflows folder contains multi-step meta-skills. One example is generative_protein_binder_design, which chains RFdiffusion, ProteinMPNN, and OpenFold3.

How a BioNeMo Skill Works

Every skill is a directory containing a SKILL.md file with YAML frontmatter and instructions. An agent reads this documentation to understand how to act. The prompt pattern remains consistent across models. NVIDIA’s documentation uses OpenFold3 as a reference, but the structure applies to other NIMs for biology including Boltz-2, DiffDock, GenMol, ProteinMPNN, MSA Search, RFdiffusion, and Evo 2. You provide the skill name, input, and endpoint.

Installation pulls skills via the open-source skills CLI:

# Browse and pick a skill interactively
npx skills add NVIDIA-BioNeMo/bionemo-agent-toolkit

# Or install one skill for a specific agent
npx skills add NVIDIA-BioNeMo/bionemo-agent-toolkit --skill boltz2-nim --agent claude-code

Deployment is a choice. Use hosted NIM endpoints for fast access without managing infrastructure. Move selected models to a local environment when you need lower warm latency, data locality, or repeated iteration.

Performance metrics

NVIDIA measured whether skills improved an agent’s workflow. All reported metrics came from Codex CLI running GPT-5.5 fast. The team compared the same agent with and without each skill.

Task completion was the first metric. Without skills, the agent completed 57.1% of required tasks on average. With access to NIM skills, completion reached 100%.

Efficiency was the second metric. NVIDIA counted passing assertions, the individual steps that compose a task. With skills, an agent produced 2x more passing assertions per 1,000 tokens. That gain held across all ten NIM skills tested.

Use cases with examples

Protein structure prediction: An agent folds a peptide sequence with Boltz-2 or OpenFold3. It returns a CIF file for downstream inspection.
Multiple sequence alignment: An agent generates an MSA with MMseqs2 through the MSA Search skill. The artifact is an A3M file.
Generative chemistry: An agent generates candidate molecules with GenMol. Outputs arrive as SDF or SMILES for filtering.
Protein binder design: The generative_protein_binder_design workflow chains three models. RFdiffusion builds a backbone, ProteinMPNN designs the sequence, and OpenFold3 validates the fold.
Each loop follows the same shape: The agent selects a model, prepares inputs, runs it, inspects outputs, and explains results with caveats.

Comparison: Agent With vs Without Skills

Dimension	General agent (no skills)	Agent + BioNeMo Skills
Task completion	57.1% average	100% average
Token efficiency	Baseline	2x passing assertions per 1k tokens
Model selection	Guesses tool, format, and inputs	Reads purpose, inputs, and artifacts
Deployment	Manual setup from source	Hosted or local NIM, documented
Failure handling	Unknown failure modes	Documented failure modes per skill
Workflows	Isolated single calls	Multi-step meta-skills (binder design)

Getting started

The prerequisites are minimal. You need an agent runtime such as Claude or Codex. You need an NVIDIA API key for hosted BioNeMo NIM endpoints. A GPU node is optional, for local NIM deployment.

Point the agent at the repository first. Let it enumerate the available capabilities before it acts. Then hand it a single skill to operate one model.

NVIDIA flags two cautions. The build.nvidia.com endpoints are for small-scale development and testing only. They are not production-grade inference. NVIDIA also stresses validation: check low-confidence structures and filter generated molecules before trusting them.

What it means

The toolkit moves AI from generating ideas to executing steps. By defining clear inputs and outputs, it allows agents to perform tasks that require specific scientific models without guessing the correct tool or format. This reduces the need for manual oversight in repetitive discovery loops.

Source Read original →

NVIDIA BioNeMo Agent Toolkit Turns Biomolecular Models Into Callable Skills for AI Agents in Drug Discovery

The gap between code and discovery

What is BioNeMo Agent Toolkit

How a BioNeMo Skill Works

Performance metrics

Use cases with examples

Comparison: Agent With vs Without Skills

Getting started

What it means

Empowering Businesses with AI: Smart Tools, Smarter Business Decisions.

follow us

Popular Tag

Popular Post

Tidal won’t pay royalties…

Gemini’s personalized AI image…

Tidal Says It Won’t…

The gap between code and discovery

What is BioNeMo Agent Toolkit

How a BioNeMo Skill Works

Performance metrics

Use cases with examples

Comparison: Agent With vs Without Skills

Getting started

What it means

More in AI Research & Science

Empowering Businesses with AI: Smart Tools, Smarter Business Decisions.

follow us

Popular Tag

Popular Post

Tidal won’t pay royalties…

Gemini’s personalized AI image…

Tidal Says It Won’t…