AI Maestro · Independent British AI Publication

The Signal

The AI stories that matter — and the tools worth your time

Edition 01
May 2026
Free
4 sections

This Week in AI — What Actually Matters

Model Race

Claude 4 Lands — and Agentic AI Finally Starts to Deliver

Anthropic shipped Claude 4 Sonnet and Opus with a dramatically improved agentic architecture. The real story isn’t benchmarks — it’s that multi-step tool use is now reliable enough that developers are deploying it in production. Claude Sonnet at $3/M input tokens is the most compelling value play in frontier AI right now. The 200K context window, the honest documentation, and writing quality noticeably ahead of GPT-4o on prose-heavy tasks — this is the one to watch.

Developer Tools

GitHub Copilot Goes Multi-Model — OpenAI Codex Returns as an Agent

Microsoft opened Copilot to Claude, Gemini, and o3 alongside GPT-4o. OpenAI’s new Codex runs tasks asynchronously — submit a task, come back to a pull request. Compelling for teams with repetitive engineering work.

Open Source

ByteDance’s Doubao: The Quiet Giant Most Europeans Are Ignoring

Doubao Pro and Lite consistently outperform their weight class on coding and reasoning. At $0.14/M tokens for Lite, cost-per-quality is unmatched — if your data policy permits a Chinese-hosted endpoint.

Watch Closely

Gemini 2.5 Pro — the Dark Horse in the Context Window Wars

1M tokens natively, and it actually performs across the full window. Researchers report results matching chunking-based RAG pipelines — without the complexity. API still in preview but worth getting on your radar.

Free Tools Worth Having Right Now

Free

Ollama

ollama.com

Run Llama 3, Mistral, Qwen, Gemma, and dozens more entirely on your own machine. Zero API cost, zero data leaving your hardware. If you have a modern GPU (8GB+ VRAM) or Apple Silicon, this is the most important tool you’re not yet using.

Free Plan

OpenRouter — Free Tier

openrouter.ai

200+ LLMs, one API key. A rotating list of completely free models includes Llama 3.1 405B, Gemma 3 27B, and several Mistral variants. One account, one key, genuinely capable models at no cost. Perfect for finding your model before going direct.

Trial

Google AI Studio

aistudio.google.com

Full access to Gemini 2.5 Pro with its 1M context window — free in the playground. Rate-limited but entirely sufficient for experimentation and document analysis. No credit card required.

Free

Claude.ai — Free Tier

claude.ai

Daily access to Claude 3.5 Sonnet with generous limits. Quality-per-message among the highest of any free AI product. Worth having as a second model even if you’re already paying for something else.

The Honest LLM API Guide — May 2026

Provider / Model	Best For	In $/M	Out $/M	Verdict
Claude 3.5 SonnetAnthropic	Agentic, writing, long context	$3.00	$15.00	Top Pick
Claude 3 HaikuAnthropic	Volume, classification, speed	$0.25	$1.25	Best Value
GPT-4o + CodexOpenAI / GitHub	Coding, IDE, PR automation	$5.00	$15.00	Dev First
Gemini 2.5 ProGoogle AI / Vertex	Long docs, multimodal, research	$1.25	$10.00	Context King
Doubao Pro / LiteByteDance (Volces)	Bulk, cost-critical workloads	$0.14	$0.28	Cheapest
Ollama (local)Your hardware	Privacy, offline, zero cost	£0	£0	Free Forever
OpenRouteropenrouter.ai	Model switching, free access	Varies	Varies	Most Flexible

Anthropic — Claude API

console.anthropic.com

Our preferred API for anything requiring actual thinking — long-form writing, complex instruction following, multi-step agentic work. Writing quality noticeably ahead of GPT-4o for prose-heavy tasks. The documentation is the most honest in the industry. Watch for: output pricing at $15/M adds up fast — use Haiku ($1.25/M) for anything that doesn’t need Sonnet quality.

→ Haiku for classification · Sonnet for generation · Opus only when nothing else will do

OpenAI — Codex Agent + GPT-4o

platform.openai.com · github.com/features/copilot

GitHub Copilot is the most deeply integrated coding assistant at £10/month — lives in VS Code, JetBrains, Neovim. The new Codex cloud agent runs tasks asynchronously: submit a task, come back to a pull request. Compelling for teams with repetitive engineering work. Watch for: $15/M output and per-task Codex billing at scale.

→ Best for teams already in the GitHub ecosystem

Google — Gemini 2.5 API

ai.google.dev · cloud.google.com/vertex-ai

The most capable model for ingesting large documents or codebases in one shot. 1M tokens that actually performs across the full window — more than can be said for earlier long-context attempts. Google AI Studio free tier lets you test properly before paying. Watch for: enterprise-heavy documentation creates higher onboarding friction than Anthropic or OpenAI.

→ Dominant choice for document analysis and research tasks

ByteDance — Doubao (Volces Engine)

volcengine.com/product/ark

Doubao Lite 32K at $0.14/M input is among the lowest-cost options that produces coherent output. OpenAI-compatible API — migration is straightforward. Watch for: data residency is in China. Hard blocker for EU user data or sensitive commercial IP. English documentation is patchy; customer support non-existent in Western time zones.

→ Viable for bulk non-sensitive workloads · Check your data policy first

Ollama — Self-Hosted Models

ollama.com · github.com/ollama/ollama

One command to download, one to run. OpenAI-compatible REST API locally — your existing tooling connects with a single URL change. Library includes Llama 3.1, Mistral Nemo, Qwen 2.5, DeepSeek, Gemma 3, Phi-4. Watch for: quality is hardware-dependent. 8B on GPU = good. 7B on CPU = frustrating. 70B needs 40GB VRAM — server territory.

→ Essential for privacy-first workflows · Perfect for prototyping without API costs

OpenRouter — One Key, 200+ Models

openrouter.ai

One API key, one billing account, access to models from Anthropic, OpenAI, Google, Meta, Mistral, and dozens more — including smaller labs you’d never find otherwise. Provider pricing plus a small markup. Free-tier models included. Watch for: a middleman adds latency and a point of failure. Go direct at production scale with a single model.

→ Start here to find your model · Go direct once you’ve decided

Guide · AI Coding Tools

The Signal — Edition 01

The Signal

This Week in AI — What Actually Matters

Claude 4 Lands — and Agentic AI Finally Starts to Deliver

GitHub Copilot Goes Multi-Model — OpenAI Codex Returns as an Agent

ByteDance’s Doubao: The Quiet Giant Most Europeans Are Ignoring

Gemini 2.5 Pro — the Dark Horse in the Context Window Wars

Free Tools Worth Having Right Now

The Honest LLM API Guide — May 2026

Claude Code vs GitHub Codex vs Cursor: The Honest 2026 Comparison

GPU Rental vs LLM API vs Cloud Hosting: Which Actually Makes Sense?

Ollama Cloud Review 2026: Is It Actually Worth It?

Empowering Businesses with AI — Smart Tools, Smarter Business Decisions.

follow us

Popular Tag

Popular Post

The EU doesn’t really…

Nobel laureate John Jumper…

OpenAI’s Codex can now…

This Week in AI — What Actually Matters

Claude 4 Lands — and Agentic AI Finally Starts to Deliver

GitHub Copilot Goes Multi-Model — OpenAI Codex Returns as an Agent

ByteDance’s Doubao: The Quiet Giant Most Europeans Are Ignoring

Gemini 2.5 Pro — the Dark Horse in the Context Window Wars

Free Tools Worth Having Right Now

The Honest LLM API Guide — May 2026

Also From AI Maestro — Related Reading

Claude Code vs GitHub Codex vs Cursor: The Honest 2026 Comparison

GPU Rental vs LLM API vs Cloud Hosting: Which Actually Makes Sense?

Ollama Cloud Review 2026: Is It Actually Worth It?

Empowering Businesses with AI — Smart Tools, Smarter Business Decisions.

follow us

Popular Tag

Popular Post

The EU doesn’t really…

Nobel laureate John Jumper…

OpenAI’s Codex can now…