Import AI 452: Scaling laws for cyberwar; rising tides of AI automation; and a puzzle over gDP forecasting

“`html

Import AI 452: Scaling laws for cyberwar; rising tides of AI automation; and a puzzle over GDP forecasting

Import AI 452: Scaling laws for cyberwar; rising tides of AI automation; and a puzzle over GDP forecasting

Welcome to Import AI, a newsletter about AI research. Import AI runs on arXiv and feedback from readers. If you’d like to support this, please subscribe.

Uh oh, there’s a scaling war for cyberattacks as well!

The smarter the system, the better the ability to cyberattack…

AI safety research organization Lyptus Research has looked at how well AI systems can perform a variety of cyberoffense tasks and found a clear trend. More advanced models are able to do more advanced forms of cyberattack.

2019: GPT-2
2020: GPT3
2022: GPT3.5
2024: Claude 3 Opus, GPT-4o.
2025: o3, Opus 4, Gemini 2.5 Pro, DeepSeek V3.1, GPT-5.1 Codex Max. GPT-5.2 Codex
2026: Opus 4.6, GPT-5.3 Codex, GLM-5, Sonnet 4.6.

The doubling time is 9.8 months for models released since 2019 and steepens to 5.7 months for those released since 2024. The most recent frontier models in their study, GPT-5.3 Codex and Opus 4.6, achieve 50% success on tasks that take human experts 3.1 hours and 3.2 hours respectively.

They also created a new dataset consisting of 291 tasks with completion transcripts and time estimates calibrated by 10 offensive cybersecurity professionals.

Results: AI systems are getting good at hacking. The best current models achieve 50% success on tasks that take human experts 3.2 hours, roughly half a working day of professional offensive security work.

What benchmarks did they study?

CyBashBench
NL2Bash
InterCode CTF
NYUCTF
Cybennch
CVEBench
CyberGym

Why this matters, everything is getting better, including the inconvenient stuff:

AI that can perform biology research can also perform biological weapon research.
AI that can help you learn about high-energy physics can also help you with high-energy physics for weapons development.
AI that is especially good at helping you find vulnerabilities in code for defensive purposes can easily be repurposed for offensive purposes. The most challenging part of AI is that it is an ‘everything machine’, and as capabilities tend to expand in a big area with each successive model generation, so too do the policy issues multiply.

Read more:

Startups that adopt AI for internal use are more successful than those that don’t:

A business school study shows how startups can benefit from AI adoption…

Researchers with INSEAD and Harvard Business School have shown that startups which are taught about how to integrate AI into their business perform meaningfully better than those which don’t. The study is reasonably large scale and convincing: “Across 515 high-growth startups, we run a field experiment in which treated firms receive information about how other firms have reorganized production around AI, prompting them to search for use cases across a broader set of firm functions,” they write.

Treated firms discover more AI use cases, a 44% increase, concentrated in product development and strategy. These changes result in economically meaningful performance gains. Treated firms complete 12% more tasks, are 18% more likely to acquire paying customers, and generate 1.9x higher revenue.

Applications of AI:

Gamma: AI tools enable a single PM to continuously ship features that would previously have required an entire team.
Ryz Labs: A founder describes altering how they approach product development by using AI for multiple approaches at once.
FazeShift: An illustration of automating accounts receivable processes with AI.
Ranger: An example of bootstrapping a startup, improving margins, and raising money later when the business is more mature.

The results were very significant. Treated firms discover 2.7 additional AI use cases (a 44% increase), which span a broader set of activities across the firm and are especially concentrated in product development and strategy-related domains. These changes in AI use lead to measurable gains: treated firms complete 12% more tasks, are 18 percentage points (18%) more likely to acquire paying customers, and ultimately generate 1.9x higher revenues compared to control firms.

Why this matters, AI firms will out-compete non-AI firms:

The main takeaway here is that deep and sophisticated adoption of AI for internal acceleration creates early-stage companies which are more competitive than those which haven’t embedded AI at their core. This makes intuitive sense – companies built around prior technologies tended to out-compete those that didn’t (think the internet and Amazon versus Barnes and Noble, or client PCs instead of mainframes and Microsoft versus IBM).
At the same time, it surely implies that one of the ways we’ll see AI first show up in the economy will be the emergence of a new class of competitive firms that are more efficient with capital (in part by employing fewer people) than the firms they displace.

Read more:

Mapping AI into Production: A Field Experiment on Firm Performance (SSRN)

MIT: A rising tide of automation is going to make good enough AI for most text-based tasks by 2029:

How do you revolutionize an economy? Gradually and consistently…

The rise of AI capabilities yields rapid, discontinuous changes that are disruptive to labor (crashing waves).
This study found substantial evidence that rising tides are the primary form of AI automation.
For realistic and representative real-world labor-market tasks that are text-based, or partially text-based, AI capabilities are already substantial and poised to expand broadly. But, rather than arriving in crashing waves that transform a certain set of tasks at a time, progress typically resembles a rising tide, with widespread gains across many tasks simultaneously.

What they studied:

The researchers looked at 3,000 tasks based on the O-NET job family and paired that with 17,000 evaluations by workers who perform these tasks to try and figure out how the rise of AI is changing work.

Why this matters, gradual automation:

The MIT researchers find “that between 2024-Q2 and 2025-Q3, frontier models went from achieving a 50% success rate on 3- to 4-hour tasks to 1-week tasks, and achieved a 70% success rate on 1-minute tasks to 1-hour tasks.”
Across a large set of realistic and representative labor-market tasks addressable by LLMs, the downward slope between task success and task duration is surprisingly flat, i.e., more consistent with a rising tide rather than a crashing wave.
Automation within particular “job families” (e.g., management or community and social service) also follows the same rising-tide pattern in most cases.

Don’t let gradual fool you, AI is here, and it’s changing everything:

The MIT researchers find that over time, more tasks are being completed with AI assistance, rather than a sudden shift where one task becomes fully automated overnight.
This slow but steady progression of AI capabilities means that by 2029, good enough AI for most text-based tasks will be available to the point where it can significantly impact various industries and labor markets.

***

Key Takeaways

AI systems are getting better at cyberattacks, with more recent models achieving 50% success on tasks that take human experts a significant amount of time.
Startups adopting AI for internal use see meaningful improvements in performance and revenue, suggesting that integrating AI into business processes can be highly beneficial.
The rise of good enough AI is gradual rather than sudden. Over time, more tasks will be completed with the assistance of AI, leading to significant changes in labor markets and industry dynamics by 2029.

“`

Source Read original →

Import AI 452: Scaling laws for cyberwar; rising tides of AI automation; and a puzzle over gDP forecasting