“`html

Import AI 448: AI R&D; Bytedance’s CUDA-writing agent; on-device satellite AI

Import AI

Welcome to Import AI, a newsletter about AI research. Import AI runs on arXiv and feedback from readers. If you’d like to support this, please subscribe.

AI progress is moving faster than even well regarded forecasters can guess:

…Ajeya Cotra updates her timelines…
Ajeya Cotra, a longtime AI thinker, has updated her predictions for AI capabilities in 2026. She writes that AI systems are progressing at an even faster rate than she initially thought, based on recent results from METR (a benchmark for evaluating software engineering capabilities). Specifically, Opus 4.6 was found to have a time horizon of just over 12 hours by January 2026, which Ajeya had previously predicted would take around 24 hours.

This rapid progress is concerning because it suggests that AI systems are likely surpassing human capabilities sooner than anticipated. Ajeya notes that by the end of the year, AI agents might have a time horizon over 100 hours on tasks such as those in METR’s suite, and this could be even more challenging when considering multiple weeks of work.

Want to measure AI R&D, here are 14 ways to do it:

The most significant property of artificial intelligence could be when it starts building itself. Researchers have identified 14 distinct metrics that can help us understand how well companies are managing and overseeing AI Research & Development (AIRDA), which is the process of getting AI systems to build other AI systems.

Measure AI performance on AI R&D tasks
Compare AI R&D performance with human and human-AI teams
Evaluate oversight red teaming: How effectively can human teams supervise AI systems that are building themselves?
Analyze misalignment in AIRDA processes
Calculate the rate of efficiency improvements on AI R&D tasks
Survey staff to understand how they use AI and its impact on productivity
Investigate where AI is used in high-stakes decisions
Examine how researchers spend their time
Monitor the effectiveness of overseeing AI development (e.g., bug rates, undesired behaviors)
Evaluate if and when AI systems subvert human goals
Track the headcount of AI researchers and their performance
Analyze compute usage across different stages of AI R&D processes
Measure how much compute is used for AI R&D compared to total spending
Evaluate permissions granted to AI systems over time

Specifically, companies should:

Track the difference in progress between safety and capabilities research. Is one area outpacing the other?
Monitor how AI R&D affects oversight processes. Does it help or hinder human control?
Measure actual AI R&D activity within their organization.

Governments should:

Develop confidential reporting systems to aggregate data from multiple companies for better understanding of AI progress.

Third parties can:

Use public sources to estimate metrics related to AI R&D, such as compute usage and performance.
Create tools and surveys to gather more insights about AI R&D activities within organizations.

Governing AI R&D:

An actor has oversight over the AI R&D process if they understand it well enough to exercise informed control. Measuring these aspects is crucial for effective governance of AI.

Indian researchers use edge computing to prototype a citywide camera network:

The Indian Institute of Science in Bengaluru developed an Intelligent Transportation System (AIITS) that uses lightweight GPUs (Jetson Edge accelerators) to process video streams from traffic cameras. This system allows for real-time analytics without the need to transmit all data to a central hub, enabling sustainable city-scale monitoring.

Helping satellites run on-device AI for arctic monitoring:

The German Research Center for Artificial Intelligence created TinyIceNet, a tiny vision model designed to estimate sea ice thickness from synthetic aperture radar (SAR) data. This work demonstrates the feasibility of running sophisticated AI models in resource-constrained environments like satellite sensors.

“`

Source Read original →

Import AI 448: AI R&D; Bytedance’s CUDA-writing agent; on-device satellite AI

Import AI

AI progress is moving faster than even well regarded forecasters can guess:

Want to measure AI R&D, here are 14 ways to do it:

Governing AI R&D:

Indian researchers use edge computing to prototype a citywide camera network:

Helping satellites run on-device AI for arctic monitoring:

Empowering Businesses with AI: Smart Tools, Smarter Business Decisions.

follow us

Popular Tag

Popular Post

Some of the nation’s…

Meituan Releases LongCat-2.0: A…

Amazon will stop accepting…

AI progress is moving faster than even well regarded forecasters can guess:

Want to measure AI R&D, here are 14 ways to do it:

Governing AI R&D:

Indian researchers use edge computing to prototype a citywide camera network:

Helping satellites run on-device AI for arctic monitoring:

Related articles

Empowering Businesses with AI: Smart Tools, Smarter Business Decisions.

follow us

Popular Tag

Popular Post

Some of the nation’s…

Meituan Releases LongCat-2.0: A…

Amazon will stop accepting…