I fitted the new δ-mem research for apple silicon using mlx and openclaw integration! My findings

Improving Agent Responses with δ-mem on Apple Silicon I’ve been exploring ways to enhance memory management for my AI agents and came…

By AI Maestro May 16, 2026 2 min read
I fitted the new δ-mem research for apple silicon using mlx and openclaw integration! My findings

Improving Agent Responses with δ-mem on Apple Silicon

I’ve been exploring ways to enhance memory management for my AI agents and came across a new paper titled “Improving Model Attention Direction Without Using Context”. This research focuses on improving model performance by adjusting attention direction without relying on context or Lora, achieving 20% better answers from their tests.

To test the effectiveness of this approach, I implemented it using mlx, which is significantly faster than other methods. I also tested with and without my openclaw session history to see if there were any improvements. The integration was successful, as evidenced by the code available at this repository.

I also created an adapter for this new approach using huggingface, which can be found here: delta-mem-qwen3-4b-instruct-mlx-adapter. The results of this integration are summarized in the following table:

TestPlainδ-memRatio
Synthetic paper-style0.51290.51291.00x
LoCoMo-10 mini0.05000.18333.67x
OpenClaw replay0.57010.66671.17x

The paper’s benchmarks reported positive results, with the δ-mem approach showing improvements of 1.31x and 1.20x across different tests like MemoryAgentBench and LoCoMo-10 mini. However, local tests were more varied:

TestLatency
Synthetic1.013x
LoCoMo-10 mini1.33x query / 1.50x total
OpenClaw replay1.30x

The synthetic tests were mostly flat, but the LoCoMo-10 mini test showed a surprising boost in performance, which is encouraging. The OpenClaw-style replay also demonstrated a more practical improvement, as seen from passing 6 out of 8 probes compared to 7 out of 8 without δ-mem.

While the synthetic tests were promising, the local tests revealed that Apple Silicon cannot run CUDA efficiently, leading to lower results. I’m eager to test this approach on a more powerful model like my current Qwen3.6:27b for mlx, which requires an adapted model. However, as I am currently unemployed and cannot afford running such a large-scale experiment in the cloud, I would be grateful if someone with access to such resources could continue this work.

In summary, integrating δ-mem on Apple Silicon has shown potential improvements in agent responses, particularly when using session history. The results are promising, but more testing is needed to validate these findings across different hardware and models.

Key Takeaways

  • The integration of δ-mem improved response quality for AI agents without relying on context or Lora.
  • Local tests showed mixed results due to the limitations of Apple Silicon in running CUDA efficiently.
  • A more powerful model like Qwen3.6:27b would require adapting the δ-mem approach, but this is currently beyond my means as an unemployed individual.

Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.

Name
Scroll to Top