Where should durable memory live in a multi-agent setup? A small research scaffold

Where should durable memory live in a multi-agent setup?

I encountered a recurring issue while running long-term projects with AI agents. The specialists performed their tasks effectively but struggled to maintain consistent project memories over time. By week 4, decisions from the initial weeks were forgotten, and previously rejected options were reactivated.

Learning from consulting practices

To address this problem, I examined how long-running projects are managed in consulting firms. They use a transformation office or project management office (PMO) to maintain cadence, document decisions, track risks, and keep a canonical current-state artifact. The key is the operating model rather than the specific roles.

Academic insights

In literature, there’s a distinction made between project memory (the knowledge available for current work) and the project-memory system (how this knowledge is stored, retrieved, and used). For example, Kasvi et al. (2003) discuss these concepts.

More recently, Mariano and Awazu (2024) view project memory as an active practice rather than a static repository. This perspective aligns with recent work in the LLM space, such as Anthropic‘s multi-agent research system, the OpenAI Agents SDK handoff pattern, and modular or hierarchical memory management like LEGOMem and AgentSys.

My hypothesis

I propose that durable memory should be managed by a single entity-the project owner. This central authority ensures consistency across all agents working on different tasks. Task specialists receive only the minimal context they need to perform their specific roles, without access to full historical data.

The core of this setup is the persistent “PM soul,” which serves as the focal point for framing ambiguous requests, decomposing work into manageable segments, and writing concise handoff briefs. This soul verifies the correctness of returned work and only writes evidence-backed facts into the project memory.

Implementation details

The repository I created is a proof-of-concept rather than a fully validated solution. It includes an agent contract, templates for memory files and handoff briefs, a consulting workflow map with sources, a case study, and an evaluation rubric (such as repeated-context events, handoff brief length, decision closure time, specialist rework loops).

The next step is to conduct a one-week pilot on a live project. The goal is to gather feedback and refine the system before claiming it as a validated solution.

Memory boundary considerations

I am particularly concerned about how the memory boundary is defined. My current rule states that specialists should only see the handoff brief plus the files they need, without full access to the project’s historical context. This approach prevents specialists from seeing too much of the history.

My concern is whether this rule will lead to specialists needing to review more than just the handoff brief when a previous option was rejected. In such cases, the memory might grow to include additional details that are crucial for understanding past decisions.

I am seeking input from others who have implemented similar systems or encountered issues related to managing memory in multi-agent setups. How do you handle this boundary? Have you found specific solutions to deal with this issue?

Key Takeaways

The central authority (PM soul) is crucial for maintaining consistent project memories.
Specialists should receive minimal, scoped context rather than full historical data.
A persistent memory file managed by the PM soul ensures consistency and reliability across different task specialists.
Feedback from a live trial will help refine this hypothesis before claiming it as a validated solution.

I am open to discussion on how we can further improve this approach based on practical experience and feedback.

Source Read original →

Where should durable memory live in a multi-agent setup? A small research scaffold