A Coding Implementation to Build Agent-Native Memory Infrastructure with Memori for Persistent Multi-User and Multi-Session LLM Applications

Key Takeaways

The tutorial demonstrates how Memori can be used to build an agent-native memory infrastructure for LLM applications.
Memori allows users like Alice and Bob to maintain separate memories while interacting with different personas, such as a fitness coach or meal planner.
Session management in Memori ensures that related project decisions are kept separate from unrelated personal details, enhancing the integrity of user data across interactions.
The example includes tests for streaming responses and asynchronous LLM calls to ensure memory works seamlessly with these patterns.

from memori import Memori
from openai import OpenAI, AsyncOpenAI
client       = OpenAI()
async_client = AsyncOpenAI()
mem = Memori()
mem.llm.register(client)
mem.llm.register(async_client)

MODEL        = "gpt-4o-mini"
WRITE_DELAY  = 6

def ask(prompt, system=None):
   msgs = []
   if system: msgs.append({"role": "system", "content": system})
   msgs.append({"role": "user", "content": prompt})
   r = client.chat.completions.create(model=MODEL, messages=msgs)
   return r.choices[0].message.content

def banner(t):
   print("\n" + "="*78 + f"\n {t}\n" + "="*78)

banner("Part 1 — Basic memory: facts persist across turns")
mem.attribution(entity_id="al***@*****le.com", process_id="personal-assistant")
ask("My name is Alice. I love hiking, Italian food, and I'm allergic to peanuts.")
time.sleep(WRITE_DELAY)
print("[Alice]", ask("What do you know about me? Be specific."))
banner("Part 2 — Multi-tenant memory: Bob's facts don't leak into Alice's recall")
mem.attribution(entity_id="bo*@*****le.com", process_id="personal-assistant")
ask("I'm Bob. Vegetarian, write Rust for a living, live in Berlin.")
time.sleep(WRITE_DELAY)
mem.attribution(entity_id="al***@*****le.com", process_id="personal-assistant")
print("[Alice]", ask("What's my favorite cuisine and any dietary issues?"))
mem.attribution(entity_id="bo*@*****le.com", process_id="personal-assistant")
print("[Bob]  ", ask("Which programming language do I write professionally?"))

banner("Part 3 — Same user, different agent personas via process_id")
mem.attribution(entity_id="al***@*****le.com", process_id="fitness-coach")
ask("Goal: sub-25-minute 5K by June. Currently I run 30 minutes flat.")
time.sleep(WRITE_DELAY)
mem.attribution(entity_id="al***@*****le.com", process_id="meal-planner")
ask("Prefer low-carb dinners on weekdays.")
time.sleep(WRITE_DELAY)
mem.attribution(entity_id="al***@*****le.com", process_id="fitness-coach")
print("[fitness-coach]", ask("Remind me of my running goal."))
mem.attribution(entity_id="al***@*****le.com", process_id="meal-planner")
print("[meal-planner] ", ask("Suggest tonight's dinner."))

banner("Part 4 — Sessions group related turns")
mem.attribution(entity_id="al***@*****le.com", process_id="personal-assistant")
project_session = f"project-fastapi-{uuid.uuid4().hex[:8]}"
mem.set_session(project_session)
ask("Notes: building a FastAPI app called 'Lighthouse', Python 3.12, "
   "deploying to Fly.io.")
time.sleep(WRITE_DELAY)
ask("Decision: SQLAlchemy + Alembic for the data layer.")
time.sleep(WRITE_DELAY)
mem.new_session()
ask("Random aside: I just adopted a puppy named Mochi.")
time.sleep(WRITE_DELAY)
mem.set_session(project_session)
print("[project session]",
     ask("Summarize what we've decided about Lighthouse so far."))

banner("Part 5 — Streaming")
mem.attribution(entity_id="al***@*****le.com", process_id="personal-assistant")
stream = client.chat.completions.create(
   model=MODEL,
   messages=[{"role": "user",
              "content": "In two sentences, what do you remember about me?"}],
   stream=True,
)
print("[stream] ", end="")
for chunk in stream:
   d = chunk.choices[0].delta.content
   if d: print(d, end="", flush=True)
print(); time.sleep(WRITE_DELAY)

banner("Part 6 — Async LLM calls")
async def async_demo():
   r = await async_client.chat.completions.create(
       model=MODEL,
       messages=[{"role": "user",
                  "content": "What dietary restriction do I have? (asked async)"}],
   )
   return r.choices[0].message.content

print("[async]", asyncio.run(async_demo()))

banner("Part 7 — Mini support agent across multiple sessions")
def support(user_id, prompt):
   mem.attribution(entity_id=user_id, process_id="support-bot")
   return ask(prompt, system=(
       "You are a calm, helpful customer support agent. "
       "Use what you remember about the user. If you don't know, say so."
   ))

mem.new_session()
print("[support T1]", support("ch*****@*****le.com", 
   "Hi! I'm Charlie, on the Pro plan. Email: ch*****@*****le.com. "
   "Billing question for next month."))

mem.new_session()
print("[support T2]", support("ch*****@*****le.com",
   "Hey, me again. What plan am I on and what's my email of record?"))

banner("Done. Open https://app.memorilabs.ai to inspect memories, "
      "or use Memori BYODB to point at your own Postgres.")

Originally published at marktechpost.com. Curated by AI Maestro.

Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.

A Coding Implementation to Build Agent-Native Memory Infrastructure with Memori for Persistent Multi-User and Multi-Session LLM Applications

Key Takeaways

Empowering Businesses with AI — Smart Tools, Smarter Business Decisions.

follow us

Popular Tag

Popular Post

I Work in Hollywood.…

CUDA Proves Nvidia Is…

Schreibmaschine’s colorful Eurorack creations…