A Coding Implementation to Build Agent-Native Memory Infrastructure with Memori for Persistent Multi-User and Multi-Session LLM Applications

Disclosure: Some links in this article are affiliate links. AI Maestro may earn a commission if you make a purchase, at no…

By AI Maestro May 11, 2026 3 min read
A Coding Implementation to Build Agent-Native Memory Infrastructure with Memori for Persistent Multi-User and Multi-Session LLM Applications

Key Takeaways

  • The tutorial demonstrates how Memori can be used to build an agent-native memory infrastructure for LLM applications.
  • Memori allows users like Alice and Bob to maintain separate memories while interacting with different personas, such as a fitness coach or meal planner.
  • Session management in Memori ensures that related project decisions are kept separate from unrelated personal details, enhancing the integrity of user data across interactions.
  • The example includes tests for streaming responses and asynchronous LLM calls to ensure memory works seamlessly with these patterns.
from memori import Memori
from openai import OpenAI, AsyncOpenAI
client       = OpenAI()
async_client = AsyncOpenAI()
mem = Memori()
mem.llm.register(client)
mem.llm.register(async_client)

MODEL        = "gpt-4o-mini"
WRITE_DELAY  = 6

def ask(prompt, system=None):
   msgs = []
   if system: msgs.append({"role": "system", "content": system})
   msgs.append({"role": "user", "content": prompt})
   r = client.chat.completions.create(model=MODEL, messages=msgs)
   return r.choices[0].message.content

def banner(t):
   print("\n" + "="*78 + f"\n {t}\n" + "="*78)

banner("Part 1 — Basic memory: facts persist across turns")
mem.attribution(entity_id="al***@*****le.com", process_id="personal-assistant")
ask("My name is Alice. I love hiking, Italian food, and I'm allergic to peanuts.")
time.sleep(WRITE_DELAY)
print("[Alice]", ask("What do you know about me? Be specific."))
banner("Part 2 — Multi-tenant memory: Bob's facts don't leak into Alice's recall")
mem.attribution(entity_id="bo*@*****le.com", process_id="personal-assistant")
ask("I'm Bob. Vegetarian, write Rust for a living, live in Berlin.")
time.sleep(WRITE_DELAY)
mem.attribution(entity_id="al***@*****le.com", process_id="personal-assistant")
print("[Alice]", ask("What's my favorite cuisine and any dietary issues?"))
mem.attribution(entity_id="bo*@*****le.com", process_id="personal-assistant")
print("[Bob]  ", ask("Which programming language do I write professionally?"))

banner("Part 3 — Same user, different agent personas via process_id")
mem.attribution(entity_id="al***@*****le.com", process_id="fitness-coach")
ask("Goal: sub-25-minute 5K by June. Currently I run 30 minutes flat.")
time.sleep(WRITE_DELAY)
mem.attribution(entity_id="al***@*****le.com", process_id="meal-planner")
ask("Prefer low-carb dinners on weekdays.")
time.sleep(WRITE_DELAY)
mem.attribution(entity_id="al***@*****le.com", process_id="fitness-coach")
print("[fitness-coach]", ask("Remind me of my running goal."))
mem.attribution(entity_id="al***@*****le.com", process_id="meal-planner")
print("[meal-planner] ", ask("Suggest tonight's dinner."))

banner("Part 4 — Sessions group related turns")
mem.attribution(entity_id="al***@*****le.com", process_id="personal-assistant")
project_session = f"project-fastapi-{uuid.uuid4().hex[:8]}"
mem.set_session(project_session)
ask("Notes: building a FastAPI app called 'Lighthouse', Python 3.12, "
   "deploying to Fly.io.")
time.sleep(WRITE_DELAY)
ask("Decision: SQLAlchemy + Alembic for the data layer.")
time.sleep(WRITE_DELAY)
mem.new_session()
ask("Random aside: I just adopted a puppy named Mochi.")
time.sleep(WRITE_DELAY)
mem.set_session(project_session)
print("[project session]",
     ask("Summarize what we've decided about Lighthouse so far."))

banner("Part 5 — Streaming")
mem.attribution(entity_id="al***@*****le.com", process_id="personal-assistant")
stream = client.chat.completions.create(
   model=MODEL,
   messages=[{"role": "user",
              "content": "In two sentences, what do you remember about me?"}],
   stream=True,
)
print("[stream] ", end="")
for chunk in stream:
   d = chunk.choices[0].delta.content
   if d: print(d, end="", flush=True)
print(); time.sleep(WRITE_DELAY)

banner("Part 6 — Async LLM calls")
async def async_demo():
   r = await async_client.chat.completions.create(
       model=MODEL,
       messages=[{"role": "user",
                  "content": "What dietary restriction do I have? (asked async)"}],
   )
   return r.choices[0].message.content

print("[async]", asyncio.run(async_demo()))

banner("Part 7 — Mini support agent across multiple sessions")
def support(user_id, prompt):
   mem.attribution(entity_id=user_id, process_id="support-bot")
   return ask(prompt, system=(
       "You are a calm, helpful customer support agent. "
       "Use what you remember about the user. If you don't know, say so."
   ))

mem.new_session()
print("[support T1]", support("ch*****@*****le.com", 
   "Hi! I'm Charlie, on the Pro plan. Email: ch*****@*****le.com. "
   "Billing question for next month."))

mem.new_session()
print("[support T2]", support("ch*****@*****le.com",
   "Hey, me again. What plan am I on and what's my email of record?"))

banner("Done. Open https://app.memorilabs.ai to inspect memories, "
      "or use Memori BYODB to point at your own Postgres.")

Originally published at marktechpost.com. Curated by AI Maestro.

Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.

Name
Scroll to Top