SterriaR logo
SterriaR
Back to Case Studies
Enterprise AI

Multi-Tenant AI Assistant Platform (Dify RAG / LLM Chatbot)

Designed and developed a multi-tenant AI assistant platform for a major enterprise where multiple departments maintain their own knowledge bases. Built an API layer bridging internal RAG and external LLMs.

Sep 2025 – Mar 2026·Role: Programmer (5-14 person team)
TypeScriptPythonNext.jsFastAPIGKEDifyPinecone

Challenge

  • Multiple departments wanted to use AI assistants, but a shared platform could not co-mingle each department's confidential information.
  • The existing Dify implementation assumed single-tenant use and lacked sufficient permission and log isolation.
  • An abstraction layer was needed to flexibly switch between LLM providers (Claude / OpenAI / Gemini).

Solution

  • Embedded tenant IDs in JWTs and validated them at Dify API calls via both URL and header.
  • Isolated the Pinecone vector space per tenant and physically prohibited cross-tenant search at the API layer.
  • Implemented an LLM provider abstraction in FastAPI; controlled model switching via environment variables and feature flags.
  • Distributed GKE Deployments per tenant group and applied resource limits at the tenant level.

Technology Decisions

Why Dify

OSS (auditable), excellent prompt-management UI, easy in-house deployment. Built UI/prompt-management features in 2 weeks that would have taken 1-2 person-months from scratch.

Why Pinecone for vector DB

Multi-tenant-capable vector DBs were limited at the time, and Pinecone's namespace feature enabled tenant isolation. Future migration to OSS such as Weaviate / Milvus is also being considered.

Outcomes

Tenants Served

5 → 20+ departments

Scalability proven; zero code changes needed for horizontal expansion

Inference Cost

30% reduction

Lighter tasks routed to lower-cost models via LLM abstraction

Prompt Iteration Cycle

2 weeks → 3 days

Dify UI enables non-engineer iteration

Project Duration

7 months

Sep 2025 – Mar 2026, end-to-end from requirements to production

Team Composition

5-14 programmers

Scaled by project phase

Team

1 of our engineers + prime contractor / partner team of 5-14

Have a similar requirement?

If you face a comparable challenge in industry, scale, or technology stack, please don't hesitate to reach out.

Schedule a free consultation