Multi-Tenant AI Assistant Platform (Dify RAG / LLM Chatbot)
Designed and developed a multi-tenant AI assistant platform for a major enterprise where multiple departments maintain their own knowledge bases. Built an API layer bridging internal RAG and external LLMs.
Challenge
- Multiple departments wanted to use AI assistants, but a shared platform could not co-mingle each department's confidential information.
- The existing Dify implementation assumed single-tenant use and lacked sufficient permission and log isolation.
- An abstraction layer was needed to flexibly switch between LLM providers (Claude / OpenAI / Gemini).
Solution
- Embedded tenant IDs in JWTs and validated them at Dify API calls via both URL and header.
- Isolated the Pinecone vector space per tenant and physically prohibited cross-tenant search at the API layer.
- Implemented an LLM provider abstraction in FastAPI; controlled model switching via environment variables and feature flags.
- Distributed GKE Deployments per tenant group and applied resource limits at the tenant level.
Technology Decisions
Why Dify
OSS (auditable), excellent prompt-management UI, easy in-house deployment. Built UI/prompt-management features in 2 weeks that would have taken 1-2 person-months from scratch.
Why Pinecone for vector DB
Multi-tenant-capable vector DBs were limited at the time, and Pinecone's namespace feature enabled tenant isolation. Future migration to OSS such as Weaviate / Milvus is also being considered.
Outcomes
Tenants Served
5 → 20+ departments
Scalability proven; zero code changes needed for horizontal expansion
Inference Cost
30% reduction
Lighter tasks routed to lower-cost models via LLM abstraction
Prompt Iteration Cycle
2 weeks → 3 days
Dify UI enables non-engineer iteration
Project Duration
7 months
Sep 2025 – Mar 2026, end-to-end from requirements to production
Team Composition
5-14 programmers
Scaled by project phase
Team
1 of our engineers + prime contractor / partner team of 5-14
Have a similar requirement?
If you face a comparable challenge in industry, scale, or technology stack, please don't hesitate to reach out.
Schedule a free consultation