Skip to main content
Rohit Raj
HomeAgentsProjectsServicesReposNotesAboutContactView Current Work
Home→AI Agents→Resolvr
🎧 Self-hosted · open source · runs on Ollama

Resolvr — Self-Hosted, Open-Source AI Customer Support Agent (Runs on Local Ollama)

Resolvr is an open-source, self-hosted AI customer support agent that classifies a support ticket, retrieves knowledge-base articles via RAG, decides resolve-vs-escalate behind a safety gate, and drafts a reply — running on local Ollama (qwen2.5:14b) so customer data never leaves your server, at zero per-token cost.

By Rohit Raj · Founding Engineer·Last updated June 14, 2026
Try the live demo →View on GitHub
Resolvr resolving a support ticket — React UI, FastAPI backend, reply drafted live by local Ollama
Resolvr running locally: a ticket auto-resolved with a reply drafted by Ollama (qwen2.5:14b), grounded in the knowledge base.

What is Resolvr?

An agent-first support pipeline you own end-to-end — not an inbox with a chatbot bolted on.

Self-hosted & private

Inference runs on local Ollama. By default no ticket text is sent to any third-party API, so customer data physically stays on your box — the cleanest path to GDPR, HIPAA, and air-gapped support.

Open source & hackable

The whole pipeline is on GitHub — FastAPI + React, MIT-style. Read it, fork it, swap the model, point it at your own knowledge base.

Zero per-token cost

No per-resolution or per-token fee. The only cost is the server you already run. An opt-in cloud-API fallback covers the hardest tickets when you want frontier quality.

Safety-gated, not reckless

Honest scope: Resolvr does autonomous triage, RAG answering, and safety-gated drafting + escalation — not end-to-end account changes or refunds. Tool/action execution is on the roadmap.

Stack: FastAPI (Python) + React/Vite · Ollama embeddings + cosine RAG · deterministic classify/decide · pytest eval-gated.

How does Resolvr work? The classify → RAG → safety-gate → draft pipeline

Four tools run in sequence. Classification, retrieval, and the resolve/escalate decision are deterministic; the LLM only writes the final reply.

1 · Classify

classify_ticket

A deterministic classifier tags the ticket with category (billing, technical, account, how-to, refund, security, legal, abuse), intent, sentiment, and priority. No LLM call — so the routing is reproducible and testable.

2 · Retrieve (RAG)

search_kb

Semantic retrieval over your help-center articles: the query is embedded with a local Ollama model and ranked by cosine similarity. Grounding the reply in real KB passages is what stops the agent from inventing policy.

3 · Decide

decide_action

A configurable confidence threshold decides resolve-vs-escalate. A hard safety gate forces escalation on security, legal, abuse, and refund tickets; low retrieval confidence escalates with full context attached.

4 · Draft

draft_resolution

For resolvable tickets, local Ollama drafts a reply strictly from the retrieved KB. Escalations get a courteous holding reply plus an internal routing note for the human who picks it up.

Can it run fully self-hosted so customer data never leaves the box?

Yes. Because inference runs on local Ollama, no ticket text is sent to any external model by default — which is a materially stronger privacy story than “privacy-first” SaaS that PII-masks and then sends the rest to a cloud LLM.

  • No third-party processor — there's no external API call to cover with a DPA, which simplifies the GDPR and data-residency story for EU SaaS and regulated teams.
  • True local inference — data physically stays on your infrastructure, so it fits air-gapped and sovereignty requirements.
  • Opt-in fallback — the cloud-API tier is off by default, so the privacy guarantee holds unless you explicitly enable it.

What does it cost vs Intercom Fin or Zendesk AI?

Resolvr has zero per-token and zero per-resolution cost — the only cost is the server you already run. That's the wedge for teams hit by per-resolution SaaS pricing, where bills scale directly with ticket volume.

One honest caveat so this stays credible: below roughly a couple of million tokens a day, a cloud API can be cheaper than running dedicated GPU infrastructure. Resolvr's value is privacy and control plus no marginal cost at volume — and the opt-in cloud fallback covers low-volume or hardest-ticket cases without locking you in.

Resolvr vs Intercom Fin, Zendesk AI, Chatwoot & Zammad

As of 2026, based on public documentation. The dimensions that matter for a self-hosted, privacy-first deployment.

ProductSelf-hostLocal LLMData leaves boxPer-resolution feeBuilt-in RAGResolve/escalate gateLicense
ResolvrYesYes (Ollama)No (local by default)NoneBuilt-inYesOpen source
Intercom FinNo (SaaS)NoYes (cloud)~$0.99/resolutionYesHandoff onlyProprietary
Zendesk AINo (SaaS)NoYes (cloud)Per-resolution add-onYesHandoff onlyProprietary
Chatwoot + CaptainYesVariesDepends on modelNone (self-host)Add-onHelpdesk handoffOpen source
ZammadYesVariesDepends on modelNone (self-host)Add-onHelpdesk handoffOpen source

Chatwoot and Zammad are excellent self-hostable helpdesks with AI add-ons; Resolvr is the agent itself — the classify → RAG → resolve/escalate pipeline — rather than the inbox around it.

When should an AI support agent resolve vs escalate?

A safety gate is a rule layer that overrides the model: certain ticket categories must always reach a human, and anything the agent isn't confident about is escalated rather than guessed.

  • Hard gate — security, legal, abuse, and refund tickets always escalate, regardless of how confident retrieval is.
  • Confidence threshold — when the best KB match is weak, the ticket escalates with full context attached instead of getting a shaky answer.
  • Eval-gated — a pytest suite enforces 100% must-escalate recall on security/legal/abuse and ≥90% action accuracy before anything ships.

That combination — RAG grounding plus a confidence-thresholded gate — is the direct answer to the “but won't it hallucinate?” objection: low-confidence cases go to a human instead of being sent.

Build it yourself: a self-hosted AI support agent on Ollama (FastAPI + React)

The whole thing is open source. Four steps from zero to a running agent.

  1. Pull local models — ollama pull qwen2.5:14b and ollama pull nomic-embed-text.
  2. Clone & install — clone the repo and run make setup to create the Python venv and install the frontend.
  3. Index your KB — drop your help-center articles in; Resolvr embeds them on first boot for cosine-similarity retrieval.
  4. Run it — make dev starts the FastAPI backend and React frontend; submit a ticket and watch classify → retrieve → decide → draft.

Full instructions, the eval suite, and a Docker Compose setup are in the GitHub README. Stars welcome — they help the project rank.

When Resolvr is not the right fit

  • You don't want to own a GPU/server — a fully managed SaaS like Fin or Zendesk AI will be less operational work.
  • You need frontier-model quality on the hardest, most ambiguous tickets — local models trail GPT-class models there (use the cloud fallback, or a managed tool).
  • You need turnkey Zendesk/Intercom inbox integration today — Resolvr ships as the agent + a demo UI, not a full helpdesk.

Who it is for: privacy- and data-residency-driven teams, regulated EU SaaS, teams frustrated by per-resolution pricing, and developers who want a self-hostable, hackable agent they fully control.

Frequently Asked Questions

What is Resolvr?

Resolvr is an open-source, self-hosted AI customer support agent that classifies a support ticket, retrieves knowledge-base articles via RAG, decides resolve-vs-escalate behind a safety gate, and drafts a reply. It runs on local Ollama models, so customer data never leaves your server, at zero per-token cost.

Can I run an AI support agent fully self-hosted so customer data never leaves my server?

Yes. Resolvr runs inference on local Ollama by default, so no ticket text is sent to any third-party API. This makes it suitable for GDPR, HIPAA, and air-gapped deployments. A cloud-API fallback exists but is opt-in and off by default.

Is there an open-source alternative to Intercom Fin or Zendesk AI that I can self-host?

Resolvr is an open-source, self-hostable alternative to Intercom Fin and Zendesk AI. Unlike those SaaS tools it runs on your own server with no per-resolution or per-token fee, and unlike helpdesks such as Chatwoot or Zammad it is an agent-first classify → RAG → resolve/escalate pipeline rather than an inbox with bolt-on AI.

How much does Resolvr cost per ticket?

Resolvr has zero per-token and zero per-resolution cost — the only cost is the server you already run. This contrasts with per-resolution SaaS pricing such as Intercom Fin's roughly $0.99 per resolution. Note that below a couple of million tokens a day, a cloud API can be cheaper than dedicated GPU infrastructure, which is exactly what the opt-in cloud fallback is for.

How does Resolvr decide whether to resolve a ticket or escalate to a human?

Resolvr makes a deterministic resolve-vs-escalate decision behind a configurable safety gate. A hard gate always escalates security, legal, abuse, and refund tickets, and low retrieval confidence escalates with full context. Everything else is auto-resolved with a reply grounded in retrieved KB articles.

Do AI support agents hallucinate, and how does Resolvr prevent it?

Resolvr reduces hallucination two ways: the reply is grounded strictly in knowledge-base articles retrieved via RAG, and a confidence-thresholded safety gate hands low-confidence cases to a human instead of sending them. Its pytest eval suite enforces 100% must-escalate recall on security, legal, and abuse tickets.

Which open-source LLM does Resolvr run on Ollama?

Resolvr defaults to qwen2.5:14b on local Ollama and works with other Ollama models such as Llama 3.1 and Mistral. Smaller 7B–8B models run on modest GPUs; the hardest tickets can use the opt-in cloud-API fallback when needed.

Is self-hosted AI customer support GDPR compliant?

Self-hosting is the cleanest GDPR story because no customer data is sent to a third-party processor — there is no external API call to cover with a DPA. Because Resolvr runs true local inference (not PII-masking before a cloud call), ticket data physically stays on your infrastructure, supporting GDPR, data-residency, and air-gapped requirements.

What stack is Resolvr built on, and how do I deploy it?

Resolvr is built on FastAPI (Python) and React/Vite, with Ollama-based embeddings and cosine-similarity RAG, deterministic classification, and a pytest eval-gated safety layer. It is open source on GitHub and can be deployed end-to-end on your own server; the build guide includes copy-pasteable Ollama and FastAPI steps.

Related

  • All autonomous agents in the Agent Host →
  • Full project catalog with architecture details →
  • Engineering notes — AI agents, RAG, and LLM systems →
  • Hire Rohit: build a custom AI agent for your stack →
← Back to the Agent Host

Rohit Raj — Backend & AI Systems

Services

AI Agent HostFounding Engineer for Hire in IndiaMobile App DevelopmentAI Chatbot DevelopmentFull-Stack Development

Get Updates