Open WebUI
CTXone ships two Open WebUI plugins out of the box:
- Tool — function-calling tools the model can invoke explicitly
(
remember,recall,forget,list_pinned). Same capability as the MCP server, but with per-user Valves and automatic agent IDs. - Filter — an in-process hook that runs around every chat turn.
inletcallsrecallon the user’s message and injects the result as a system prompt;outletoptionally captures the assistant’s reply as a fact. This works with models that don’t support tool-calling at all.
Both live in the same file —
bindings/python/src/ctxone/integrations/openwebui.py — and share a
Hub client. You can use either or both.
Which one do I want?
Section titled “Which one do I want?”| Goal | Use |
|---|---|
| Let the model decide when to save/load facts | Tool |
| Automatically inject relevant memory on every turn | Filter |
| Make memory work with non-tool-calling models (Llama 2, old Mistral) | Filter |
| Capture the assistant’s answers as future facts | Filter (opt-in) |
| Give users per-account private memory branches | Either (both support UserValves.branch) |
Most users end up installing both. The Filter runs on every turn so the model sees relevant memory automatically; the Tool is there for when the model knows it wants to save something important, like “remember that we picked BSL-1.1”.
Install
Section titled “Install”You need a running CTXone Hub somewhere Open WebUI can reach. The
Quickstart covers that — easiest is
docker compose up from the repo root, which exposes the Hub on
http://localhost:3001.
Option 1: Paste into Open WebUI (fastest)
Section titled “Option 1: Paste into Open WebUI (fastest)”- In Open WebUI: Admin Panel → Functions → +
- Paste the entire contents of
bindings/python/src/ctxone/integrations/openwebui.py - Save. Open WebUI reads the
requirements: ctxone>=0.73.0line in the docstring frontmatter and auto-installs the client into its Python environment (make sureENABLE_PIP_INSTALL_FRONTMATTER_REQUIREMENTS=trueis set in your Open WebUI env — it usually is for self-hosted installs). - The file defines both
ToolsandFilterclasses. Open WebUI registers both; enable the ones you want from the Functions list.
Option 2: Install as a Python package
Section titled “Option 2: Install as a Python package”If you run Open WebUI from source or in a venv you control:
pip install "ctxone[openwebui]"Then in your own plugin file:
from ctxone.integrations.openwebui import Tools, Filterand save it as an Open WebUI function. This is cleaner if you want to pin the ctxone version yourself or ship it in an image.
Configuration (Valves)
Section titled “Configuration (Valves)”Both plugins expose two tiers of Pydantic config Open WebUI renders as forms:
Admin Valves (one per install)
Section titled “Admin Valves (one per install)”| Field | Default | What it does |
|---|---|---|
hub_url | http://localhost:3001 | Where to reach the Hub |
default_branch | main | Branch for reads and writes when users haven’t picked their own |
timeout_seconds | 15 (Tool) / 8 (Filter) | HTTP timeout. The Filter uses a shorter timeout because it’s in the hot path of every turn |
allow_writes | true (Tool only) | Set false for read-only demos |
recall_budget | 1500 | Default token budget passed to recall() |
min_query_length | 3 (Filter only) | Skip recall when the user message is shorter than this |
silent_on_error | true (Filter only) | If true, Hub errors are swallowed and the chat continues; if false, they fail the turn |
priority | -10 (Filter only) | Open WebUI filter priority. Lower runs earlier — memory injection should happen before most other filters |
UserValves (per-user overrides)
Section titled “UserValves (per-user overrides)”Each user in Open WebUI gets their own copy of these:
| Field | Default | What it does |
|---|---|---|
enabled | true (Filter only) | Master switch — turn off if you don’t want memory injected into your chats |
branch | "" (empty) | Your private branch. Empty means “use the admin default” |
remember_importance | medium (Tool only) | Importance level this user writes facts with |
capture_replies | false (Filter only) | If on, store the assistant’s reply as a fact on every turn |
capture_importance | low (Filter only) | Importance applied to auto-captured replies |
How attribution works
Section titled “How attribution works”Every commit CTXone writes needs an agent ID — that’s the “who”
shown in ctx blame output. The integration resolves this from
Open WebUI’s __user__ dict, in this order:
user.email(preferred — typically the login email)user.name(display name)user.id(UUID)"openwebui"(fallback if none of the above)
So a remember call from [email protected] shows up in
ctx blame /memory/facts/abc as [email protected] with the exact
timestamp, intent, and reasoning the Hub recorded. Your team gets
real provenance for free.
How the Filter feels in practice
Section titled “How the Filter feels in practice”The user types a question. Before it hits the model, the Filter runs:
-
Extract the last user message (handles multimodal content by joining the
textparts and ignoring images). -
Skip if shorter than
min_query_lengthor if the user has the filter disabled. -
Call
hub.recall(topic=last_message, budget=recall_budget)against the user’s private branch. -
If there are matches, render them as a system prompt block:
## Relevant memory from CTXoneRetrieved for topic: 'licensing'- [pinned] **Vision** — ship a BSL-1.1 product with MIT clients- [fact] CTXone uses BSL-1.1- [fact] Converts to Apache 2 after 4 years_(CTXone: this retrieval is 14.2× smaller than loading the fullmemory graph.)_ -
Prepend this as a new system message. It never overwrites whatever system prompt the model already has.
-
Return the mutated body. The model now sees the memory before generating.
If capture_replies is on, the Filter also watches the outlet. When
the full assistant response arrives, it calls hub.remember with the
reply as a low-importance fact under /memory/openwebui/<user>/.
This makes every conversation self-teaching — but it can also fill
your memory with junk, which is why it’s off by default.
Troubleshooting
Section titled “Troubleshooting””CTXone Hub is unreachable”
Section titled “”CTXone Hub is unreachable””The most common cause is Open WebUI running in Docker with the Hub
running on the host. localhost:3001 inside a container refers to
the container itself. Change hub_url to either:
http://host.docker.internal:3001(macOS, Windows, recent Linux)http://172.17.0.1:3001(Linux default Docker bridge)- Or run both in the same
docker composenetwork — seedocker-compose.ymlfor a working setup.
”The Filter takes too long, chat feels slow”
Section titled “”The Filter takes too long, chat feels slow””The Filter calls the Hub synchronously in inlet. On a fast local
Hub this is 5–20 ms, which is invisible. If it’s slow:
- Drop
timeout_secondsto 3 or 4. The Filter swallows timeouts whensilent_on_error=true, so a slow Hub just means no memory injection on that turn. - Lower
recall_budget— smaller budgets return faster because the Hub prunes earlier. - Check
RUST_LOG=infoon the Hub and look at therecalllines. Each one prints tokens sent and the savings ratio.
”I don’t want the Filter on every chat”
Section titled “”I don’t want the Filter on every chat””self.toggle = Truein the Filter’s__init__makes Open WebUI render a per-chat switch in the UI. Click it off and the Filter skips that conversation.- Or set
UserValves.enabled = falseto turn it off permanently for your account.
”How do I see what was auto-captured?”
Section titled “”How do I see what was auto-captured?””ctx ls /memory/openwebui/ctx --branch <your-branch> recall "<topic>"Or browse through CTXone Lens at http://localhost:5173/browse — the
blame panel shows which user wrote each fact via Open WebUI and when.
Related docs
Section titled “Related docs”- QUICKSTART.md — spinning up a Hub
- INTEGRATIONS.md — MCP wiring for Claude Code, Cursor, etc.
- VISION.md — why CTXone exists at all