使用 Clawdbot + DMR 运行私有个人 AI

本文介绍了一种使用 Clawdbot 和 Docker Model Runner (DMR) 创建私有 AI 助手的技术方案。Clawdbot 作为智能体接口，可与 Telegram 和 Signal 等消息平台集成，而 DMR 则作为本地推理引擎，将 LLM 作为 OCI 制品运行。该方案解决了日益增长的数据隐私、数据驻留以及云端 AI 高昂 Token 成本的问题。作者提供了详细的技术步骤，包括配置 Clawdbot 将请求路由到本地 DMR 服务器、管理模型上下文窗口，以及部署 gpt-oss 或 glm-4.7-flash 等特定模型。该工作流支持在本地硬件上完成自动化邮件摘要等复杂任务，确保敏感数据永远不会离开用户的基础设施。

Personal AI assistants are transforming how we manage our daily lives—from handling emails and calendars to automating smart homes. However, as these assistants gain more access to our private data, concerns about privacy, data residency, and long-term costs are at an all-time high.

By combining Clawdbot with Docker Model Runner (DMR), you can build a high-performance, agentic personal assistant while keeping full control over your data, infrastructure, and spending.

This post walks through how to configure Clawdbot to utilize Docker Model Runner, enabling a privacy-first approach to personal intelligence.

What Are Clawdbot and Docker Model Runner?

Clawdbot is a self-hosted AI assistant designed to live where you already are. Unlike browser-bound bots, Clawdbot integrates directly with messaging apps like Telegram, WhatsApp, Discord, and Signal. It acts as a proactive digital coworker capable of executing real-world actions across your devices and services.

Docker Model Runner (DMR) is Docker’s native solution for running and managing large language models (LLMs) as OCI artifacts. It exposes an OpenAI-compatible API, allowing it to serve as the private “brain” for any tool that supports standard AI endpoints.

Together, they create a unified assistant that can browse the web, manage your files, and respond to your messages without ever sending your sensitive data to a third-party cloud.

Benefits of the Clawdbot + DMR Stack

Privacy by Design

In a “Privacy-First” setup, your assistant’s memory, message history, and files stay on your hardware. Docker Model Runner isolates model inference, meaning:

No third-party training: Your personal emails and schedules aren’t used to train future commercial models.

Sandboxed execution: Models run in isolated environments, protecting your host system.

Data Sovereignty: You decide exactly which “Skills” (web browsing, file access) the assistant can use.

Cost Control and Scaling

Cloud-based agents often become expensive when they use “long-term memory” or “proactive searching,” which consume massive amounts of tokens. With Docker Model Runner, inference runs on your own GPU/CPU. Once a model is pulled, there are no per-token fees. You can let Clawdbot summarize thousands of unread emails or research complex topics for hours without worrying about a surprise API bill at the end of the month.

Configuring Clawdbot with Docker Model Runner

Modifying the Clawdbot Configuration

Clawdbot uses a flexible configuration system to define which models and providers drive its reasoning. While the onboarding wizard (clawdbot onboard) is the standard setup path, you can manually point Clawdbot to your private Docker infrastructure.

You can define your provider configuration in:

Global configuration: ~/.config/clawdbot/config.json
Workspace-specific configuration: clawdbot.json in your active workspace root.

Using Clawdbot with Docker Model Runner

To bridge the two, update your configuration to point to the DMR server. Assuming Docker Model Runner is running at its default address: http://localhost:12434/v1.

Your config.json should be updated as follows:

           {
          
             "models": {
          
               "providers": {
          
                 "dmr": {
          
                   "baseUrl": "http://localhost:12434/v1",
          
                   "apiKey": "dmr-local",
          
                   "api": "openai-completions",
          
                   "models": [
          
                     {
          
                       "id": "gpt-oss:128K",
          
                       "name": "gpt-oss (128K context window)",
          
                       "contextWindow": 128000,
          
                       "maxTokens": 128000
          
                     },
          
                     {
          
                       "id": "glm-4.7-flash:128K",
          
                       "name": "glm-4.7-flash (128K context window)",
          
                       "contextWindow": 128000,
          
                       "maxTokens": 128000
          
                     }
          
                   ]
          
                 }
          
               }
          
             },
          
             "agents": {
          
               "defaults": {
          
                 "model": {
          
                   "primary": "dmr/gpt-oss:128K"
          
                 }
          
               }
          
             }
          
           }

This configuration tells Clawdbot to bypass external APIs and route all “thinking” to your private models.

Note for Docker Desktop Users:

Ensure TCP access is enabled so Clawdbot can communicate with the runner. Run the following command in your terminal:

docker desktop enable model-runner –tcp

Recommended Models for Personal Assistants

While coding models focus on logic, personal assistant models need a balance of instruction-following, tool-use capability, and long-term memory.

Model	Best For	DMR Pull Command
Model gpt-oss	Best For Complex reasoning & scheduling	DMR Pull Command docker model pull gpt-oss
Model glm-4.7-flash	Best For Fast coding assistance and debugging	DMR Pull Command docker model pull glm-4.7-flash
Model qwen3-coder	Best For Agentic coding workflows	DMR Pull Command docker model pull qwem3-coder

Pulling models from the ecosystem

DMR can pull models directly from Hugging Face and convert them into OCI artifacts automatically:

docker model pull huggingface.co/bartowski/Llama-3.3-70B-Instruct-GGUF

Context Length and “Soul”

For a personal assistant, context length is critical. Clawdbot relies on a SOUL.md file (which defines its personality) and a Memory Vault (which stores your preferences).

If a model’s default context is too small, it will “forget” your instructions mid-conversation. You can use DMR to repackage a model with a larger context window:

docker model package --from llama3.3 --context-size 128000 llama-personal:128k

Once packaged, reference llama-personal:128k in your Clawdbot config to ensure your assistant always remembers the full history of your requests.

Putting Clawdbot to Work: Running Scheduled Tasks

With Clawdbot and DMR running, you can move beyond simple chat. Let’s set up a “Morning Briefing” task.

Verify the Model: docker model ls (Ensure your model is active).
Initialize the Soul: Run clawdbot init-soul to define how the assistant should talk to you.
Assign a Task:

“Clawdbot, every morning at 8:00 AM, check my unread emails, summarize the top 3 priorities, and message me the summary on Telegram.”

Because Clawdbot is connected to your private Docker Model Runner, it can parse those emails and reason about your schedule privately. No data leaves your machine; you simply receive a helpful notification on your phone via your chosen messaging app.

How You Can Get Involved

The Clawdbot and Docker Model Runner ecosystems are growing rapidly. Here’s how you can help:

Share Model Artifacts: Push your optimized OCI model packages to Docker Hub for others to use.
Join the Community: Visit the Docker Model Runner GitHub repo.

{{userData.name}}已认证

使用 Clawdbot + DMR 运行私有个人 AI

What Are Clawdbot and Docker Model Runner?

Benefits of the Clawdbot + DMR Stack

Configuring Clawdbot with Docker Model Runner

Modifying the Clawdbot Configuration

Using Clawdbot with Docker Model Runner

Recommended Models for Personal Assistants

Pulling models from the ecosystem

Context Length and “Soul”

Putting Clawdbot to Work: Running Scheduled Tasks

How You Can Get Involved

Agent Skills 详解：常见问题解答 - Vercel

学会影视飓风和杰伦的 AI 视频工作流后，我做了条新片子(附：他们没说的六个更新)

Spec Driven Development: 当架构变得可执行

DeepSeek 的终极野心：把大语言模型的基本语言都改造成图像｜DeepSeek 新论文解读

75%预训练数据都能删！Jeff Dean 新作：全自动筛除低质量数据

Pulumi 新增对 Terraform 和 HCL 的原生支持

如何基于复杂业务规则，设计 B 端系统？

PyTorch 入门与性能分析