e2hang/EzVibeR

Fork 0

Files

Claude Agent f981fcf6c1 fix: Enlarge, Emotion

2026-06-12 17:24:21 +08:00

14 KiB

Raw Permalink Blame History

EzVibeR+ 对话能力接入规划

状态：待执行
范围：仅 EzVibeR+/ 目录
目标：桌宠（Live2D 窗口）新增持久聊天面板 + 后端接入本地 llama.cpp 内嵌推理

0. 用户已确认的决策

决策点	值	备注
聊天面板	新建持久面板，右侧，宽 340px	同一 live2d 窗口内
LLM 后端	llama.cpp 内嵌（rust crate `llama-cpp-rs`）	不用 Ollama
模型放置	用户手动放置，路径 `$HOME/.live2D/model/llm/`	应用目录下
模型文件命名	`Qwen2.5-0.5B-Instruct-Q4_K_M.gguf`	用户自行下载
流式 vs 整段	整段返回	0.5B 推理 < 3s，没必要流式
模型找不到	前端 chat 面板首条提示，附路径+复制按钮	不强报错
加载时机	懒加载：首次发消息时 load	启动快 5-10s
推理后端	CPU only（关 CUDA/Metal feature）	后期可选 feature flag
构建工具链	需要 cmake + gcc/clang（首次编译）+20MB	用户已同意
上下文窗口	2048 tokens	`brain.rs::max_history=20` 兜底

1. 架构总览

1.1 数据流

用户输入消息
    │
    ▼
[前端 ChatPanel.vue]
    │ invoke('chat', { message })
    ▼
[Tauri commands::chat] ──→  brain.think(message, UserMessage)
    │                            │
    │                            ▼
    │                     [AgentBrain]  ← RAG 检索 / 历史追加 / Action 解析（不变）
    │                            │
    │                            ▼ provider.chat(messages)
    │                     [LlamaCppProvider]  ★ 新建
    │                            │
    │                            ├─ 首次调用？→  load GGUF（懒加载，OnceCell）
    │                            └─ 模型 prompt 拼装 → llama.cpp decode
    ▼
[BrainResponse { text, action, emotion_state, memory_id }]
    │
    ▼
[前端 useChat.ts] 追加到 messages state → 渲染气泡

1.2 模块依赖关系

main.rs
  └─> AgentBrain::new(provider, memory, config)
        └─> Arc<dyn LLMProvider>
              └─> LlamaCppProvider::new(model_path)  ★ 新增

frontend (live2d.html)
  └─> <Live2DShell>
        ├─ <Live2DCanvas>  (原 index.vue canvas + 工具条)
        └─ <ChatPanel>     ★ 新增
              └─ useChat()  ★ 新增

2. 文件变更清单

2.1 后端 Rust（src-tauri/）

文件	操作	内容
`Cargo.toml`	修改	新增 `llama-cpp-rs = "0.3"`，默认关 CUDA/Metal
`src/utils.rs`	修改	新增 `fn llm_model_dir() -> PathBuf` 返回 `$HOME/.live2D/model/llm/`
`src/app/config.rs`	修改	`AppConf` 新增字段 `llm_model_path: String`（默认 `""` 表示自动用 `llm_model_dir`）
`src/modules/llama_cpp.rs`	新建	实现 `LLMProvider` trait 的 `LlamaCppProvider`
`src/modules/mod.rs`	修改	`pub mod llama_cpp;` 并 re-export `LlamaCppProvider`
`src/main.rs`	修改	`NoopLLMProvider` → `LlamaCppProvider::new(default_model_path())`

2.2 前端 Vue（src/）

文件	操作	内容
`src/live2d/index.vue`	修改	重构为 flex 左右布局：左 canvas + 工具条，右 `<ChatPanel />`
`src/live2d/components/ChatPanel.vue`	新建	消息流 + 输入框 + 状态指示器
`src/live2d/components/MessageBubble.vue`	新建	单条消息气泡组件（用户/桌宠样式区分）
`src/live2d/hooks/useChat.ts`	新建	消息数组 state + invoke('chat') + 错误处理
`src/live2d/hooks/useModelStatus.ts`	新建	通过 invoke('get_model_status') 检查模型是否加载

2.3 不动的文件

src-tauri/src/modules/brain.rs 0 改动（已经是干净的 trait 抽象）
src-tauri/src/app/commands.rs 0 改动（chat 命令已存在）
src-tauri/src/main.rs 0 改动（仅替换 provider 类型）
src/App.vue, src/main.ts 不动（live2d 是独立入口）

3. 后端实现细节

3.1 `src/modules/llama_cpp.rs`

// 核心结构
pub struct LlamaCppProvider {
    model_path: PathBuf,           // 配置或默认路径
    model: OnceCell<LlamaModel>,   // 懒加载容器
    n_ctx: u32,                    // 2048
    n_threads: u16,                // 物理核心数
}

impl LLMProvider for LlamaCppProvider {
    fn chat<'a>(&'a self, messages: &'a [ChatMessage]) 
        -> Pin<Box<dyn Future<Output = Result<String, BrainError>> + Send + 'a>>
    {
        Box::pin(async move {
            // 1. 懒加载：第一次进来时 load
            let model = self.model.get_or_try_init(|| {
                LlamaModel::load_from_file(&self.model_path, GgmlDType::F16, 
                    &LlamaParams::default().with_n_ctx(self.n_ctx))
                    .map_err(|e| BrainError::LLMError(format!("模型加载失败: {}", e)))
            })?;
            
            // 2. 拼 prompt（Qwen2.5 ChatML 格式）
            let prompt = build_qwen_chatml_prompt(messages);
            
            // 3. 推理（同步阻塞，单次 generate）
            let response = model.generate(prompt, GenerateParams::default()
                .with_max_tokens(512)
                .with_temperature(0.7)
                .with_stop(vec!["</s>".into(), "<|im_end|>".into()]))
                .map_err(|e| BrainError::LLMError(format!("推理失败: {}", e)))?;
            
            // 4. strip ChatML 残留
            Ok(strip_chatml_artifacts(&response))
        })
    }
}

Qwen2.5 ChatML prompt 模板：

<|im_start|>system
你是一个名为 EzVibe 的桌面宠物助手。...<|im_end|>
<|im_start|>user
【用户说】{user_input}

【相关记忆】
{...}<|im_end|>
<|im_start|>assistant

错误处理：

模型文件不存在 → BrainError::LLMError("GGUF 模型未找到。\n\n请将模型文件放到:\n{path}\n\n文件名应为: Qwen2.5-0.5B-Instruct-Q4_K_M.gguf")
加载失败（损坏/版本不匹配）→ BrainError::LLMError(...) 带具体错误
推理超时 → BrainError::LLMError("推理超时（>60s）")

3.2 路径解析

// utils.rs 新增
pub fn llm_model_dir() -> PathBuf {
    app_root().join("model").join("llm")
}

pub fn default_llm_model_path() -> PathBuf {
    llm_model_dir().join("Qwen2.5-0.5B-Instruct-Q4_K_M.gguf")
}

启动时若目录不存在 → fs::create_dir_all 创建（虽然是手动放置，但目录是自动建的）。

3.3 main.rs 改动

// 旧
let llm_provider: Arc<dyn LLMProvider> = Arc::new(NoopLLMProvider);

// 新
use crate::modules::LlamaCppProvider;
use crate::utils::default_llm_model_path;

let model_path = if !app_conf.llm_model_path.is_empty() {
    PathBuf::from(&app_conf.llm_model_path)
} else {
    default_llm_model_path()
};

// 确保目录存在
if let Some(parent) = model_path.parent() {
    let _ = std::fs::create_dir_all(parent);
    log::info!("📂 LLM 模型目录: {}", parent.display());
}
log::info!("🤖 LLM 模型路径: {} (存在: {})", model_path.display(), model_path.exists());

let llm_provider: Arc<dyn LLMProvider> = Arc::new(LlamaCppProvider::new(model_path));

4. 前端实现细节

4.1 布局重构（`live2d/index.vue`）

<template>
  <div class="live2d-view">
    <div class="left-pane">
      <canvas id="live2d" />
      <div class="waifu-tool">...</div>  <!-- 原工具条 -->
    </div>
    <ChatPanel class="right-pane" />
  </div>
</template>

<style>
.live2d-view {
  display: flex;
  flex-direction: row;
  width: 100%;
  height: 100%;
}
.left-pane {
  flex: 1 1 auto;
  position: relative;
  min-width: 0;
}
.right-pane {
  flex: 0 0 340px;
  border-left: 1px solid rgba(0,0,0,0.1);
  background: rgba(255,255,255,0.85);
  backdrop-filter: blur(12px);
}
</style>

重要：把 data-tauri-drag-region="true" 只放在 left-pane 的 canvas 上，不放在 right-pane 上 — 这样拖窗只对 Live2D 区生效，不会被聊天面板"偷走"。

4.2 `ChatPanel.vue` 骨架

<template>
  <div class="chat-panel">
    <!-- 顶部状态条 -->
    <header class="chat-header">
      <span class="dot" :class="status" />  <!-- 绿/灰/红 -->
      <span class="status-text">{{ statusText }}</span>
      <button @click="showHelp = !showHelp">?</button>
    </header>
    
    <!-- 帮助/首次提示（模型未找到时） -->
    <div v-if="showHelp" class="help-banner">
      模型未加载。GGUF 文件应放置在：
      <code>{{ modelPath }}</code>
      <button @click="copyPath">复制路径</button>
    </div>
    
    <!-- 消息流 -->
    <div class="messages" ref="messagesRef">
      <MessageBubble v-for="m in messages" :key="m.id" :msg="m" />
      <div v-if="loading" class="typing-indicator">桌宠正在思考…</div>
    </div>
    
    <!-- 输入框 -->
    <div class="input-bar">
      <textarea
        v-model="input"
        :disabled="loading"
        @keydown.enter.exact="send"
        placeholder="和桌宠说点什么…"
        rows="2"
      />
      <button @click="send" :disabled="loading || !input.trim()">发送</button>
    </div>
  </div>
</template>

4.3 `useChat.ts`

export function useChat() {
  const messages = ref<Message[]>([]);
  const loading = ref(false);
  const error = ref<string | null>(null);
  const modelReady = ref(false);
  const modelPath = ref('');
  
  // 初始化：检查模型状态
  onMounted(async () => {
    try {
      const status = await invoke<{ ready: boolean; path: string }>('get_model_status');
      modelReady.value = status.ready;
      modelPath.value = status.path;
    } catch (e) { /* noop */ }
  });
  
  async function send() {
    const text = input.value.trim();
    if (!text || loading.value) return;
    
    // 1. 追加用户消息
    const userMsg: Message = { id: uuid(), role: 'user', text, ts: Date.now() };
    messages.value.push(userMsg);
    input.value = '';
    loading.value = true;
    error.value = null;
    
    // 2. 调后端
    try {
      const resp = await invoke<BrainResponse>('chat', { message: text });
      const aiMsg: Message = { id: uuid(), role: 'assistant', text: resp.text, ts: Date.now() };
      messages.value.push(aiMsg);
      modelReady.value = true;  // 第一次成功后标记 ready
    } catch (e: any) {
      error.value = String(e);
      messages.value.push({
        id: uuid(), role: 'system', text: `⚠️ ${e}`, ts: Date.now()
      });
    } finally {
      loading.value = false;
      nextTick(scrollToBottom);
    }
  }
  
  return { messages, loading, error, modelReady, modelPath, send, input };
}

4.4 新增 Tauri 命令

commands.rs 新增一个（其它不变）：

#[tauri::command]
pub fn get_model_status(state: State<'_, AppState>) -> Result<ModelStatus, String> {
    let provider = state.brain.provider();
    Ok(ModelStatus {
        ready: provider.is_loaded(),
        path: provider.path_string(),
    })
}

需要给 AgentBrain 暴露 provider 的访问方法（在 brain.rs 加一个 pub fn provider(&self) -> &Arc<dyn LLMProvider>）。

5. 依赖安装

5.1 Rust 编译要求

llama-cpp-rs 编译时需要 C++ 工具链：

OS	工具
Linux	`gcc` / `clang` + `cmake`（`sudo dnf install gcc cmake`）
macOS	Xcode Command Line Tools（`xcode-select --install`）
Windows	MSVC（Visual Studio Build Tools）+ CMake

首次 cargo build 会从源码编译 llama.cpp，约 5-10 分钟，之后增量编译 < 30s。

5.2 前端无新依赖

纯 Vue 3 + TypeScript，不引入新 npm 包。

6. 用户手动操作清单

执行后请用户做一件事：

下载模型（任一来源）：
- HuggingFace: https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct-GGUF/tree/main
- 文件名：qwen2.5-0.5b-instruct-q4_k_m.gguf（注意 HuggingFace 上文件名是全小写）
- 大小：约 400MB
放置模型到 $HOME/.live2D/model/llm/：
```
mkdir -p ~/.live2D/model/llm
mv ~/Downloads/qwen2.5-0.5b-instruct-q4_k_m.gguf ~/.live2D/model/llm/
```
完整路径应为：$HOME/.live2D/model/llm/qwen2.5-0.5b-instruct-q4_k_m.gguf

Windows 用户：%USERPROFILE%\.live2D\model\llm\
macOS 用户：/Users/你的用户名/.live2D/model/llm/
首次启动发消息：llama.cpp 首次加载 GGUF 需要 3-8 秒（CPU 编译 + 内存映射），之后对话 1-3 秒/条。

7. 验证步骤

按顺序执行：

cd src-tauri && cargo check — 确认编译通过
npm run build — 前端构建通过
npm run tauri:dev — 启动应用
桌宠窗口出现，右侧有聊天面板
输入"你好"，按回车
后台日志显示：加载模型（首次约 5s）→ 推理（约 2s）→ 返回气泡
检查 $HOME/.live2D/model/llm/ 路径提示正确

8. 已知限制（v1）

不支持流式输出：等模型全推完才显示，对 0.5B 模型体验尚可；3B+ 建议启 streaming
不支持多 GPU/Metal/CUDA：feature flag 留好接口，但 v1 只开 CPU
不支持模型热切换：重启应用才能换模型
不支持量化参数调优：用 GGUF 自带量化（Q4_K_M）
memory/RAG 已就绪：brain.rs 的 RAG 自动注入，无需额外配置
RAG 用的还是 DummyEmbedder（零向量）：搜索结果不相关。改进是后话，先把对话跑通

9. 验收 checklist

cargo check 通过
npm run build 通过
启动后右侧出现聊天面板
输入消息后能正常回复
模型未放时显示明确错误+路径
第二次发消息明显比第一次快（懒加载生效）
已存的 live2d.conf.json 兼容（旧字段不丢）
拖动桌宠窗口仍然只对 Live2D 区生效

14 KiB Raw Permalink Blame History Unescape Escape