--- feature_id: "AISVC" iteration_id: "v0.8.0-intent-hybrid-routing" title: "意图识别混合路由优化 - 技术设计" status: "draft" version: "0.8.0" created_at: "2026-03-08" inputs: - "spec/ai-service/iterations/v0.8.0-intent-hybrid-routing/requirements.md" - "spec/ai-service/iterations/v0.8.0-intent-hybrid-routing/scope.md" --- # 意图识别混合路由优化 - Design(v0.8.0) ## 1. 设计目标与约束 ### 1.1 设计目标 - 将意图识别从"单一规则匹配"升级为"规则+语义+LLM"三路混合路由 - 提升意图识别召回率与准确率 - 提供置信度评分与路由追踪日志 - 最小侵入:仅在 Step 3 插入混合路由,不改主链路 ### 1.2 硬约束 - 现有规则引擎继续可用,作为混合路由的一路输入 - `/ai/chat` 对外响应语义不变 - 全部新增逻辑必须 tenantId 隔离 - 保留 `IntentRouter.match()` 方法向后兼容 --- ## 2. 架构设计 ### 2.1 最小侵入架构图 ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ Orchestrator 12-Step Pipeline │ │ │ │ Step 1: InputScanner → Step 2: FlowEngine → Step 3: IntentRouter [改造] │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────────┐│ │ │ IntentRouter (Hybrid Routing) ││ │ │ ││ │ │ ┌─────────────────────────────────────────────────────────────────────┐││ │ │ │ Parallel Matching Layer │││ │ │ │ │││ │ │ │ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │││ │ │ │ │ RuleMatcher │ │SemanticMatcher│ │ LlmJudge │ │││ │ │ │ │ (现有+score) │ │ (新增) │ │ (条件触发) │ │││ │ │ │ │ │ │ │ │ │ │││ │ │ │ │ keywords │ │ embedding │ │ LLM call │ │││ │ │ │ │ regex │ │ similarity │ │ arbitration │ │││ │ │ │ │ score: 0|1 │ │ score: 0~1 │ │ score: 0~1 │ │││ │ │ │ └───────┬───────┘ └───────┬───────┘ └───────┬───────┘ │││ │ │ │ │ │ │ │││ │ │ │ └───────────────────┼───────────────────┘ │││ │ │ │ ▼ │││ │ │ └─────────────────────────────────────────────────────────────────────┘││ │ │ │ ││ │ │ ▼ ││ │ │ ┌─────────────────────────────────────────────────────────────────────┐││ │ │ │ FusionPolicy (新增) │││ │ │ │ │││ │ │ │ 输入: rule_result, semantic_result, llm_result │││ │ │ │ 处理: 加权融合 + 冲突检测 + 阈值判定 │││ │ │ │ 输出: final_intent, final_confidence, decision_reason, trace │││ │ │ │ │││ │ │ └─────────────────────────────────────────────────────────────────────┘││ │ │ ││ │ └─────────────────────────────────────────────────────────────────────────┘│ │ │ │ │ ▼ │ │ response_type 路由(不变) │ │ fixed / rag / flow / transfer │ └─────────────────────────────────────────────────────────────────────────────┘ ``` ### 2.2 插入点标注 | 插入点 | 位置 | 改造内容 | |--------|------|----------| | Step 3 入口 | orchestrator.py:500 | 调用 `IntentRouter.match_hybrid()` 替代 `match()` | | IntentRule 实体 | entities.py:420-463 | 新增 `intent_vector`、`semantic_examples` 字段 | | IntentRouter 类 | intent/router.py | 新增 `match_hybrid()` 方法,保留 `match()` 向后兼容 | | 新增模块 | intent/ | 新增 `semantic_matcher.py`、`llm_judge.py`、`fusion_policy.py`、`models.py` | --- ## 3. 核心接口设计 ### 3.1 数据模型 ```python # 位置: app/services/intent/models.py from dataclasses import dataclass, field from typing import Any import uuid @dataclass class RuleMatchResult: """规则匹配结果""" rule_id: uuid.UUID | None rule: "IntentRule | None" match_type: str | None # "keyword" | "regex" | None matched_text: str | None score: float # 1.0 或 0.0 duration_ms: int @dataclass class SemanticCandidate: """语义匹配候选""" rule: "IntentRule" score: float # 0.0 ~ 1.0 相似度 @dataclass class SemanticMatchResult: """语义匹配结果""" candidates: list[SemanticCandidate] # Top-N 候选 top_score: float duration_ms: int skipped: bool # 是否跳过(无语义向量配置) skip_reason: str | None # 跳过原因 @dataclass class LlmJudgeInput: """LLM 仲裁输入""" message: str candidates: list[dict] # 候选意图列表 conflict_type: str # "rule_semantic_conflict" | "gray_zone" | "multi_intent" @dataclass class LlmJudgeResult: """LLM 仲裁结果""" intent_id: str | None intent_name: str | None score: float # 0.0 ~ 1.0 reasoning: str | None # LLM 的推理过程 duration_ms: int tokens_used: int triggered: bool @dataclass class FusionConfig: """融合配置""" w_rule: float = 0.5 w_semantic: float = 0.3 w_llm: float = 0.2 semantic_threshold: float = 0.7 conflict_threshold: float = 0.2 gray_zone_threshold: float = 0.6 min_trigger_threshold: float = 0.3 clarify_threshold: float = 0.4 multi_intent_threshold: float = 0.15 llm_judge_enabled: bool = True semantic_matcher_enabled: bool = True @dataclass class RouteTrace: """路由追踪日志""" rule_match: dict = field(default_factory=dict) semantic_match: dict = field(default_factory=dict) llm_judge: dict = field(default_factory=dict) fusion: dict = field(default_factory=dict) @dataclass class FusionResult: """融合决策结果""" final_intent: "IntentRule | None" final_confidence: float decision_reason: str need_clarify: bool clarify_candidates: list["IntentRule"] | None trace: RouteTrace ``` ### 3.2 RuleMatcher(改造现有 IntentRouter) ```python # 位置: app/services/intent/router.py class RuleMatcher: """规则匹配器(基于现有 IntentRouter)""" def match(self, message: str, rules: list[IntentRule]) -> RuleMatchResult: """ 关键词+正则匹配 匹配算法: 1. 按 priority 降序遍历规则 2. 对每条规则,先尝试关键词匹配 3. 若无关键词匹配,尝试正则模式匹配 4. 返回第一个匹配(最高优先级) Args: message: 用户消息 rules: 规则列表(已按优先级降序排列) Returns: RuleMatchResult: 匹配结果 """ start_time = time.time() message_lower = message.lower() for rule in rules: if not rule.is_enabled: continue result = self._match_keywords(message, message_lower, rule) if result: duration_ms = int((time.time() - start_time) * 1000) return RuleMatchResult( rule_id=rule.id, rule=rule, match_type="keyword", matched_text=result.matched, score=1.0, duration_ms=duration_ms ) result = self._match_patterns(message, rule) if result: duration_ms = int((time.time() - start_time) * 1000) return RuleMatchResult( rule_id=rule.id, rule=rule, match_type="regex", matched_text=result.matched, score=1.0, duration_ms=duration_ms ) duration_ms = int((time.time() - start_time) * 1000) return RuleMatchResult( rule_id=None, rule=None, match_type=None, matched_text=None, score=0.0, duration_ms=duration_ms ) def _match_keywords(self, message: str, message_lower: str, rule: IntentRule) -> IntentMatchResult | None: """关键词匹配(保留现有逻辑)""" pass def _match_patterns(self, message: str, rule: IntentRule) -> IntentMatchResult | None: """正则匹配(保留现有逻辑)""" pass ``` ### 3.3 SemanticMatcher(新增) ```python # 位置: app/services/intent/semantic_matcher.py import asyncio from typing import Any import numpy as np class SemanticMatcher: """语义匹配器""" def __init__( self, embedding_provider: EmbeddingProvider, config: FusionConfig ): self._embedding_provider = embedding_provider self._config = config async def match( self, message: str, rules: list[IntentRule], tenant_id: str, top_k: int = 3 ) -> SemanticMatchResult: """ 向量语义匹配 匹配模式: - 模式 A: 使用规则预置的 intent_vector 直接计算相似度 - 模式 B: 使用规则的 semantic_examples 动态计算平均相似度 Args: message: 用户消息 rules: 规则列表 tenant_id: 租户 ID top_k: 返回候选数 Returns: SemanticMatchResult: 匹配结果 """ start_time = time.time() if not self._config.semantic_matcher_enabled: return SemanticMatchResult( candidates=[], top_score=0.0, duration_ms=0, skipped=True, skip_reason="disabled" ) rules_with_semantic = [r for r in rules if self._has_semantic_config(r)] if not rules_with_semantic: duration_ms = int((time.time() - start_time) * 1000) return SemanticMatchResult( candidates=[], top_score=0.0, duration_ms=duration_ms, skipped=True, skip_reason="no_semantic_config" ) try: message_vector = await asyncio.wait_for( self._embedding_provider.embed(message), timeout=self._config.semantic_matcher_timeout_ms / 1000 ) except asyncio.TimeoutError: duration_ms = int((time.time() - start_time) * 1000) return SemanticMatchResult( candidates=[], top_score=0.0, duration_ms=duration_ms, skipped=True, skip_reason="embedding_timeout" ) except Exception as e: duration_ms = int((time.time() - start_time) * 1000) return SemanticMatchResult( candidates=[], top_score=0.0, duration_ms=duration_ms, skipped=True, skip_reason=f"embedding_error: {str(e)}" ) candidates = [] for rule in rules_with_semantic: score = await self._calculate_similarity(message_vector, rule) if score > 0: candidates.append(SemanticCandidate(rule=rule, score=score)) candidates.sort(key=lambda x: x.score, reverse=True) candidates = candidates[:top_k] duration_ms = int((time.time() - start_time) * 1000) return SemanticMatchResult( candidates=candidates, top_score=candidates[0].score if candidates else 0.0, duration_ms=duration_ms, skipped=False, skip_reason=None ) def _has_semantic_config(self, rule: IntentRule) -> bool: """检查规则是否有语义配置""" return bool(rule.intent_vector) or bool(rule.semantic_examples) async def _calculate_similarity(self, message_vector: list[float], rule: IntentRule) -> float: """计算相似度""" if rule.intent_vector: return self._cosine_similarity(message_vector, rule.intent_vector) elif rule.semantic_examples: example_vectors = await self._embedding_provider.embed_batch(rule.semantic_examples) similarities = [ self._cosine_similarity(message_vector, v) for v in example_vectors ] return max(similarities) if similarities else 0.0 return 0.0 def _cosine_similarity(self, v1: list[float], v2: list[float]) -> float: """计算余弦相似度""" v1_arr = np.array(v1) v2_arr = np.array(v2) return float(np.dot(v1_arr, v2_arr) / (np.linalg.norm(v1_arr) * np.linalg.norm(v2_arr))) ``` ### 3.4 LlmJudge(新增) ```python # 位置: app/services/intent/llm_judge.py class LlmJudge: """LLM 仲裁器""" JUDGE_PROMPT = """你是一个意图识别仲裁器。根据用户消息和候选意图,判断最匹配的意图。 用户消息:{message} 候选意图: {candidates} 请返回 JSON 格式: {{ "intent_id": "最匹配的意图ID", "intent_name": "意图名称", "confidence": 0.0-1.0之间的置信度, "reasoning": "判断理由" }} """ def __init__( self, llm_client: LLMClient, config: FusionConfig ): self._llm_client = llm_client self._config = config def should_trigger( self, rule_result: RuleMatchResult, semantic_result: SemanticMatchResult, config: FusionConfig ) -> tuple[bool, str]: """ 判断是否触发 LLM Judge 触发条件: 1. 冲突场景:RuleMatcher 与 SemanticMatcher 命中不同意图 2. 灰区场景:最高置信度在灰区范围内 3. 多意图场景:多个候选意图置信度接近 Args: rule_result: 规则匹配结果 semantic_result: 语义匹配结果 config: 融合配置 Returns: (是否触发, 触发原因) """ if not config.llm_judge_enabled: return False, "disabled" rule_score = rule_result.score semantic_score = semantic_result.top_score if rule_score > 0 and semantic_score > 0: if rule_result.rule_id != semantic_result.candidates[0].rule.id: if abs(rule_score - semantic_score) < config.conflict_threshold: return True, "rule_semantic_conflict" max_score = max(rule_score, semantic_score) if config.min_trigger_threshold < max_score < config.gray_zone_threshold: return True, "gray_zone" if len(semantic_result.candidates) >= 2: top1_score = semantic_result.candidates[0].score top2_score = semantic_result.candidates[1].score if abs(top1_score - top2_score) < config.multi_intent_threshold: return True, "multi_intent" return False, "" async def judge( self, input: LlmJudgeInput, tenant_id: str ) -> LlmJudgeResult: """ LLM 仲裁 Args: input: 仲裁输入 tenant_id: 租户 ID Returns: LlmJudgeResult: 仲裁结果 """ start_time = time.time() candidates_text = "\n".join([ f"- ID: {c['id']}, 名称: {c['name']}, 描述: {c.get('description', 'N/A')}" for c in input.candidates ]) prompt = self.JUDGE_PROMPT.format( message=input.message, candidates=candidates_text ) try: response = await asyncio.wait_for( self._llm_client.generate( messages=[{"role": "user", "content": prompt}], max_tokens=200, temperature=0 ), timeout=self._config.llm_judge_timeout_ms / 1000 ) result = self._parse_response(response.content) duration_ms = int((time.time() - start_time) * 1000) return LlmJudgeResult( intent_id=result.get("intent_id"), intent_name=result.get("intent_name"), score=result.get("confidence", 0.5), reasoning=result.get("reasoning"), duration_ms=duration_ms, tokens_used=response.total_tokens, triggered=True ) except asyncio.TimeoutError: duration_ms = int((time.time() - start_time) * 1000) return LlmJudgeResult( intent_id=None, intent_name=None, score=0.0, reasoning="LLM timeout", duration_ms=duration_ms, tokens_used=0, triggered=True ) except Exception as e: duration_ms = int((time.time() - start_time) * 1000) return LlmJudgeResult( intent_id=None, intent_name=None, score=0.0, reasoning=f"LLM error: {str(e)}", duration_ms=duration_ms, tokens_used=0, triggered=True ) def _parse_response(self, content: str) -> dict: """解析 LLM 响应""" import json try: return json.loads(content) except json.JSONDecodeError: return {} ``` ### 3.5 FusionPolicy(新增) ```python # 位置: app/services/intent/fusion_policy.py class FusionPolicy: """融合决策策略""" DECISION_PRIORITY = [ ("rule_high_confidence", lambda r, s, l: r.score == 1.0 and r.rule is not None), ("llm_judge", lambda r, s, l: l.triggered and l.intent_id is not None), ("semantic_override", lambda r, s, l: r.score == 0 and s.top_score > 0.7), ("rule_semantic_agree", lambda r, s, l: r.score > 0 and s.top_score > 0.5 and r.rule_id == s.candidates[0].rule.id if s.candidates else False), ("semantic_fallback", lambda r, s, l: s.top_score > 0.5), ("rule_fallback", lambda r, s, l: r.score > 0), ("no_match", lambda r, s, l: True), ] def __init__(self, config: FusionConfig): self._config = config def fuse( self, rule_result: RuleMatchResult, semantic_result: SemanticMatchResult, llm_result: LlmJudgeResult | None ) -> FusionResult: """ 融合决策 Args: rule_result: 规则匹配结果 semantic_result: 语义匹配结果 llm_result: LLM 仲裁结果(可能为 None) Returns: FusionResult: 融合结果 """ trace = RouteTrace( rule_match={ "rule_id": str(rule_result.rule_id) if rule_result.rule_id else None, "match_type": rule_result.match_type, "matched_text": rule_result.matched_text, "score": rule_result.score, "duration_ms": rule_result.duration_ms }, semantic_match={ "top_candidates": [ {"rule_id": str(c.rule.id), "name": c.rule.name, "score": c.score} for c in semantic_result.candidates ], "top_score": semantic_result.top_score, "duration_ms": semantic_result.duration_ms, "skipped": semantic_result.skipped, "skip_reason": semantic_result.skip_reason }, llm_judge={ "triggered": llm_result.triggered if llm_result else False, "intent_id": llm_result.intent_id if llm_result else None, "score": llm_result.score if llm_result else 0.0, "duration_ms": llm_result.duration_ms if llm_result else 0, "tokens_used": llm_result.tokens_used if llm_result else 0 }, fusion={} ) final_intent = None final_confidence = 0.0 decision_reason = "no_match" for reason, condition in self.DECISION_PRIORITY: if condition(rule_result, semantic_result, llm_result or LlmJudgeResult.empty()): decision_reason = reason break if decision_reason == "rule_high_confidence": final_intent = rule_result.rule final_confidence = 1.0 elif decision_reason == "llm_judge" and llm_result: final_intent = self._find_rule_by_id(llm_result.intent_id, rule_result, semantic_result) final_confidence = llm_result.score elif decision_reason == "semantic_override": final_intent = semantic_result.candidates[0].rule final_confidence = semantic_result.top_score elif decision_reason == "rule_semantic_agree": final_intent = rule_result.rule final_confidence = self._calculate_weighted_confidence(rule_result, semantic_result, llm_result) elif decision_reason == "semantic_fallback": final_intent = semantic_result.candidates[0].rule final_confidence = semantic_result.top_score elif decision_reason == "rule_fallback": final_intent = rule_result.rule final_confidence = rule_result.score need_clarify = final_confidence < self._config.clarify_threshold clarify_candidates = None if need_clarify and len(semantic_result.candidates) > 1: clarify_candidates = [c.rule for c in semantic_result.candidates[:3]] trace.fusion = { "weights": { "w_rule": self._config.w_rule, "w_semantic": self._config.w_semantic, "w_llm": self._config.w_llm }, "final_confidence": final_confidence, "decision_reason": decision_reason } return FusionResult( final_intent=final_intent, final_confidence=final_confidence, decision_reason=decision_reason, need_clarify=need_clarify, clarify_candidates=clarify_candidates, trace=trace ) def _calculate_weighted_confidence( self, rule_result: RuleMatchResult, semantic_result: SemanticMatchResult, llm_result: LlmJudgeResult | None ) -> float: """计算加权置信度""" rule_score = rule_result.score semantic_score = semantic_result.top_score if not semantic_result.skipped else 0.0 llm_score = llm_result.score if llm_result and llm_result.triggered else 0.0 total_weight = self._config.w_rule + self._config.w_semantic if llm_result and llm_result.triggered: total_weight += self._config.w_llm confidence = ( self._config.w_rule * rule_score + self._config.w_semantic * semantic_score + self._config.w_llm * llm_score ) / total_weight return min(1.0, max(0.0, confidence)) def _find_rule_by_id( self, intent_id: str | None, rule_result: RuleMatchResult, semantic_result: SemanticMatchResult ) -> IntentRule | None: """根据 ID 查找规则""" if not intent_id: return None if rule_result.rule_id and str(rule_result.rule_id) == intent_id: return rule_result.rule for candidate in semantic_result.candidates: if str(candidate.rule.id) == intent_id: return candidate.rule return None ``` ### 3.6 IntentRouter 升级 ```python # 位置: app/services/intent/router.py class IntentRouter: """意图路由器(升级版)""" def __init__( self, rule_matcher: RuleMatcher, semantic_matcher: SemanticMatcher, llm_judge: LlmJudge, fusion_policy: FusionPolicy, config: FusionConfig | None = None ): self._rule_matcher = rule_matcher self._semantic_matcher = semantic_matcher self._llm_judge = llm_judge self._fusion_policy = fusion_policy self._config = config or FusionConfig() async def match_hybrid( self, message: str, rules: list[IntentRule], tenant_id: str, config: FusionConfig | None = None ) -> FusionResult: """ 混合路由入口 流程: 1. 并行执行 RuleMatcher + SemanticMatcher 2. 判断是否触发 LlmJudge 3. 执行 FusionPolicy 4. 返回融合结果 Args: message: 用户消息 rules: 规则列表 tenant_id: 租户 ID config: 融合配置(可选,覆盖默认配置) Returns: FusionResult: 融合结果 """ effective_config = config or self._config rule_result, semantic_result = await asyncio.gather( asyncio.to_thread(self._rule_matcher.match, message, rules), self._semantic_matcher.match(message, rules, tenant_id) ) llm_result = None should_trigger, trigger_reason = self._llm_judge.should_trigger( rule_result, semantic_result, effective_config ) if should_trigger: candidates = self._build_llm_candidates(rule_result, semantic_result) llm_result = await self._llm_judge.judge( LlmJudgeInput( message=message, candidates=candidates, conflict_type=trigger_reason ), tenant_id ) fusion_result = self._fusion_policy.fuse( rule_result, semantic_result, llm_result ) return fusion_result def match(self, message: str, rules: list[IntentRule]) -> IntentMatchResult | None: """ 原有方法保留,向后兼容 Args: message: 用户消息 rules: 规则列表 Returns: IntentMatchResult | None: 匹配结果 """ result = self._rule_matcher.match(message, rules) if result.rule: return IntentMatchResult( rule=result.rule, match_type=result.match_type, matched=result.matched_text ) return None def _build_llm_candidates( self, rule_result: RuleMatchResult, semantic_result: SemanticMatchResult ) -> list[dict]: """构建 LLM 候选列表""" candidates = [] if rule_result.rule: candidates.append({ "id": str(rule_result.rule_id), "name": rule_result.rule.name, "description": f"匹配方式: {rule_result.match_type}, 匹配内容: {rule_result.matched_text}" }) for candidate in semantic_result.candidates[:3]: if not any(c["id"] == str(candidate.rule.id) for c in candidates): candidates.append({ "id": str(candidate.rule.id), "name": candidate.rule.name, "description": f"语义相似度: {candidate.score:.2f}" }) return candidates ``` --- ## 4. 融合公式与默认阈值 ### 4.1 融合公式 ``` final_confidence = (w_rule * rule_score + w_semantic * semantic_score + w_llm * llm_score) / total_weight ``` 其中: - `total_weight = w_rule + w_semantic + (w_llm if llm_triggered else 0)` - 结果限制在 `[0.0, 1.0]` 范围内 ### 4.2 默认阈值配置 ```python DEFAULT_FUSION_CONFIG = FusionConfig( w_rule=0.5, # 规则权重 w_semantic=0.3, # 语义权重 w_llm=0.2, # LLM 权重 semantic_threshold=0.7, # 语义匹配高置信阈值 conflict_threshold=0.2, # 冲突判定阈值(置信度差值) gray_zone_threshold=0.6, # 灰区上限阈值 min_trigger_threshold=0.3, # 灰区下限阈值 clarify_threshold=0.4, # 澄清触发阈值 multi_intent_threshold=0.15, # 多意图判定阈值 llm_judge_enabled=True, # 启用 LLM 仲裁 semantic_matcher_enabled=True, # 启用语义匹配 ) ``` ### 4.3 决策优先级 | 优先级 | 决策原因 | 条件 | |--------|----------|------| | 1 | rule_high_confidence | RuleMatcher 命中且 score=1.0 | | 2 | llm_judge | LlmJudge 触发且返回有效意图 | | 3 | semantic_override | RuleMatcher 未命中但 SemanticMatcher 高置信 | | 4 | rule_semantic_agree | 规则与语义匹配同一意图 | | 5 | semantic_fallback | SemanticMatcher 中等置信 | | 6 | rule_fallback | 仅规则匹配 | | 7 | no_match | 三路均低置信 | --- ## 5. 异常处理策略 | 异常场景 | 处理策略 | 可观测性 | |----------|----------|----------| | Embedding 调用失败 | 跳过 SemanticMatcher,仅使用 RuleMatcher | `semantic_match.skipped=true, skip_reason="embedding_failed"` | | Embedding 超时 | 跳过 SemanticMatcher,仅使用 RuleMatcher | `semantic_match.skipped=true, skip_reason="embedding_timeout"` | | Qdrant 检索失败 | 跳过 SemanticMatcher,仅使用 RuleMatcher | `semantic_match.skipped=true, skip_reason="qdrant_error"` | | LLM 调用失败 | 回退到 SemanticMatcher 结果 | `llm_judge.reasoning="LLM error: ..."` | | LLM 超时 | 回退到 SemanticMatcher 结果 | `llm_judge.reasoning="LLM timeout"` | | LLM 响应解析失败 | 回退到 SemanticMatcher 结果 | `llm_judge.reasoning="parse_failed"` | | 规则缓存失效 | 重新从数据库加载 | `rule_match.cache_miss=true` | | 配置缺失 | 使用默认配置 | `fusion.using_default_config=true` | --- ## 6. 数据模型变更 ### 6.1 IntentRule 实体扩展 ```python # 位置: app/models/entities.py class IntentRule(SQLModel, table=True): # ... 现有字段 ... intent_vector: list[float] | None = Field( default=None, sa_column=Column(JSONB, nullable=True, comment="意图向量(预计算)") ) semantic_examples: list[str] | None = Field( default=None, sa_column=Column(JSONB, nullable=True, comment="语义示例句列表") ) ``` ### 6.2 ChatMessage 实体扩展 ```python # 位置: app/models/entities.py class ChatMessage(SQLModel, table=True): # ... 现有字段 ... route_trace: dict[str, Any] | None = Field( default=None, sa_column=Column(JSONB, nullable=True, comment="意图路由追踪日志") ) ``` --- ## 7. API 设计 ### 7.1 融合配置 API ``` GET /admin/intent-rules/fusion-config PUT /admin/intent-rules/fusion-config ``` ### 7.2 意图向量生成 API ``` POST /admin/intent-rules/{id}/generate-vector ``` ### 7.3 监控 API 扩展 ``` GET /admin/monitoring/conversations/{id} ``` 响应新增 `route_trace` 字段。 --- ## 8. 性能优化 ### 8.1 并行执行 RuleMatcher 和 SemanticMatcher 使用 `asyncio.gather` 并行执行。 ### 8.2 超时控制 - SemanticMatcher 超时:100ms - LlmJudge 超时:2000ms ### 8.3 缓存策略 - 规则缓存:复用现有 RuleCache(TTL=60s) - 向量缓存:可选,后续迭代考虑 --- ## 9. 风险与待澄清 ### 9.1 风险 | 风险 | 等级 | 缓解措施 | |------|------|----------| | Embedding 调用增加延迟 | 中 | 设置超时,超时跳过 | | LLM Judge 频繁触发增加成本 | 中 | 配置合理触发阈值 | | 语义向量配置复杂度高 | 低 | 提供自动生成 API | ### 9.2 待澄清 - 意图向量索引是否需要独立的 Qdrant Collection? - LlmJudge 的 Token 消耗是否需要单独计费统计? - 融合配置是否需要支持租户级差异化配置?