984 lines
35 KiB
Markdown
984 lines
35 KiB
Markdown
|
|
---
|
|||
|
|
feature_id: "AISVC"
|
|||
|
|
iteration_id: "v0.8.0-intent-hybrid-routing"
|
|||
|
|
title: "意图识别混合路由优化 - 技术设计"
|
|||
|
|
status: "draft"
|
|||
|
|
version: "0.8.0"
|
|||
|
|
created_at: "2026-03-08"
|
|||
|
|
inputs:
|
|||
|
|
- "spec/ai-service/iterations/v0.8.0-intent-hybrid-routing/requirements.md"
|
|||
|
|
- "spec/ai-service/iterations/v0.8.0-intent-hybrid-routing/scope.md"
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
# 意图识别混合路由优化 - Design(v0.8.0)
|
|||
|
|
|
|||
|
|
## 1. 设计目标与约束
|
|||
|
|
|
|||
|
|
### 1.1 设计目标
|
|||
|
|
|
|||
|
|
- 将意图识别从"单一规则匹配"升级为"规则+语义+LLM"三路混合路由
|
|||
|
|
- 提升意图识别召回率与准确率
|
|||
|
|
- 提供置信度评分与路由追踪日志
|
|||
|
|
- 最小侵入:仅在 Step 3 插入混合路由,不改主链路
|
|||
|
|
|
|||
|
|
### 1.2 硬约束
|
|||
|
|
|
|||
|
|
- 现有规则引擎继续可用,作为混合路由的一路输入
|
|||
|
|
- `/ai/chat` 对外响应语义不变
|
|||
|
|
- 全部新增逻辑必须 tenantId 隔离
|
|||
|
|
- 保留 `IntentRouter.match()` 方法向后兼容
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 2. 架构设计
|
|||
|
|
|
|||
|
|
### 2.1 最小侵入架构图
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
|||
|
|
│ Orchestrator 12-Step Pipeline │
|
|||
|
|
│ │
|
|||
|
|
│ Step 1: InputScanner → Step 2: FlowEngine → Step 3: IntentRouter [改造] │
|
|||
|
|
│ │ │
|
|||
|
|
│ ▼ │
|
|||
|
|
│ ┌─────────────────────────────────────────────────────────────────────────┐│
|
|||
|
|
│ │ IntentRouter (Hybrid Routing) ││
|
|||
|
|
│ │ ││
|
|||
|
|
│ │ ┌─────────────────────────────────────────────────────────────────────┐││
|
|||
|
|
│ │ │ Parallel Matching Layer │││
|
|||
|
|
│ │ │ │││
|
|||
|
|
│ │ │ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │││
|
|||
|
|
│ │ │ │ RuleMatcher │ │SemanticMatcher│ │ LlmJudge │ │││
|
|||
|
|
│ │ │ │ (现有+score) │ │ (新增) │ │ (条件触发) │ │││
|
|||
|
|
│ │ │ │ │ │ │ │ │ │││
|
|||
|
|
│ │ │ │ keywords │ │ embedding │ │ LLM call │ │││
|
|||
|
|
│ │ │ │ regex │ │ similarity │ │ arbitration │ │││
|
|||
|
|
│ │ │ │ score: 0|1 │ │ score: 0~1 │ │ score: 0~1 │ │││
|
|||
|
|
│ │ │ └───────┬───────┘ └───────┬───────┘ └───────┬───────┘ │││
|
|||
|
|
│ │ │ │ │ │ │││
|
|||
|
|
│ │ │ └───────────────────┼───────────────────┘ │││
|
|||
|
|
│ │ │ ▼ │││
|
|||
|
|
│ │ └─────────────────────────────────────────────────────────────────────┘││
|
|||
|
|
│ │ │ ││
|
|||
|
|
│ │ ▼ ││
|
|||
|
|
│ │ ┌─────────────────────────────────────────────────────────────────────┐││
|
|||
|
|
│ │ │ FusionPolicy (新增) │││
|
|||
|
|
│ │ │ │││
|
|||
|
|
│ │ │ 输入: rule_result, semantic_result, llm_result │││
|
|||
|
|
│ │ │ 处理: 加权融合 + 冲突检测 + 阈值判定 │││
|
|||
|
|
│ │ │ 输出: final_intent, final_confidence, decision_reason, trace │││
|
|||
|
|
│ │ │ │││
|
|||
|
|
│ │ └─────────────────────────────────────────────────────────────────────┘││
|
|||
|
|
│ │ ││
|
|||
|
|
│ └─────────────────────────────────────────────────────────────────────────┘│
|
|||
|
|
│ │ │
|
|||
|
|
│ ▼ │
|
|||
|
|
│ response_type 路由(不变) │
|
|||
|
|
│ fixed / rag / flow / transfer │
|
|||
|
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 2.2 插入点标注
|
|||
|
|
|
|||
|
|
| 插入点 | 位置 | 改造内容 |
|
|||
|
|
|--------|------|----------|
|
|||
|
|
| Step 3 入口 | orchestrator.py:500 | 调用 `IntentRouter.match_hybrid()` 替代 `match()` |
|
|||
|
|
| IntentRule 实体 | entities.py:420-463 | 新增 `intent_vector`、`semantic_examples` 字段 |
|
|||
|
|
| IntentRouter 类 | intent/router.py | 新增 `match_hybrid()` 方法,保留 `match()` 向后兼容 |
|
|||
|
|
| 新增模块 | intent/ | 新增 `semantic_matcher.py`、`llm_judge.py`、`fusion_policy.py`、`models.py` |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 3. 核心接口设计
|
|||
|
|
|
|||
|
|
### 3.1 数据模型
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# 位置: app/services/intent/models.py
|
|||
|
|
|
|||
|
|
from dataclasses import dataclass, field
|
|||
|
|
from typing import Any
|
|||
|
|
import uuid
|
|||
|
|
|
|||
|
|
@dataclass
|
|||
|
|
class RuleMatchResult:
|
|||
|
|
"""规则匹配结果"""
|
|||
|
|
rule_id: uuid.UUID | None
|
|||
|
|
rule: "IntentRule | None"
|
|||
|
|
match_type: str | None # "keyword" | "regex" | None
|
|||
|
|
matched_text: str | None
|
|||
|
|
score: float # 1.0 或 0.0
|
|||
|
|
duration_ms: int
|
|||
|
|
|
|||
|
|
@dataclass
|
|||
|
|
class SemanticCandidate:
|
|||
|
|
"""语义匹配候选"""
|
|||
|
|
rule: "IntentRule"
|
|||
|
|
score: float # 0.0 ~ 1.0 相似度
|
|||
|
|
|
|||
|
|
@dataclass
|
|||
|
|
class SemanticMatchResult:
|
|||
|
|
"""语义匹配结果"""
|
|||
|
|
candidates: list[SemanticCandidate] # Top-N 候选
|
|||
|
|
top_score: float
|
|||
|
|
duration_ms: int
|
|||
|
|
skipped: bool # 是否跳过(无语义向量配置)
|
|||
|
|
skip_reason: str | None # 跳过原因
|
|||
|
|
|
|||
|
|
@dataclass
|
|||
|
|
class LlmJudgeInput:
|
|||
|
|
"""LLM 仲裁输入"""
|
|||
|
|
message: str
|
|||
|
|
candidates: list[dict] # 候选意图列表
|
|||
|
|
conflict_type: str # "rule_semantic_conflict" | "gray_zone" | "multi_intent"
|
|||
|
|
|
|||
|
|
@dataclass
|
|||
|
|
class LlmJudgeResult:
|
|||
|
|
"""LLM 仲裁结果"""
|
|||
|
|
intent_id: str | None
|
|||
|
|
intent_name: str | None
|
|||
|
|
score: float # 0.0 ~ 1.0
|
|||
|
|
reasoning: str | None # LLM 的推理过程
|
|||
|
|
duration_ms: int
|
|||
|
|
tokens_used: int
|
|||
|
|
triggered: bool
|
|||
|
|
|
|||
|
|
@dataclass
|
|||
|
|
class FusionConfig:
|
|||
|
|
"""融合配置"""
|
|||
|
|
w_rule: float = 0.5
|
|||
|
|
w_semantic: float = 0.3
|
|||
|
|
w_llm: float = 0.2
|
|||
|
|
semantic_threshold: float = 0.7
|
|||
|
|
conflict_threshold: float = 0.2
|
|||
|
|
gray_zone_threshold: float = 0.6
|
|||
|
|
min_trigger_threshold: float = 0.3
|
|||
|
|
clarify_threshold: float = 0.4
|
|||
|
|
multi_intent_threshold: float = 0.15
|
|||
|
|
llm_judge_enabled: bool = True
|
|||
|
|
semantic_matcher_enabled: bool = True
|
|||
|
|
|
|||
|
|
@dataclass
|
|||
|
|
class RouteTrace:
|
|||
|
|
"""路由追踪日志"""
|
|||
|
|
rule_match: dict = field(default_factory=dict)
|
|||
|
|
semantic_match: dict = field(default_factory=dict)
|
|||
|
|
llm_judge: dict = field(default_factory=dict)
|
|||
|
|
fusion: dict = field(default_factory=dict)
|
|||
|
|
|
|||
|
|
@dataclass
|
|||
|
|
class FusionResult:
|
|||
|
|
"""融合决策结果"""
|
|||
|
|
final_intent: "IntentRule | None"
|
|||
|
|
final_confidence: float
|
|||
|
|
decision_reason: str
|
|||
|
|
need_clarify: bool
|
|||
|
|
clarify_candidates: list["IntentRule"] | None
|
|||
|
|
trace: RouteTrace
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 3.2 RuleMatcher(改造现有 IntentRouter)
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# 位置: app/services/intent/router.py
|
|||
|
|
|
|||
|
|
class RuleMatcher:
|
|||
|
|
"""规则匹配器(基于现有 IntentRouter)"""
|
|||
|
|
|
|||
|
|
def match(self, message: str, rules: list[IntentRule]) -> RuleMatchResult:
|
|||
|
|
"""
|
|||
|
|
关键词+正则匹配
|
|||
|
|
|
|||
|
|
匹配算法:
|
|||
|
|
1. 按 priority 降序遍历规则
|
|||
|
|
2. 对每条规则,先尝试关键词匹配
|
|||
|
|
3. 若无关键词匹配,尝试正则模式匹配
|
|||
|
|
4. 返回第一个匹配(最高优先级)
|
|||
|
|
|
|||
|
|
Args:
|
|||
|
|
message: 用户消息
|
|||
|
|
rules: 规则列表(已按优先级降序排列)
|
|||
|
|
|
|||
|
|
Returns:
|
|||
|
|
RuleMatchResult: 匹配结果
|
|||
|
|
"""
|
|||
|
|
start_time = time.time()
|
|||
|
|
message_lower = message.lower()
|
|||
|
|
|
|||
|
|
for rule in rules:
|
|||
|
|
if not rule.is_enabled:
|
|||
|
|
continue
|
|||
|
|
|
|||
|
|
result = self._match_keywords(message, message_lower, rule)
|
|||
|
|
if result:
|
|||
|
|
duration_ms = int((time.time() - start_time) * 1000)
|
|||
|
|
return RuleMatchResult(
|
|||
|
|
rule_id=rule.id,
|
|||
|
|
rule=rule,
|
|||
|
|
match_type="keyword",
|
|||
|
|
matched_text=result.matched,
|
|||
|
|
score=1.0,
|
|||
|
|
duration_ms=duration_ms
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
result = self._match_patterns(message, rule)
|
|||
|
|
if result:
|
|||
|
|
duration_ms = int((time.time() - start_time) * 1000)
|
|||
|
|
return RuleMatchResult(
|
|||
|
|
rule_id=rule.id,
|
|||
|
|
rule=rule,
|
|||
|
|
match_type="regex",
|
|||
|
|
matched_text=result.matched,
|
|||
|
|
score=1.0,
|
|||
|
|
duration_ms=duration_ms
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
duration_ms = int((time.time() - start_time) * 1000)
|
|||
|
|
return RuleMatchResult(
|
|||
|
|
rule_id=None,
|
|||
|
|
rule=None,
|
|||
|
|
match_type=None,
|
|||
|
|
matched_text=None,
|
|||
|
|
score=0.0,
|
|||
|
|
duration_ms=duration_ms
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
def _match_keywords(self, message: str, message_lower: str, rule: IntentRule) -> IntentMatchResult | None:
|
|||
|
|
"""关键词匹配(保留现有逻辑)"""
|
|||
|
|
pass
|
|||
|
|
|
|||
|
|
def _match_patterns(self, message: str, rule: IntentRule) -> IntentMatchResult | None:
|
|||
|
|
"""正则匹配(保留现有逻辑)"""
|
|||
|
|
pass
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 3.3 SemanticMatcher(新增)
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# 位置: app/services/intent/semantic_matcher.py
|
|||
|
|
|
|||
|
|
import asyncio
|
|||
|
|
from typing import Any
|
|||
|
|
import numpy as np
|
|||
|
|
|
|||
|
|
class SemanticMatcher:
|
|||
|
|
"""语义匹配器"""
|
|||
|
|
|
|||
|
|
def __init__(
|
|||
|
|
self,
|
|||
|
|
embedding_provider: EmbeddingProvider,
|
|||
|
|
config: FusionConfig
|
|||
|
|
):
|
|||
|
|
self._embedding_provider = embedding_provider
|
|||
|
|
self._config = config
|
|||
|
|
|
|||
|
|
async def match(
|
|||
|
|
self,
|
|||
|
|
message: str,
|
|||
|
|
rules: list[IntentRule],
|
|||
|
|
tenant_id: str,
|
|||
|
|
top_k: int = 3
|
|||
|
|
) -> SemanticMatchResult:
|
|||
|
|
"""
|
|||
|
|
向量语义匹配
|
|||
|
|
|
|||
|
|
匹配模式:
|
|||
|
|
- 模式 A: 使用规则预置的 intent_vector 直接计算相似度
|
|||
|
|
- 模式 B: 使用规则的 semantic_examples 动态计算平均相似度
|
|||
|
|
|
|||
|
|
Args:
|
|||
|
|
message: 用户消息
|
|||
|
|
rules: 规则列表
|
|||
|
|
tenant_id: 租户 ID
|
|||
|
|
top_k: 返回候选数
|
|||
|
|
|
|||
|
|
Returns:
|
|||
|
|
SemanticMatchResult: 匹配结果
|
|||
|
|
"""
|
|||
|
|
start_time = time.time()
|
|||
|
|
|
|||
|
|
if not self._config.semantic_matcher_enabled:
|
|||
|
|
return SemanticMatchResult(
|
|||
|
|
candidates=[],
|
|||
|
|
top_score=0.0,
|
|||
|
|
duration_ms=0,
|
|||
|
|
skipped=True,
|
|||
|
|
skip_reason="disabled"
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
rules_with_semantic = [r for r in rules if self._has_semantic_config(r)]
|
|||
|
|
if not rules_with_semantic:
|
|||
|
|
duration_ms = int((time.time() - start_time) * 1000)
|
|||
|
|
return SemanticMatchResult(
|
|||
|
|
candidates=[],
|
|||
|
|
top_score=0.0,
|
|||
|
|
duration_ms=duration_ms,
|
|||
|
|
skipped=True,
|
|||
|
|
skip_reason="no_semantic_config"
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
try:
|
|||
|
|
message_vector = await asyncio.wait_for(
|
|||
|
|
self._embedding_provider.embed(message),
|
|||
|
|
timeout=self._config.semantic_matcher_timeout_ms / 1000
|
|||
|
|
)
|
|||
|
|
except asyncio.TimeoutError:
|
|||
|
|
duration_ms = int((time.time() - start_time) * 1000)
|
|||
|
|
return SemanticMatchResult(
|
|||
|
|
candidates=[],
|
|||
|
|
top_score=0.0,
|
|||
|
|
duration_ms=duration_ms,
|
|||
|
|
skipped=True,
|
|||
|
|
skip_reason="embedding_timeout"
|
|||
|
|
)
|
|||
|
|
except Exception as e:
|
|||
|
|
duration_ms = int((time.time() - start_time) * 1000)
|
|||
|
|
return SemanticMatchResult(
|
|||
|
|
candidates=[],
|
|||
|
|
top_score=0.0,
|
|||
|
|
duration_ms=duration_ms,
|
|||
|
|
skipped=True,
|
|||
|
|
skip_reason=f"embedding_error: {str(e)}"
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
candidates = []
|
|||
|
|
for rule in rules_with_semantic:
|
|||
|
|
score = await self._calculate_similarity(message_vector, rule)
|
|||
|
|
if score > 0:
|
|||
|
|
candidates.append(SemanticCandidate(rule=rule, score=score))
|
|||
|
|
|
|||
|
|
candidates.sort(key=lambda x: x.score, reverse=True)
|
|||
|
|
candidates = candidates[:top_k]
|
|||
|
|
|
|||
|
|
duration_ms = int((time.time() - start_time) * 1000)
|
|||
|
|
return SemanticMatchResult(
|
|||
|
|
candidates=candidates,
|
|||
|
|
top_score=candidates[0].score if candidates else 0.0,
|
|||
|
|
duration_ms=duration_ms,
|
|||
|
|
skipped=False,
|
|||
|
|
skip_reason=None
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
def _has_semantic_config(self, rule: IntentRule) -> bool:
|
|||
|
|
"""检查规则是否有语义配置"""
|
|||
|
|
return bool(rule.intent_vector) or bool(rule.semantic_examples)
|
|||
|
|
|
|||
|
|
async def _calculate_similarity(self, message_vector: list[float], rule: IntentRule) -> float:
|
|||
|
|
"""计算相似度"""
|
|||
|
|
if rule.intent_vector:
|
|||
|
|
return self._cosine_similarity(message_vector, rule.intent_vector)
|
|||
|
|
elif rule.semantic_examples:
|
|||
|
|
example_vectors = await self._embedding_provider.embed_batch(rule.semantic_examples)
|
|||
|
|
similarities = [
|
|||
|
|
self._cosine_similarity(message_vector, v)
|
|||
|
|
for v in example_vectors
|
|||
|
|
]
|
|||
|
|
return max(similarities) if similarities else 0.0
|
|||
|
|
return 0.0
|
|||
|
|
|
|||
|
|
def _cosine_similarity(self, v1: list[float], v2: list[float]) -> float:
|
|||
|
|
"""计算余弦相似度"""
|
|||
|
|
v1_arr = np.array(v1)
|
|||
|
|
v2_arr = np.array(v2)
|
|||
|
|
return float(np.dot(v1_arr, v2_arr) / (np.linalg.norm(v1_arr) * np.linalg.norm(v2_arr)))
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 3.4 LlmJudge(新增)
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# 位置: app/services/intent/llm_judge.py
|
|||
|
|
|
|||
|
|
class LlmJudge:
|
|||
|
|
"""LLM 仲裁器"""
|
|||
|
|
|
|||
|
|
JUDGE_PROMPT = """你是一个意图识别仲裁器。根据用户消息和候选意图,判断最匹配的意图。
|
|||
|
|
|
|||
|
|
用户消息:{message}
|
|||
|
|
|
|||
|
|
候选意图:
|
|||
|
|
{candidates}
|
|||
|
|
|
|||
|
|
请返回 JSON 格式:
|
|||
|
|
{{
|
|||
|
|
"intent_id": "最匹配的意图ID",
|
|||
|
|
"intent_name": "意图名称",
|
|||
|
|
"confidence": 0.0-1.0之间的置信度,
|
|||
|
|
"reasoning": "判断理由"
|
|||
|
|
}}
|
|||
|
|
"""
|
|||
|
|
|
|||
|
|
def __init__(
|
|||
|
|
self,
|
|||
|
|
llm_client: LLMClient,
|
|||
|
|
config: FusionConfig
|
|||
|
|
):
|
|||
|
|
self._llm_client = llm_client
|
|||
|
|
self._config = config
|
|||
|
|
|
|||
|
|
def should_trigger(
|
|||
|
|
self,
|
|||
|
|
rule_result: RuleMatchResult,
|
|||
|
|
semantic_result: SemanticMatchResult,
|
|||
|
|
config: FusionConfig
|
|||
|
|
) -> tuple[bool, str]:
|
|||
|
|
"""
|
|||
|
|
判断是否触发 LLM Judge
|
|||
|
|
|
|||
|
|
触发条件:
|
|||
|
|
1. 冲突场景:RuleMatcher 与 SemanticMatcher 命中不同意图
|
|||
|
|
2. 灰区场景:最高置信度在灰区范围内
|
|||
|
|
3. 多意图场景:多个候选意图置信度接近
|
|||
|
|
|
|||
|
|
Args:
|
|||
|
|
rule_result: 规则匹配结果
|
|||
|
|
semantic_result: 语义匹配结果
|
|||
|
|
config: 融合配置
|
|||
|
|
|
|||
|
|
Returns:
|
|||
|
|
(是否触发, 触发原因)
|
|||
|
|
"""
|
|||
|
|
if not config.llm_judge_enabled:
|
|||
|
|
return False, "disabled"
|
|||
|
|
|
|||
|
|
rule_score = rule_result.score
|
|||
|
|
semantic_score = semantic_result.top_score
|
|||
|
|
|
|||
|
|
if rule_score > 0 and semantic_score > 0:
|
|||
|
|
if rule_result.rule_id != semantic_result.candidates[0].rule.id:
|
|||
|
|
if abs(rule_score - semantic_score) < config.conflict_threshold:
|
|||
|
|
return True, "rule_semantic_conflict"
|
|||
|
|
|
|||
|
|
max_score = max(rule_score, semantic_score)
|
|||
|
|
if config.min_trigger_threshold < max_score < config.gray_zone_threshold:
|
|||
|
|
return True, "gray_zone"
|
|||
|
|
|
|||
|
|
if len(semantic_result.candidates) >= 2:
|
|||
|
|
top1_score = semantic_result.candidates[0].score
|
|||
|
|
top2_score = semantic_result.candidates[1].score
|
|||
|
|
if abs(top1_score - top2_score) < config.multi_intent_threshold:
|
|||
|
|
return True, "multi_intent"
|
|||
|
|
|
|||
|
|
return False, ""
|
|||
|
|
|
|||
|
|
async def judge(
|
|||
|
|
self,
|
|||
|
|
input: LlmJudgeInput,
|
|||
|
|
tenant_id: str
|
|||
|
|
) -> LlmJudgeResult:
|
|||
|
|
"""
|
|||
|
|
LLM 仲裁
|
|||
|
|
|
|||
|
|
Args:
|
|||
|
|
input: 仲裁输入
|
|||
|
|
tenant_id: 租户 ID
|
|||
|
|
|
|||
|
|
Returns:
|
|||
|
|
LlmJudgeResult: 仲裁结果
|
|||
|
|
"""
|
|||
|
|
start_time = time.time()
|
|||
|
|
|
|||
|
|
candidates_text = "\n".join([
|
|||
|
|
f"- ID: {c['id']}, 名称: {c['name']}, 描述: {c.get('description', 'N/A')}"
|
|||
|
|
for c in input.candidates
|
|||
|
|
])
|
|||
|
|
|
|||
|
|
prompt = self.JUDGE_PROMPT.format(
|
|||
|
|
message=input.message,
|
|||
|
|
candidates=candidates_text
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
try:
|
|||
|
|
response = await asyncio.wait_for(
|
|||
|
|
self._llm_client.generate(
|
|||
|
|
messages=[{"role": "user", "content": prompt}],
|
|||
|
|
max_tokens=200,
|
|||
|
|
temperature=0
|
|||
|
|
),
|
|||
|
|
timeout=self._config.llm_judge_timeout_ms / 1000
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
result = self._parse_response(response.content)
|
|||
|
|
duration_ms = int((time.time() - start_time) * 1000)
|
|||
|
|
|
|||
|
|
return LlmJudgeResult(
|
|||
|
|
intent_id=result.get("intent_id"),
|
|||
|
|
intent_name=result.get("intent_name"),
|
|||
|
|
score=result.get("confidence", 0.5),
|
|||
|
|
reasoning=result.get("reasoning"),
|
|||
|
|
duration_ms=duration_ms,
|
|||
|
|
tokens_used=response.total_tokens,
|
|||
|
|
triggered=True
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
except asyncio.TimeoutError:
|
|||
|
|
duration_ms = int((time.time() - start_time) * 1000)
|
|||
|
|
return LlmJudgeResult(
|
|||
|
|
intent_id=None,
|
|||
|
|
intent_name=None,
|
|||
|
|
score=0.0,
|
|||
|
|
reasoning="LLM timeout",
|
|||
|
|
duration_ms=duration_ms,
|
|||
|
|
tokens_used=0,
|
|||
|
|
triggered=True
|
|||
|
|
)
|
|||
|
|
except Exception as e:
|
|||
|
|
duration_ms = int((time.time() - start_time) * 1000)
|
|||
|
|
return LlmJudgeResult(
|
|||
|
|
intent_id=None,
|
|||
|
|
intent_name=None,
|
|||
|
|
score=0.0,
|
|||
|
|
reasoning=f"LLM error: {str(e)}",
|
|||
|
|
duration_ms=duration_ms,
|
|||
|
|
tokens_used=0,
|
|||
|
|
triggered=True
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
def _parse_response(self, content: str) -> dict:
|
|||
|
|
"""解析 LLM 响应"""
|
|||
|
|
import json
|
|||
|
|
try:
|
|||
|
|
return json.loads(content)
|
|||
|
|
except json.JSONDecodeError:
|
|||
|
|
return {}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 3.5 FusionPolicy(新增)
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# 位置: app/services/intent/fusion_policy.py
|
|||
|
|
|
|||
|
|
class FusionPolicy:
|
|||
|
|
"""融合决策策略"""
|
|||
|
|
|
|||
|
|
DECISION_PRIORITY = [
|
|||
|
|
("rule_high_confidence", lambda r, s, l: r.score == 1.0 and r.rule is not None),
|
|||
|
|
("llm_judge", lambda r, s, l: l.triggered and l.intent_id is not None),
|
|||
|
|
("semantic_override", lambda r, s, l: r.score == 0 and s.top_score > 0.7),
|
|||
|
|
("rule_semantic_agree", lambda r, s, l: r.score > 0 and s.top_score > 0.5 and r.rule_id == s.candidates[0].rule.id if s.candidates else False),
|
|||
|
|
("semantic_fallback", lambda r, s, l: s.top_score > 0.5),
|
|||
|
|
("rule_fallback", lambda r, s, l: r.score > 0),
|
|||
|
|
("no_match", lambda r, s, l: True),
|
|||
|
|
]
|
|||
|
|
|
|||
|
|
def __init__(self, config: FusionConfig):
|
|||
|
|
self._config = config
|
|||
|
|
|
|||
|
|
def fuse(
|
|||
|
|
self,
|
|||
|
|
rule_result: RuleMatchResult,
|
|||
|
|
semantic_result: SemanticMatchResult,
|
|||
|
|
llm_result: LlmJudgeResult | None
|
|||
|
|
) -> FusionResult:
|
|||
|
|
"""
|
|||
|
|
融合决策
|
|||
|
|
|
|||
|
|
Args:
|
|||
|
|
rule_result: 规则匹配结果
|
|||
|
|
semantic_result: 语义匹配结果
|
|||
|
|
llm_result: LLM 仲裁结果(可能为 None)
|
|||
|
|
|
|||
|
|
Returns:
|
|||
|
|
FusionResult: 融合结果
|
|||
|
|
"""
|
|||
|
|
trace = RouteTrace(
|
|||
|
|
rule_match={
|
|||
|
|
"rule_id": str(rule_result.rule_id) if rule_result.rule_id else None,
|
|||
|
|
"match_type": rule_result.match_type,
|
|||
|
|
"matched_text": rule_result.matched_text,
|
|||
|
|
"score": rule_result.score,
|
|||
|
|
"duration_ms": rule_result.duration_ms
|
|||
|
|
},
|
|||
|
|
semantic_match={
|
|||
|
|
"top_candidates": [
|
|||
|
|
{"rule_id": str(c.rule.id), "name": c.rule.name, "score": c.score}
|
|||
|
|
for c in semantic_result.candidates
|
|||
|
|
],
|
|||
|
|
"top_score": semantic_result.top_score,
|
|||
|
|
"duration_ms": semantic_result.duration_ms,
|
|||
|
|
"skipped": semantic_result.skipped,
|
|||
|
|
"skip_reason": semantic_result.skip_reason
|
|||
|
|
},
|
|||
|
|
llm_judge={
|
|||
|
|
"triggered": llm_result.triggered if llm_result else False,
|
|||
|
|
"intent_id": llm_result.intent_id if llm_result else None,
|
|||
|
|
"score": llm_result.score if llm_result else 0.0,
|
|||
|
|
"duration_ms": llm_result.duration_ms if llm_result else 0,
|
|||
|
|
"tokens_used": llm_result.tokens_used if llm_result else 0
|
|||
|
|
},
|
|||
|
|
fusion={}
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
final_intent = None
|
|||
|
|
final_confidence = 0.0
|
|||
|
|
decision_reason = "no_match"
|
|||
|
|
|
|||
|
|
for reason, condition in self.DECISION_PRIORITY:
|
|||
|
|
if condition(rule_result, semantic_result, llm_result or LlmJudgeResult.empty()):
|
|||
|
|
decision_reason = reason
|
|||
|
|
break
|
|||
|
|
|
|||
|
|
if decision_reason == "rule_high_confidence":
|
|||
|
|
final_intent = rule_result.rule
|
|||
|
|
final_confidence = 1.0
|
|||
|
|
elif decision_reason == "llm_judge" and llm_result:
|
|||
|
|
final_intent = self._find_rule_by_id(llm_result.intent_id, rule_result, semantic_result)
|
|||
|
|
final_confidence = llm_result.score
|
|||
|
|
elif decision_reason == "semantic_override":
|
|||
|
|
final_intent = semantic_result.candidates[0].rule
|
|||
|
|
final_confidence = semantic_result.top_score
|
|||
|
|
elif decision_reason == "rule_semantic_agree":
|
|||
|
|
final_intent = rule_result.rule
|
|||
|
|
final_confidence = self._calculate_weighted_confidence(rule_result, semantic_result, llm_result)
|
|||
|
|
elif decision_reason == "semantic_fallback":
|
|||
|
|
final_intent = semantic_result.candidates[0].rule
|
|||
|
|
final_confidence = semantic_result.top_score
|
|||
|
|
elif decision_reason == "rule_fallback":
|
|||
|
|
final_intent = rule_result.rule
|
|||
|
|
final_confidence = rule_result.score
|
|||
|
|
|
|||
|
|
need_clarify = final_confidence < self._config.clarify_threshold
|
|||
|
|
clarify_candidates = None
|
|||
|
|
if need_clarify and len(semantic_result.candidates) > 1:
|
|||
|
|
clarify_candidates = [c.rule for c in semantic_result.candidates[:3]]
|
|||
|
|
|
|||
|
|
trace.fusion = {
|
|||
|
|
"weights": {
|
|||
|
|
"w_rule": self._config.w_rule,
|
|||
|
|
"w_semantic": self._config.w_semantic,
|
|||
|
|
"w_llm": self._config.w_llm
|
|||
|
|
},
|
|||
|
|
"final_confidence": final_confidence,
|
|||
|
|
"decision_reason": decision_reason
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
return FusionResult(
|
|||
|
|
final_intent=final_intent,
|
|||
|
|
final_confidence=final_confidence,
|
|||
|
|
decision_reason=decision_reason,
|
|||
|
|
need_clarify=need_clarify,
|
|||
|
|
clarify_candidates=clarify_candidates,
|
|||
|
|
trace=trace
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
def _calculate_weighted_confidence(
|
|||
|
|
self,
|
|||
|
|
rule_result: RuleMatchResult,
|
|||
|
|
semantic_result: SemanticMatchResult,
|
|||
|
|
llm_result: LlmJudgeResult | None
|
|||
|
|
) -> float:
|
|||
|
|
"""计算加权置信度"""
|
|||
|
|
rule_score = rule_result.score
|
|||
|
|
semantic_score = semantic_result.top_score if not semantic_result.skipped else 0.0
|
|||
|
|
llm_score = llm_result.score if llm_result and llm_result.triggered else 0.0
|
|||
|
|
|
|||
|
|
total_weight = self._config.w_rule + self._config.w_semantic
|
|||
|
|
if llm_result and llm_result.triggered:
|
|||
|
|
total_weight += self._config.w_llm
|
|||
|
|
|
|||
|
|
confidence = (
|
|||
|
|
self._config.w_rule * rule_score +
|
|||
|
|
self._config.w_semantic * semantic_score +
|
|||
|
|
self._config.w_llm * llm_score
|
|||
|
|
) / total_weight
|
|||
|
|
|
|||
|
|
return min(1.0, max(0.0, confidence))
|
|||
|
|
|
|||
|
|
def _find_rule_by_id(
|
|||
|
|
self,
|
|||
|
|
intent_id: str | None,
|
|||
|
|
rule_result: RuleMatchResult,
|
|||
|
|
semantic_result: SemanticMatchResult
|
|||
|
|
) -> IntentRule | None:
|
|||
|
|
"""根据 ID 查找规则"""
|
|||
|
|
if not intent_id:
|
|||
|
|
return None
|
|||
|
|
|
|||
|
|
if rule_result.rule_id and str(rule_result.rule_id) == intent_id:
|
|||
|
|
return rule_result.rule
|
|||
|
|
|
|||
|
|
for candidate in semantic_result.candidates:
|
|||
|
|
if str(candidate.rule.id) == intent_id:
|
|||
|
|
return candidate.rule
|
|||
|
|
|
|||
|
|
return None
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 3.6 IntentRouter 升级
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# 位置: app/services/intent/router.py
|
|||
|
|
|
|||
|
|
class IntentRouter:
|
|||
|
|
"""意图路由器(升级版)"""
|
|||
|
|
|
|||
|
|
def __init__(
|
|||
|
|
self,
|
|||
|
|
rule_matcher: RuleMatcher,
|
|||
|
|
semantic_matcher: SemanticMatcher,
|
|||
|
|
llm_judge: LlmJudge,
|
|||
|
|
fusion_policy: FusionPolicy,
|
|||
|
|
config: FusionConfig | None = None
|
|||
|
|
):
|
|||
|
|
self._rule_matcher = rule_matcher
|
|||
|
|
self._semantic_matcher = semantic_matcher
|
|||
|
|
self._llm_judge = llm_judge
|
|||
|
|
self._fusion_policy = fusion_policy
|
|||
|
|
self._config = config or FusionConfig()
|
|||
|
|
|
|||
|
|
async def match_hybrid(
|
|||
|
|
self,
|
|||
|
|
message: str,
|
|||
|
|
rules: list[IntentRule],
|
|||
|
|
tenant_id: str,
|
|||
|
|
config: FusionConfig | None = None
|
|||
|
|
) -> FusionResult:
|
|||
|
|
"""
|
|||
|
|
混合路由入口
|
|||
|
|
|
|||
|
|
流程:
|
|||
|
|
1. 并行执行 RuleMatcher + SemanticMatcher
|
|||
|
|
2. 判断是否触发 LlmJudge
|
|||
|
|
3. 执行 FusionPolicy
|
|||
|
|
4. 返回融合结果
|
|||
|
|
|
|||
|
|
Args:
|
|||
|
|
message: 用户消息
|
|||
|
|
rules: 规则列表
|
|||
|
|
tenant_id: 租户 ID
|
|||
|
|
config: 融合配置(可选,覆盖默认配置)
|
|||
|
|
|
|||
|
|
Returns:
|
|||
|
|
FusionResult: 融合结果
|
|||
|
|
"""
|
|||
|
|
effective_config = config or self._config
|
|||
|
|
|
|||
|
|
rule_result, semantic_result = await asyncio.gather(
|
|||
|
|
asyncio.to_thread(self._rule_matcher.match, message, rules),
|
|||
|
|
self._semantic_matcher.match(message, rules, tenant_id)
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
llm_result = None
|
|||
|
|
should_trigger, trigger_reason = self._llm_judge.should_trigger(
|
|||
|
|
rule_result, semantic_result, effective_config
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
if should_trigger:
|
|||
|
|
candidates = self._build_llm_candidates(rule_result, semantic_result)
|
|||
|
|
llm_result = await self._llm_judge.judge(
|
|||
|
|
LlmJudgeInput(
|
|||
|
|
message=message,
|
|||
|
|
candidates=candidates,
|
|||
|
|
conflict_type=trigger_reason
|
|||
|
|
),
|
|||
|
|
tenant_id
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
fusion_result = self._fusion_policy.fuse(
|
|||
|
|
rule_result, semantic_result, llm_result
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
return fusion_result
|
|||
|
|
|
|||
|
|
def match(self, message: str, rules: list[IntentRule]) -> IntentMatchResult | None:
|
|||
|
|
"""
|
|||
|
|
原有方法保留,向后兼容
|
|||
|
|
|
|||
|
|
Args:
|
|||
|
|
message: 用户消息
|
|||
|
|
rules: 规则列表
|
|||
|
|
|
|||
|
|
Returns:
|
|||
|
|
IntentMatchResult | None: 匹配结果
|
|||
|
|
"""
|
|||
|
|
result = self._rule_matcher.match(message, rules)
|
|||
|
|
if result.rule:
|
|||
|
|
return IntentMatchResult(
|
|||
|
|
rule=result.rule,
|
|||
|
|
match_type=result.match_type,
|
|||
|
|
matched=result.matched_text
|
|||
|
|
)
|
|||
|
|
return None
|
|||
|
|
|
|||
|
|
def _build_llm_candidates(
|
|||
|
|
self,
|
|||
|
|
rule_result: RuleMatchResult,
|
|||
|
|
semantic_result: SemanticMatchResult
|
|||
|
|
) -> list[dict]:
|
|||
|
|
"""构建 LLM 候选列表"""
|
|||
|
|
candidates = []
|
|||
|
|
|
|||
|
|
if rule_result.rule:
|
|||
|
|
candidates.append({
|
|||
|
|
"id": str(rule_result.rule_id),
|
|||
|
|
"name": rule_result.rule.name,
|
|||
|
|
"description": f"匹配方式: {rule_result.match_type}, 匹配内容: {rule_result.matched_text}"
|
|||
|
|
})
|
|||
|
|
|
|||
|
|
for candidate in semantic_result.candidates[:3]:
|
|||
|
|
if not any(c["id"] == str(candidate.rule.id) for c in candidates):
|
|||
|
|
candidates.append({
|
|||
|
|
"id": str(candidate.rule.id),
|
|||
|
|
"name": candidate.rule.name,
|
|||
|
|
"description": f"语义相似度: {candidate.score:.2f}"
|
|||
|
|
})
|
|||
|
|
|
|||
|
|
return candidates
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 4. 融合公式与默认阈值
|
|||
|
|
|
|||
|
|
### 4.1 融合公式
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
final_confidence = (w_rule * rule_score + w_semantic * semantic_score + w_llm * llm_score) / total_weight
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
其中:
|
|||
|
|
- `total_weight = w_rule + w_semantic + (w_llm if llm_triggered else 0)`
|
|||
|
|
- 结果限制在 `[0.0, 1.0]` 范围内
|
|||
|
|
|
|||
|
|
### 4.2 默认阈值配置
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
DEFAULT_FUSION_CONFIG = FusionConfig(
|
|||
|
|
w_rule=0.5, # 规则权重
|
|||
|
|
w_semantic=0.3, # 语义权重
|
|||
|
|
w_llm=0.2, # LLM 权重
|
|||
|
|
semantic_threshold=0.7, # 语义匹配高置信阈值
|
|||
|
|
conflict_threshold=0.2, # 冲突判定阈值(置信度差值)
|
|||
|
|
gray_zone_threshold=0.6, # 灰区上限阈值
|
|||
|
|
min_trigger_threshold=0.3, # 灰区下限阈值
|
|||
|
|
clarify_threshold=0.4, # 澄清触发阈值
|
|||
|
|
multi_intent_threshold=0.15, # 多意图判定阈值
|
|||
|
|
llm_judge_enabled=True, # 启用 LLM 仲裁
|
|||
|
|
semantic_matcher_enabled=True, # 启用语义匹配
|
|||
|
|
)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 4.3 决策优先级
|
|||
|
|
|
|||
|
|
| 优先级 | 决策原因 | 条件 |
|
|||
|
|
|--------|----------|------|
|
|||
|
|
| 1 | rule_high_confidence | RuleMatcher 命中且 score=1.0 |
|
|||
|
|
| 2 | llm_judge | LlmJudge 触发且返回有效意图 |
|
|||
|
|
| 3 | semantic_override | RuleMatcher 未命中但 SemanticMatcher 高置信 |
|
|||
|
|
| 4 | rule_semantic_agree | 规则与语义匹配同一意图 |
|
|||
|
|
| 5 | semantic_fallback | SemanticMatcher 中等置信 |
|
|||
|
|
| 6 | rule_fallback | 仅规则匹配 |
|
|||
|
|
| 7 | no_match | 三路均低置信 |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 5. 异常处理策略
|
|||
|
|
|
|||
|
|
| 异常场景 | 处理策略 | 可观测性 |
|
|||
|
|
|----------|----------|----------|
|
|||
|
|
| Embedding 调用失败 | 跳过 SemanticMatcher,仅使用 RuleMatcher | `semantic_match.skipped=true, skip_reason="embedding_failed"` |
|
|||
|
|
| Embedding 超时 | 跳过 SemanticMatcher,仅使用 RuleMatcher | `semantic_match.skipped=true, skip_reason="embedding_timeout"` |
|
|||
|
|
| Qdrant 检索失败 | 跳过 SemanticMatcher,仅使用 RuleMatcher | `semantic_match.skipped=true, skip_reason="qdrant_error"` |
|
|||
|
|
| LLM 调用失败 | 回退到 SemanticMatcher 结果 | `llm_judge.reasoning="LLM error: ..."` |
|
|||
|
|
| LLM 超时 | 回退到 SemanticMatcher 结果 | `llm_judge.reasoning="LLM timeout"` |
|
|||
|
|
| LLM 响应解析失败 | 回退到 SemanticMatcher 结果 | `llm_judge.reasoning="parse_failed"` |
|
|||
|
|
| 规则缓存失效 | 重新从数据库加载 | `rule_match.cache_miss=true` |
|
|||
|
|
| 配置缺失 | 使用默认配置 | `fusion.using_default_config=true` |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 6. 数据模型变更
|
|||
|
|
|
|||
|
|
### 6.1 IntentRule 实体扩展
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# 位置: app/models/entities.py
|
|||
|
|
|
|||
|
|
class IntentRule(SQLModel, table=True):
|
|||
|
|
# ... 现有字段 ...
|
|||
|
|
|
|||
|
|
intent_vector: list[float] | None = Field(
|
|||
|
|
default=None,
|
|||
|
|
sa_column=Column(JSONB, nullable=True, comment="意图向量(预计算)")
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
semantic_examples: list[str] | None = Field(
|
|||
|
|
default=None,
|
|||
|
|
sa_column=Column(JSONB, nullable=True, comment="语义示例句列表")
|
|||
|
|
)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 6.2 ChatMessage 实体扩展
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# 位置: app/models/entities.py
|
|||
|
|
|
|||
|
|
class ChatMessage(SQLModel, table=True):
|
|||
|
|
# ... 现有字段 ...
|
|||
|
|
|
|||
|
|
route_trace: dict[str, Any] | None = Field(
|
|||
|
|
default=None,
|
|||
|
|
sa_column=Column(JSONB, nullable=True, comment="意图路由追踪日志")
|
|||
|
|
)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 7. API 设计
|
|||
|
|
|
|||
|
|
### 7.1 融合配置 API
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
GET /admin/intent-rules/fusion-config
|
|||
|
|
PUT /admin/intent-rules/fusion-config
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 7.2 意图向量生成 API
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
POST /admin/intent-rules/{id}/generate-vector
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 7.3 监控 API 扩展
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
GET /admin/monitoring/conversations/{id}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
响应新增 `route_trace` 字段。
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 8. 性能优化
|
|||
|
|
|
|||
|
|
### 8.1 并行执行
|
|||
|
|
|
|||
|
|
RuleMatcher 和 SemanticMatcher 使用 `asyncio.gather` 并行执行。
|
|||
|
|
|
|||
|
|
### 8.2 超时控制
|
|||
|
|
|
|||
|
|
- SemanticMatcher 超时:100ms
|
|||
|
|
- LlmJudge 超时:2000ms
|
|||
|
|
|
|||
|
|
### 8.3 缓存策略
|
|||
|
|
|
|||
|
|
- 规则缓存:复用现有 RuleCache(TTL=60s)
|
|||
|
|
- 向量缓存:可选,后续迭代考虑
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 9. 风险与待澄清
|
|||
|
|
|
|||
|
|
### 9.1 风险
|
|||
|
|
|
|||
|
|
| 风险 | 等级 | 缓解措施 |
|
|||
|
|
|------|------|----------|
|
|||
|
|
| Embedding 调用增加延迟 | 中 | 设置超时,超时跳过 |
|
|||
|
|
| LLM Judge 频繁触发增加成本 | 中 | 配置合理触发阈值 |
|
|||
|
|
| 语义向量配置复杂度高 | 低 | 提供自动生成 API |
|
|||
|
|
|
|||
|
|
### 9.2 待澄清
|
|||
|
|
|
|||
|
|
- 意图向量索引是否需要独立的 Qdrant Collection?
|
|||
|
|
- LlmJudge 的 Token 消耗是否需要单独计费统计?
|
|||
|
|
- 融合配置是否需要支持租户级差异化配置?
|