docs: add fix-double-finalize-and-bindings-api implementation plan
This commit is contained in:
@@ -0,0 +1,241 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="zh-CN">
|
||||
<head>
|
||||
<meta charset="UTF-8" />
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||||
<title>AuditCtx — 结构与调用速查</title>
|
||||
<style>
|
||||
:root {
|
||||
--c-bg: #fafaf8;
|
||||
--c-panel: #ffffff;
|
||||
--c-text: #1a1a1a;
|
||||
--c-muted: #666;
|
||||
--c-subtle: #999;
|
||||
--c-border: #e5e5e0;
|
||||
--c-accent: #2a4d6e;
|
||||
--c-accent-soft: #eaf0f6;
|
||||
--c-code-bg: #2d2d2a;
|
||||
--c-code-text: #f0f0e8;
|
||||
--c-inline-bg: #f0ede5;
|
||||
--c-inline-text: #5a3a1a;
|
||||
--font-cn: "PingFang SC", "Microsoft YaHei", system-ui, sans-serif;
|
||||
--font-mono: "JetBrains Mono", "SF Mono", Menlo, monospace;
|
||||
}
|
||||
|
||||
* { box-sizing: border-box; }
|
||||
html, body {
|
||||
margin: 0; padding: 0;
|
||||
background: var(--c-bg); color: var(--c-text);
|
||||
font-family: var(--font-cn); line-height: 1.7; font-size: 15px;
|
||||
}
|
||||
|
||||
.layout {
|
||||
max-width: 880px;
|
||||
margin: 0 auto;
|
||||
padding: 48px 56px 80px;
|
||||
}
|
||||
|
||||
h1 {
|
||||
font-size: 28px; font-weight: 600; margin: 0 0 6px;
|
||||
border-bottom: 3px solid var(--c-accent); padding-bottom: 14px;
|
||||
}
|
||||
h2 {
|
||||
font-size: 20px; font-weight: 600;
|
||||
margin: 40px 0 14px;
|
||||
padding-left: 12px;
|
||||
border-left: 4px solid var(--c-accent);
|
||||
}
|
||||
.subtitle { color: var(--c-muted); font-size: 14px; margin-bottom: 16px; }
|
||||
|
||||
.lead {
|
||||
background: var(--c-accent-soft);
|
||||
border-left: 4px solid var(--c-accent);
|
||||
padding: 14px 20px;
|
||||
margin: 20px 0 28px;
|
||||
border-radius: 0 6px 6px 0;
|
||||
}
|
||||
.lead strong { color: var(--c-accent); }
|
||||
|
||||
p { margin: 10px 0; }
|
||||
ul { padding-left: 22px; margin: 8px 0; }
|
||||
ul li { margin: 5px 0; }
|
||||
ul li::marker { color: var(--c-accent); }
|
||||
|
||||
code {
|
||||
font-family: var(--font-mono);
|
||||
background: var(--c-inline-bg);
|
||||
color: var(--c-inline-text);
|
||||
padding: 1px 6px; border-radius: 3px;
|
||||
font-size: 13px;
|
||||
}
|
||||
pre {
|
||||
background: var(--c-code-bg);
|
||||
color: var(--c-code-text);
|
||||
padding: 16px 20px;
|
||||
border-radius: 6px;
|
||||
overflow-x: auto;
|
||||
margin: 12px 0;
|
||||
font-family: var(--font-mono);
|
||||
font-size: 13px;
|
||||
line-height: 1.55;
|
||||
}
|
||||
pre code {
|
||||
background: transparent; color: inherit; padding: 0; font-size: inherit;
|
||||
}
|
||||
.cmt { color: #8a9a7a; }
|
||||
|
||||
table {
|
||||
width: 100%; border-collapse: collapse; margin: 12px 0;
|
||||
background: var(--c-panel);
|
||||
border: 1px solid var(--c-border);
|
||||
border-radius: 6px; overflow: hidden;
|
||||
font-size: 14px;
|
||||
}
|
||||
th, td {
|
||||
padding: 9px 13px; text-align: left; vertical-align: top;
|
||||
border-bottom: 1px solid var(--c-border);
|
||||
}
|
||||
th {
|
||||
background: #f5f3ec; font-weight: 600;
|
||||
color: var(--c-muted); font-size: 13px;
|
||||
}
|
||||
tr:last-child td { border-bottom: none; }
|
||||
td code { font-size: 12px; }
|
||||
|
||||
.fileref {
|
||||
font-family: var(--font-mono); font-size: 12px;
|
||||
background: #efece4; color: #5a3a1a;
|
||||
padding: 1px 6px; border-radius: 3px;
|
||||
border: 1px solid #d8d2c0;
|
||||
}
|
||||
|
||||
.note {
|
||||
background: var(--c-accent-soft);
|
||||
border-left: 4px solid var(--c-accent);
|
||||
padding: 10px 16px; margin: 14px 0;
|
||||
border-radius: 0 6px 6px 0;
|
||||
font-size: 14px;
|
||||
}
|
||||
.note .tag { font-weight: 700; color: var(--c-accent); margin-right: 6px; }
|
||||
|
||||
.field-cat-身份 { color: #2a5a8a; font-weight: 600; }
|
||||
.field-cat-装配 { color: #80560a; font-weight: 600; }
|
||||
.field-cat-产物 { color: #4a7050; font-weight: 600; }
|
||||
.field-cat-诊断 { color: #888; font-weight: 600; }
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
<div class="layout">
|
||||
|
||||
<h1>AuditCtx — 结构与调用速查</h1>
|
||||
<div class="subtitle">LeAudit 服务编排层的不可变上下文 · 2026-04-27</div>
|
||||
|
||||
<div class="lead">
|
||||
<strong>是什么:</strong>
|
||||
<code>AuditCtx</code>(<span class="fileref">src/leaudit/services/audit_ctx.py</span>)是一次评查运行的<strong>不可变骨架</strong>。<code>@dataclass(frozen=True)</code>,构造一次,跨 service 流转;每个 stage 用 <code>with_xxx()</code> 派生新 ctx,旧 ctx 不变。
|
||||
</div>
|
||||
|
||||
<h2>结构</h2>
|
||||
|
||||
<table>
|
||||
<thead><tr><th style="width:24%">字段</th><th style="width:14%">类</th><th style="width:32%">类型</th><th>说明</th></tr></thead>
|
||||
<tbody>
|
||||
<tr><td><code>document_id</code></td><td><span class="field-cat-身份">身份</span></td><td><code>str</code></td><td>这次 run 的 ID</td></tr>
|
||||
<tr><td><code>rules_file</code></td><td><span class="field-cat-身份">身份</span></td><td><code>RulesFile | None</code></td><td>评查规则;None 时由分类器决定</td></tr>
|
||||
<tr><td><code>file_path</code></td><td><span class="field-cat-身份">身份</span></td><td><code>str | None</code></td><td>本地文件路径</td></tr>
|
||||
<tr><td><code>page_range</code></td><td><span class="field-cat-身份">身份</span></td><td><code>tuple[int,...] | None</code></td><td>子文档页面范围</td></tr>
|
||||
<tr><td><code>services</code></td><td><span class="field-cat-装配">装配</span></td><td><code>AuditServices</code></td><td>所有 service / client 的容器</td></tr>
|
||||
<tr><td><code>config</code></td><td><span class="field-cat-装配">装配</span></td><td><code>AuditConfig</code></td><td>运行期旋钮</td></tr>
|
||||
<tr><td><code>normalized_doc</code></td><td><span class="field-cat-产物">产物</span></td><td><code>NormalizedDocument | None</code></td><td>OCR + 分类 + 分段</td></tr>
|
||||
<tr><td><code>extraction</code></td><td><span class="field-cat-产物">产物</span></td><td><code>ExtractionBundle | None</code></td><td>抽取字段 / 多实体 / 派生</td></tr>
|
||||
<tr><td><code>phase</code></td><td><span class="field-cat-产物">产物</span></td><td><code>str | None</code></td><td><code>draft</code> 或 <code>executed</code></td></tr>
|
||||
<tr><td><code>evaluation</code></td><td><span class="field-cat-产物">产物</span></td><td><code>EvaluationResult | None</code></td><td>每条规则的评查结论</td></tr>
|
||||
<tr><td><code>fallback_tasks</code></td><td><span class="field-cat-产物">产物</span></td><td><code>tuple[RescueTask,...]</code></td><td>失败规则的修救任务</td></tr>
|
||||
<tr><td><code>extraction_errors</code></td><td><span class="field-cat-诊断">诊断</span></td><td><code>tuple[str,...]</code></td><td>抽取错误日志</td></tr>
|
||||
<tr><td><code>timing</code></td><td><span class="field-cat-诊断">诊断</span></td><td><code>Mapping[str,float]</code></td><td>每阶段耗时</td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
|
||||
<div class="note"><span class="tag">不可变性:</span><code>frozen=True</code> 防字段重绑;<code>__post_init__</code> 把 list/dict 强转 <code>tuple</code> / <code>MappingProxyType</code>,防 <code>ctx.timing["x"] = 1</code> 这种穿透改写。</div>
|
||||
|
||||
<h2>调用</h2>
|
||||
|
||||
<h3 style="font-size:15px; color:var(--c-accent); margin:20px 0 8px;">1. 构造</h3>
|
||||
|
||||
<pre><code>from leaudit.services.audit_ctx import AuditCtx
|
||||
from leaudit.services.audit_services import AuditServices
|
||||
from leaudit.config.audit_config import AuditConfig
|
||||
|
||||
services = AuditServices(
|
||||
llm_client=llm, vlm_client=vlm, ocr_client=ocr,
|
||||
normalization=norm_svc, extraction=ext_svc, evaluation=eval_svc,
|
||||
)
|
||||
|
||||
ctx = AuditCtx(
|
||||
document_id="my_doc",
|
||||
rules_file=rules,
|
||||
services=services,
|
||||
file_path="/path/to/doc.pdf",
|
||||
config=AuditConfig(group_size=8),
|
||||
)</code></pre>
|
||||
|
||||
<h3 style="font-size:15px; color:var(--c-accent); margin:20px 0 8px;">2. 演化(每个 stage 一次)</h3>
|
||||
|
||||
<pre><code>ctx = ctx.with_normalized_doc(ocr_result) <span class="cmt"># Stage 1</span>
|
||||
ctx = ctx.with_extraction(extraction) <span class="cmt"># Stage 3</span>
|
||||
ctx = ctx.with_phase("executed") <span class="cmt"># Stage 4</span>
|
||||
ctx = ctx.with_evaluation(evaluation) <span class="cmt"># Stage 5</span>
|
||||
ctx = ctx.with_fallback_task(task) <span class="cmt"># Stage 6(每条失败规则一次)</span>
|
||||
ctx = ctx.with_timing(ocr=1.2, total=8.5) <span class="cmt"># 累加耗时</span></code></pre>
|
||||
|
||||
<p>所有 <code>with_xxx()</code> 都是 <code>dataclasses.replace()</code> 的薄包装——<strong>返回新对象</strong>,旧 ctx 不变。要用 <code>ctx = ctx.with_...()</code> 接住返回值。</p>
|
||||
|
||||
<h3 style="font-size:15px; color:var(--c-accent); margin:20px 0 8px;">3. 读取</h3>
|
||||
|
||||
<pre><code>ocr_result = ctx.normalized_doc
|
||||
fields = ctx.extraction.fields
|
||||
phase = ctx.phase
|
||||
score = ctx.evaluation.total_score
|
||||
ocr_secs = ctx.timing.get("ocr", 0.0)
|
||||
text = ctx.source_text <span class="cmt"># 派生属性 = ctx.extraction.source_text</span></code></pre>
|
||||
|
||||
<h3 style="font-size:15px; color:var(--c-accent); margin:20px 0 8px;">4. 两个桥接 API</h3>
|
||||
|
||||
<table>
|
||||
<thead><tr><th style="width:30%">API</th><th>用途</th></tr></thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td><code>ctx.effective_config</code></td>
|
||||
<td>合并 <code>ctx.config</code> + <code>Settings</code>。<strong>业务代码统一只读这个</strong>,禁止 <code>get_settings()</code> / <code>os.getenv</code>(架构 guard 在 <span class="fileref">tests/architecture/</span>)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>ctx.as_rescue_inputs()</code></td>
|
||||
<td>转成 engine 旧的 <code>RescueInputs</code> 形状,桥接 <code>engine._try_ai_rescue</code>,不用改 engine</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
|
||||
<h2>典型调用:AuditService.audit(ctx)</h2>
|
||||
|
||||
<p>所有 stage 已经在 <code>AuditService.audit()</code> 里编排好了(<span class="fileref">src/leaudit/services/audit_service.py:152</span>)。普通用法只要构造 ctx 然后调它:</p>
|
||||
|
||||
<pre><code>ctx = AuditCtx(document_id="...", rules_file=rules, services=services,
|
||||
file_path="...", config=cfg)
|
||||
ctx = await audit_service.audit(ctx)
|
||||
<span class="cmt"># 跑完 ctx 已经被填满 normalized_doc / extraction / phase / evaluation / fallback_tasks / timing</span></code></pre>
|
||||
|
||||
<p>七个 stage 顺序:<code>Normalize</code> → <code>Resolve rules</code> → <code>Extract</code> → <code>Phase 判定</code> → <code>Evaluate</code> → <code>Rescue</code> → <code>Finalize</code>。</p>
|
||||
|
||||
<h2>规则三条</h2>
|
||||
|
||||
<ul>
|
||||
<li>加产物字段 → <code>audit_ctx.py</code> 加字段 + 一个 <code>with_xxx</code> helper</li>
|
||||
<li>读运行配置 → <code>ctx.effective_config</code>,<strong>不要</strong>直接 <code>get_settings()</code></li>
|
||||
<li>改 stage 行为 → <strong>必须</strong> return 新 ctx,<strong>不要</strong>原地改字段</li>
|
||||
</ul>
|
||||
|
||||
</div>
|
||||
|
||||
</body>
|
||||
</html>
|
||||
@@ -0,0 +1,100 @@
|
||||
# LeAudit 设计文档目录
|
||||
|
||||
本目录包含 `leaudit-platform` 项目的核心架构设计文档。
|
||||
|
||||
## 文档索引
|
||||
|
||||
| 文档 | 内容 | 状态 |
|
||||
|------|------|------|
|
||||
| [document_schema_design.md](document_schema_design.md) | 文档域表结构设计 — 17 张 `leaudit_*` 表完整说明 | ✅ 已落地 |
|
||||
| [dsl_rule_schema_design.md](dsl_rule_schema_design.md) | DSL 规则域表结构 — 规则集/版本/绑定管理 | ✅ 核心表已落地 |
|
||||
| [processing_logic.md](processing_logic.md) | LeAudit 7 阶段处理流水线说明 | 📖 参考文档 |
|
||||
| [bridge_directory_design.md](bridge_directory_design.md) | Bridge 桥接层目录与职责设计 | ✅ 已落地 |
|
||||
| [infrastructure_redesign.md](infrastructure_redesign.md) | 基础设施重设计 — OSS/队列/缓存/区域隔离 | 📋 设计蓝图 |
|
||||
|
||||
## 快览
|
||||
|
||||
```
|
||||
leaudit-platform 核心链路:
|
||||
|
||||
用户上传文档
|
||||
→ leaudit_bridge pipeline (OCR→Extract→Evaluate→Rescue)
|
||||
→ StorageAdapter 写入 leaudit_* 表
|
||||
→ Controller→Service 读取结果返回前端
|
||||
|
||||
数据存储:
|
||||
PostgreSQL: 17 张 leaudit_* 表 (元数据 + 结果索引)
|
||||
MinIO OSS: bdocs/ + artifacts/ (文件真源)
|
||||
Redis: 队列 + 缓存 + 并发控制
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 迁移规范
|
||||
|
||||
从老项目 `docauditai` 迁移代码到新项目 `leaudit-platform` 时,严格遵守以下规范。
|
||||
|
||||
### 一、路由
|
||||
|
||||
| 规则 | 说明 |
|
||||
|------|------|
|
||||
| **路由地址不变** | 前端不做任何修改,`POST /auth/login`、`POST /upload` 等全部保持原样 |
|
||||
| **编码按新规范** | 不用旧项目 `@router.post()` 裸写逻辑,改为 Controller → Service(接口+实现) → Model 分层 |
|
||||
| **响应格式统一** | 旧 `{success, data}` → 新 `Result[T]` 格式 `{code, message, data}` |
|
||||
| **去掉 V1/V2** | 旧项目 `app/routes/v2/` 的多版本路由全部废弃,只保留实际使用的路由 |
|
||||
|
||||
### 二、Service 层
|
||||
|
||||
| 规则 | 说明 |
|
||||
|------|------|
|
||||
| **业务逻辑原封不动** | 核心判断逻辑、优先级、算法全部保持和旧项目一致 |
|
||||
| **只改数据访问方式** | `asyncpg.connect()` → SQLAlchemy `GetAsyncSession()` + `text()` |
|
||||
| **接口+实现分离** | 每个模块定义 `IXxxService` 抽象接口 + `XxxServiceImpl` 实现 |
|
||||
| **不顺手优化** | 已验证的业务逻辑不做"顺便改进",避免引入新 bug |
|
||||
|
||||
### 三、数据库
|
||||
|
||||
| 规则 | 说明 |
|
||||
|------|------|
|
||||
| **全 `leaudit_*` 前缀** | 与旧 `docauditai` 库的表完全隔离,不混用 |
|
||||
| **新数据库独立** | `leaudit_platform@nas.7bm.co:54302`,不修改旧库结构 |
|
||||
| **所有表必有** | `create_time` + `update_time` + `delete_time` 三时间戳 |
|
||||
| **所有列必有中文注释** | `COMMENT ON COLUMN xxx IS '...'` |
|
||||
| **软删除** | 业务表不允许物理删除,统一用 `delete_time IS NULL` 过滤 |
|
||||
|
||||
### 四、配置
|
||||
|
||||
| 规则 | 说明 |
|
||||
|------|------|
|
||||
| **TOML → os.environ → Pydantic Settings → module export** | 配置加载链:`app.toml` → 环境变量 → `_settings.py` → `from fastapi_admin.config import XXX` |
|
||||
| **不硬编码** | 禁止在代码中写死 host/port/key,一律走 config 模块 |
|
||||
| **app.toml 不入 git** | 含数据库密码、API Key,`.gitignore` 已排除 |
|
||||
|
||||
### 五、Bridge 导入
|
||||
|
||||
| 旧引用 | 新引用 |
|
||||
|-------|-------|
|
||||
| `from core.config import XXX` | `from fastapi_admin.config import XXX` |
|
||||
| `from core.postgrest.client import ...` | `from fastapi_common...database import GetAsyncSession` + `text()` |
|
||||
| `from core.logger import log` | `from fastapi_common.fastapi_common_logger import logger` |
|
||||
| `from core.celery_app_limited import celery_app` | P2 阶段集成 |
|
||||
| `from core.utils.instance_context import ...` | 已废弃(显式参数替代 os.environ 切换) |
|
||||
|
||||
### 六、文件路径
|
||||
|
||||
| 旧路径 | 新路径 |
|
||||
|-------|-------|
|
||||
| `app/routes/v2/` | `fastapi_modules/fastapi_leaudit/controllers/` |
|
||||
| `app/rbac/` | `fastapi_modules/fastapi_leaudit/services/impl/` |
|
||||
| `services/leaudit_bridge/` | `fastapi_modules/fastapi_leaudit/leaudit_bridge/` |
|
||||
| `core/config.py` | `fastapi_admin/config/` |
|
||||
| `core/storage/` | `fastapi_common/` (通用化) |
|
||||
|
||||
### 七、不允许做的事
|
||||
|
||||
- ❌ 在老 `docauditai` 基础上直接改代码
|
||||
- ❌ 在新项目中引用 `docauditai` 的任何模块
|
||||
- ❌ 在新项目中直接 `import leaudit` 内核(必须走 bridge)
|
||||
- ❌ 跳过 Controller→Service→Model 分层直接写路由逻辑
|
||||
- ❌ 数据库表不加中文注释
|
||||
- ❌ 硬编码配置值
|
||||
@@ -0,0 +1,128 @@
|
||||
# LeAudit Bridge 目录设计
|
||||
|
||||
## 1. 目标
|
||||
|
||||
`fastapi_modules/fastapi_leaudit/leaudit_bridge/` 是 `leaudit-platform` 和 `leaudit` 内核之间的唯一正式桥接层。
|
||||
|
||||
设计目标:
|
||||
|
||||
- 业务层不直接调用 `leaudit` 内核模块
|
||||
- 所有文档评查统一从 bridge 进入
|
||||
- bridge 负责输入映射、上下文构建、结果适配、结果落库
|
||||
- 后续替换或升级 `leaudit` 时,最大限度减少对业务层的影响
|
||||
|
||||
## 2. 目录结构
|
||||
|
||||
```text
|
||||
fastapi_modules/fastapi_leaudit/leaudit_bridge/
|
||||
├── __init__.py # 顶层入口:create_pipeline / is_leaudit_mode
|
||||
├── pipeline.py # 管线总入口:OCR → Extract → Evaluate → Persist
|
||||
├── ctx_builder.py # 构建 leaudit 执行上下文
|
||||
├── rules_loader.py # 规则文件加载与缓存
|
||||
├── client_factory.py # OCR/LLM/VLM 客户端工厂
|
||||
├── result_adapter.py # leaudit 结果 → 统一格式
|
||||
├── storage_adapter.py # 结果写入 leaudit_* 表(SQLAlchemy)
|
||||
├── tasks.py # Celery 异步任务入口
|
||||
├── ocr_bridge.py # OCR/VLM 桥接后处理
|
||||
└── case_number_extractor.py # 案件编号提取
|
||||
```
|
||||
|
||||
> 路径从老项目 `services/leaudit_bridge/` 变更为 `fastapi_modules/fastapi_leaudit/leaudit_bridge/`。
|
||||
|
||||
## 3. 文件职责说明
|
||||
|
||||
### `__init__.py`
|
||||
|
||||
- `is_leaudit_mode()` — 新平台始终返回 True
|
||||
- `create_pipeline(rules_path)` — 创建完整 LauditPipeline 实例
|
||||
- 使用 `DocNormalizationAdapter` 包裹 OCR 客户端
|
||||
- 构建 `RulesFileRegistry` 用于内容分类
|
||||
|
||||
### `pipeline.py`
|
||||
|
||||
核心入口 `LauditPipeline.run()`:
|
||||
|
||||
```
|
||||
document_id + file_path + rules_file
|
||||
→ OCR (含分类/分段/印章增强)
|
||||
→ 案件编号提取
|
||||
→ Extraction (dispatch_extract)
|
||||
→ 坐标解析 (resolve_bundle_positions)
|
||||
→ Phase 判定 (draft/executed)
|
||||
→ Evaluation (evaluate_extraction)
|
||||
→ StorageAdapter 落库
|
||||
```
|
||||
|
||||
### `ctx_builder.py`
|
||||
|
||||
把 leaudit-platform 文档对象映射成 leaudit 可执行上下文。
|
||||
|
||||
### `rules_loader.py`
|
||||
|
||||
- 本地 YAML 加载(`leaudit.dsl.loader.load_rules_file`)
|
||||
- 未来扩展:OSS 下载 + 缓存
|
||||
|
||||
### `client_factory.py`
|
||||
|
||||
统一创建 leaudit 运行所需依赖对象:
|
||||
|
||||
- `create_ocr_client()` → `ChandraOCRClient`
|
||||
- `create_llm_client()` → `OpenAICompatibleClient`
|
||||
- `create_vlm_client()` → `QwenVLMClient`
|
||||
|
||||
配置源:`fastapi_admin.config`(从 `app.toml` 加载)
|
||||
|
||||
### `result_adapter.py`
|
||||
|
||||
把 leaudit 的原始结果转换成统一消费结构,屏蔽内核对象细节变化。
|
||||
|
||||
### `storage_adapter.py`
|
||||
|
||||
将适配后的结果写入 `leaudit_*` 表:
|
||||
|
||||
- `update_document_status()` → `leaudit_documents.processing_status`
|
||||
- `save_ocr_result()` → `leaudit_artifacts`
|
||||
- `save_extraction_result()` → `leaudit_field_results`
|
||||
- `save_evaluation_results()` → `leaudit_rule_results` + `leaudit_audit_runs`
|
||||
|
||||
使用 SQLAlchemy `GetAsyncSession()` + `text()` 查询。
|
||||
|
||||
### `tasks.py`
|
||||
|
||||
Celery 异步任务入口。P2 阶段完成 Celery 集成。
|
||||
|
||||
- `leaudit_process_document()` — 主处理函数
|
||||
- `dispatch_leaudit_task()` — 任务分发(P2 改用 `.apply_async()`)
|
||||
- `_resolve_rules_path()` — 规则路径解析(config → DB → type_id 映射)
|
||||
|
||||
## 4. 调用边界约束
|
||||
|
||||
- Controller 层不直接调用 `leaudit` 内核模块
|
||||
- Service 层不直接调用 `leaudit` 内核模块
|
||||
- 只有 `leaudit_bridge/` 可以感知 `leaudit` 内核类型和对象
|
||||
- 所有外部调用统一经过 `pipeline.py` 或 `tasks.py`
|
||||
|
||||
## 5. 数据流
|
||||
|
||||
```text
|
||||
Controller / Service
|
||||
→ leaudit_bridge.tasks.dispatch_leaudit_task
|
||||
→ leaudit_bridge.pipeline.LauditPipeline.run
|
||||
→ ctx_builder / rules_loader / client_factory
|
||||
→ leaudit 引擎执行
|
||||
→ result_adapter
|
||||
→ storage_adapter
|
||||
→ leaudit_* 结果表
|
||||
```
|
||||
|
||||
## 6. 导入路径迁移
|
||||
|
||||
从老项目迁移时,所有 `from core.*` 已更新为:
|
||||
|
||||
| 旧引用 | 新引用 |
|
||||
|-------|-------|
|
||||
| `from core.config import ...` | `from fastapi_admin.config import ...` |
|
||||
| `from core.postgrest.client import ...` | SQLAlchemy `GetAsyncSession()` + `text()` |
|
||||
| `from core.logger import log` | `from fastapi_common.fastapi_common_logger import logger` |
|
||||
| `from core.celery_app_limited import celery_app` | P2 阶段集成 |
|
||||
| `from core.utils.instance_context import ...` | 已移除(显式参数替代环境变量切换) |
|
||||
@@ -0,0 +1,220 @@
|
||||
# LeAudit 文档域表结构设计
|
||||
|
||||
> **状态**: 已落地到 `leaudit_platform` 数据库(17 张表,全中文注释)
|
||||
> **设计基准**: `/home/wren-dev/Porject/leaudit/src/leaudit` 源码分析
|
||||
|
||||
## 1. 设计依据
|
||||
|
||||
本设计基于对 `leaudit` 源码的完整分析,对齐其真实的处理管线、数据模型和产出物类型。
|
||||
|
||||
### 1.1 LeAudit 处理管线(7 阶段)
|
||||
|
||||
```text
|
||||
文件输入
|
||||
→ Stage 1: Normalize(解析 → OCR → 分类 → 案卷分段 → 印章增强 → markdown 渲染)
|
||||
→ Stage 2: Rules Resolve(加载 RulesFile)
|
||||
→ Stage 3: Extract(字段抽取 → hydrate → multi-entity → derived → grounding)
|
||||
→ Stage 4: Detect Phase(draft / executed 判定)
|
||||
→ Stage 5: Evaluate(规则引擎 → 16 种 check 类型 → pass/fail/skip)
|
||||
→ Stage 6: Rescue(失败规则补救:review → agent → pass/fail 翻转)
|
||||
→ Stage 7: Finalize → Persist
|
||||
```
|
||||
|
||||
### 1.2 LeAudit 自身存储模型(仅供参考,不做 leaudit-platform 真相源)
|
||||
|
||||
LeAudit 有一套自己的 SQLAlchemy ORM 表(`storage/models/`)。**leaudit-platform 不使用 LeAudit 的这套存储**。bridge 层(`storage_adapter.py`)已经绕过它,直接将结果写入 leaudit-platform 的 `leaudit_*` 表。
|
||||
|
||||
---
|
||||
|
||||
## 2. 设计原则
|
||||
|
||||
### 2.1 完全独立于业务表
|
||||
|
||||
所有 LeAudit 域表统一 `leaudit_*` 前缀,与老 `docauditai` 的 `documents` / `evaluation_points` 等表完全隔离。
|
||||
|
||||
### 2.2 数据库存索引,OSS 存真身
|
||||
|
||||
- OCR JSON、normalized JSON、markdown、页面图、裁切图 → OSS 文件
|
||||
- 数据库只存元数据、状态、OSS 地址
|
||||
|
||||
### 2.3 一次执行一个 Run
|
||||
|
||||
每次文档进入 LeAudit 处理链,生成一条 `leaudit_audit_runs` 记录。支持重跑、版本回放、模型对比。
|
||||
|
||||
### 2.4 结果与产物解耦
|
||||
|
||||
- 结果表(rule_results / field_results)→ "这次跑出了什么"
|
||||
- 产物表(artifacts)→ "这次跑的过程中生成了哪些文件"
|
||||
|
||||
---
|
||||
|
||||
## 3. 完整表清单
|
||||
|
||||
| # | 表名 | 用途 | 状态 |
|
||||
|---|------|------|------|
|
||||
| 1 | `leaudit_documents` | LeAudit 域文档镜像,关联业务文档 | ✅ 已创建 |
|
||||
| 2 | `leaudit_document_files` | 文档文件版本管理 | ✅ 已创建 |
|
||||
| 3 | `leaudit_audit_runs` | 每次处理执行的主索引记录 | ✅ 已创建 |
|
||||
| 4 | `leaudit_artifacts` | OCR/normalize/manifest/markdown/图片等文件产物索引 | ✅ 已创建 |
|
||||
| 5 | `leaudit_rule_results` | 规则级评查结果(逐条规则一行) | ✅ 已创建 |
|
||||
| 6 | `leaudit_field_results` | 字段级抽取结果 | ✅ 已创建 |
|
||||
| 7 | `leaudit_run_metrics` | 各阶段耗时与计数统计 | ✅ 已创建 |
|
||||
| 8 | `leaudit_run_errors` | 各阶段错误与诊断 | ✅ 已创建 |
|
||||
| 9 | `leaudit_rescue_outcomes` | Rescue 补救结果 | ✅ 已创建 |
|
||||
| 10 | `leaudit_rule_sets` | 规则集主表 | ✅ 已创建 |
|
||||
| 11 | `leaudit_rule_versions` | 规则版本表 | ✅ 已创建 |
|
||||
| 12 | `leaudit_rule_type_bindings` | 文档类型与规则集绑定 | ✅ 已创建 |
|
||||
| 13 | `jwt_tokens` | JWT 令牌管理 | ✅ 已创建 |
|
||||
| 14 | `leaudit_document_types` | 文档类型定义 | ✅ 已创建 |
|
||||
| 15 | `leaudit_entry_modules` | 入口模块/导航菜单 | ✅ 已创建 |
|
||||
| 16 | `leaudit_evaluation_point_groups` | 评查点规则组(PID 树形) | ✅ 已创建 |
|
||||
| 17 | `leaudit_evaluation_points` | 规则点/评查点元数据 | ✅ 已创建 |
|
||||
|
||||
> 所有表均含 `create_time` / `update_time` / `delete_time` 三时间戳 + 中文列注释。
|
||||
> 详细建表 SQL 见 `scripts/schema_v2_add_evaluation_tables.sql`。
|
||||
|
||||
---
|
||||
|
||||
## 4. 核心表结构(文档执行域)
|
||||
|
||||
### 4.1 `leaudit_documents`
|
||||
|
||||
LeAudit 域文档镜像表。通过 `biz_document_id` 关联老系统 `documents.id`。
|
||||
|
||||
| 字段 | 类型 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| `id` | bigint PK | 主键,自增 |
|
||||
| `biz_document_id` | bigint UNIQUE | 关联老业务 `documents.id` |
|
||||
| `type_id` | bigint | 文档类型 ID → `leaudit_document_types.id` |
|
||||
| `processing_status` | varchar(64) | waiting / running / completed / failed |
|
||||
| `current_run_id` | bigint | 最新有效 `leaudit_audit_runs.id` |
|
||||
| `create_time` / `update_time` / `delete_time` | timestamptz | 标准时间戳 |
|
||||
|
||||
### 4.2 `leaudit_document_files`
|
||||
|
||||
文档文件表。一份文档可以有多个物理文件版本。
|
||||
|
||||
| 字段 | 类型 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| `id` | bigint PK | 主键 |
|
||||
| `document_id` | bigint FK | 关联 `leaudit_documents.id` |
|
||||
| `file_role` | varchar(64) | primary / attachment / scan / ocr_result |
|
||||
| `file_name` / `file_ext` / `mime_type` / `file_size` | — | 文件基本信息 |
|
||||
| `sha256` | varchar(64) | 文件 SHA256 |
|
||||
| `oss_url` | varchar(2048) | OSS 地址(真源) |
|
||||
| `local_path` | varchar(1024) | 本地临时路径 |
|
||||
| `is_active` | boolean | 当前生效文件 |
|
||||
| `created_by` | bigint | 上传者 |
|
||||
|
||||
### 4.3 `leaudit_audit_runs`
|
||||
|
||||
核心追踪表。每次进入 LeAudit 主链生成一条记录。
|
||||
|
||||
| 字段分类 | 字段 | 说明 |
|
||||
|---------|------|------|
|
||||
| **标识** | `id` / `document_id` / `document_file_id` / `run_no` | 主键 + 关联 + 序号 |
|
||||
| **触发** | `trigger_source` / `trigger_user_id` / `task_id` | 触发来源/人/Celery ID |
|
||||
| **状态** | `status` / `phase` | pending→running→completed/failed |
|
||||
| **规则溯源** | `rule_set_id` / `rule_version_id` / `rule_type_id` | 使用的规则版本 |
|
||||
| | `rule_source_oss_url` / `rule_source_sha256` / `rule_local_cache_path` | 规则文件定位 |
|
||||
| **模型快照** | `engine_version` / `llm_provider` / `llm_model` / `vlm_provider` / `vlm_model` / `ocr_provider` / `ocr_model` | 可追溯差异来源 |
|
||||
| **Rescue** | `rescue_mode` / `rescue_applied` | 补救配置与执行标记 |
|
||||
| **结果汇总** | `total_score` / `passed_count` / `failed_count` / `skipped_count` / `result_status` | 总量统计 |
|
||||
| **时间** | `started_at` / `finished_at` / `create_time` / `update_time` | 时间追踪 |
|
||||
|
||||
### 4.4 `leaudit_artifacts`
|
||||
|
||||
统一产物表。记录一次 run 过程中产生的所有文件型中间产物。
|
||||
|
||||
**artifact_type 枚举(20 种)**:`original_doc` / `normalized_doc` / `render_png` / `render_pdf` / `ocr_json` / `extract_json` / `evaluate_json` / `rescue_json` / `quality_report` / `diff_report` / `cross_review` / `merged_result` / `final_report` / `vlm_render` / `vlm_vis_page` / `vlm_vis_subdoc` / `vlm_vis_field` / `vlm_debug` / `rescue_debug` / `pipeline_log`
|
||||
|
||||
### 4.5 `leaudit_rule_results`
|
||||
|
||||
规则级结果表。对齐 LeAudit `EvaluationResult.rules` 的实际字段。
|
||||
|
||||
| 字段分类 | 关键字段 | 说明 |
|
||||
|---------|---------|------|
|
||||
| **归属** | `run_id` / `rule_version_id` / `document_id` | 三重定位 |
|
||||
| **规则标识** | `rule_id` / `rule_name` / `risk` / `score` | 规则基本信息 |
|
||||
| **结果** | `passed` / `status` / `skip_reason` / `confidence` | 判定结果 |
|
||||
| **消息** | `pass_message` / `fail_message` | 通过/失败文案 |
|
||||
| **细节** | `stages` (jsonb) / `extracted_fields` (jsonb) / `field_positions` (jsonb) / `rule_meta` (jsonb) | 结构化详情 |
|
||||
| **整改** | `remediation` (jsonb) / `rescue_applied` / `rescue_passed` | 补救信息 |
|
||||
| **兜底** | `result_payload` (jsonb) | 完整 RuleResult JSON 备份 |
|
||||
|
||||
### 4.6-4.9 辅助表
|
||||
|
||||
- **`leaudit_field_results`**: 字段级抽取结果(value_text / confidence / grounding_score / hard_failed 等)
|
||||
- **`leaudit_run_metrics`**: 各阶段耗时(ocr/normalize/extract/evaluate/rescue_seconds)+ 计数统计
|
||||
- **`leaudit_run_errors`**: 各阶段错误(stage / level / error_code / message / detail_json)
|
||||
- **`leaudit_rescue_outcomes`**: Rescue 补救结果(diagnosis / final_status / llm_calls / vlm_calls)
|
||||
|
||||
---
|
||||
|
||||
## 5. 规则管理域表
|
||||
|
||||
详见 [DSL 规则域表结构设计](dsl_rule_schema_design.md)。
|
||||
|
||||
| 表 | 用途 |
|
||||
|----|------|
|
||||
| `leaudit_rule_sets` | 规则集主表(rule_type / rule_name / current_version_id) |
|
||||
| `leaudit_rule_versions` | 规则版本(oss_url / file_sha256 / metadata_* 快照) |
|
||||
| `leaudit_rule_type_bindings` | 文档类型 ↔ 规则集绑定(binding_mode / priority) |
|
||||
|
||||
---
|
||||
|
||||
## 6. 评查点管理域表
|
||||
|
||||
| 表 | 用途 |
|
||||
|----|------|
|
||||
| `leaudit_evaluation_point_groups` | 评查点规则组(PID 树形层级) |
|
||||
| `leaudit_evaluation_points` | 规则点/评查点(code / score / scoring_config / references_laws) |
|
||||
| `leaudit_document_types` | 文档类型定义(code / classification_keywords / extraction_mode) |
|
||||
| `leaudit_entry_modules` | 入口模块(name / path / areas / icon_path) |
|
||||
|
||||
---
|
||||
|
||||
## 7. 表关系图
|
||||
|
||||
```text
|
||||
leaudit_entry_modules
|
||||
└── leaudit_document_types
|
||||
├── leaudit_documents ── biz_document_id → 老系统 documents
|
||||
│ ├── leaudit_document_files
|
||||
│ └── leaudit_audit_runs
|
||||
│ ├── leaudit_artifacts (N)
|
||||
│ ├── leaudit_run_metrics (1)
|
||||
│ ├── leaudit_run_errors (N)
|
||||
│ ├── leaudit_rule_results (N)
|
||||
│ ├── leaudit_field_results (N)
|
||||
│ └── leaudit_rescue_outcomes (N)
|
||||
└── leaudit_rule_type_bindings
|
||||
└── leaudit_rule_sets
|
||||
├── leaudit_rule_versions
|
||||
└── leaudit_evaluation_point_groups
|
||||
└── leaudit_evaluation_points
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. OSS 路径约定
|
||||
|
||||
详见 [基础设施重设计](infrastructure_redesign.md) 第一节。
|
||||
|
||||
```text
|
||||
# 业务文档
|
||||
bdocs/{region}/{type_code}/{doc_id}/{version}/{file_role}.{ext}
|
||||
|
||||
# 评查产物
|
||||
artifacts/{region}/{run_id}/{artifact_type}/{detail}.{ext}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 9. 最终结论
|
||||
|
||||
- 所有表 `leaudit_*` 前缀,与老系统完全隔离
|
||||
- `leaudit_audit_runs` 是每次处理的唯一追踪单位
|
||||
- `leaudit_artifacts` 统一管理所有文件产物,数据库只存索引
|
||||
- `leaudit_rule_results` 粒度到逐条规则,结构与 LeAudit `RuleResult` 对齐
|
||||
- 数据库已全部建表完成,含中文注释和三时间戳
|
||||
@@ -0,0 +1,241 @@
|
||||
# LeAudit DSL 规则域表结构设计
|
||||
|
||||
> **状态**: 核心表已落地 `leaudit_platform` 数据库
|
||||
> **相关**: [文档域表设计](document_schema_design.md) | [Bridge 目录设计](bridge_directory_design.md)
|
||||
|
||||
## 1. 设计目标
|
||||
|
||||
- `LeAudit` 运行时仍然按 YAML 规则文件执行
|
||||
- 规则文件的持久真相源不再依赖固定本地目录
|
||||
- 数据库存储规则元数据、版本、状态、OSS 地址
|
||||
- YAML 正文作为文件产物放在 OSS
|
||||
- 运行时根据数据库当前激活版本下载到本地临时文件,再按 `LeAudit` 原逻辑加载
|
||||
- 支持规则编辑、校验、发布、回滚、审计
|
||||
|
||||
## 2. 核心原则
|
||||
|
||||
### 2.1 规则真相源是 "OSS 文件 + 数据库索引"
|
||||
|
||||
- 执行规则内容:YAML 文件
|
||||
- 持久位置:OSS
|
||||
- 数据库:记录元数据、版本、OSS 地址、激活状态
|
||||
|
||||
### 2.2 不让数据库直接变成规则执行解释器
|
||||
|
||||
数据库不替代 YAML/DSL 本身,不把规则拆碎存成"行式规则引擎配置"。
|
||||
|
||||
原因:
|
||||
|
||||
- 会偏离 `LeAudit` 原逻辑
|
||||
- 会抬高规则编辑和发布成本
|
||||
- 会让 DSL 的可读性和兼容性变差
|
||||
|
||||
### 2.3 规则编辑与规则执行分离
|
||||
|
||||
- 编辑阶段:编辑 YAML 文本、校验、上传 OSS、写版本记录
|
||||
- 执行阶段:查 DB 元数据、下 OSS 文件、本地临时加载执行
|
||||
|
||||
## 3. 已落地核心表
|
||||
|
||||
| # | 表名 | 用途 | 状态 |
|
||||
|---|------|------|------|
|
||||
| 1 | `leaudit_rule_sets` | 规则集主表 | ✅ |
|
||||
| 2 | `leaudit_rule_versions` | 规则版本表 | ✅ |
|
||||
| 3 | `leaudit_rule_type_bindings` | 文档类型与规则集绑定 | ✅ |
|
||||
| 4 | `leaudit_evaluation_point_groups` | 评查点规则组(PID 树形) | ✅ |
|
||||
| 5 | `leaudit_evaluation_points` | 规则点/评查点元数据 | ✅ |
|
||||
|
||||
待补充(后续阶段):
|
||||
|
||||
| # | 表名 | 用途 |
|
||||
|---|------|------|
|
||||
| 6 | `leaudit_rule_file_artifacts` | 规则文件产物(校验报告、导出包等) |
|
||||
| 7 | `leaudit_rule_publish_logs` | 规则发布/回滚审计日志 |
|
||||
| 8 | `leaudit_rule_validation_logs` | 规则校验日志 |
|
||||
|
||||
## 4. 核心表结构
|
||||
|
||||
### 4.1 `leaudit_rule_sets`
|
||||
|
||||
规则集主表,描述一个稳定的业务规则集实体。
|
||||
|
||||
| 字段 | 类型 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| `id` | bigint PK | 主键 |
|
||||
| `rule_type` | varchar(128) | 业务规则类型编码,对应 DSL `metadata.type_id` |
|
||||
| `rule_name` | varchar(512) | 规则集名称 |
|
||||
| `domain_type` | varchar(64) | contract / admin_license / legal_doc |
|
||||
| `description` | text | 规则集描述 |
|
||||
| `entry_module` | varchar(64) | 对应业务入口模块标识 |
|
||||
| `current_version_id` | bigint | 当前激活版本 → `leaudit_rule_versions.id` |
|
||||
| `status` | varchar(32) | draft / active / deprecated / archived |
|
||||
| `is_builtin` | boolean | 是否内置规则集(内置不可删除) |
|
||||
| `owner_user_id` | bigint | 负责人 |
|
||||
| `create_time` / `update_time` / `delete_time` | timestamptz | 标准时间戳 |
|
||||
|
||||
### 4.2 `leaudit_rule_versions`
|
||||
|
||||
规则版本表,每次编辑/发布都生成一条新版本。
|
||||
|
||||
| 字段 | 类型 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| `id` | bigint PK | 主键 |
|
||||
| `rule_set_id` | bigint FK | → `leaudit_rule_sets.id` |
|
||||
| `version_no` | varchar(64) | 语义版本号 |
|
||||
| `version_seq` | int | 顺序号(rule_set 内递增) |
|
||||
| `status` | varchar(32) | draft / published / deprecated / rollback |
|
||||
| `source_type` | varchar(32) | oss_yaml / local_yaml / db_snippet |
|
||||
| `dsl_format` | varchar(32) | yaml / json |
|
||||
| `oss_url` | varchar(2048) | YAML 文件 OSS 地址 |
|
||||
| `file_sha256` / `file_size` | — | 文件完整性 |
|
||||
| `metadata_type_id` / `metadata_name` / `metadata_version` | — | DSL metadata 快照 |
|
||||
| `change_note` | text | 变更说明 |
|
||||
| `editor_user_id` / `publisher_user_id` / `published_at` | — | 编辑/发布信息 |
|
||||
| `create_time` / `update_time` | timestamptz | 标准时间戳 |
|
||||
|
||||
### 4.3 `leaudit_rule_type_bindings`
|
||||
|
||||
文档类型和规则集的绑定表。
|
||||
|
||||
| 字段 | 类型 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| `id` | bigint PK | 主键 |
|
||||
| `doc_type_id` | bigint | → `leaudit_document_types.id` |
|
||||
| `doc_type_code` | varchar(128) | 文档类型编码(冗余快速匹配) |
|
||||
| `rule_set_id` | bigint FK | → `leaudit_rule_sets.id` |
|
||||
| `binding_mode` | varchar(32) | explicit / wildcard / fallback |
|
||||
| `priority` | int | 优先级(越大越高) |
|
||||
| `is_active` | boolean | 是否激活 |
|
||||
| `note` | text | 备注 |
|
||||
| `create_time` / `update_time` | timestamptz | 标准时间戳 |
|
||||
|
||||
## 5. 规则执行时序
|
||||
|
||||
### 5.1 运行时加载(当前流程)
|
||||
|
||||
```text
|
||||
文档类型确定
|
||||
→ leaudit_rule_type_bindings 查绑定
|
||||
→ leaudit_rule_sets.current_version_id
|
||||
→ leaudit_rule_versions.oss_url
|
||||
→ 下载 YAML 到本地临时文件
|
||||
→ leaudit.dsl.loader.load_rules_file()
|
||||
→ 进入执行链
|
||||
```
|
||||
|
||||
### 5.2 编辑保存(后续实现)
|
||||
|
||||
```text
|
||||
前端提交 YAML 文本
|
||||
→ YAML 语法校验
|
||||
→ LeAudit DSL 语义校验
|
||||
→ 上传 YAML 到 OSS
|
||||
→ 写 leaudit_rule_versions
|
||||
→ 写 leaudit_rule_validation_logs(可选)
|
||||
```
|
||||
|
||||
### 5.3 发布生效(后续实现)
|
||||
|
||||
```text
|
||||
选择某版本发布
|
||||
→ 更新 leaudit_rule_sets.current_version_id
|
||||
→ 写 leaudit_rule_publish_logs(可选)
|
||||
→ 清理/失效本地规则缓存
|
||||
```
|
||||
|
||||
## 6. 与 OSS 的配合方式
|
||||
|
||||
### 6.1 路径建议
|
||||
|
||||
规则文件必须版本化,不能固定覆盖同一路径:
|
||||
|
||||
```text
|
||||
oss://leaudit/rules/{rule_type}/{version_no}/rules.yaml
|
||||
oss://leaudit/rules/{rule_type}/{version_no}/validation_report.json
|
||||
```
|
||||
|
||||
### 6.2 为什么不能只用固定路径
|
||||
|
||||
如果始终覆盖同一个 `rules.yaml`:
|
||||
|
||||
- 难回滚
|
||||
- 难追踪历史执行结果对应的规则版本
|
||||
- 本地缓存容易脏
|
||||
- 无法做结果回放和审计
|
||||
|
||||
## 7. 种子数据:从本地 `rules/` 到 OSS + DB
|
||||
|
||||
当前项目 `rules/` 目录下已有 20+ 个类型目录、每个包含可用的 `rules.yaml`。需一次性导入。
|
||||
|
||||
### 7.1 初始化脚本逻辑
|
||||
|
||||
```python
|
||||
# 伪代码:把 rules/ 导入 OSS + DB
|
||||
for each rules_dir in rules/:
|
||||
yaml_text = read(rules_dir / "rules.yaml")
|
||||
rules_file = parse_rules_yaml_text(yaml_text)
|
||||
|
||||
version_no = rules_file.metadata.version
|
||||
oss_url = f"oss://leaudit/rules/{rules_dir.name}/{version_no}/rules.yaml"
|
||||
upload_to_oss(oss_url, yaml_text)
|
||||
|
||||
rule_set = upsert_rule_set(
|
||||
rule_type=rules_dir.name,
|
||||
rule_name=rules_file.metadata.name,
|
||||
domain_type=classify_domain(rules_file),
|
||||
status="active",
|
||||
is_builtin=True,
|
||||
)
|
||||
|
||||
version = insert_rule_version(
|
||||
rule_set_id=rule_set.id,
|
||||
version_no=version_no,
|
||||
oss_url=oss_url,
|
||||
...
|
||||
)
|
||||
|
||||
update_rule_set(rule_set.id, current_version_id=version.id)
|
||||
|
||||
# 绑定文档类型(如果已知)
|
||||
doc_type_id = resolve_doc_type_id(rules_dir.name)
|
||||
if doc_type_id:
|
||||
insert_type_binding(...)
|
||||
```
|
||||
|
||||
### 7.2 初始化后的本地 `rules/` 处置
|
||||
|
||||
**推荐方案**:保留为只读紧急回退备份,标记为不再接受编辑。如果 OSS 不可用或规则全损,本地至少有一套可用的规则副本。
|
||||
|
||||
### 7.3 `_TYPE_ID_RULES_MAP` 硬编码过渡
|
||||
|
||||
`leaudit_bridge/tasks.py` 中的硬编码映射:
|
||||
|
||||
```python
|
||||
_TYPE_ID_RULES_MAP: dict[int, str] = {
|
||||
3: "行政处罚",
|
||||
}
|
||||
```
|
||||
|
||||
过渡策略:
|
||||
|
||||
1. **阶段 A**:保留硬编码为 fallback,同时 `leaudit_rule_type_bindings` 表已有数据
|
||||
2. **阶段 B**:`_resolve_rules_path()` 先查 `leaudit_rule_type_bindings`,未命中 fallback 到硬编码
|
||||
3. **阶段 C**:所有绑定入库后,删除 `_TYPE_ID_RULES_MAP`
|
||||
|
||||
## 8. 推荐接口能力(后续实现)
|
||||
|
||||
- `GET /api/leaudit/rule-sets`
|
||||
- `GET /api/leaudit/rule-sets/{rule_type}`
|
||||
- `GET /api/leaudit/rule-sets/{rule_type}/versions`
|
||||
- `GET /api/leaudit/rule-versions/{version_id}/content`
|
||||
- `POST /api/leaudit/rule-sets/{rule_type}/validate`
|
||||
- `POST /api/leaudit/rule-sets/{rule_type}/versions`
|
||||
- `POST /api/leaudit/rule-sets/{rule_type}/publish`
|
||||
- `POST /api/leaudit/rule-sets/{rule_type}/rollback`
|
||||
|
||||
## 9. 最终结论
|
||||
|
||||
- 规则内容继续保持 YAML 形态
|
||||
- 持久真相源从固定本地文件改为 OSS 文件
|
||||
- 数据库存规则元数据、版本、状态和 OSS 地址
|
||||
- 运行时通过 "DB → OSS → 本地临时 YAML → LeAudit loader" 保持 `LeAudit` 原逻辑不变
|
||||
@@ -0,0 +1,459 @@
|
||||
# LeAudit Platform — 基础设施深度重设计方案
|
||||
|
||||
> 基于老项目 docauditai 深度逆向分析,对标新平台 leaudit-platform 重新规划。
|
||||
|
||||
---
|
||||
|
||||
## 一、文件存储 OSS 路径设计
|
||||
|
||||
### 1.1 老项目路径模式
|
||||
|
||||
```
|
||||
documents/{instance_name}/{doc_type_name}/{year}/{中文日期}/{doc_dir}/{filename}
|
||||
|
||||
实例: documents/mz/行政许可卷宗/2026/04月27日/采购合同_14时30分25秒/采购合同.pdf
|
||||
```
|
||||
|
||||
**老项目核心问题:**
|
||||
- 中文路径(区域名、文档类型名、日期)— URL 编码后不可读,程序解析困难
|
||||
- `instance_name` 用区域缩写(mz/yf/jy/cz/sj),耦合 `INSTANCE_NAME` 环境变量
|
||||
- 纯时间戳区分版本,无语义化版本号,查找历史版本全靠 DB 反查
|
||||
- 业务文档和评查产物混在一个路径空间,无类型区分
|
||||
- 无文件级权限元数据,拿到 presigned URL 即可访问
|
||||
|
||||
### 1.2 新平台路径设计
|
||||
|
||||
#### 两级路径体系
|
||||
|
||||
```
|
||||
┌── 业务文档 (Business Documents) ── 用户上传的原始文件
|
||||
│ bdocs/{region}/{type_code}/{doc_id}/{version}/{file_role}.{ext}
|
||||
│ bdocs/gd-mz/contract.entrust/10042/v1/primary.pdf
|
||||
│ bdocs/gd-mz/contract.entrust/10042/v1/attachment_a.docx
|
||||
│
|
||||
└── 评查产物 (Audit Artifacts) ── 引擎产出的中间/最终文件
|
||||
artifacts/{region}/{run_id}/{artifact_type}/{detail}.{ext}
|
||||
artifacts/gd-mz/5801/ocr_result/ocr.json
|
||||
artifacts/gd-mz/5801/render_png/page_003.png
|
||||
artifacts/gd-mz/5801/final_report/report.pdf
|
||||
```
|
||||
|
||||
#### 路径段规范
|
||||
|
||||
| 段 | 含义 | 格式 | 示例 |
|
||||
|---|---|---|---|
|
||||
| `bdocs` / `artifacts` | 顶层命名空间 | 固定 | `bdocs` = 业务文档, `artifacts` = 评查产物 |
|
||||
| `{region}` | 区域代码 | `{province}-{city}` | `gd-mz` (广东-梅州), `gd-yf` (云浮), `gd-jy` (揭阳), `gd-cz` (潮州), `gd-sj` (省级) |
|
||||
| `{type_code}` | 文档类型编码 | DSL type_id | `contract.entrust`, `admin_license.new` |
|
||||
| `{doc_id}` | 文档 ID | DB 主键 | `10042` |
|
||||
| `{version}` | 版本号 | `v{n}` | `v1`, `v2`, `v3` |
|
||||
| `{file_role}` | 文件角色 | 枚举 | `primary` / `attachment_a` / `scan` / `ocr_text` |
|
||||
| `{run_id}` | 评查运行 ID | DB 主键 | `5801` |
|
||||
| `{artifact_type}` | 产物类型 | 枚举(20种) | `ocr_result`, `extract_json`, `evaluate_json`, `final_report` |
|
||||
| `{detail}` | 产物详情 | 自由格式 | `page_003.png`, `rule_R001.json` |
|
||||
|
||||
#### 关键设计决策
|
||||
|
||||
1. **全英文路径** — 无 URL 编码问题,日志/调试可直接阅读
|
||||
2. **区域用 `{province}-{city}` 代码** — 比旧系统 `mz`/`yf` 更明确,未来跨省扩展无歧义
|
||||
3. **`doc_id` 入路径** — 路径即自描述,无需查 DB 即可知道文件归属
|
||||
4. **显式 `{version}` 段** — 版本号在路径中可见,支持 v1/v2/v3 并存
|
||||
5. **产物按 `run_id` 组织** — 一次评查的所有产物在同一目录下,清理/归档方便
|
||||
|
||||
---
|
||||
|
||||
## 二、同文件多版本机制
|
||||
|
||||
### 2.1 老项目做法
|
||||
|
||||
- 每次上传用时间戳生成新目录 → 物理隔离
|
||||
- DB 中按 `(name, type_id)` 分组,`create_time DESC` 取最新
|
||||
- 版本号在查询时动态计算,不存储
|
||||
- 旧版本永久保留,无清理策略
|
||||
|
||||
### 2.2 新平台版本设计
|
||||
|
||||
#### 版本存储模型
|
||||
|
||||
```
|
||||
bdocs/gd-mz/contract.entrust/10042/
|
||||
├── v1/primary.pdf ← 首次上传
|
||||
├── v2/primary.pdf ← 第二次上传(修正版)
|
||||
└── v3/primary.pdf ← 第三次上传(最终版)
|
||||
```
|
||||
|
||||
#### 版本元数据
|
||||
|
||||
在 `leaudit_document_files` 表中记录:
|
||||
|
||||
```python
|
||||
class LeauditDocumentFile(Base):
|
||||
document_id: int # 文档 ID
|
||||
version_no: int # 版本号 (1, 2, 3...)
|
||||
version_seq: str # 语义版本 "v1", "v2"
|
||||
file_role: str # primary / attachment / ...
|
||||
oss_url: str # 完整 OSS 路径
|
||||
sha256: str # 文件哈希
|
||||
is_current: bool # 是否当前活跃版本
|
||||
replaced_by_id: int # 被哪个新版本取代(版本链)
|
||||
upload_user_id: int # 上传者
|
||||
change_note: str # 变更说明
|
||||
```
|
||||
|
||||
#### 版本生命周期
|
||||
|
||||
```
|
||||
upload v1 → v1.is_current = True
|
||||
upload v2 → v1.is_current = False, v1.replaced_by_id = v2.id
|
||||
v2.is_current = True
|
||||
upload v3 → v2.is_current = False, v2.replaced_by_id = v3.id
|
||||
v3.is_current = True
|
||||
```
|
||||
|
||||
- 所有旧版本文件保留在 OSS(不物理删除)
|
||||
- 版本链可在前端展示为"历史版本"列表
|
||||
- 回滚 = 将指定旧版本标记为 `is_current = True`(无需复制文件)
|
||||
|
||||
#### 与评查运行的关联
|
||||
|
||||
每个 `leaudit_audit_runs` 记录锁定使用的版本:
|
||||
|
||||
```
|
||||
audit_runs.document_file_id → 指向具体版本的 leaudit_document_files.id
|
||||
```
|
||||
|
||||
这样即使文档后来更新到 v3,历史评查记录仍然精确指向当时的 v1 文件。
|
||||
|
||||
---
|
||||
|
||||
## 三、多地区文件查看权限 & 区域隔离
|
||||
|
||||
### 3.1 老项目做法
|
||||
|
||||
- 单一 bucket,路径前缀 `{instance_name}` 区分区域
|
||||
- 文件访问无权限校验(拿到 presigned URL 即访问)
|
||||
- 隔离依赖 `INSTANCE_NAME` 环境变量 → 只在 API 层有效
|
||||
|
||||
### 3.2 新平台权限模型
|
||||
|
||||
#### 三层权限控制
|
||||
|
||||
```
|
||||
Layer 1: 区域隔离 (Region Isolation)
|
||||
└── 用户所属区域决定可见文档范围
|
||||
省级 (gd-sj) 用户可看所有区域
|
||||
地市级 (gd-mz) 用户只能看本区域
|
||||
|
||||
Layer 2: 文件级权限 (Document-Level)
|
||||
└── 基于 RBAC 的文档访问控制
|
||||
document:read:own → 本人上传的
|
||||
document:read:all → 本区域全部的
|
||||
document:read:cross → 跨区域查看
|
||||
|
||||
Layer 3: 产物级权限 (Artifact-Level)
|
||||
└── 评查产物按 run_id 隔离
|
||||
产物继承其文档的权限策略
|
||||
临时产物 (rescue debug log) 仅内部系统可读
|
||||
```
|
||||
|
||||
#### 权限检查流程
|
||||
|
||||
```
|
||||
请求: GET /api/v2/documents/10042/files/v1/primary.pdf
|
||||
|
||||
1. JWT 解析 → 获取 user_id, user_role, user_region
|
||||
2. 区域检查: user_region == 'gd-sj' OR user_region == 文档的区域
|
||||
3. 权限检查: CheckPermission(user_id, "document:read:own")
|
||||
或通过 GRANT/DENY 通配符匹配
|
||||
4. 通过 → 生成 presigned URL (TTL 10分钟)
|
||||
5. 拒绝 → 返回 403
|
||||
```
|
||||
|
||||
#### 跨区域访问
|
||||
|
||||
```python
|
||||
# 省级用户发起跨区域评查
|
||||
POST /api/v2/documents/cross-review
|
||||
{
|
||||
"document_id": 10042, # gd-mz 的文档
|
||||
"reviewer_region": "gd-yf", # 让云浮审核员查看
|
||||
"permission": "document:read:cross"
|
||||
}
|
||||
→ 系统为 gd-yf 区域的审核员创建临时访问授权
|
||||
→ 记录到 leaudit_cross_access_logs
|
||||
→ 临时授权在评查完成后自动过期
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 四、队列机制重设计
|
||||
|
||||
### 4.1 老项目分析
|
||||
|
||||
**架构:**
|
||||
```
|
||||
┌─────────────┐ ┌──────────────┐ ┌──────────────┐
|
||||
│ API Server │────▶│ Redis Queue │────▶│ Celery Worker│
|
||||
│ (8000-8873) │ │ (单队列) │ │ (8线程, 4并发)│
|
||||
└─────────────┘ └──────────────┘ └──────────────┘
|
||||
│ │
|
||||
└── source_port ────────────────────────▶│ os.environ 切换
|
||||
│ 线程级隔离
|
||||
```
|
||||
|
||||
**关键机制:**
|
||||
- 所有区域共享一个 Redis 队列
|
||||
- `source_port` 参数 → worker 在任务执行时切换环境变量
|
||||
- Redis 信号量限制全局并发为 4
|
||||
- Thread pool (8 线程) → 4 个实际并发 + 4 个 I/O 等待
|
||||
|
||||
**问题:**
|
||||
- 环境变量切换是脆弱的状态管理(线程安全问题,需 threading.local 补偿)
|
||||
- 单一队列无优先级区分(紧急任务和批处理同队列)
|
||||
- 信号量修复依赖定时任务(有窗口期泄漏风险)
|
||||
|
||||
### 4.2 新平台队列设计
|
||||
|
||||
#### 多队列架构
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────┐
|
||||
│ Redis │
|
||||
│ │
|
||||
│ leaudit:queue:high (优先级高) │
|
||||
│ leaudit:queue:default (普通) │
|
||||
│ leaudit:queue:batch (批量/低优先级) │
|
||||
│ leaudit:queue:system (系统维护) │
|
||||
│ │
|
||||
│ leaudit:semaphore:global (并发控制) │
|
||||
│ leaudit:semaphore:vlm (VLM并发) │
|
||||
└──────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
#### 任务路由(不再用 source_port)
|
||||
|
||||
```python
|
||||
# 新方案:在任务提交时直接带上下文,而非运行时切换环境变量
|
||||
@celery_app.task(
|
||||
bind=True,
|
||||
queue="leaudit:queue:default",
|
||||
time_limit=1800,
|
||||
soft_time_limit=1500,
|
||||
max_retries=3,
|
||||
default_retry_delay=60,
|
||||
)
|
||||
async def leaudit_process_document(
|
||||
self,
|
||||
document_id: int,
|
||||
run_id: int,
|
||||
region: str, # gd-mz, gd-yf... (替代 source_port)
|
||||
config: dict, # 运行时配置快照
|
||||
user_id: int | None = None,
|
||||
):
|
||||
"""文档评查任务 - 上下文通过参数传递,不依赖环境变量"""
|
||||
...
|
||||
```
|
||||
|
||||
**改进点:**
|
||||
1. **显式参数替代环境变量** — `region` + `config` 直接传参,线程安全,可测试
|
||||
2. **优先级队列** — 用户手动触发的走 high,API 自动触发的走 default,批量导入走 batch
|
||||
3. **去 source_port** — 不再需要 `set_instance_environment` / `restore_instance_environment` 这种脆弱模式
|
||||
4. **配置快照** — 任务创建时拍下完整配置(LLM model、OCR endpoint 等),即使配置后续变更也不影响已提交任务
|
||||
|
||||
#### 任务类型与路由
|
||||
|
||||
| 任务 | 队列 | 优先级 | 并发限制 | 超时 |
|
||||
|---|---|---|---|---|
|
||||
| 用户手动评查 | `high` | 8 | 全局 4 | 30min |
|
||||
| API 自动评查 | `default` | 5 | 全局 4 | 30min |
|
||||
| 批量导入 | `batch` | 3 | 全局 2 | 60min |
|
||||
| 交叉评查 | `default` | 5 | 全局 4 | 30min |
|
||||
| 信号量修复 | `system` | 10 | 无限制 | 10s |
|
||||
| 健康检查 | `system` | 10 | 无限制 | 5s |
|
||||
| 统计更新 | `batch` | 1 | 全局 1 | 5min |
|
||||
|
||||
#### 并发控制改进
|
||||
|
||||
```python
|
||||
# 新方案:上下文管理器 + 自动释放
|
||||
class TaskConcurrencyLimiter:
|
||||
"""基于 Redis 的并发限制器,使用 Lua 脚本保证原子性"""
|
||||
|
||||
async def acquire(self, semaphore_key: str, max_concurrency: int, timeout: float) -> str:
|
||||
"""原子获取许可 → 返回 permit_id"""
|
||||
...
|
||||
|
||||
async def release(self, semaphore_key: str, permit_id: str):
|
||||
"""原子释放许可"""
|
||||
...
|
||||
|
||||
@asynccontextmanager
|
||||
async def limit(self, semaphore_key: str, max_concurrency: int):
|
||||
permit_id = await self.acquire(semaphore_key, max_concurrency, timeout=300)
|
||||
try:
|
||||
yield
|
||||
finally:
|
||||
await self.release(semaphore_key, permit_id)
|
||||
```
|
||||
|
||||
- Redis Lua 脚本替代 WATCH/MULTI/EXEC(减少乐观锁冲突)
|
||||
- 上下文管理器替代手动 acquire/release(防泄漏)
|
||||
- 无定时修复任务(Lua 原子操作 → 不会出现不一致状态)
|
||||
|
||||
---
|
||||
|
||||
## 五、缓存机制重设计
|
||||
|
||||
### 5.1 老项目分析
|
||||
|
||||
**三层 Redis 使用:**
|
||||
|
||||
| 层 | 用途 | 数据 | TTL |
|
||||
|---|---|---|---|
|
||||
| 权限缓存 | RBAC 鉴权加速 | user:permissions:*, rbac:routes:* | 5-30 min |
|
||||
| Token 黑名单 | JWT 吊销 | token:revoked:{jti} | 最长 24h |
|
||||
| 并发控制 | 信号量 | semaphore:* | 1800s 许可 TTL |
|
||||
|
||||
**问题:**
|
||||
- 5+ 个独立 Redis 连接(无统一连接池)
|
||||
- `fastapi_cache2` 已初始化但从未使用
|
||||
- 区域隔离依赖 Redis DB 号切换(配置复杂,易出错)
|
||||
- 无 API 响应缓存(每次请求都查 DB)
|
||||
- 统计缓存 TTL=20min,与权限 TTL 不一致
|
||||
|
||||
### 5.2 新平台缓存架构
|
||||
|
||||
#### 统一连接池
|
||||
|
||||
```python
|
||||
# fastapi_common/fastapi_common_cache/redis_pool.py
|
||||
class RedisPool:
|
||||
"""全局 Redis 连接池 — 所有模块共用"""
|
||||
|
||||
_instance: 'RedisPool' = None
|
||||
|
||||
def __init__(self, config: dict):
|
||||
# 异步连接池 (主)
|
||||
self.async_pool = aioredis.ConnectionPool(
|
||||
max_connections=50,
|
||||
socket_connect_timeout=5,
|
||||
socket_keepalive=True,
|
||||
retry_on_timeout=True,
|
||||
)
|
||||
# 同步客户端 (Celery worker 用)
|
||||
self.sync_client = redis.Redis(...)
|
||||
```
|
||||
|
||||
**关键决策:不再用 Redis DB 号做区域隔离** — 改用 key prefix:
|
||||
```
|
||||
# 旧: REDIS_DB=12 (gd-mz), REDIS_DB=13 (gd-yf)
|
||||
# 新: 所有区域共享 DB 0, 用 prefix 区分
|
||||
leaudit:gd-mz:permission:user:12345
|
||||
leaudit:gd-yf:permission:user:12345
|
||||
leaudit:gd-sj:cache:stats:homepage
|
||||
```
|
||||
|
||||
好处:统一监控、可在单次 SCAN 中查询跨区域数据、配置简单(只有一个 DB 号)。
|
||||
|
||||
#### 缓存分层策略
|
||||
|
||||
```
|
||||
L1: 进程内存缓存 (Python dict / lru_cache)
|
||||
├── 配置数据 (规则集列表、引擎版本) — 启动加载,手动刷新
|
||||
└── DSL 规则文件内容 — 本地文件缓存 + mtime 检测
|
||||
|
||||
L2: Redis 缓存 (TTL 驱动)
|
||||
├── 权限数据: 5 min TTL (变化频率低)
|
||||
├── Token 黑名单: 按 JWT exp 计算,最长 24h
|
||||
├── 统计聚合: 10-30 min TTL (计算开销大)
|
||||
└── API 响应: 1-5 min TTL (热点接口)
|
||||
|
||||
L3: 分布式锁
|
||||
└── 并发控制信号量 (Lua 原子操作)
|
||||
```
|
||||
|
||||
#### Cache Key 命名规范
|
||||
|
||||
```
|
||||
leaudit:{region}:{domain}:{entity_type}:{entity_id}
|
||||
└── 空 = 全局共享
|
||||
|
||||
# 权限
|
||||
leaudit:gd-mz:perm:user:12345 # 用户权限集
|
||||
leaudit:gd-mz:perm:user:12345:doc:read # 单条权限检查
|
||||
|
||||
# Token
|
||||
leaudit:global:token:revoked:{jti} # 全局共享
|
||||
|
||||
# 统计
|
||||
leaudit:gd-mz:stats:homepage:{user_id} # 首页统计
|
||||
leaudit:gd-sj:stats:daily:2026-04-27 # 日报
|
||||
|
||||
# 并发
|
||||
leaudit:global:sem:task:permits # 任务并发信号量
|
||||
leaudit:global:sem:vlm:permits # VLM 并发信号量
|
||||
|
||||
# API 响应缓存
|
||||
leaudit:gd-mz:api:documents:list:{hash} # 文档列表
|
||||
```
|
||||
|
||||
#### 缓存失效策略
|
||||
|
||||
| 触发器 | 清理动作 | 粒度 |
|
||||
|---|---|---|
|
||||
| 用户角色变更 | `SCAN leaudit:*:perm:user:{uid}` → DEL | 精确 |
|
||||
| 规则集发布新版本 | `DEL leaudit:*:perm:role:*` (所有角色) | 全局 |
|
||||
| Token 吊销 | `SETEX token:revoked:{jti} TTL 1` | 精确 |
|
||||
| 文档状态变更 | `DEL leaudit:{region}:stats:*` | 区域 |
|
||||
| 配置变更 | `DEL leaudit:global:config:*` | 全局 |
|
||||
|
||||
#### 不缓存的数据
|
||||
|
||||
- **评查结果详情** — 每次评查结果不同,缓存无意义
|
||||
- **文档文件内容** — 在 OSS,不在缓存
|
||||
- **实时队列状态** — 直接读 Redis List,不做额外缓存
|
||||
- **DSL 规则执行中间结果** — 一次性的,不需要缓存
|
||||
|
||||
---
|
||||
|
||||
## 六、整体基础设施拓扑
|
||||
|
||||
```
|
||||
┌──────────────────────────┐
|
||||
│ PostgreSQL │
|
||||
│ nas.7bm.co:54302 │
|
||||
│ leaudit_platform │
|
||||
│ (17 tables) │
|
||||
└──────────┬───────────────┘
|
||||
│
|
||||
┌──────────────┐ ┌──────────┴───────────────┐ ┌──────────────┐
|
||||
│ MinIO OSS │ │ FastAPI App │ │ Redis │
|
||||
│ │◀───│ (port 8000-8873) │───▶│ │
|
||||
│ bdocs/ │ │ │ │ Queue │
|
||||
│ artifacts/ │ │ Controller→Service→Model │ │ Cache │
|
||||
│ │ │ Bridge→LeAudit Engine │ │ Semaphore │
|
||||
└──────────────┘ └──────────────────────────┘ └──────────────┘
|
||||
│
|
||||
┌──────────┴───────────────┐
|
||||
│ Celery Worker(s) │
|
||||
│ (thread pool) │
|
||||
│ Pipeline run │
|
||||
└──────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 七、实施优先级
|
||||
|
||||
| 优先级 | 任务 | 依赖 |
|
||||
|---|---|---|
|
||||
| **P0** | 配置文件补充:Redis/OSS/LLM/VLM/OCR 真实值 | app.toml |
|
||||
| **P0** | OSS 路径工具类 `oss_path_utils.py` | 上述路径规范 |
|
||||
| **P0** | Redis 统一连接池 `redis_pool.py` | Redis 配置 |
|
||||
| **P1** | 文件上传流程适配新 OSS 路径 | oss_path_utils |
|
||||
| **P1** | Celery 任务路由改造(参数化替代 source_port) | Redis 连接池 |
|
||||
| **P1** | 并发控制器 Lua 脚本实现 | Redis |
|
||||
| **P1** | 权限缓存对接 PermissionService | Redis 缓存层 |
|
||||
| **P2** | API 响应缓存(热点接口) | Redis 缓存层 |
|
||||
| **P2** | 多版本文件管理 | 文件上传流程 |
|
||||
| **P3** | 跨区域访问审计日志 | 权限系统 |
|
||||
@@ -0,0 +1,153 @@
|
||||
# LeAudit 处理逻辑说明
|
||||
|
||||
> 描述 leaudit 引擎的完整流水线阶段,新老项目通用。
|
||||
|
||||
## 1. 主链概览
|
||||
|
||||
`leaudit` 的完整处理链不是单一 OCR 流程,而是一条完整评查流水线:
|
||||
|
||||
```text
|
||||
文件输入
|
||||
→ normalize
|
||||
→ rules resolve
|
||||
→ extract
|
||||
→ determine phase
|
||||
→ evaluate
|
||||
→ rescue
|
||||
→ persist / adapt result
|
||||
```
|
||||
|
||||
## 2. 各阶段说明
|
||||
|
||||
### 2.1 normalize
|
||||
|
||||
职责:
|
||||
|
||||
- 文档解析
|
||||
- OCR 或文本提取
|
||||
- 文档分类
|
||||
- 案卷分段
|
||||
- 印章/签名增强
|
||||
- 统一形成归一化结果对象
|
||||
|
||||
产出:
|
||||
|
||||
- `normalized_doc`
|
||||
- 原始文本
|
||||
- 分类结果
|
||||
- 视觉清单(VisualManifest)
|
||||
- 子文档结构
|
||||
|
||||
### 2.2 rules resolve
|
||||
|
||||
职责:
|
||||
|
||||
- 按输入文档类型或分类结果找到 DSL 规则文件
|
||||
- 解析成 `RulesFile`
|
||||
|
||||
leaudit-platform 做法:
|
||||
|
||||
- 由 bridge 层显式完成规则映射(`rules_loader.py`)
|
||||
- 规则从 OSS 下载到本地临时文件后加载
|
||||
- 支持 `leaudit_rule_type_bindings` 表查绑定 → `leaudit_rule_versions.oss_url` 下载
|
||||
|
||||
### 2.3 extract
|
||||
|
||||
职责:
|
||||
|
||||
- 按 schema 分组抽取字段
|
||||
- 进行 null-field retry / deep retry
|
||||
- hydrate 类型化
|
||||
- multi-entity 展开
|
||||
- derived 字段计算
|
||||
- post-hoc grounding
|
||||
- seal/signature 补充抽取
|
||||
|
||||
产出:
|
||||
|
||||
- 结构化字段
|
||||
- 多实体字段
|
||||
- derived 字段
|
||||
- 抽取错误列表
|
||||
|
||||
### 2.4 determine phase
|
||||
|
||||
职责:
|
||||
|
||||
- 识别文档属于 `draft` 还是 `executed`
|
||||
- 决定后续哪些规则参与执行
|
||||
|
||||
注意:
|
||||
|
||||
- `leaudit` 的策略是先用 `executed` 全量抽取,再反向判定 phase
|
||||
- 接入时不要外层再强行重复做另一套 phase 判定逻辑
|
||||
|
||||
### 2.5 evaluate
|
||||
|
||||
职责:
|
||||
|
||||
- 执行 DSL rules
|
||||
- 经过 phase filter、activate_if、confidence gate
|
||||
- 逐条 rule 输出 pass/fail/skip
|
||||
- 计算分数、失败原因、整改建议
|
||||
|
||||
产出:
|
||||
|
||||
- 文档级评查结果
|
||||
- 规则级结果列表
|
||||
|
||||
### 2.6 rescue
|
||||
|
||||
职责:
|
||||
|
||||
- 对失败规则做补救判定
|
||||
- 可能把失败翻转为通过
|
||||
|
||||
注意:
|
||||
|
||||
- rescue 后的最终聚合结果才是业务最终结果
|
||||
- 落库时要认最终结果,不要混用 pre-rescue 和 post-rescue 状态
|
||||
|
||||
## 3. Bridge 层的输入要求
|
||||
|
||||
bridge 层(`fastapi_modules/fastapi_leaudit/leaudit_bridge/`)至少要准备:
|
||||
|
||||
- `document_id`
|
||||
- 可读的本地文件路径
|
||||
- 对应 `RulesFile`
|
||||
- OCR/LLM/VLM 客户端
|
||||
- 结果落库适配器(`storage_adapter.py`)
|
||||
|
||||
## 4. Bridge 层的输出
|
||||
|
||||
统一整理出:
|
||||
|
||||
- 文档级总结果:`total_score` / `passed_count` / `failed_count` / `skipped_count` / `timing`
|
||||
- 规则级结果:`rule_id` / `rule_name` / `passed` / `status` / `score` / `risk` / `message` / `remediation`
|
||||
- 抽取结果:`extraction_fields` / `derived_fields` / `multi_entity`
|
||||
|
||||
## 5. 对接时的关键约束
|
||||
|
||||
### 文件路径
|
||||
|
||||
`leaudit` 许多逻辑默认依赖本地文件路径,因此 bridge 层必须保证:
|
||||
|
||||
- 文档能以本地文件形式供其读取
|
||||
- 如果文件在对象存储中,需先下载到临时路径
|
||||
|
||||
### 配置统一
|
||||
|
||||
不要让 `leaudit` 在项目里形成另一套独立配置真相源。
|
||||
|
||||
由项目统一注入:
|
||||
|
||||
- OCR 配置(`app.toml` → `fastapi_admin.config.OCR_*`)
|
||||
- LLM 配置(`app.toml` → `fastapi_admin.config.LLM_*`)
|
||||
- VLM 配置(`app.toml` → `fastapi_admin.config.VLM_*`)
|
||||
- 并发控制配置
|
||||
|
||||
### 存储隔离
|
||||
|
||||
- `leaudit` 内部计算
|
||||
- bridge 适配后写入 `leaudit_*` 统一结果表
|
||||
- 不使用 leaudit 自己的默认 `documents/extractions/evaluations` 表
|
||||
@@ -0,0 +1,344 @@
|
||||
# 去重后的完整接口清单
|
||||
|
||||
- 统计时间:2026-04-27
|
||||
- 当前分支:`Wren-Development-V5`
|
||||
- 去重口径:按“HTTP 方法 + 归一化路径 + endpoint”去重
|
||||
- 去重结果:`201` 个 HTTP 接口
|
||||
- 原始运行时路由对象:`347` 个
|
||||
- 归一化合并前缀:`/admin`、`/api/v2`、`/api/v3`、`/v3`
|
||||
|
||||
## [ai-suggestions] 4
|
||||
|
||||
```text
|
||||
POST /ai-suggestions/batch-generate
|
||||
POST /ai-suggestions/generate
|
||||
GET /ai-suggestions/task/{task_id}
|
||||
GET /ai-suggestions/{evaluation_result_id}
|
||||
```
|
||||
|
||||
## [api] 4
|
||||
|
||||
```text
|
||||
DELETE,GET,PATCH,POST,PUT /api/dataset/{path:path}
|
||||
POST /api/postgrest/proxy
|
||||
DELETE,GET,PATCH,POST,PUT /api/postgrest/proxy/{table_path:path}
|
||||
DELETE,GET,PATCH,POST,PUT /api/{path:path}
|
||||
```
|
||||
|
||||
## [areas] 2
|
||||
|
||||
```text
|
||||
GET /areas
|
||||
POST /areas/reload
|
||||
```
|
||||
|
||||
## [auth] 6
|
||||
|
||||
```text
|
||||
GET /auth/admin-only
|
||||
GET /auth/check-permission
|
||||
POST /auth/login
|
||||
GET /auth/me
|
||||
POST /auth/password_login
|
||||
DELETE,GET,PATCH,POST /auth/sso_users
|
||||
```
|
||||
|
||||
## [awareness-configs] 5
|
||||
|
||||
```text
|
||||
GET /awareness-configs
|
||||
POST /awareness-configs
|
||||
DELETE /awareness-configs/{doc_type_code}
|
||||
GET /awareness-configs/{doc_type_code}
|
||||
PUT /awareness-configs/{doc_type_code}
|
||||
```
|
||||
|
||||
## [awareness-templates] 8
|
||||
|
||||
```text
|
||||
GET /awareness-templates
|
||||
POST /awareness-templates
|
||||
GET /awareness-templates/code/{template_code}
|
||||
GET /awareness-templates/types
|
||||
DELETE /awareness-templates/{template_id}
|
||||
GET /awareness-templates/{template_id}
|
||||
PUT /awareness-templates/{template_id}
|
||||
POST /awareness-templates/{template_id}/duplicate
|
||||
```
|
||||
|
||||
## [cross_review] 14
|
||||
|
||||
```text
|
||||
POST /cross_review/proposals
|
||||
POST /cross_review/proposals/details
|
||||
POST /cross_review/proposals/document
|
||||
POST /cross_review/proposals/document/check_pending_votes
|
||||
DELETE /cross_review/proposals/{proposal_id}
|
||||
POST /cross_review/proposals/{proposal_id}/votes
|
||||
POST /cross_review/tasks/user_tasks
|
||||
GET /cross_review/tasks/{task_id}/can-confirm
|
||||
GET /cross_review/tasks/{task_id}/documents
|
||||
POST /cross_review/tasks/{task_id}/documents
|
||||
POST /cross_review/tasks/{task_id}/documents/{document_id}/append_attachments
|
||||
POST /cross_review/tasks/{task_id}/documents/{document_id}/complete
|
||||
GET /cross_review/tasks/{task_id}/progress
|
||||
POST /cross_review/tasks/{task_id}/upload_documents
|
||||
```
|
||||
|
||||
## [debug] 2
|
||||
|
||||
```text
|
||||
GET /debug/dify-config
|
||||
GET /debug/test-dify
|
||||
```
|
||||
|
||||
## [dify] 10
|
||||
|
||||
```text
|
||||
GET /dify/area-datasets
|
||||
POST /dify/area-datasets
|
||||
GET /dify/area-datasets/areas
|
||||
GET /dify/area-datasets/check/{dataset_id}
|
||||
GET /dify/area-datasets/my
|
||||
DELETE /dify/area-datasets/{dataset_bind_id}
|
||||
GET /dify/area-datasets/{dataset_bind_id}
|
||||
PUT /dify/area-datasets/{dataset_bind_id}
|
||||
GET /dify/chat-apps
|
||||
GET /dify/chat-apps/default
|
||||
```
|
||||
|
||||
## [dify_chat] 1
|
||||
|
||||
```text
|
||||
DELETE,GET,PATCH,POST,PUT /dify_chat/{path:path}
|
||||
```
|
||||
|
||||
## [dify_dataset] 1
|
||||
|
||||
```text
|
||||
DELETE,GET,PATCH,POST,PUT /dify_dataset/{path:path}
|
||||
```
|
||||
|
||||
## [document-types] 7
|
||||
|
||||
```text
|
||||
GET /document-types
|
||||
POST /document-types
|
||||
GET /document-types/options/entry-modules
|
||||
GET /document-types/options/prompt-templates
|
||||
DELETE /document-types/{type_id}
|
||||
GET /document-types/{type_id}
|
||||
PUT /document-types/{type_id}
|
||||
```
|
||||
|
||||
## [documents] 7
|
||||
|
||||
```text
|
||||
GET /documents/check-duplicate
|
||||
POST /documents/contract_templates/{comparison_id}/append_attachments
|
||||
POST /documents/contracts/{document_id}/append_attachments
|
||||
POST /documents/cross_review/documents/upload_and_assign
|
||||
GET /documents/list
|
||||
POST /documents/upload
|
||||
POST /documents/upload_contract_template
|
||||
```
|
||||
|
||||
## [entry-modules] 6
|
||||
|
||||
```text
|
||||
GET /entry-modules
|
||||
POST /entry-modules
|
||||
DELETE /entry-modules/{module_id}
|
||||
GET /entry-modules/{module_id}
|
||||
PUT /entry-modules/{module_id}
|
||||
POST /entry-modules/{module_id}/image
|
||||
```
|
||||
|
||||
## [leaudit-review-points] 1
|
||||
|
||||
```text
|
||||
GET /leaudit-review-points/{document_id}
|
||||
```
|
||||
|
||||
## [prompt-templates] 8
|
||||
|
||||
```text
|
||||
GET /prompt-templates
|
||||
POST /prompt-templates
|
||||
GET /prompt-templates/code/{template_code}
|
||||
GET /prompt-templates/types
|
||||
DELETE /prompt-templates/{template_id}
|
||||
GET /prompt-templates/{template_id}
|
||||
PUT /prompt-templates/{template_id}
|
||||
POST /prompt-templates/{template_id}/duplicate
|
||||
```
|
||||
|
||||
## [qichacha] 5
|
||||
|
||||
```text
|
||||
POST /qichacha/batch
|
||||
POST /qichacha/company
|
||||
POST /qichacha/dishonesty
|
||||
POST /qichacha/enterprise
|
||||
GET /qichacha/status
|
||||
```
|
||||
|
||||
## [rbac] 25
|
||||
|
||||
```text
|
||||
GET /rbac/check-route
|
||||
POST /rbac/clear-routes-cache
|
||||
GET /rbac/permissions
|
||||
POST /rbac/permissions
|
||||
DELETE /rbac/permissions/{permission_id}
|
||||
GET /rbac/permissions/{permission_id}
|
||||
PUT /rbac/permissions/{permission_id}
|
||||
DELETE /rbac/role-permissions
|
||||
GET /rbac/role-permissions
|
||||
POST /rbac/role-permissions
|
||||
PUT /rbac/role-permissions
|
||||
GET /rbac/roles
|
||||
POST /rbac/roles
|
||||
DELETE /rbac/roles/{role_id}
|
||||
GET /rbac/roles/{role_id}
|
||||
PUT /rbac/roles/{role_id}
|
||||
GET /rbac/roles/{role_id}/all-routes
|
||||
GET /rbac/roles/{role_id}/routes
|
||||
PUT /rbac/roles/{role_id}/routes
|
||||
GET /rbac/roles/{role_id}/users
|
||||
GET /rbac/user/routes
|
||||
GET /rbac/users
|
||||
GET /rbac/users/{user_id}/roles
|
||||
POST /rbac/users/{user_id}/roles
|
||||
DELETE /rbac/users/{user_id}/roles/{role_id}
|
||||
```
|
||||
|
||||
## [routes] 3
|
||||
|
||||
```text
|
||||
GET /routes
|
||||
GET /routes/{route_id}
|
||||
GET /routes/{route_id}/permissions
|
||||
```
|
||||
|
||||
## [rpc] 1
|
||||
|
||||
```text
|
||||
DELETE,GET,PATCH,POST,PUT /rpc/{rpc_function:path}
|
||||
```
|
||||
|
||||
## [statistics] 3
|
||||
|
||||
```text
|
||||
GET /statistics/home-data
|
||||
GET /statistics/top-error-points
|
||||
GET /statistics/top-risk-users
|
||||
```
|
||||
|
||||
## [storage] 11
|
||||
|
||||
```text
|
||||
DELETE /storage/buckets
|
||||
GET /storage/buckets
|
||||
POST /storage/buckets
|
||||
DELETE /storage/files
|
||||
GET /storage/files
|
||||
POST /storage/files/batch-delete
|
||||
POST /storage/files/copy
|
||||
GET /storage/files/download
|
||||
GET /storage/files/metadata
|
||||
POST /storage/files/move
|
||||
GET /storage/files/presigned-url
|
||||
```
|
||||
|
||||
## [system] 5
|
||||
|
||||
```text
|
||||
GET /system/ai-cloud/status
|
||||
POST /system/ai-cloud/switch
|
||||
GET /system/queue/details
|
||||
GET /system/queue/position/{document_id}
|
||||
GET /system/queue/status
|
||||
```
|
||||
|
||||
## [user] 2
|
||||
|
||||
```text
|
||||
GET /user/routes
|
||||
GET /user/routes/flat
|
||||
```
|
||||
|
||||
## [users] 5
|
||||
|
||||
```text
|
||||
GET /users
|
||||
GET /users/organizations
|
||||
GET /users/organizations/flat
|
||||
GET /users/organizations/tree
|
||||
GET /users/organizations/{ou_id}/users
|
||||
```
|
||||
|
||||
## [v2 admin alias] 51
|
||||
|
||||
```text
|
||||
POST /v2/ai-suggestions/batch-generate
|
||||
POST /v2/ai-suggestions/generate
|
||||
GET /v2/ai-suggestions/task/{task_id}
|
||||
GET /v2/ai-suggestions/{evaluation_result_id}
|
||||
POST /v2/cross_review/proposals
|
||||
POST /v2/cross_review/proposals/details
|
||||
POST /v2/cross_review/proposals/document
|
||||
POST /v2/cross_review/proposals/document/check_pending_votes
|
||||
DELETE /v2/cross_review/proposals/{proposal_id}
|
||||
POST /v2/cross_review/proposals/{proposal_id}/votes
|
||||
POST /v2/cross_review/tasks/user_tasks
|
||||
GET /v2/cross_review/tasks/{task_id}/can-confirm
|
||||
GET /v2/cross_review/tasks/{task_id}/documents
|
||||
POST /v2/cross_review/tasks/{task_id}/documents
|
||||
POST /v2/cross_review/tasks/{task_id}/documents/{document_id}/append_attachments
|
||||
POST /v2/cross_review/tasks/{task_id}/documents/{document_id}/complete
|
||||
GET /v2/cross_review/tasks/{task_id}/progress
|
||||
POST /v2/cross_review/tasks/{task_id}/upload_documents
|
||||
GET /v2/documents/check-duplicate
|
||||
POST /v2/documents/contract_templates/{comparison_id}/append_attachments
|
||||
POST /v2/documents/contracts/{document_id}/append_attachments
|
||||
POST /v2/documents/cross_review/documents/upload_and_assign
|
||||
GET /v2/documents/list
|
||||
POST /v2/documents/upload
|
||||
POST /v2/documents/upload_contract_template
|
||||
POST /v2/qichacha/batch
|
||||
POST /v2/qichacha/company
|
||||
POST /v2/qichacha/dishonesty
|
||||
POST /v2/qichacha/enterprise
|
||||
GET /v2/qichacha/status
|
||||
DELETE /v2/storage/buckets
|
||||
GET /v2/storage/buckets
|
||||
POST /v2/storage/buckets
|
||||
DELETE /v2/storage/files
|
||||
GET /v2/storage/files
|
||||
POST /v2/storage/files/batch-delete
|
||||
POST /v2/storage/files/copy
|
||||
GET /v2/storage/files/download
|
||||
GET /v2/storage/files/metadata
|
||||
POST /v2/storage/files/move
|
||||
GET /v2/storage/files/presigned-url
|
||||
GET /v2/system/ai-cloud/status
|
||||
POST /v2/system/ai-cloud/switch
|
||||
GET /v2/system/queue/details
|
||||
GET /v2/system/queue/position/{document_id}
|
||||
GET /v2/system/queue/status
|
||||
GET /v2/users
|
||||
GET /v2/users/organizations
|
||||
GET /v2/users/organizations/flat
|
||||
GET /v2/users/organizations/tree
|
||||
GET /v2/users/organizations/{ou_id}/users
|
||||
```
|
||||
|
||||
## [versions] 4
|
||||
|
||||
```text
|
||||
POST /versions/compare
|
||||
GET /versions/documents-list
|
||||
GET /versions/statistics
|
||||
GET /versions/{entity_id}
|
||||
```
|
||||
@@ -0,0 +1,683 @@
|
||||
# Fix Double Finalize + Rule Type Bindings API Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Fix two blocking issues: (1) eliminate the duplicate `result_status` / `finished_at` write in `save_evaluation_results`, and (2) add full CRUD API for the `leaudit_rule_type_bindings` table.
|
||||
|
||||
**Architecture:** Fix 1 is a one-line removal in `storage_adapter.py` — strip the premature run summary UPDATE from `save_evaluation_results` so `finalize_run` is the single source of truth for terminal state. Fix 2 follows the existing RuleController → IRuleService → RuleServiceImpl layered pattern, adding DTO/VO types and 4 endpoints for binding management.
|
||||
|
||||
**Tech Stack:** Python, FastAPI, SQLAlchemy async, PostgreSQL
|
||||
|
||||
---
|
||||
|
||||
## File Map
|
||||
|
||||
| File | Action | Responsibility |
|
||||
|---|---|---|
|
||||
| `fastapi_modules/fastapi_leaudit/leaudit_bridge/storage_adapter.py` | Modify | Remove premature UPDATE from `save_evaluation_results` |
|
||||
| `fastapi_modules/fastapi_leaudit/domian/Dto/ruleBindingDto.py` | Create | `RuleBindingCreateDTO`, `RuleBindingUpdateDTO` |
|
||||
| `fastapi_modules/fastapi_leaudit/domian/vo/ruleVo.py` | Modify | Add `RuleBindingVO` |
|
||||
| `fastapi_modules/fastapi_leaudit/services/ruleService.py` | Modify | Add 4 abstract methods |
|
||||
| `fastapi_modules/fastapi_leaudit/services/impl/ruleServiceImpl.py` | Modify | Add 4 method implementations |
|
||||
| `fastapi_modules/fastapi_leaudit/controllers/ruleController.py` | Modify | Add 4 endpoints |
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Fix Double Finalize — Strip Premature UPDATE from save_evaluation_results
|
||||
|
||||
**Files:**
|
||||
- Modify: `fastapi_modules/fastapi_leaudit/leaudit_bridge/storage_adapter.py:167-181`
|
||||
|
||||
- [ ] **Step 1: Remove result_status and finished_at from the UPDATE clause**
|
||||
|
||||
Replace lines 167-181 of `storage_adapter.py`:
|
||||
|
||||
```python
|
||||
# Update audit_runs summary
|
||||
await session.execute(
|
||||
text("""UPDATE leaudit_audit_runs SET
|
||||
total_score = :ts, passed_count = :pc, failed_count = :fc,
|
||||
skipped_count = :sc, result_status = :rs, finished_at = now(), update_time = now()
|
||||
WHERE id = :rid"""),
|
||||
{
|
||||
"ts": evaluation.total_score,
|
||||
"pc": evaluation.passed_count,
|
||||
"fc": evaluation.failed_count,
|
||||
"sc": evaluation.skipped_count,
|
||||
"rs": "pass" if evaluation.failed_count == 0 else "fail",
|
||||
"rid": resolved_run_id,
|
||||
},
|
||||
)
|
||||
```
|
||||
|
||||
With:
|
||||
|
||||
```python
|
||||
# Update audit_runs summary (scores only — terminal state set by finalize_run)
|
||||
await session.execute(
|
||||
text("""UPDATE leaudit_audit_runs SET
|
||||
total_score = :ts, passed_count = :pc, failed_count = :fc,
|
||||
skipped_count = :sc, update_time = now()
|
||||
WHERE id = :rid"""),
|
||||
{
|
||||
"ts": evaluation.total_score,
|
||||
"pc": evaluation.passed_count,
|
||||
"fc": evaluation.failed_count,
|
||||
"sc": evaluation.skipped_count,
|
||||
"rid": resolved_run_id,
|
||||
},
|
||||
)
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Verify finalize_run is still the last writer in persist_result**
|
||||
|
||||
Read `nativeRunner.py:149-157` to confirm `finalize_run` runs after all other persist steps, including `save_evaluation_results`. The order is:
|
||||
|
||||
```
|
||||
save_ocr_result → save_extraction_result → save_evaluation_results → save_run_errors → save_rescue_outcomes → save_run_metrics → finalize_run
|
||||
```
|
||||
|
||||
Confirmed: `finalize_run` is the LAST call in `persist_result()`, so it will always set the definitive terminal state.
|
||||
|
||||
- [ ] **Step 3: Syntax check**
|
||||
|
||||
```bash
|
||||
cd /home/wren-dev/Porject/leaudit-platform && python -m compileall fastapi_modules/fastapi_leaudit/leaudit_bridge/storage_adapter.py
|
||||
```
|
||||
Expected: Compile successful, no errors.
|
||||
|
||||
- [ ] **Step 4: Commit**
|
||||
|
||||
```bash
|
||||
cd /home/wren-dev/Porject/leaudit-platform
|
||||
git add fastapi_modules/fastapi_leaudit/leaudit_bridge/storage_adapter.py
|
||||
git commit -m "fix: remove premature result_status/finished_at from save_evaluation_results
|
||||
|
||||
finalize_run() is the single source of truth for terminal run state.
|
||||
Previously save_evaluation_results wrote a binary pass/fail status and
|
||||
finished_at BEFORE rescue outcomes/metrics were saved, then finalize_run
|
||||
overwrote it. Now scores only are written here; terminal state is set
|
||||
once by finalize_run after all sub-results are persisted."
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Create RuleBinding DTOs
|
||||
|
||||
**Files:**
|
||||
- Create: `fastapi_modules/fastapi_leaudit/domian/Dto/ruleBindingDto.py`
|
||||
|
||||
- [ ] **Step 1: Create the DTO file**
|
||||
|
||||
```python
|
||||
"""规则类型绑定 DTO。"""
|
||||
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
|
||||
class RuleBindingCreateDTO(BaseModel):
|
||||
"""创建规则类型绑定请求。"""
|
||||
|
||||
docTypeId: int = Field(..., description="文档类型ID → leaudit_document_types.id")
|
||||
docTypeCode: str | None = Field(None, description="文档类型编码(冗余快速匹配)")
|
||||
ruleSetId: int = Field(..., description="规则集ID → leaudit_rule_sets.id")
|
||||
bindingMode: str = Field("explicit", description="绑定模式: explicit / wildcard / fallback")
|
||||
priority: int = Field(0, description="优先级(数值越大优先级越高)")
|
||||
note: str | None = Field(None, description="备注说明")
|
||||
|
||||
|
||||
class RuleBindingUpdateDTO(BaseModel):
|
||||
"""更新规则类型绑定请求。"""
|
||||
|
||||
isActive: bool | None = Field(None, description="是否激活")
|
||||
priority: int | None = Field(None, description="优先级")
|
||||
bindingMode: str | None = Field(None, description="绑定模式")
|
||||
note: str | None = Field(None, description="备注说明")
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Syntax check**
|
||||
|
||||
```bash
|
||||
cd /home/wren-dev/Porject/leaudit-platform && python -m compileall fastapi_modules/fastapi_leaudit/domian/Dto/ruleBindingDto.py
|
||||
```
|
||||
Expected: Compile successful.
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
cd /home/wren-dev/Porject/leaudit-platform
|
||||
git add fastapi_modules/fastapi_leaudit/domian/Dto/ruleBindingDto.py
|
||||
git commit -m "feat: add RuleBindingCreateDTO and RuleBindingUpdateDTO"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Add RuleBindingVO to ruleVo.py
|
||||
|
||||
**Files:**
|
||||
- Modify: `fastapi_modules/fastapi_leaudit/domian/vo/ruleVo.py`
|
||||
|
||||
- [ ] **Step 1: Append RuleBindingVO class at end of file**
|
||||
|
||||
Add after line 49 (after the `RuleValidationVO` class):
|
||||
|
||||
```python
|
||||
|
||||
|
||||
class RuleBindingVO(BaseModel):
|
||||
"""规则类型绑定响应。"""
|
||||
|
||||
id: int = Field(..., description="绑定ID")
|
||||
docTypeId: int = Field(..., description="文档类型ID")
|
||||
docTypeCode: str | None = Field(None, description="文档类型编码")
|
||||
ruleSetId: int = Field(..., description="规则集ID")
|
||||
ruleType: str | None = Field(None, description="规则类型编码(来自关联查询)")
|
||||
ruleName: str | None = Field(None, description="规则集名称(来自关联查询)")
|
||||
bindingMode: str = Field(..., description="绑定模式: explicit / wildcard / fallback")
|
||||
priority: int = Field(0, description="优先级")
|
||||
isActive: bool = Field(True, description="是否激活")
|
||||
note: str | None = Field(None, description="备注说明")
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Syntax check**
|
||||
|
||||
```bash
|
||||
cd /home/wren-dev/Porject/leaudit-platform && python -m compileall fastapi_modules/fastapi_leaudit/domian/vo/ruleVo.py
|
||||
```
|
||||
Expected: Compile successful.
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
cd /home/wren-dev/Porject/leaudit-platform
|
||||
git add fastapi_modules/fastapi_leaudit/domian/vo/ruleVo.py
|
||||
git commit -m "feat: add RuleBindingVO for rule type bindings response"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Add Binding Methods to IRuleService Interface
|
||||
|
||||
**Files:**
|
||||
- Modify: `fastapi_modules/fastapi_leaudit/services/ruleService.py`
|
||||
|
||||
- [ ] **Step 1: Add import for RuleBindingVO at top of file**
|
||||
|
||||
Add `RuleBindingVO` to the existing import block (line 5-10):
|
||||
|
||||
```python
|
||||
from fastapi_modules.fastapi_leaudit.domian.vo.ruleVo import (
|
||||
RuleBindingVO,
|
||||
RuleContentVO,
|
||||
RuleSetVO,
|
||||
RuleValidationVO,
|
||||
RuleVersionVO,
|
||||
)
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add 4 abstract methods before the closing of the class**
|
||||
|
||||
Add after the `Rollback` method (before the last blank line of the class):
|
||||
|
||||
```python
|
||||
@abstractmethod
|
||||
async def ListBindings(self, RuleType: str | None = None) -> list[RuleBindingVO]:
|
||||
"""列出规则类型绑定。可按规则类型过滤。"""
|
||||
...
|
||||
|
||||
@abstractmethod
|
||||
async def CreateBinding(
|
||||
self,
|
||||
DocTypeId: int,
|
||||
RuleSetId: int,
|
||||
BindingMode: str = "explicit",
|
||||
Priority: int = 0,
|
||||
DocTypeCode: str | None = None,
|
||||
Note: str | None = None,
|
||||
) -> RuleBindingVO:
|
||||
"""创建规则类型绑定。"""
|
||||
...
|
||||
|
||||
@abstractmethod
|
||||
async def UpdateBinding(
|
||||
self,
|
||||
BindingId: int,
|
||||
IsActive: bool | None = None,
|
||||
Priority: int | None = None,
|
||||
BindingMode: str | None = None,
|
||||
Note: str | None = None,
|
||||
) -> RuleBindingVO:
|
||||
"""更新规则类型绑定。"""
|
||||
...
|
||||
|
||||
@abstractmethod
|
||||
async def DeleteBinding(self, BindingId: int) -> None:
|
||||
"""删除规则类型绑定。"""
|
||||
...
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Syntax check**
|
||||
|
||||
```bash
|
||||
cd /home/wren-dev/Porject/leaudit-platform && python -m compileall fastapi_modules/fastapi_leaudit/services/ruleService.py
|
||||
```
|
||||
Expected: Compile successful.
|
||||
|
||||
- [ ] **Step 4: Commit**
|
||||
|
||||
```bash
|
||||
cd /home/wren-dev/Porject/leaudit-platform
|
||||
git add fastapi_modules/fastapi_leaudit/services/ruleService.py
|
||||
git commit -m "feat: add binding CRUD methods to IRuleService interface"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 5: Implement Binding Methods in RuleServiceImpl
|
||||
|
||||
**Files:**
|
||||
- Modify: `fastapi_modules/fastapi_leaudit/services/impl/ruleServiceImpl.py`
|
||||
|
||||
- [ ] **Step 1: Add RuleBindingVO import**
|
||||
|
||||
Add `RuleBindingVO` to the import from `ruleVo` (line 12-17):
|
||||
|
||||
```python
|
||||
from fastapi_modules.fastapi_leaudit.domian.vo.ruleVo import (
|
||||
RuleBindingVO,
|
||||
RuleContentVO,
|
||||
RuleSetVO,
|
||||
RuleValidationVO,
|
||||
RuleVersionVO,
|
||||
)
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add 4 method implementations after Rollback method (before _SwitchVersion)**
|
||||
|
||||
Insert before the `_SwitchVersion` method (before line 342):
|
||||
|
||||
```python
|
||||
async def ListBindings(self, RuleType: str | None = None) -> list[RuleBindingVO]:
|
||||
"""列出规则类型绑定,可按规则类型过滤。"""
|
||||
async with GetAsyncSession() as Session:
|
||||
if RuleType:
|
||||
Result = await Session.execute(
|
||||
text(
|
||||
"""
|
||||
SELECT
|
||||
b.id,
|
||||
b.doc_type_id,
|
||||
b.doc_type_code,
|
||||
b.rule_set_id,
|
||||
b.binding_mode,
|
||||
b.priority,
|
||||
b.is_active,
|
||||
b.note,
|
||||
rs.rule_type,
|
||||
rs.rule_name
|
||||
FROM leaudit_rule_type_bindings b
|
||||
JOIN leaudit_rule_sets rs ON rs.id = b.rule_set_id
|
||||
WHERE rs.rule_type = :rule_type
|
||||
AND rs.delete_time IS NULL
|
||||
ORDER BY b.priority DESC, b.id DESC
|
||||
"""
|
||||
),
|
||||
{"rule_type": RuleType},
|
||||
)
|
||||
else:
|
||||
Result = await Session.execute(
|
||||
text(
|
||||
"""
|
||||
SELECT
|
||||
b.id,
|
||||
b.doc_type_id,
|
||||
b.doc_type_code,
|
||||
b.rule_set_id,
|
||||
b.binding_mode,
|
||||
b.priority,
|
||||
b.is_active,
|
||||
b.note,
|
||||
rs.rule_type,
|
||||
rs.rule_name
|
||||
FROM leaudit_rule_type_bindings b
|
||||
JOIN leaudit_rule_sets rs ON rs.id = b.rule_set_id
|
||||
WHERE rs.delete_time IS NULL
|
||||
ORDER BY rs.rule_type, b.priority DESC, b.id DESC
|
||||
"""
|
||||
),
|
||||
)
|
||||
return [
|
||||
RuleBindingVO(
|
||||
id=int(Row["id"]),
|
||||
docTypeId=int(Row["doc_type_id"]),
|
||||
docTypeCode=Row["doc_type_code"],
|
||||
ruleSetId=int(Row["rule_set_id"]),
|
||||
ruleType=Row["rule_type"],
|
||||
ruleName=Row["rule_name"],
|
||||
bindingMode=Row["binding_mode"],
|
||||
priority=int(Row["priority"]),
|
||||
isActive=bool(Row["is_active"]),
|
||||
note=Row["note"],
|
||||
)
|
||||
for Row in Result.mappings().all()
|
||||
]
|
||||
|
||||
async def CreateBinding(
|
||||
self,
|
||||
DocTypeId: int,
|
||||
RuleSetId: int,
|
||||
BindingMode: str = "explicit",
|
||||
Priority: int = 0,
|
||||
DocTypeCode: str | None = None,
|
||||
Note: str | None = None,
|
||||
) -> RuleBindingVO:
|
||||
"""创建规则类型绑定。"""
|
||||
async with GetAsyncSession() as Session:
|
||||
RuleSet = await Session.execute(
|
||||
text("SELECT id, rule_type, rule_name FROM leaudit_rule_sets WHERE id = :rid AND delete_time IS NULL LIMIT 1"),
|
||||
{"rid": RuleSetId},
|
||||
)
|
||||
if not RuleSet.mappings().first():
|
||||
raise LeauditException(StatusCodeEnum.HTTP_404_NOT_FOUND, "规则集不存在")
|
||||
|
||||
Existing = await Session.execute(
|
||||
text(
|
||||
"""
|
||||
SELECT id FROM leaudit_rule_type_bindings
|
||||
WHERE doc_type_id = :dtid AND rule_set_id = :rsid
|
||||
LIMIT 1
|
||||
"""
|
||||
),
|
||||
{"dtid": DocTypeId, "rsid": RuleSetId},
|
||||
)
|
||||
if Existing.mappings().first():
|
||||
raise LeauditException(StatusCodeEnum.HTTP_409_CONFLICT, "该文档类型已绑定此规则集")
|
||||
|
||||
Result = await Session.execute(
|
||||
text(
|
||||
"""
|
||||
INSERT INTO leaudit_rule_type_bindings (
|
||||
doc_type_id,
|
||||
doc_type_code,
|
||||
rule_set_id,
|
||||
binding_mode,
|
||||
priority,
|
||||
is_active,
|
||||
note
|
||||
) VALUES (
|
||||
:doc_type_id,
|
||||
:doc_type_code,
|
||||
:rule_set_id,
|
||||
:binding_mode,
|
||||
:priority,
|
||||
true,
|
||||
:note
|
||||
)
|
||||
RETURNING id, doc_type_id, doc_type_code, rule_set_id,
|
||||
binding_mode, priority, is_active, note
|
||||
"""
|
||||
),
|
||||
{
|
||||
"doc_type_id": DocTypeId,
|
||||
"doc_type_code": DocTypeCode,
|
||||
"rule_set_id": RuleSetId,
|
||||
"binding_mode": BindingMode,
|
||||
"priority": Priority,
|
||||
"note": Note,
|
||||
},
|
||||
)
|
||||
await Session.commit()
|
||||
Row = Result.mappings().first()
|
||||
RsRow = RuleSet.mappings().first()
|
||||
return RuleBindingVO(
|
||||
id=int(Row["id"]),
|
||||
docTypeId=int(Row["doc_type_id"]),
|
||||
docTypeCode=Row["doc_type_code"],
|
||||
ruleSetId=int(Row["rule_set_id"]),
|
||||
ruleType=RsRow["rule_type"],
|
||||
ruleName=RsRow["rule_name"],
|
||||
bindingMode=Row["binding_mode"],
|
||||
priority=int(Row["priority"]),
|
||||
isActive=bool(Row["is_active"]),
|
||||
note=Row["note"],
|
||||
)
|
||||
|
||||
async def UpdateBinding(
|
||||
self,
|
||||
BindingId: int,
|
||||
IsActive: bool | None = None,
|
||||
Priority: int | None = None,
|
||||
BindingMode: str | None = None,
|
||||
Note: str | None = None,
|
||||
) -> RuleBindingVO:
|
||||
"""更新规则类型绑定。"""
|
||||
async with GetAsyncSession() as Session:
|
||||
Existing = await Session.execute(
|
||||
text(
|
||||
"""
|
||||
SELECT
|
||||
b.id, b.doc_type_id, b.doc_type_code, b.rule_set_id,
|
||||
b.binding_mode, b.priority, b.is_active, b.note,
|
||||
rs.rule_type, rs.rule_name
|
||||
FROM leaudit_rule_type_bindings b
|
||||
JOIN leaudit_rule_sets rs ON rs.id = b.rule_set_id
|
||||
WHERE b.id = :bid
|
||||
LIMIT 1
|
||||
"""
|
||||
),
|
||||
{"bid": BindingId},
|
||||
)
|
||||
Row = Existing.mappings().first()
|
||||
if not Row:
|
||||
raise LeauditException(StatusCodeEnum.HTTP_404_NOT_FOUND, "绑定记录不存在")
|
||||
|
||||
SetClauses: list[str] = []
|
||||
Params: dict[str, object] = {"bid": BindingId}
|
||||
|
||||
if IsActive is not None:
|
||||
SetClauses.append("is_active = :is_active")
|
||||
Params["is_active"] = IsActive
|
||||
if Priority is not None:
|
||||
SetClauses.append("priority = :priority")
|
||||
Params["priority"] = Priority
|
||||
if BindingMode is not None:
|
||||
SetClauses.append("binding_mode = :binding_mode")
|
||||
Params["binding_mode"] = BindingMode
|
||||
if Note is not None:
|
||||
SetClauses.append("note = :note")
|
||||
Params["note"] = Note
|
||||
|
||||
if SetClauses:
|
||||
SetClauses.append("update_time = now()")
|
||||
await Session.execute(
|
||||
text(f"UPDATE leaudit_rule_type_bindings SET {', '.join(SetClauses)} WHERE id = :bid"),
|
||||
Params,
|
||||
)
|
||||
await Session.commit()
|
||||
|
||||
Result = await Session.execute(
|
||||
text(
|
||||
"""
|
||||
SELECT
|
||||
b.id, b.doc_type_id, b.doc_type_code, b.rule_set_id,
|
||||
b.binding_mode, b.priority, b.is_active, b.note,
|
||||
rs.rule_type, rs.rule_name
|
||||
FROM leaudit_rule_type_bindings b
|
||||
JOIN leaudit_rule_sets rs ON rs.id = b.rule_set_id
|
||||
WHERE b.id = :bid
|
||||
LIMIT 1
|
||||
"""
|
||||
),
|
||||
{"bid": BindingId},
|
||||
)
|
||||
Row = Result.mappings().first()
|
||||
return RuleBindingVO(
|
||||
id=int(Row["id"]),
|
||||
docTypeId=int(Row["doc_type_id"]),
|
||||
docTypeCode=Row["doc_type_code"],
|
||||
ruleSetId=int(Row["rule_set_id"]),
|
||||
ruleType=Row["rule_type"],
|
||||
ruleName=Row["rule_name"],
|
||||
bindingMode=Row["binding_mode"],
|
||||
priority=int(Row["priority"]),
|
||||
isActive=bool(Row["is_active"]),
|
||||
note=Row["note"],
|
||||
)
|
||||
|
||||
async def DeleteBinding(self, BindingId: int) -> None:
|
||||
"""删除规则类型绑定。"""
|
||||
async with GetAsyncSession() as Session:
|
||||
Result = await Session.execute(
|
||||
text("DELETE FROM leaudit_rule_type_bindings WHERE id = :bid"),
|
||||
{"bid": BindingId},
|
||||
)
|
||||
await Session.commit()
|
||||
if Result.rowcount == 0:
|
||||
raise LeauditException(StatusCodeEnum.HTTP_404_NOT_FOUND, "绑定记录不存在")
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Syntax check**
|
||||
|
||||
```bash
|
||||
cd /home/wren-dev/Porject/leaudit-platform && python -m compileall fastapi_modules/fastapi_leaudit/services/impl/ruleServiceImpl.py
|
||||
```
|
||||
Expected: Compile successful.
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
cd /home/wren-dev/Porject/leaudit-platform
|
||||
git add fastapi_modules/fastapi_leaudit/services/impl/ruleServiceImpl.py
|
||||
git commit -m "feat: implement binding CRUD in RuleServiceImpl"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 6: Add Binding Endpoints to RuleController
|
||||
|
||||
**Files:**
|
||||
- Modify: `fastapi_modules/fastapi_leaudit/controllers/ruleController.py`
|
||||
|
||||
- [ ] **Step 1: Add imports for new types**
|
||||
|
||||
Update the imports (lines 3-16) to include binding DTOs and VO:
|
||||
|
||||
```python
|
||||
"""规则管理控制器。"""
|
||||
|
||||
from fastapi_common.fastapi_common_web.controller import BaseController
|
||||
from fastapi_common.fastapi_common_web.domain.responses import Result
|
||||
|
||||
from fastapi_modules.fastapi_leaudit.domian.Dto.ruleBindingDto import (
|
||||
RuleBindingCreateDTO,
|
||||
RuleBindingUpdateDTO,
|
||||
)
|
||||
from fastapi_modules.fastapi_leaudit.domian.Dto.rulePublishDto import RulePublishDTO
|
||||
from fastapi_modules.fastapi_leaudit.domian.Dto.ruleValidateDto import RuleValidateDTO
|
||||
from fastapi_modules.fastapi_leaudit.domian.Dto.ruleVersionCreateDto import RuleVersionCreateDTO
|
||||
from fastapi_modules.fastapi_leaudit.domian.vo.ruleVo import (
|
||||
RuleBindingVO,
|
||||
RuleContentVO,
|
||||
RuleSetVO,
|
||||
RuleValidationVO,
|
||||
RuleVersionVO,
|
||||
)
|
||||
from fastapi_modules.fastapi_leaudit.services import IRuleService
|
||||
from fastapi_modules.fastapi_leaudit.services.impl.ruleServiceImpl import RuleServiceImpl
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add 4 endpoint definitions inside __init__**
|
||||
|
||||
Add after the `RollbackRuleVersion` endpoint (after line 82, before the closing of `__init__`):
|
||||
|
||||
```python
|
||||
# ── Rule Type Bindings ──────────────────────────────────────
|
||||
|
||||
@self.router.get("/bindings", response_model=Result[list[RuleBindingVO]])
|
||||
async def ListBindings(ruleType: str | None = None):
|
||||
"""列出规则类型绑定。可按规则类型过滤。"""
|
||||
Data = await self.RuleService.ListBindings(RuleType=ruleType)
|
||||
return Result.success(data=Data)
|
||||
|
||||
@self.router.post("/{RuleType}/bindings", response_model=Result[RuleBindingVO])
|
||||
async def CreateBinding(RuleType: str, body: RuleBindingCreateDTO):
|
||||
"""创建规则类型绑定。"""
|
||||
Data = await self.RuleService.CreateBinding(
|
||||
DocTypeId=body.docTypeId,
|
||||
RuleSetId=body.ruleSetId,
|
||||
BindingMode=body.bindingMode,
|
||||
Priority=body.priority,
|
||||
DocTypeCode=body.docTypeCode,
|
||||
Note=body.note,
|
||||
)
|
||||
return Result.success(data=Data)
|
||||
|
||||
@self.router.put("/bindings/{BindingId}", response_model=Result[RuleBindingVO])
|
||||
async def UpdateBinding(BindingId: int, body: RuleBindingUpdateDTO):
|
||||
"""更新规则类型绑定。"""
|
||||
Data = await self.RuleService.UpdateBinding(
|
||||
BindingId=BindingId,
|
||||
IsActive=body.isActive,
|
||||
Priority=body.priority,
|
||||
BindingMode=body.bindingMode,
|
||||
Note=body.note,
|
||||
)
|
||||
return Result.success(data=Data)
|
||||
|
||||
@self.router.delete("/bindings/{BindingId}", response_model=Result[None])
|
||||
async def DeleteBinding(BindingId: int):
|
||||
"""删除规则类型绑定。"""
|
||||
await self.RuleService.DeleteBinding(BindingId=BindingId)
|
||||
return Result.success()
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Syntax check**
|
||||
|
||||
```bash
|
||||
cd /home/wren-dev/Porject/leaudit-platform && python -m compileall fastapi_modules/fastapi_leaudit/controllers/ruleController.py
|
||||
```
|
||||
Expected: Compile successful.
|
||||
|
||||
- [ ] **Step 4: Commit**
|
||||
|
||||
```bash
|
||||
cd /home/wren-dev/Porject/leaudit-platform
|
||||
git add fastapi_modules/fastapi_leaudit/controllers/ruleController.py
|
||||
git commit -m "feat: add rule type binding CRUD endpoints to RuleController"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 7: Verification — Cross-Module Import Check
|
||||
|
||||
**Files:** None (verification only)
|
||||
|
||||
- [ ] **Step 1: Verify all modified modules compile together**
|
||||
|
||||
```bash
|
||||
cd /home/wren-dev/Porject/leaudit-platform && python -c "
|
||||
from fastapi_modules.fastapi_leaudit.domian.Dto.ruleBindingDto import RuleBindingCreateDTO, RuleBindingUpdateDTO
|
||||
from fastapi_modules.fastapi_leaudit.domian.vo.ruleVo import RuleBindingVO
|
||||
from fastapi_modules.fastapi_leaudit.services.ruleService import IRuleService
|
||||
from fastapi_modules.fastapi_leaudit.services.impl.ruleServiceImpl import RuleServiceImpl
|
||||
from fastapi_modules.fastapi_leaudit.leaudit_bridge.storage_adapter import StorageAdapter
|
||||
print('All imports OK')
|
||||
"
|
||||
```
|
||||
Expected: `All imports OK`
|
||||
|
||||
- [ ] **Step 2: Verify the double finalize fix — confirm finalize_run is the only terminal state writer**
|
||||
|
||||
```bash
|
||||
cd /home/wren-dev/Porject/leaudit-platform && grep -n "result_status\|finished_at" fastapi_modules/fastapi_leaudit/leaudit_bridge/storage_adapter.py
|
||||
```
|
||||
Expected output should show `result_status` and `finished_at` ONLY in `finalize_run` and `fail_run`, NOT in `save_evaluation_results`.
|
||||
|
||||
- [ ] **Step 3: Commit final verification**
|
||||
|
||||
```bash
|
||||
cd /home/wren-dev/Porject/leaudit-platform
|
||||
git add -A
|
||||
git diff --cached --stat
|
||||
git commit -m "chore: verify cross-module imports and finalize consistency"
|
||||
```
|
||||
@@ -0,0 +1,306 @@
|
||||
# LeAudit YAML 规则在线编辑设计
|
||||
|
||||
## 1. 背景
|
||||
|
||||
`LeAudit` 内核当前以本地 YAML/DSL 文件作为规则输入载体执行评查流程。现阶段 **不修改 `leaudit` 核心**,但平台侧需要逐步具备以下能力:
|
||||
|
||||
- 在线查看 YAML 规则内容
|
||||
- 在线编辑 YAML 规则
|
||||
- 校验规则语法与 DSL 语义
|
||||
- 规则版本化保存
|
||||
- 规则发布、回滚、审计
|
||||
|
||||
因此需要在 `leaudit-platform` 中补充一套“**规则管理真相源**”方案,使平台能够支持规则后台,而运行时仍然兼容 `leaudit` 只读取本地 YAML 的既有行为。
|
||||
|
||||
---
|
||||
|
||||
## 2. 结论
|
||||
|
||||
**可以不改 `leaudit` 核心,同时把 DSL YAML 存储到 OSS,并把路径、版本、哈希等元数据存入数据库。**
|
||||
|
||||
运行时通过 bridge 层完成一次“远端规则 → 本地临时文件”的转换:
|
||||
|
||||
```text
|
||||
数据库读取规则版本信息
|
||||
→ 获取 rules.yaml 的 oss_url
|
||||
→ 从 OSS 下载到本地临时文件
|
||||
→ 调用 leaudit.dsl.loader.load_rules_file(local_tmp_path)
|
||||
→ 交给 leaudit 原生执行链继续处理
|
||||
```
|
||||
|
||||
也就是说:
|
||||
|
||||
- **规则真相源**:OSS + 数据库
|
||||
- **执行载体**:本地临时 YAML 文件
|
||||
- **LeAudit 输入接口**:保持不变
|
||||
|
||||
这是当前约束下最稳妥、最容易演进的方案。
|
||||
|
||||
---
|
||||
|
||||
## 3. 为什么不能继续把本地 `rules/` 目录作为正式真相源
|
||||
|
||||
如果未来要开放 YAML 在线编辑界面,本地目录方案会迅速暴露问题:
|
||||
|
||||
### 3.1 不利于在线编辑
|
||||
|
||||
- 前端编辑后的内容最终仍要人工写回服务器目录
|
||||
- 多实例部署时,需要同步到多台机器
|
||||
- 容器化部署时,本地文件可能不是稳定持久层
|
||||
|
||||
### 3.2 不利于版本管理
|
||||
|
||||
- 难以明确记录“这次评查到底使用了哪一版规则”
|
||||
- 覆盖同一路径的 `rules.yaml` 后,历史执行很难追溯
|
||||
- 回滚通常会退化为“手工替换文件”
|
||||
|
||||
### 3.3 不利于审计与权限
|
||||
|
||||
- 谁改的、何时改的、为什么发布,很难形成正式审计链
|
||||
- 无法自然承载“编辑 / 审核 / 发布 / 回滚”权限流程
|
||||
|
||||
### 3.4 不利于多实例一致性
|
||||
|
||||
- API 节点 A 和 Worker 节点 B 可能读取到不同版本本地文件
|
||||
- 扩容后所有节点都要同步规则目录,运维成本高
|
||||
|
||||
因此,本地 `rules/` 目录更适合保留为:
|
||||
|
||||
- 种子规则导入源
|
||||
- 紧急回退备份
|
||||
- 开发环境本地调试资源
|
||||
|
||||
而不应该继续承担正式规则真相源角色。
|
||||
|
||||
---
|
||||
|
||||
## 4. 为什么采用“OSS + DB + 本地临时文件”模式
|
||||
|
||||
该方案同时兼顾了 **不改核心** 和 **平台化管理** 两个目标。
|
||||
|
||||
### 4.1 对 `leaudit` 零侵入
|
||||
|
||||
`leaudit` 仍然读取本地 YAML 文件,无需改造其解析器、执行器或 DSL 加载逻辑。
|
||||
|
||||
### 4.2 支持在线编辑界面
|
||||
|
||||
前端提交 YAML 文本后,平台可以执行标准流程:
|
||||
|
||||
```text
|
||||
编辑
|
||||
→ 保存草稿
|
||||
→ 语法校验
|
||||
→ DSL 语义校验
|
||||
→ 上传 OSS
|
||||
→ 写入规则版本表
|
||||
→ 发布 / 回滚
|
||||
```
|
||||
|
||||
这让规则成为“平台可管理资产”,而不是“服务器磁盘文件”。
|
||||
|
||||
### 4.3 规则版本可追溯
|
||||
|
||||
每次评查运行都可以记录:
|
||||
|
||||
- `rule_set_id`
|
||||
- `rule_version_id`
|
||||
- `rule_source_oss_url`
|
||||
- `rule_source_sha256`
|
||||
|
||||
这样可以准确回答:
|
||||
|
||||
- 这个结果用的是哪一版规则?
|
||||
- 规则文件是否被篡改?
|
||||
- 是否可以按历史版本回放?
|
||||
|
||||
### 4.4 发布与回滚简单
|
||||
|
||||
- 发布:切换 `leaudit_rule_sets.current_version_id`
|
||||
- 回滚:切回旧版本 ID
|
||||
|
||||
无需登录服务器替换目录文件,也不要求应用重新发版。
|
||||
|
||||
### 4.5 多实例一致
|
||||
|
||||
所有 API / Worker 都从同一份 DB + OSS 真相源取规则,不再依赖本地目录是否同步。
|
||||
|
||||
---
|
||||
|
||||
## 5. 建议的系统分层
|
||||
|
||||
### 5.1 真相源
|
||||
|
||||
- **OSS**:存储 `rules.yaml` 正文文件
|
||||
- **数据库**:存储规则集、规则版本、绑定关系、发布状态、哈希、编辑人、发布时间等元数据
|
||||
|
||||
### 5.2 执行层
|
||||
|
||||
- bridge 层负责把 OSS 文件下载到本地临时路径
|
||||
- 临时路径交给 `leaudit.dsl.loader.load_rules_file()` 使用
|
||||
|
||||
### 5.3 回退层
|
||||
|
||||
- 本地 `rules/` 目录保留为 fallback 或 emergency backup
|
||||
- 当 OSS 不可用或某些历史规则尚未迁移时可临时使用
|
||||
|
||||
---
|
||||
|
||||
## 6. 在线编辑功能设计
|
||||
|
||||
### 6.1 目标能力
|
||||
|
||||
平台应逐步具备以下功能:
|
||||
|
||||
- 查看规则集列表
|
||||
- 查看规则版本历史
|
||||
- 查看某版本 YAML 内容
|
||||
- 在线编辑 YAML
|
||||
- 保存草稿版本
|
||||
- 校验 YAML 语法
|
||||
- 校验 LeAudit DSL 语义
|
||||
- 发布指定版本
|
||||
- 回滚到历史版本
|
||||
- 查看发布日志与校验日志
|
||||
|
||||
### 6.2 推荐流程
|
||||
|
||||
#### 编辑保存
|
||||
|
||||
```text
|
||||
前端提交 YAML 文本
|
||||
→ 后端做 YAML 语法校验
|
||||
→ 后端做 LeAudit DSL 语义校验
|
||||
→ 生成新版本号 / version_seq
|
||||
→ 上传 rules.yaml 到 OSS
|
||||
→ 写 leaudit_rule_versions
|
||||
→ 返回版本信息
|
||||
```
|
||||
|
||||
#### 发布生效
|
||||
|
||||
```text
|
||||
选择版本发布
|
||||
→ 更新 leaudit_rule_sets.current_version_id
|
||||
→ 记录发布日志
|
||||
→ 清理规则缓存
|
||||
→ 后续新 run 自动使用新版本
|
||||
```
|
||||
|
||||
#### 回滚
|
||||
|
||||
```text
|
||||
选择旧版本
|
||||
→ 切换 current_version_id 到旧版本
|
||||
→ 写回滚日志
|
||||
→ 清理规则缓存
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. 运行时加载设计
|
||||
|
||||
### 7.1 核心原则
|
||||
|
||||
运行时不直接让 `leaudit` 读取 OSS,也不直接读取数据库文本;而是通过 bridge 统一适配。
|
||||
|
||||
### 7.2 推荐加载链路
|
||||
|
||||
```text
|
||||
document/type 确定
|
||||
→ leaudit_rule_type_bindings 查规则集
|
||||
→ leaudit_rule_sets.current_version_id
|
||||
→ leaudit_rule_versions.oss_url
|
||||
→ 下载 OSS 文件到本地临时目录
|
||||
→ 校验 sha256
|
||||
→ load_rules_file(local_path)
|
||||
→ 执行 leaudit pipeline
|
||||
```
|
||||
|
||||
### 7.2.1 当前项目已落地状态
|
||||
|
||||
当前 bridge 已按这条路线开始落地,分成两条对称链路:
|
||||
|
||||
- 文档文件链:
|
||||
- `leaudit_document_files.local_path / oss_url`
|
||||
- 下载或读取后落本地临时文件
|
||||
- 再交给原生 `AuditCtx.file_path`
|
||||
- 规则文件链:
|
||||
- `leaudit_rule_type_bindings`
|
||||
- `leaudit_rule_sets.current_version_id`
|
||||
- `leaudit_rule_versions.oss_url`
|
||||
- 下载到本地临时 `rules.yaml`
|
||||
- `RulesLoader.load(local_path)`
|
||||
- `NativeRunner -> AuditService.audit(ctx)`
|
||||
|
||||
这意味着后续开放 YAML 在线编辑界面时,不需要改 `leaudit` 核心,只要继续维护 “OSS + DB + 本地临时文件” 这条桥接链即可。
|
||||
|
||||
### 7.3 为什么必须保留“本地临时文件”
|
||||
|
||||
因为当前约束是:
|
||||
|
||||
- 不修改 `leaudit` 核心
|
||||
- `leaudit` 仍以本地路径作为 DSL 加载输入
|
||||
|
||||
所以本地临时文件不是“倒退”,而是一个必要的兼容层。
|
||||
|
||||
---
|
||||
|
||||
## 8. 与现有文档的一致性
|
||||
|
||||
该方案与当前 `docs/leaudit` 目录中的设计方向保持一致:
|
||||
|
||||
- `docs/leaudit/dsl_rule_schema_design.md`
|
||||
- 已提出“规则真相源 = OSS 文件 + 数据库索引”
|
||||
- 已提出“运行时 DB → OSS → 本地临时 YAML → LeAudit loader”
|
||||
- `docs/leaudit/bridge_directory_design.md`
|
||||
- 已明确 bridge 负责规则加载与缓存
|
||||
- `docs/leaudit/processing_logic.md`
|
||||
- 已明确 rules resolve 属于桥接层职责
|
||||
|
||||
本文件的重点是把“**为了未来开放 YAML 编辑界面,为什么必须这样设计**”单独说明清楚。
|
||||
|
||||
---
|
||||
|
||||
## 9. 当前项目建议
|
||||
|
||||
### 9.1 短期
|
||||
|
||||
- 保持本地 `rules/` 目录可用,确保现有流程可运行
|
||||
- 将其视为 fallback,而非长期正式真相源
|
||||
|
||||
### 9.2 中期
|
||||
|
||||
- 增加规则内容查看 / 编辑 / 保存 / 发布接口
|
||||
- 补齐 `leaudit_rule_versions` 的 OSS 文件上传和版本切换能力
|
||||
- 补统一 OSS 客户端与 presign / upload / version publish 能力
|
||||
|
||||
### 9.3 长期
|
||||
|
||||
- 后台提供完整 YAML 在线编辑器
|
||||
- 支持草稿、发布、回滚、审计
|
||||
- 清理本地硬编码规则映射,统一走规则绑定表
|
||||
|
||||
---
|
||||
|
||||
## 10. 最终结论
|
||||
|
||||
如果未来要开放 YAML 编辑界面,那么当前项目最合适的规则架构不是“继续依赖本地目录”,而是:
|
||||
|
||||
- **OSS 存规则文件正文**
|
||||
- **数据库存路径、版本、哈希、状态、发布信息**
|
||||
- **运行时下载到本地临时文件后交给 `leaudit` 执行**
|
||||
|
||||
这样既能保证:
|
||||
|
||||
- 不修改 `leaudit`
|
||||
- 兼容现有 DSL 加载方式
|
||||
|
||||
又能保证:
|
||||
|
||||
- 在线编辑方便
|
||||
- 版本管理清晰
|
||||
- 发布回滚简单
|
||||
- 多实例一致
|
||||
- 运行结果可审计可追溯
|
||||
|
||||
这是当前项目向“规则可运营平台”演进时最合理的方案。
|
||||
@@ -0,0 +1,296 @@
|
||||
# 为什么仍然需要 Bridge 适配层
|
||||
|
||||
## 1. 结论先行
|
||||
|
||||
即使当前项目已经确认:
|
||||
|
||||
- 后续应使用 `leaudit` 原生 `AuditCtx`
|
||||
- 后续不应继续由平台自己手写主流程编排
|
||||
- 正式执行入口应收敛到 `AuditService.audit(ctx)`
|
||||
|
||||
**也仍然需要保留 Bridge / 适配层。**
|
||||
|
||||
原因不是因为不用原生 CTX,而恰恰是因为:
|
||||
|
||||
> **要正确使用原生 CTX,就更应该把它封装在 Bridge 里。**
|
||||
|
||||
也就是说,正确架构不是:
|
||||
|
||||
```text
|
||||
平台层 -> 直接调用 leaudit AuditCtx / AuditService
|
||||
```
|
||||
|
||||
而是:
|
||||
|
||||
```text
|
||||
平台层 -> Bridge 适配层 -> leaudit AuditCtx / AuditService
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. 误区澄清
|
||||
|
||||
一个很容易出现的误区是:
|
||||
|
||||
- 既然 `leaudit` 已经有原生 `AuditCtx`
|
||||
- 那平台直接调用它就好了
|
||||
- Bridge 似乎没有必要
|
||||
|
||||
这个判断看起来简化了结构,但实际上会把平台和 `leaudit` 深度耦合起来,后续维护成本更高。
|
||||
|
||||
**真正应该取消的是:**
|
||||
|
||||
- 平台自己重写 7 阶段编排
|
||||
|
||||
**不应该取消的是:**
|
||||
|
||||
- 平台和 `leaudit` 之间的正式边界层
|
||||
|
||||
所以:
|
||||
|
||||
- **不要自己编排**
|
||||
- **但要保留适配层**
|
||||
|
||||
---
|
||||
|
||||
## 3. 适配层到底在适配什么
|
||||
|
||||
Bridge 的本质,是把“平台世界”翻译成“引擎世界”,再把“引擎结果”翻译回“平台世界”。
|
||||
|
||||
## 3.1 平台世界
|
||||
|
||||
平台里实际关心的是这些对象:
|
||||
|
||||
- `document_id`
|
||||
- `document_file_id`
|
||||
- `rule_set_id`
|
||||
- `rule_version_id`
|
||||
- `oss_url`
|
||||
- `run_id`
|
||||
- 用户触发信息
|
||||
- 权限信息
|
||||
- 数据库记录
|
||||
- 前端 DTO / VO
|
||||
|
||||
## 3.2 引擎世界
|
||||
|
||||
`leaudit` 原生执行关心的是这些对象:
|
||||
|
||||
- `file_path`
|
||||
- `rules_file`
|
||||
- `AuditServices`
|
||||
- `AuditConfig`
|
||||
- `AuditCtx`
|
||||
- `AuditService.audit(ctx)`
|
||||
|
||||
这两套概念体系并不相同,因此天然需要一层转换。
|
||||
|
||||
---
|
||||
|
||||
## 4. 为什么不能让平台层直接碰 leaudit 原生对象
|
||||
|
||||
## 4.1 会导致架构边界失守
|
||||
|
||||
如果 Controller / Service / Model 层直接构造 `AuditCtx`,那就意味着:
|
||||
|
||||
- 平台业务代码开始直接依赖 `leaudit`
|
||||
- `leaudit` 的内部概念会渗透到整个项目
|
||||
- 后面任何原生字段调整都会扩散到平台层
|
||||
|
||||
这会破坏当前项目一直强调的边界原则:
|
||||
|
||||
- 平台层不直接感知 `leaudit` 内核细节
|
||||
- `leaudit_bridge/` 是唯一正式桥接层
|
||||
|
||||
---
|
||||
|
||||
## 4.2 会让平台逻辑和引擎逻辑搅在一起
|
||||
|
||||
平台侧还必须处理这些事情:
|
||||
|
||||
- 文档文件从哪里取
|
||||
- OSS 文件如何下载
|
||||
- 规则版本从哪里查
|
||||
- run 如何创建
|
||||
- run 如何更新状态
|
||||
- 结果如何写回 `leaudit_*` 表
|
||||
- 前端如何查询结果
|
||||
|
||||
这些都不是 `leaudit` 原生 CTX 的职责。
|
||||
|
||||
如果平台层直接碰 `AuditCtx`,这些平台职责和引擎职责就会混在同一个 service 里,结构会越来越乱。
|
||||
|
||||
---
|
||||
|
||||
## 4.3 会让未来升级风险更大
|
||||
|
||||
如果以后 `leaudit` 升级:
|
||||
|
||||
- `AuditCtx` 字段变更
|
||||
- `AuditService` 签名调整
|
||||
- `AuditServices` 装配方式变化
|
||||
- `AuditConfig` 配置项增加
|
||||
|
||||
那么:
|
||||
|
||||
- 如果只有 bridge 感知这些对象,改动范围很小
|
||||
- 如果平台层很多地方直接依赖这些对象,改动会扩散全项目
|
||||
|
||||
因此,适配层的价值就在于:
|
||||
|
||||
> 把 `leaudit` 变化锁死在边界层里。
|
||||
|
||||
---
|
||||
|
||||
## 5. Bridge 的正确职责
|
||||
|
||||
确认使用原生 CTX 后,Bridge 的职责应该重新定义为:
|
||||
|
||||
## 5.1 输入适配
|
||||
|
||||
- 根据 `document_id` 找到待执行文档
|
||||
- 根据 `document_file_id` 找到文件真源
|
||||
- 如有需要,从 OSS 下载文档到本地临时路径
|
||||
- 根据 `type_id` / `rule_type_binding` 找到本次评查规则版本
|
||||
- 从 OSS 下载规则 YAML
|
||||
- 解析出 `RulesFile`
|
||||
|
||||
## 5.2 运行装配
|
||||
|
||||
- 创建 `AuditServices`
|
||||
- 创建 `AuditConfig`
|
||||
- 创建原生 `AuditCtx`
|
||||
- 调用 `AuditService.audit(ctx)`
|
||||
|
||||
## 5.3 输出适配
|
||||
|
||||
- 从最终 `ctx` 读取:
|
||||
- `normalized_doc`
|
||||
- `extraction`
|
||||
- `phase`
|
||||
- `evaluation`
|
||||
- `fallback_tasks`
|
||||
- `timing`
|
||||
- 写入:
|
||||
- `leaudit_audit_runs`
|
||||
- `leaudit_rule_results`
|
||||
- `leaudit_field_results`
|
||||
- `leaudit_artifacts`
|
||||
- `leaudit_run_metrics`
|
||||
- `leaudit_run_errors`
|
||||
|
||||
## 5.4 边界保护
|
||||
|
||||
- 平台 Controller / Service 不直接 import `leaudit.services.*`
|
||||
- 只有 `leaudit_bridge/` 感知原生 `AuditCtx`、`AuditService`、`AuditServices`
|
||||
|
||||
---
|
||||
|
||||
## 6. 三层结构图
|
||||
|
||||
推荐的结构应该是三层,而不是两层:
|
||||
|
||||
```text
|
||||
┌────────────────────────────────────────────┐
|
||||
│ 平台层 │
|
||||
│ Controller / Service / Model / API / DB │
|
||||
│ 文档、规则、权限、任务、结果查询 │
|
||||
└──────────────────┬─────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌────────────────────────────────────────────┐
|
||||
│ Bridge 适配层 │
|
||||
│ file resolve / rules resolve / ctx build │
|
||||
│ AuditServices / AuditConfig / AuditCtx │
|
||||
│ persist ctx outputs to leaudit_* │
|
||||
└──────────────────┬─────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌────────────────────────────────────────────┐
|
||||
│ leaudit 原生内核层 │
|
||||
│ AuditCtx / AuditService / Evaluation / │
|
||||
│ Extraction / Rescue / DSL loader │
|
||||
└────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
这个结构的关键点是:
|
||||
|
||||
- 平台层不直接碰 `leaudit` 细节
|
||||
- `leaudit` 不直接感知平台数据库和 OSS
|
||||
- 所有翻译工作都集中在 Bridge
|
||||
|
||||
---
|
||||
|
||||
## 7. 为什么这比“平台直接调原生 CTX”更稳
|
||||
|
||||
### 优势 1:边界清晰
|
||||
|
||||
- 平台只管业务
|
||||
- `leaudit` 只管评查
|
||||
- Bridge 只管适配
|
||||
|
||||
### 优势 2:变化可控
|
||||
|
||||
- `leaudit` 升级时改 Bridge
|
||||
- 平台结构基本不动
|
||||
|
||||
### 优势 3:便于替换
|
||||
|
||||
将来如果评查引擎变化:
|
||||
|
||||
- 平台不需要全面改造
|
||||
- 只需替换适配层实现
|
||||
|
||||
### 优势 4:测试更容易
|
||||
|
||||
Bridge 可以单独测试:
|
||||
|
||||
- 是否正确组装 `AuditCtx`
|
||||
- 是否正确调用 `AuditService.audit(ctx)`
|
||||
- 是否正确写回平台结果表
|
||||
|
||||
---
|
||||
|
||||
## 8. 对当前项目的直接要求
|
||||
|
||||
既然已经确认:
|
||||
|
||||
- 必须使用原生 `AuditCtx`
|
||||
- 不能继续自己编排主流程
|
||||
|
||||
那么当前项目应该同步修正为:
|
||||
|
||||
### 不再推荐的方向
|
||||
|
||||
- 在 `fastapi_modules/fastapi_leaudit/leaudit_bridge/pipeline.py` 中继续手写 7 阶段流程
|
||||
- 在平台侧继续直接串联:
|
||||
- OCR
|
||||
- Extract
|
||||
- Phase
|
||||
- Evaluate
|
||||
- Rescue
|
||||
|
||||
### 推荐的方向
|
||||
|
||||
- `pipeline.py` 改造成薄包装器
|
||||
- 新增 `audit_ctx_builder.py`
|
||||
- 新增 `audit_service_factory.py`
|
||||
- 由 bridge 统一:
|
||||
- build ctx
|
||||
- run audit
|
||||
- persist result
|
||||
|
||||
---
|
||||
|
||||
## 9. 最终结论
|
||||
|
||||
确认使用原生 `AuditCtx` 之后,不是 Bridge 就没用了,反而说明:
|
||||
|
||||
> **Bridge 不该负责重写编排,但必须负责原生编排的接入适配。**
|
||||
|
||||
所以最终应该坚持这条原则:
|
||||
|
||||
- **主流程执行:交给 `leaudit` 原生 `AuditService.audit(ctx)`**
|
||||
- **平台边界控制:交给 `leaudit_bridge/`**
|
||||
|
||||
这才是当前项目最稳妥、最可维护的长期方案。
|
||||
@@ -0,0 +1,610 @@
|
||||
# 原生 AuditCtx 接入重构方案
|
||||
|
||||
## 1. 目标
|
||||
|
||||
基于对 `/home/wren-dev/Porject/leaudit/src` 源码的核对,确认当前 `leaudit` 的正式执行模型应视为:
|
||||
|
||||
- `AuditCtx`
|
||||
- `AuditServices`
|
||||
- `AuditConfig`
|
||||
- `AuditService.audit(ctx)`
|
||||
|
||||
因此,`leaudit-platform` 后续不应继续由平台自己手写主流程编排,而应重构为:
|
||||
|
||||
```text
|
||||
平台层
|
||||
→ Bridge 适配层
|
||||
→ build AuditServices
|
||||
→ build AuditConfig
|
||||
→ build AuditCtx
|
||||
→ call AuditService.audit(ctx)
|
||||
→ persist ctx outputs
|
||||
→ 返回 run / result
|
||||
```
|
||||
|
||||
本文档用于说明:
|
||||
|
||||
- 为什么要重构
|
||||
- 重构后的目标架构
|
||||
- 现有文件怎么迁移
|
||||
- 建议新增哪些文件
|
||||
- 如何分阶段实施,避免一次性推翻现有代码
|
||||
|
||||
---
|
||||
|
||||
## 2. 关键结论
|
||||
|
||||
## 2.1 必须使用原生 CTX
|
||||
|
||||
这一点已经可以下明确结论:
|
||||
|
||||
- 后续应使用 `leaudit.services.audit_ctx.AuditCtx`
|
||||
- 后续应通过 `leaudit.services.audit_service.AuditService.audit(ctx)` 执行主流程
|
||||
- 不建议平台继续手工串接:
|
||||
- OCR
|
||||
- Extract
|
||||
- Phase
|
||||
- Evaluate
|
||||
- Rescue
|
||||
- Persist
|
||||
|
||||
## 2.2 但 Bridge 不能取消
|
||||
|
||||
Bridge 仍然是必须保留的正式边界层。
|
||||
|
||||
Bridge 的职责应改为:
|
||||
|
||||
- 平台对象 -> 原生 `AuditCtx`
|
||||
- 原生 `ctx` -> 平台数据库结果
|
||||
|
||||
也就是说:
|
||||
|
||||
- **不要自己编排**
|
||||
- **但要自己适配**
|
||||
|
||||
---
|
||||
|
||||
## 3. 当前实现与目标实现的差异
|
||||
|
||||
## 3.1 当前实现(过渡态)
|
||||
|
||||
当前 `fastapi_modules/fastapi_leaudit/leaudit_bridge/pipeline.py` 的角色,是平台侧自编排器:
|
||||
|
||||
```text
|
||||
file_path + rules_file
|
||||
→ OCR
|
||||
→ Extraction
|
||||
→ Phase detection
|
||||
→ Evaluation
|
||||
→ Save results
|
||||
```
|
||||
|
||||
特点:
|
||||
|
||||
- 直接调用 engine / extraction / evaluation 低层函数
|
||||
- 部分阶段逻辑由平台维护
|
||||
- rescue / finalize 语义未完整对齐原生服务层
|
||||
- 结果写入与运行编排耦合较高
|
||||
|
||||
## 3.2 目标实现(正式态)
|
||||
|
||||
目标应改成:
|
||||
|
||||
```text
|
||||
document + file + rule version
|
||||
→ build AuditServices
|
||||
→ build AuditConfig
|
||||
→ build AuditCtx
|
||||
→ await AuditService.audit(ctx)
|
||||
→ persist final ctx
|
||||
```
|
||||
|
||||
特点:
|
||||
|
||||
- 主流程交给 `leaudit` 原生编排器
|
||||
- Bridge 只负责适配与持久化
|
||||
- 平台不再复制 `leaudit` 的阶段编排语义
|
||||
- 后续升级风险更小
|
||||
|
||||
---
|
||||
|
||||
## 4. 目标架构图
|
||||
|
||||
```text
|
||||
┌────────────────────────────────────────────┐
|
||||
│ 平台层 │
|
||||
│ Controller / Service / ORM / API / DB │
|
||||
│ 文档上传、规则管理、任务触发、结果查询 │
|
||||
└──────────────────┬─────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌────────────────────────────────────────────┐
|
||||
│ Bridge 适配层 │
|
||||
│ │
|
||||
│ 1. resolve document file │
|
||||
│ 2. resolve rule version │
|
||||
│ 3. download doc / yaml to local temp │
|
||||
│ 4. build AuditServices / AuditConfig │
|
||||
│ 5. build AuditCtx │
|
||||
│ 6. call AuditService.audit(ctx) │
|
||||
│ 7. persist ctx outputs to leaudit_* │
|
||||
└──────────────────┬─────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌────────────────────────────────────────────┐
|
||||
│ leaudit 原生服务层 │
|
||||
│ AuditCtx / AuditServices / AuditService │
|
||||
│ normalization / extraction / evaluation │
|
||||
│ rescue / finalize │
|
||||
└────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Bridge 重构后的职责边界
|
||||
|
||||
## 5.1 Bridge 负责的事情
|
||||
|
||||
- 从平台数据库找到本次执行的文档与文件版本
|
||||
- 从平台规则表找到当前生效规则版本
|
||||
- 从 OSS 下载文档文件与规则文件到本地临时路径
|
||||
- 构造 `AuditServices`
|
||||
- 构造 `AuditConfig`
|
||||
- 构造原生 `AuditCtx`
|
||||
- 调用 `AuditService.audit(ctx)`
|
||||
- 将最终 `ctx` 中的结果持久化到 `leaudit_*`
|
||||
|
||||
### 5.1.1 当前已落地到代码的链路
|
||||
|
||||
当前项目已经完成第一批骨架接入:
|
||||
|
||||
- 文档文件:
|
||||
- `auditServiceImpl.py` 中从 `LeauditDocumentFile` 解析文件来源
|
||||
- `fileSourceResolver.py` 已支持 `localPath` 和 `ossUrl`
|
||||
- `tasks.py` 执行前统一写入本地临时文档文件
|
||||
- 规则文件:
|
||||
- `auditServiceImpl.py` 创建 `LeauditAuditRun` 时锁定 `ruleVersionId` / `ruleSourceOssUrl`
|
||||
- `ruleVersionResolver.py` 按 `run_id` 解析规则版本来源
|
||||
- `tasks.py` 已支持 `OSS URL -> 本地临时 YAML -> RulesLoader -> NativeRunner`
|
||||
|
||||
因此当前 bridge 的真实职责已经不只是“理论适配层”,而是运行期真正负责把平台侧 DB/OSS 语义转换成 `leaudit` 原生可消费的本地文件输入。
|
||||
|
||||
## 5.2 Bridge 不再负责的事情
|
||||
|
||||
- 不再手工定义 7 个 stage 的顺序
|
||||
- 不再自己负责 phase 判定与 rescue 调度的业务语义
|
||||
- 不再自己维护与原生服务层重复的一套编排逻辑
|
||||
|
||||
## 5.3 平台层不应直接做的事情
|
||||
|
||||
- 不直接构造 `AuditCtx`
|
||||
- 不直接调用 `AuditService.audit(ctx)`
|
||||
- 不直接 import `leaudit.services.*`
|
||||
|
||||
---
|
||||
|
||||
## 6. 建议新增的核心文件
|
||||
|
||||
## 6.1 `audit_ctx_builder.py`
|
||||
|
||||
建议新增:
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/audit_ctx_builder.py`
|
||||
|
||||
职责:
|
||||
|
||||
- 把平台运行对象组装成原生 `AuditCtx`
|
||||
|
||||
建议输入:
|
||||
|
||||
- `run_id`
|
||||
- `document_id`
|
||||
- `document_file_id`
|
||||
- `rule_version_id`
|
||||
- `local_file_path`
|
||||
- `rules_file`
|
||||
- `services`
|
||||
- `audit_config`
|
||||
|
||||
建议输出:
|
||||
|
||||
- `AuditCtx`
|
||||
|
||||
建议职责细分:
|
||||
|
||||
- `build_services(...)`
|
||||
- `build_config(...)`
|
||||
- `build_ctx(...)`
|
||||
|
||||
---
|
||||
|
||||
## 6.2 `audit_service_factory.py`
|
||||
|
||||
建议新增:
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/audit_service_factory.py`
|
||||
|
||||
职责:
|
||||
|
||||
- 构造原生 `AuditService` 及其依赖服务
|
||||
|
||||
建议内部负责:
|
||||
|
||||
- `DocNormalizationService`
|
||||
- `ExtractionService`
|
||||
- `EvaluationService`
|
||||
- `RescueService`
|
||||
- `AuditServices`
|
||||
- `AuditService`
|
||||
|
||||
目标是让平台不直接感知 `leaudit` 服务装配细节。
|
||||
|
||||
---
|
||||
|
||||
## 6.3 `file_source_resolver.py`
|
||||
|
||||
建议新增:
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/file_source_resolver.py`
|
||||
|
||||
职责:
|
||||
|
||||
- 根据 `document_file_id` 找到文件真源
|
||||
- 必要时从 OSS 下载到本地临时路径
|
||||
|
||||
说明:
|
||||
|
||||
当前 `ctx_builder.py` 有一部分类似职责,但建议拆得更明确:
|
||||
|
||||
- 文档文件解析
|
||||
- 规则文件解析
|
||||
- 原生 ctx 组装
|
||||
|
||||
不要混在一个“万能 builder”里。
|
||||
|
||||
---
|
||||
|
||||
## 6.4 `rule_version_resolver.py`
|
||||
|
||||
建议新增:
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/rule_version_resolver.py`
|
||||
|
||||
职责:
|
||||
|
||||
- 根据 `document.type_id` / `binding` 找到当前生效规则版本
|
||||
- 从 OSS 下载 `rules.yaml`
|
||||
- 解析为 `RulesFile`
|
||||
|
||||
说明:
|
||||
|
||||
当前 `rules_loader.py` 偏向本地路径加载器;后续建议保留它做“底层 YAML 解析器”,但把“版本解析 + 规则定位”职责上移到 resolver。
|
||||
|
||||
---
|
||||
|
||||
## 7. 现有文件的重构策略
|
||||
|
||||
## 7.1 `pipeline.py`
|
||||
|
||||
文件:
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/pipeline.py`
|
||||
|
||||
### 当前问题
|
||||
|
||||
- 现在是平台自编排器
|
||||
- 直接串低层 stage
|
||||
|
||||
### 目标定位
|
||||
|
||||
重构后应退化为“薄包装器”。
|
||||
|
||||
建议最终形态:
|
||||
|
||||
```python
|
||||
ctx = build_audit_ctx(...)
|
||||
ctx = await audit_service.audit(ctx)
|
||||
return ctx
|
||||
```
|
||||
|
||||
也就是说:
|
||||
|
||||
- 保留 `pipeline.py` 文件名可以
|
||||
- 但不再保留其“自定义主编排器”角色
|
||||
|
||||
### 建议处理方式
|
||||
|
||||
- 第一阶段:保留现有 `LauditPipeline`,但新增原生 `AuditCtxPipeline`
|
||||
- 第二阶段:调用方切换到新实现
|
||||
- 第三阶段:删除或降级旧自编排逻辑
|
||||
|
||||
---
|
||||
|
||||
## 7.2 `tasks.py`
|
||||
|
||||
文件:
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/tasks.py`
|
||||
|
||||
### 当前问题
|
||||
|
||||
- 现在既负责任务分发,又负责规则路径解析,又直接触发旧 pipeline
|
||||
- 仍依赖 `LEAUDIT_RULES_DIR` 与 `_TYPE_ID_RULES_MAP`
|
||||
|
||||
### 重构目标
|
||||
|
||||
把它变成真正的“任务入口层”:
|
||||
|
||||
- 创建执行上下文
|
||||
- 调 bridge 执行服务
|
||||
- 更新 run 状态
|
||||
|
||||
### 建议改造
|
||||
|
||||
- `dispatch_leaudit_task()` 只做任务分发
|
||||
- `leaudit_process_document()` 只做:
|
||||
- resolve run inputs
|
||||
- call bridge runner
|
||||
- update run status
|
||||
|
||||
不要在这里继续放太多规则解析与 pipeline 内部细节。
|
||||
|
||||
---
|
||||
|
||||
## 7.3 `ctx_builder.py`
|
||||
|
||||
文件:
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/ctx_builder.py`
|
||||
|
||||
### 当前问题
|
||||
|
||||
- 当前 `ExecutionContext` 不是原生 `AuditCtx`
|
||||
- 更像平台内部轻量执行输入对象
|
||||
|
||||
### 重构目标
|
||||
|
||||
有两种方案:
|
||||
|
||||
#### 方案 A:保留并改名
|
||||
|
||||
- 保留文件,但改成“平台预上下文 builder”
|
||||
- 只负责收集文档/文件/规则/本地路径
|
||||
|
||||
#### 方案 B:拆分
|
||||
|
||||
推荐拆分成:
|
||||
|
||||
- `file_source_resolver.py`
|
||||
- `rule_version_resolver.py`
|
||||
- `audit_ctx_builder.py`
|
||||
|
||||
我更建议方案 B,更清晰。
|
||||
|
||||
---
|
||||
|
||||
## 7.4 `rules_loader.py`
|
||||
|
||||
文件:
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/rules_loader.py`
|
||||
|
||||
### 当前问题
|
||||
|
||||
- 更偏“本地 YAML 加载器”
|
||||
- 还没有真正承担“规则版本解析器”职责
|
||||
|
||||
### 重构目标
|
||||
|
||||
让它只做一件事:
|
||||
|
||||
- 输入本地路径 / YAML 文本
|
||||
- 输出 `RulesFile`
|
||||
|
||||
而以下职责交给 resolver:
|
||||
|
||||
- 查绑定表
|
||||
- 查规则版本
|
||||
- 查 OSS 路径
|
||||
- 下载本地临时文件
|
||||
|
||||
---
|
||||
|
||||
## 7.5 `storage_adapter.py`
|
||||
|
||||
文件:
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/storage_adapter.py`
|
||||
|
||||
### 当前问题
|
||||
|
||||
- 结果写入还依赖“按 `document_id` 查最新 run”
|
||||
- 与当前平台自编排写法耦合较深
|
||||
|
||||
### 重构目标
|
||||
|
||||
让它变成“最终 ctx 持久化器”。
|
||||
|
||||
建议接口风格:
|
||||
|
||||
- `persist_run_start(...)`
|
||||
- `persist_ctx_outputs(run_id, ctx, meta)`
|
||||
- `persist_failure(run_id, err)`
|
||||
|
||||
其中 `persist_ctx_outputs()` 从原生 `ctx` 中读取:
|
||||
|
||||
- `normalized_doc`
|
||||
- `extraction`
|
||||
- `phase`
|
||||
- `evaluation`
|
||||
- `fallback_tasks`
|
||||
- `timing`
|
||||
|
||||
统一落表。
|
||||
|
||||
---
|
||||
|
||||
## 8. 目标执行链路
|
||||
|
||||
重构后的理想执行链如下:
|
||||
|
||||
```text
|
||||
AuditServiceImpl.Run()
|
||||
→ 创建 leaudit_audit_runs
|
||||
→ dispatch_leaudit_task(run_id, ...)
|
||||
→ bridge runner
|
||||
→ resolve document file
|
||||
→ resolve rule version
|
||||
→ build AuditServices
|
||||
→ build AuditConfig
|
||||
→ build AuditCtx
|
||||
→ await AuditService.audit(ctx)
|
||||
→ persist ctx outputs
|
||||
→ 返回 run_id
|
||||
```
|
||||
|
||||
这里的关键点是:
|
||||
|
||||
- 平台仍然围绕 run 管理
|
||||
- 引擎仍然围绕 ctx 管理
|
||||
- bridge 负责把两者接起来
|
||||
|
||||
---
|
||||
|
||||
## 9. 推荐实施步骤
|
||||
|
||||
## 阶段 1:并行引入原生 CTX 路线
|
||||
|
||||
目标:
|
||||
|
||||
- 不马上删旧 `pipeline.py`
|
||||
- 先把原生接入链做出来
|
||||
|
||||
建议动作:
|
||||
|
||||
- 新增 `audit_ctx_builder.py`
|
||||
- 新增 `audit_service_factory.py`
|
||||
- 新增一个新的 bridge runner
|
||||
|
||||
例如:
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/native_runner.py`
|
||||
|
||||
职责:
|
||||
|
||||
- 一次性跑通原生 CTX 执行
|
||||
|
||||
---
|
||||
|
||||
## 阶段 2:让任务入口切到 native runner
|
||||
|
||||
目标:
|
||||
|
||||
- `tasks.py` 不再调用旧自编排 pipeline
|
||||
- 改成调用原生 ctx 路线
|
||||
|
||||
建议动作:
|
||||
|
||||
- 改 `dispatch_leaudit_task()`
|
||||
- 改 `leaudit_process_document()`
|
||||
- 保留旧 pipeline 作为 fallback 一小段时间
|
||||
|
||||
---
|
||||
|
||||
## 阶段 3:让结果持久化围绕最终 ctx 收口
|
||||
|
||||
目标:
|
||||
|
||||
- `storage_adapter.py` 从“阶段中途写”转成“最终 ctx 聚合写”
|
||||
|
||||
建议动作:
|
||||
|
||||
- 显式传 `run_id`
|
||||
- 从 ctx 中统一提取产物与汇总
|
||||
|
||||
---
|
||||
|
||||
## 阶段 4:删除旧自编排主链
|
||||
|
||||
目标:
|
||||
|
||||
- 彻底避免出现两套主流程语义
|
||||
|
||||
建议动作:
|
||||
|
||||
- 删除旧 `LauditPipeline.run()` 中的核心编排
|
||||
- 或保留文件但只做代理包装
|
||||
|
||||
---
|
||||
|
||||
## 10. 风险与注意事项
|
||||
|
||||
## 10.1 不要一次性重写全部文件
|
||||
|
||||
建议采用“双轨过渡”:
|
||||
|
||||
- 旧 pipeline 先保留
|
||||
- 新 native runner 并行引入
|
||||
- 验证结果一致后再切换
|
||||
|
||||
## 10.2 不要把平台字段直接塞进原生 CTX
|
||||
|
||||
例如:
|
||||
|
||||
- `run_id`
|
||||
- `trigger_user_id`
|
||||
- `biz_document_id`
|
||||
|
||||
这些属于平台字段,不应该污染 `AuditCtx` 本身。
|
||||
|
||||
建议放在:
|
||||
|
||||
- bridge 层本地元数据对象
|
||||
- 或持久化上下文对象
|
||||
|
||||
## 10.3 不要让平台业务代码直接 import `leaudit.services.*`
|
||||
|
||||
这条边界必须守住。
|
||||
|
||||
即使未来 100% 使用原生 `AuditCtx`,也应该只允许 `leaudit_bridge/` 感知这些类型。
|
||||
|
||||
## 10.4 规则与文件都要先落成本地路径
|
||||
|
||||
原生 `AuditCtx` 依旧是文件路径驱动:
|
||||
|
||||
- 文档文件要落地成本地路径
|
||||
- 规则文件也要落地成本地 YAML 路径
|
||||
|
||||
不要试图让 `leaudit` 直接理解 OSS / DB。
|
||||
|
||||
---
|
||||
|
||||
## 11. 与现有文档的关系
|
||||
|
||||
本方案是以下文档的进一步收敛:
|
||||
|
||||
- `docs/规则编辑/yaml规则在线编辑设计.md`
|
||||
- `docs/规则编辑/跑通全流程所需准备项.md`
|
||||
- `docs/规则编辑/开发任务拆解清单.md`
|
||||
- `docs/规则编辑/为什么仍然需要Bridge适配层.md`
|
||||
|
||||
它的核心新增点是:
|
||||
|
||||
- 不再停留在“Bridge 保留 / 自编排废弃”的原则层
|
||||
- 而是把“如何迁移到原生 AuditCtx 模式”具体化
|
||||
|
||||
---
|
||||
|
||||
## 12. 最终结论
|
||||
|
||||
当前项目的 Bridge 应该正式转向下面这条路线:
|
||||
|
||||
- **平台负责 run / OSS / DB / 权限 / API**
|
||||
- **Bridge 负责 AuditCtx 接入适配**
|
||||
- **`leaudit` 原生服务层负责评查执行**
|
||||
|
||||
所以重构的最终方向不是“继续完善旧 pipeline”,而是:
|
||||
|
||||
> **用原生 `AuditCtx + AuditService.audit(ctx)` 替代平台自编排主链,并把 Bridge 重塑为适配层。**
|
||||
|
||||
这是当前项目最稳、最可维护、最符合 `leaudit` 演进方向的接入方案。
|
||||
@@ -0,0 +1,711 @@
|
||||
# LeAudit 开发任务拆解清单
|
||||
|
||||
## 1. 目标
|
||||
|
||||
基于以下两份文档,进一步拆出一份可执行的开发任务清单,并尽量精确到建议修改文件:
|
||||
|
||||
- `docs/规则编辑/yaml规则在线编辑设计.md`
|
||||
- `docs/规则编辑/跑通全流程所需准备项.md`
|
||||
|
||||
本清单覆盖的目标不是单点“规则编辑”,而是完整业务链路:
|
||||
|
||||
```text
|
||||
上传文档
|
||||
→ 获取文件真源
|
||||
→ OCR
|
||||
→ 抽取
|
||||
→ 评查
|
||||
→ 结果落库
|
||||
→ 查询运行状态 / 结果
|
||||
→ 再扩展到 YAML 在线编辑 / 发布 / 回滚
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. 当前代码现状摘要
|
||||
|
||||
在开始任务前,先明确当前代码的真实状态。
|
||||
|
||||
> 2026-04-27 补充结论:结合 `/home/wren-dev/Porject/leaudit/src` 源码确认,
|
||||
> 当前 `leaudit` 的正式执行入口应视为:
|
||||
> `AuditCtx` + `AuditService.audit(ctx)`。
|
||||
> 因此本清单中的后续任务,默认都以“保留 Bridge,但禁止平台自己重写主流程编排”为前提。
|
||||
|
||||
### 2.1 已有骨架
|
||||
|
||||
- 评查控制器:
|
||||
- `fastapi_modules/fastapi_leaudit/controllers/auditController.py`
|
||||
- 评查服务接口/实现:
|
||||
- `fastapi_modules/fastapi_leaudit/services/auditService.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/auditServiceImpl.py`
|
||||
- bridge 层:
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/pipeline.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/tasks.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/rules_loader.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/storage_adapter.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/ctx_builder.py`
|
||||
- 模型:
|
||||
- `fastapi_modules/fastapi_leaudit/models/leauditDocument.py`
|
||||
- `fastapi_modules/fastapi_leaudit/models/leauditDocumentFile.py`
|
||||
- `fastapi_modules/fastapi_leaudit/models/leauditAuditRun.py`
|
||||
- 规则服务接口骨架:
|
||||
- `fastapi_modules/fastapi_leaudit/services/ruleService.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/vo/ruleVo.py`
|
||||
|
||||
### 2.2 当前主要缺口
|
||||
|
||||
- `AuditServiceImpl.Run()` 已可创建 run 并触发 NativeRunner 任务
|
||||
- `GetResult()` 已可查询 `leaudit_rule_results`
|
||||
- 规则文件主链已开始支持 `run -> rule_version -> oss_url -> 本地临时 YAML`
|
||||
- `tasks.py` 仍保留 `LEAUDIT_RULES_DIR` 和 `_TYPE_ID_RULES_MAP` 作为 fallback
|
||||
- 尚未看到规则编辑控制器与 `RuleServiceImpl`
|
||||
- 尚未形成统一 OSS 文件服务
|
||||
- 结果写入仍有“按 document_id 找最新 run”的简化逻辑
|
||||
|
||||
因此,任务拆解应该分两层:
|
||||
|
||||
- **P0:先把上传 → OCR → 抽取 → 评查 → 查询打通**
|
||||
- **P1/P2:再把规则 OSS 化、版本化、在线编辑化**
|
||||
|
||||
---
|
||||
|
||||
## 3. 分阶段开发任务清单
|
||||
|
||||
## P0:先打通最小评查闭环
|
||||
|
||||
目标:
|
||||
|
||||
```text
|
||||
上传文档
|
||||
→ 创建 document / file / run
|
||||
→ bridge 执行 OCR / 抽取 / 评查
|
||||
→ 落库
|
||||
→ 查到 run 状态与结果
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### P0-1:补齐评查服务入口
|
||||
|
||||
#### 任务说明
|
||||
|
||||
把 `POST /api/audit/run` 从“只抛异常”改成真正可执行的评查入口。
|
||||
|
||||
#### 需要修改
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/auditServiceImpl.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/auditService.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/Dto/auditDto.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/vo/auditVo.py`
|
||||
|
||||
#### 具体工作
|
||||
|
||||
- 在 `Run()` 中完成:
|
||||
- 校验文档是否存在
|
||||
- 查当前可执行文件版本
|
||||
- 计算 `run_no`
|
||||
- 创建 `leaudit_audit_runs`
|
||||
- 调用 `dispatch_leaudit_task()`
|
||||
- 返回 `AuditRunVO`
|
||||
- 调整 `IAuditService.Run()` 的接口定义,使其与实现参数一致
|
||||
- 如有必要,为 `AuditRunDTO` 增加可选字段:
|
||||
- `documentFileId`
|
||||
- `force`
|
||||
- `ruleType`
|
||||
- `ruleVersionId`(可选,便于指定版本重跑)
|
||||
|
||||
#### 产出目标
|
||||
|
||||
- 调用 `/api/audit/run` 不再报 “Celery 任务集成待实现”
|
||||
- 至少能创建 run 并触发 bridge 层执行
|
||||
|
||||
---
|
||||
|
||||
### P0-2:补齐 run 创建与状态更新逻辑
|
||||
|
||||
#### 任务说明
|
||||
|
||||
把 run 作为整条链的中心对象,保证每次执行都能明确追踪。
|
||||
|
||||
#### 需要修改
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/models/leauditAuditRun.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/auditServiceImpl.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/tasks.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/storage_adapter.py`
|
||||
|
||||
#### 具体工作
|
||||
|
||||
- 在触发评查前创建 `leaudit_audit_runs`
|
||||
- 明确 run 初始字段:
|
||||
- `status=pending`
|
||||
- `phase=normalize` 或空
|
||||
- `startedAt`
|
||||
- `documentFileId`
|
||||
- `ruleSetId`
|
||||
- `ruleVersionId`
|
||||
- 执行链中显式传递 `run_id`
|
||||
- `storage_adapter.py` 所有落库方法改为:
|
||||
- 不再“按 `document_id` 查最新 run”
|
||||
- 统一显式使用 `run_id`
|
||||
|
||||
#### 产出目标
|
||||
|
||||
- 所有结果写入能严格绑定到唯一 `run_id`
|
||||
- 避免多次重跑 / 并发时结果串写
|
||||
|
||||
---
|
||||
|
||||
### P0-3:补齐文件输入链
|
||||
|
||||
#### 任务说明
|
||||
|
||||
在执行 OCR 前,明确“这次评查使用哪一个文件”。
|
||||
|
||||
#### 需要修改
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/models/leauditDocument.py`
|
||||
- `fastapi_modules/fastapi_leaudit/models/leauditDocumentFile.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/ctx_builder.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/auditServiceImpl.py`
|
||||
|
||||
#### 建议新增
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/services/documentService.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/documentServiceImpl.py`
|
||||
或
|
||||
- `fastapi_modules/fastapi_leaudit/services/fileService.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/fileServiceImpl.py`
|
||||
|
||||
#### 具体工作
|
||||
|
||||
- 增加“获取当前有效文件”的服务方法
|
||||
- 根据 `document_id` 找到当前激活的 `leaudit_document_files`
|
||||
- 如果文件在 OSS,先下载到本地临时路径
|
||||
- 给 pipeline 提供稳定的 `file_path`
|
||||
|
||||
#### 产出目标
|
||||
|
||||
- 评查入口不依赖调用方直接传原始字节
|
||||
- 可以从数据库+文件真源独立还原执行输入
|
||||
|
||||
---
|
||||
|
||||
### P0-4:打通 bridge 任务入口
|
||||
|
||||
#### 任务说明
|
||||
|
||||
让 `dispatch_leaudit_task()` 真正成为评查执行入口,而不是演示性同步封装。
|
||||
但注意:这里的“执行入口”不是继续扩写平台自编排 pipeline,而是逐步过渡到:
|
||||
|
||||
```text
|
||||
build AuditCtx
|
||||
→ call AuditService.audit(ctx)
|
||||
→ persist ctx outputs
|
||||
```
|
||||
|
||||
#### 需要修改
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/tasks.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/__init__.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/pipeline.py`
|
||||
|
||||
#### 建议新增
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/audit_ctx_builder.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/audit_service_factory.py`
|
||||
|
||||
#### 具体工作
|
||||
|
||||
- 统一 `dispatch_leaudit_task()` 的入参:
|
||||
- `run_id`
|
||||
- `document_id`
|
||||
- `document_file_id`
|
||||
- `rules_path` 或 `rule_version_id`
|
||||
- 可选 `trigger_user_id`
|
||||
- 逐步去掉 `source_port` 作为主上下文依赖
|
||||
- 允许先同步执行,后续再切 Celery
|
||||
- 逐步让 `pipeline.py` 退化为薄包装层,而不是 7 阶段自编排器
|
||||
- 在 bridge 内部统一完成:
|
||||
- `AuditServices` 构造
|
||||
- `AuditConfig` 构造
|
||||
- 原生 `AuditCtx` 构造
|
||||
- `AuditService.audit(ctx)` 调用
|
||||
|
||||
#### 产出目标
|
||||
|
||||
- 业务层只调一个稳定入口
|
||||
- bridge 层掌控实际执行上下文
|
||||
- 主流程编排回归 `leaudit` 原生服务层
|
||||
|
||||
---
|
||||
|
||||
### P0-5:补齐结果查询接口
|
||||
|
||||
#### 任务说明
|
||||
|
||||
不仅要能跑,还要能看到结果。
|
||||
|
||||
#### 需要修改
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/auditServiceImpl.py`
|
||||
- `fastapi_modules/fastapi_leaudit/controllers/auditController.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/vo/auditVo.py`
|
||||
|
||||
#### 具体工作
|
||||
|
||||
- `GetRunStatus()` 查询真实 run 状态
|
||||
- `GetResult()` 从以下表联合查询:
|
||||
- `leaudit_audit_runs`
|
||||
- `leaudit_rule_results`
|
||||
- 可选 `leaudit_field_results`
|
||||
- 把 `rules=[]` 改成真实规则级返回
|
||||
- 如有必要,为 `AuditResultVO` 增加:
|
||||
- `timing`
|
||||
- `fields`
|
||||
- `errors`
|
||||
|
||||
#### 产出目标
|
||||
|
||||
- `/api/audit/result/{RunId}` 能返回真实评查结果
|
||||
|
||||
---
|
||||
|
||||
### P0-6:补齐结果落库结构
|
||||
|
||||
#### 任务说明
|
||||
|
||||
当前 `StorageAdapter` 已有雏形,但还需要工程化加固。
|
||||
|
||||
#### 需要修改
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/storage_adapter.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/result_adapter.py`
|
||||
|
||||
#### 具体工作
|
||||
|
||||
- 所有保存方法显式接收 `run_id`
|
||||
- 补 `run_metrics` 写入
|
||||
- 补 `run_errors` 写入
|
||||
- 梳理 `artifacts` 与 `field_results` 的最小必要字段
|
||||
- 保持 `rule_results` 与 `AuditResultVO` 的结构一致
|
||||
|
||||
#### 产出目标
|
||||
|
||||
- 结果写入不再依赖“最新 run”猜测
|
||||
- 后续前端查询更稳定
|
||||
|
||||
---
|
||||
|
||||
## P1:把规则执行链切到 OSS + DB
|
||||
|
||||
目标:
|
||||
|
||||
```text
|
||||
文档类型
|
||||
→ 绑定表
|
||||
→ 规则集
|
||||
→ 当前版本
|
||||
→ OSS YAML
|
||||
→ 下载本地临时文件
|
||||
→ leaudit 执行
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### P1-1:补规则读取服务
|
||||
|
||||
#### 任务说明
|
||||
|
||||
把规则加载从“本地目录路径”升级成“DB + OSS + 临时文件”。
|
||||
|
||||
#### 当前状态
|
||||
|
||||
这一项已经完成第一阶段落地:
|
||||
|
||||
- `auditServiceImpl.py` 会在建 run 时锁定 `ruleVersionId` 与 `ruleSourceOssUrl`
|
||||
- `tasks.py` 会优先按 `run_id` 解析规则来源
|
||||
- `ruleVersionResolver.py` 会把 OSS YAML 下载到本地临时文件
|
||||
- `RulesLoader.load(local_path)` 已接入执行链
|
||||
|
||||
当前剩余工作已经从“能不能执行”变成“如何把上传 / 发布 / 缓存 / 回收做完整”。
|
||||
|
||||
#### 需要修改
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/tasks.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/nativeRunner.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/auditServiceImpl.py`
|
||||
|
||||
#### 建议新增
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/ruleVersionResolver.py`
|
||||
|
||||
#### 具体工作
|
||||
|
||||
- 根据 run 锁定的 `rule_version_id` / `rule_source_oss_url` 解析规则来源
|
||||
- 下载 YAML 到本地临时文件
|
||||
- 校验 `rule_source_sha256`
|
||||
- 调 `load_rules_file(local_path)`
|
||||
- 继续保留本地 `rules/` 作为 fallback
|
||||
|
||||
#### 产出目标
|
||||
|
||||
- 执行时优先走 run 绑定规则版本,而不是 `_TYPE_ID_RULES_MAP`
|
||||
- 原生 `AuditCtx.rules_file` 由 bridge 正式注入,而不是平台手工绕过服务编排层
|
||||
|
||||
---
|
||||
|
||||
### P1-2:补统一 OSS 文件服务
|
||||
|
||||
#### 任务说明
|
||||
|
||||
项目当前有 OSS 配置,但缺少统一文件服务。
|
||||
|
||||
#### 需要修改
|
||||
|
||||
- `fastapi_admin/config/_settings.py`(仅在配置不够时)
|
||||
|
||||
#### 建议新增
|
||||
|
||||
- `fastapi_common/fastapi_common_storage/__init__.py`
|
||||
- `fastapi_common/fastapi_common_storage/oss_client.py`
|
||||
- `fastapi_common/fastapi_common_storage/oss_path_utils.py`
|
||||
|
||||
如果不想放到 `fastapi_common/`,也可以先放:
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/services/ossService.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/ossServiceImpl.py`
|
||||
|
||||
#### 具体工作
|
||||
|
||||
- 提供统一方法:
|
||||
- 上传文件到 OSS
|
||||
- 下载到本地临时文件
|
||||
- 计算 / 校验 sha256
|
||||
- 删除临时文件
|
||||
- 同时兼容:
|
||||
- 规则文件
|
||||
- 原始文档
|
||||
- 评查产物
|
||||
|
||||
#### 产出目标
|
||||
|
||||
- 文档文件和规则文件都可以共用一套对象存储服务
|
||||
|
||||
---
|
||||
|
||||
### P1-3:补规则版本与绑定的查询模型
|
||||
|
||||
#### 任务说明
|
||||
|
||||
当前代码里还没有看到对应规则表的 ORM / 查询对象,后续查询会比较痛苦。
|
||||
|
||||
#### 建议新增
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/models/leauditRuleSet.py`
|
||||
- `fastapi_modules/fastapi_leaudit/models/leauditRuleVersion.py`
|
||||
- `fastapi_modules/fastapi_leaudit/models/leauditRuleTypeBinding.py`
|
||||
- 更新 `fastapi_modules/fastapi_leaudit/models/__init__.py`
|
||||
|
||||
#### 具体工作
|
||||
|
||||
- 为规则集、规则版本、绑定表建立 ORM
|
||||
- 后续服务层不必到处手写 SQL
|
||||
|
||||
#### 产出目标
|
||||
|
||||
- 规则管理服务、规则解析服务都能清晰建模
|
||||
|
||||
---
|
||||
|
||||
## P2:开放 YAML 在线编辑 / 发布 / 回滚
|
||||
|
||||
目标:
|
||||
|
||||
让规则成为后台可管理资产,而不是服务器上的裸文件。
|
||||
|
||||
---
|
||||
|
||||
### P2-1:补规则控制器与服务实现
|
||||
|
||||
#### 任务说明
|
||||
|
||||
当前只有 `IRuleService` 接口,需要真正落地规则后台。
|
||||
|
||||
#### 需要修改
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/services/ruleService.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/vo/ruleVo.py`
|
||||
|
||||
#### 建议新增
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/controllers/ruleController.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/ruleServiceImpl.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/Dto/ruleDto.py`
|
||||
|
||||
#### 具体工作
|
||||
|
||||
- 完成接口:
|
||||
- 列表
|
||||
- 版本历史
|
||||
- 查看内容
|
||||
- 保存版本
|
||||
- 校验
|
||||
- 发布
|
||||
- 回滚
|
||||
- 为 controller 增加路由
|
||||
|
||||
#### 产出目标
|
||||
|
||||
- 后台可以真正管理规则
|
||||
|
||||
---
|
||||
|
||||
### P2-2:补规则内容查看/保存接口
|
||||
|
||||
#### 建议新增
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/domian/Dto/ruleContentDto.py`
|
||||
|
||||
#### 具体工作
|
||||
|
||||
- 定义:
|
||||
- 保存 YAML 文本请求 DTO
|
||||
- 规则校验响应 VO
|
||||
- 规则内容响应 VO
|
||||
- 从 OSS 读回内容展示给前端
|
||||
- 新版本保存时先写 OSS,再写 DB
|
||||
|
||||
#### 产出目标
|
||||
|
||||
- 前端可以拿到 YAML 文本并保存新版本
|
||||
|
||||
---
|
||||
|
||||
### P2-3:补 YAML 语法校验 + DSL 语义校验
|
||||
|
||||
#### 建议新增
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/services/ruleValidationService.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/ruleValidationServiceImpl.py`
|
||||
|
||||
#### 具体工作
|
||||
|
||||
- YAML 解析校验
|
||||
- `leaudit` DSL schema 校验
|
||||
- 提取 metadata 快照
|
||||
- 形成标准错误列表
|
||||
|
||||
#### 产出目标
|
||||
|
||||
- 发布前可拦截坏规则
|
||||
|
||||
---
|
||||
|
||||
### P2-4:补发布、回滚、审计
|
||||
|
||||
#### 需要修改
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/ruleServiceImpl.py`
|
||||
|
||||
#### 建议新增
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/models/leauditRulePublishLog.py`
|
||||
- `fastapi_modules/fastapi_leaudit/models/leauditRuleValidationLog.py`
|
||||
|
||||
如果暂时不建 ORM,也至少需要:
|
||||
|
||||
- 对应 SQL migration / 建表脚本
|
||||
|
||||
#### 具体工作
|
||||
|
||||
- 发布时更新 `current_version_id`
|
||||
- 写发布日志
|
||||
- 回滚时写回滚日志
|
||||
- 记录操作者与时间
|
||||
|
||||
#### 产出目标
|
||||
|
||||
- 规则变更具备可审计性
|
||||
|
||||
---
|
||||
|
||||
## P3:补平台级工程能力
|
||||
|
||||
目标:
|
||||
|
||||
让系统从“能跑”升级到“可持续运行”。
|
||||
|
||||
---
|
||||
|
||||
### P3-1:Celery / Redis 正式接入
|
||||
|
||||
#### 需要修改
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/tasks.py`
|
||||
- `fastapi_modules/fastapi_leaudit/tasks/__init__.py`
|
||||
- `fastapi_admin/config/_settings.py`
|
||||
|
||||
#### 建议新增
|
||||
|
||||
- `fastapi_common/fastapi_common_cache/redis_pool.py`
|
||||
- `fastapi_modules/fastapi_leaudit/tasks/celery_app.py`
|
||||
|
||||
#### 具体工作
|
||||
|
||||
- 同步任务改为异步分发
|
||||
- 配置任务超时 / 重试 / 队列
|
||||
|
||||
---
|
||||
|
||||
### P3-2:规则缓存与发布失效
|
||||
|
||||
#### 需要修改
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/rules_loader.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/ruleServiceImpl.py`
|
||||
|
||||
#### 具体工作
|
||||
|
||||
- 规则缓存 key 改为 `rule_version_id` 或 `oss_url + sha256`
|
||||
- 发布后清缓存
|
||||
- 多 worker 时设计统一失效策略
|
||||
|
||||
---
|
||||
|
||||
### P3-3:结果增强与诊断能力
|
||||
|
||||
#### 需要修改
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/storage_adapter.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/auditServiceImpl.py`
|
||||
|
||||
#### 具体工作
|
||||
|
||||
- 完整落 `run_metrics`
|
||||
- 完整落 `run_errors`
|
||||
- 补 `rescue_outcomes`
|
||||
- 前端可查看错误详情、阶段耗时
|
||||
|
||||
---
|
||||
|
||||
## 4. 建议新增文件总表
|
||||
|
||||
下面是我建议优先考虑新增的文件,便于你按模块建任务:
|
||||
|
||||
### 规则管理
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/controllers/ruleController.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/ruleServiceImpl.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/ruleValidationService.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/ruleValidationServiceImpl.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/ruleResolverService.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/ruleResolverServiceImpl.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/Dto/ruleDto.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/Dto/ruleContentDto.py`
|
||||
|
||||
### 规则 ORM
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/models/leauditRuleSet.py`
|
||||
- `fastapi_modules/fastapi_leaudit/models/leauditRuleVersion.py`
|
||||
- `fastapi_modules/fastapi_leaudit/models/leauditRuleTypeBinding.py`
|
||||
- `fastapi_modules/fastapi_leaudit/models/leauditRulePublishLog.py`
|
||||
- `fastapi_modules/fastapi_leaudit/models/leauditRuleValidationLog.py`
|
||||
|
||||
### 文件与存储
|
||||
|
||||
- `fastapi_common/fastapi_common_storage/oss_client.py`
|
||||
- `fastapi_common/fastapi_common_storage/oss_path_utils.py`
|
||||
|
||||
### 原生 AuditCtx 接入
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/audit_ctx_builder.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/audit_service_factory.py`
|
||||
|
||||
### 文档执行输入
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/services/documentService.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/documentServiceImpl.py`
|
||||
|
||||
### 异步任务 / 缓存
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/tasks/celery_app.py`
|
||||
- `fastapi_common/fastapi_common_cache/redis_pool.py`
|
||||
|
||||
---
|
||||
|
||||
## 5. 建议优先修改文件总表
|
||||
|
||||
如果按“先把主链跑通”的角度,最优先改的文件是:
|
||||
|
||||
1. `fastapi_modules/fastapi_leaudit/services/impl/auditServiceImpl.py`
|
||||
2. `fastapi_modules/fastapi_leaudit/leaudit_bridge/tasks.py`
|
||||
3. `fastapi_modules/fastapi_leaudit/leaudit_bridge/storage_adapter.py`
|
||||
4. `fastapi_modules/fastapi_leaudit/leaudit_bridge/rules_loader.py`
|
||||
5. `fastapi_modules/fastapi_leaudit/leaudit_bridge/ctx_builder.py`
|
||||
6. `fastapi_modules/fastapi_leaudit/domian/vo/auditVo.py`
|
||||
7. `fastapi_modules/fastapi_leaudit/services/ruleService.py`
|
||||
8. `fastapi_modules/fastapi_leaudit/domian/vo/ruleVo.py`
|
||||
9. `fastapi_modules/fastapi_leaudit/models/__init__.py`
|
||||
10. `fastapi_admin/config/_settings.py`
|
||||
|
||||
---
|
||||
|
||||
## 6. 推荐执行顺序
|
||||
|
||||
如果按最稳妥的方式推进,我建议这样做:
|
||||
|
||||
### 第一阶段:只做主链闭环
|
||||
|
||||
- 改 `AuditServiceImpl`
|
||||
- 改 `tasks.py`
|
||||
- 新增 `audit_ctx_builder.py`
|
||||
- 新增 `audit_service_factory.py`
|
||||
- 改 `storage_adapter.py`
|
||||
- 改 `GetResult()`
|
||||
|
||||
目标:先能跑通上传后评查与结果查询。
|
||||
|
||||
### 第二阶段:切规则到 OSS + DB
|
||||
|
||||
- 补规则 ORM
|
||||
- 改 `rules_loader.py`
|
||||
- 加 OSS 文件服务
|
||||
|
||||
目标:评查执行真正使用数据库发布的规则版本。
|
||||
|
||||
### 第三阶段:开放规则后台
|
||||
|
||||
- 加 `ruleController.py`
|
||||
- 加 `RuleServiceImpl`
|
||||
- 加校验服务
|
||||
- 加发布/回滚日志
|
||||
|
||||
目标:前端可编辑、发布、回滚 YAML。
|
||||
|
||||
### 第四阶段:工程化增强
|
||||
|
||||
- Celery / Redis
|
||||
- 缓存失效
|
||||
- 审计
|
||||
- metrics / errors / rescue
|
||||
|
||||
目标:从“能跑”变成“可运营”。
|
||||
|
||||
---
|
||||
|
||||
## 7. 一句话结论
|
||||
|
||||
如果你要的是“基于当前代码库,把功能拆成能开发的任务”,那么真正的主线不是先做编辑器,而是:
|
||||
|
||||
1. 先把 `auditServiceImpl + bridge + storageAdapter` 打通
|
||||
2. 再把 `rules_loader` 从本地目录切到 `OSS + DB`
|
||||
3. 最后再做 `ruleController + RuleServiceImpl + 校验/发布/回滚`
|
||||
|
||||
也就是说:
|
||||
|
||||
- **第一优先级是评查主链**
|
||||
- **第二优先级是规则执行链**
|
||||
- **第三优先级才是规则编辑后台**
|
||||
|
||||
这样开发成本最低,验证路径也最清晰。
|
||||
@@ -0,0 +1,721 @@
|
||||
## 背景
|
||||
|
||||
当前 `leaudit-platform` 已经完成第一阶段原生执行链接入:
|
||||
|
||||
- 文档文件已开始支持 `ossUrl -> 本地临时文件 -> NativeRunner`
|
||||
- 规则文件已开始支持 `run -> rule_source_oss_url -> 本地临时 YAML -> NativeRunner`
|
||||
- 执行核心已经切到原生 `AuditCtx` / `AuditService`
|
||||
- `leaudit` 核心不修改,平台继续通过 bridge 适配
|
||||
|
||||
但当前仍有三个关键缺口没有真正完成闭环:
|
||||
|
||||
1. 还没有统一 OSS 服务,文档/规则下载仍是临时实现
|
||||
2. 还没有完整的规则 YAML 管理后端能力
|
||||
3. 还没有把“上传 → OCR → 抽取 → 评查 → 查询结果”整条链路正式联调跑通
|
||||
|
||||
因此需要制定一份统一实施计划,按依赖顺序把这三块一起完成。
|
||||
|
||||
---
|
||||
|
||||
## 总目标
|
||||
|
||||
本轮实施的目标分为三部分:
|
||||
|
||||
### 目标 1:统一 OSS 服务
|
||||
|
||||
把当前 bridge 中分散的下载逻辑收敛成统一的 OSS / MinIO 能力层,供文档文件、规则 YAML、运行产物三类对象复用。
|
||||
|
||||
### 目标 2:补齐规则管理后端
|
||||
|
||||
支持规则 YAML 的:
|
||||
|
||||
- 内容校验
|
||||
- 新版本上传
|
||||
- 版本持久化
|
||||
- 当前版本发布
|
||||
- 历史版本回滚
|
||||
- 版本正文读取
|
||||
|
||||
并为后续在线编辑界面提供稳定后端。
|
||||
|
||||
### 目标 3:打通完整评查流程
|
||||
|
||||
跑通最小完整业务闭环:
|
||||
|
||||
```text
|
||||
上传文档
|
||||
-> 文档入库
|
||||
-> 文件入 OSS / 或存在可访问真源
|
||||
-> 创建 run
|
||||
-> 下载文档到本地临时文件
|
||||
-> 下载规则 YAML 到本地临时文件
|
||||
-> Native AuditCtx 执行
|
||||
-> OCR / 抽取 / 评查
|
||||
-> 结果写回 leaudit_*
|
||||
-> 查询运行状态和结果
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 当前已完成基线
|
||||
|
||||
### 已完成
|
||||
|
||||
- `AuditServiceImpl.Run()` 已可创建 `leaudit_audit_runs` 并触发执行
|
||||
- `AuditServiceImpl.GetResult()` 已可读取 `leaudit_rule_results`
|
||||
- 文档文件执行链已接入:
|
||||
- `LeauditDocumentFile.localPath`
|
||||
- `LeauditDocumentFile.ossUrl`
|
||||
- 规则文件执行链已接入:
|
||||
- `LeauditAuditRun.ruleVersionId`
|
||||
- `LeauditAuditRun.ruleSourceOssUrl`
|
||||
- `run -> oss_url -> 本地临时 YAML -> RulesLoader -> NativeRunner`
|
||||
- bridge 层已经新增:
|
||||
- `auditCtxBuilder.py`
|
||||
- `auditServiceFactory.py`
|
||||
- `nativeRunner.py`
|
||||
- `fileSourceResolver.py`
|
||||
- `ruleVersionResolver.py`
|
||||
|
||||
### 尚未完成
|
||||
|
||||
- 统一 OSS client / path utils 还未落地
|
||||
- 规则上传 / 发布 / 回滚 / 读正文接口还未落地
|
||||
- YAML 语法校验和 DSL 语义校验还未形成正式服务
|
||||
- `run_metrics` / `run_errors` / `rescue_outcomes` 持久化还未补齐
|
||||
- 全流程 E2E 还未完成真实联调
|
||||
|
||||
### 当前实施进展(2026-04-27)
|
||||
|
||||
- `M1` 已开始落地:
|
||||
- 已补 `OSS_USE_SSL` / `OSS_PRESIGN_EXPIRE_SECONDS` 等配置项
|
||||
- 已新增统一 `OssClient` 与 `OssPathUtils`
|
||||
- `fileSourceResolver.py` / `ruleVersionResolver.py` 已开始接统一 OSS 服务
|
||||
- `M2` 已开始落地:
|
||||
- 已新增 `ruleValidator.py`
|
||||
- 已补 `ruleServiceImpl.py`
|
||||
- 已新增 `ruleController.py`
|
||||
- 已补规则版本创建 / 校验 / 发布 / 回滚 / 正文读取接口骨架
|
||||
- `M3` 已开始落地:
|
||||
- 已按 `docs/AuditCtx深度解读-2026-04-27.html` 复用原生 `AuditCtx` 语义
|
||||
- 已开始把 `ctx.timing` / `ctx.fallback_tasks` / 抽取错误落库到运行级表
|
||||
- `M4` 仍待继续实施
|
||||
|
||||
---
|
||||
|
||||
## 总体实施分期
|
||||
|
||||
建议分 4 个里程碑实施:
|
||||
|
||||
- `M1`:统一 OSS 基础设施
|
||||
- `M2`:规则管理后端能力
|
||||
- `M3`:执行链正式化与结果持久化补齐
|
||||
- `M4`:全流程联调与验收
|
||||
|
||||
实施顺序必须按依赖推进,不建议跳步。
|
||||
|
||||
---
|
||||
|
||||
## M1:统一 OSS 基础设施
|
||||
|
||||
### M1-1 配置模型标准化
|
||||
|
||||
#### 目标
|
||||
|
||||
统一 OSS / MinIO 的配置读取方式,为后续文档和规则服务提供底层能力。
|
||||
|
||||
#### 需要处理
|
||||
|
||||
- 新增或确认 OSS 配置项
|
||||
- 保证配置能从 `app.toml` / 环境变量进入 Settings
|
||||
- 补齐 `__init__.pyi` 类型声明
|
||||
|
||||
#### 建议配置项
|
||||
|
||||
- `OSS_ENDPOINT`
|
||||
- `OSS_BUCKET`
|
||||
- `OSS_ACCESS_KEY`
|
||||
- `OSS_SECRET_KEY`
|
||||
- `OSS_REGION`
|
||||
- `OSS_USE_SSL`
|
||||
- `OSS_PUBLIC_BASE_URL`
|
||||
- `OSS_PRESIGN_EXPIRE_SECONDS`
|
||||
|
||||
#### 需要修改的文件
|
||||
|
||||
- `fastapi_admin/config/_settings.py`
|
||||
- `fastapi_admin/config/__init__.pyi`
|
||||
- 环境配置文件(如有)
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- 业务代码可以统一 import OSS 配置
|
||||
- IDE 类型可识别
|
||||
- 开发 / 测试 / 生产环境可分离
|
||||
|
||||
### M1-2 统一 OSS Client
|
||||
|
||||
#### 目标
|
||||
|
||||
提供平台统一的上传、下载、presign 和对象存在性判断能力。
|
||||
|
||||
#### 建议新增文件
|
||||
|
||||
- `fastapi_common/fastapi_common_storage/oss_client.py`
|
||||
- `fastapi_common/fastapi_common_storage/oss_path_utils.py`
|
||||
|
||||
#### 最低能力要求
|
||||
|
||||
- 上传 bytes
|
||||
- 上传文本
|
||||
- 上传本地文件
|
||||
- 下载 bytes
|
||||
- 下载到本地临时文件
|
||||
- 判断对象是否存在
|
||||
- 生成 presigned URL
|
||||
- 统一返回 object key / url
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- 规则和文档两条链都能复用同一套 OSS 接口
|
||||
- bridge 代码中不再直接写 `urlopen`
|
||||
|
||||
### M1-3 统一 OSS 路径生成工具
|
||||
|
||||
#### 目标
|
||||
|
||||
把文档中约定的路径规范落实为代码工具,避免业务层散写路径字符串。
|
||||
|
||||
#### 路径规范
|
||||
|
||||
```text
|
||||
bdocs/{region}/{type_code}/{doc_id}/{version}/{file_role}.{ext}
|
||||
artifacts/{region}/{run_id}/{artifact_type}/{detail}.{ext}
|
||||
rules/{rule_type}/{version_no}/rules.yaml
|
||||
rules/{rule_type}/{version_no}/validation_report.json
|
||||
```
|
||||
|
||||
#### 需要实现的方法
|
||||
|
||||
- `BuildBusinessDocKey(...)`
|
||||
- `BuildArtifactKey(...)`
|
||||
- `BuildRuleYamlKey(...)`
|
||||
- `BuildRuleValidationReportKey(...)`
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- 上传文档、上传规则、记录产物时统一生成 key
|
||||
- 路径规范不再散落在业务代码中
|
||||
|
||||
### M1-4 接管当前 bridge 下载逻辑
|
||||
|
||||
#### 目标
|
||||
|
||||
把 bridge 中当前临时下载逻辑接到统一 OSS 服务。
|
||||
|
||||
#### 需要修改的文件
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/fileSourceResolver.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/ruleVersionResolver.py`
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- 文档下载统一经 OSS client
|
||||
- 规则下载统一经 OSS client
|
||||
- 支持正式私有桶而不只依赖公开 HTTP/HTTPS
|
||||
|
||||
---
|
||||
|
||||
## M2:规则管理后端能力
|
||||
|
||||
### M2-1 补规则服务接口
|
||||
|
||||
#### 目标
|
||||
|
||||
把规则服务从当前骨架扩展为完整规则生命周期服务。
|
||||
|
||||
#### 建议能力
|
||||
|
||||
- 查询规则集列表
|
||||
- 查询规则版本列表
|
||||
- 查询指定版本正文
|
||||
- 校验 YAML
|
||||
- 创建新规则版本
|
||||
- 发布规则版本
|
||||
- 回滚规则版本
|
||||
|
||||
#### 需要修改或新增的文件
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/services/ruleService.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/ruleServiceImpl.py`
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- Service 层接口稳定
|
||||
- 后续 controller 只做 DTO 拆值与返回
|
||||
|
||||
### M2-2 补 DTO / VO / BO
|
||||
|
||||
#### 目标
|
||||
|
||||
为规则管理接口提供规范化入参与出参。
|
||||
|
||||
#### 建议新增文件
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/domian/Dto/ruleVersionCreateDto.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/Dto/ruleValidateDto.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/Dto/rulePublishDto.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/vo/ruleVersionVo.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/vo/ruleContentVo.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/bo/ruleVersionCreateBo.py`
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- 字段统一 lowerCamelCase
|
||||
- Controller 只接 DTO
|
||||
- Service 返回 VO / BO
|
||||
|
||||
### M2-3 实现规则校验服务
|
||||
|
||||
#### 目标
|
||||
|
||||
为在线编辑和发布提供保存前 / 发布前的双层校验能力。
|
||||
|
||||
#### 校验层次
|
||||
|
||||
##### 第一层:YAML 语法校验
|
||||
|
||||
- 缩进
|
||||
- 基本结构
|
||||
- 能否解析
|
||||
|
||||
##### 第二层:LeAudit DSL 语义校验
|
||||
|
||||
- `metadata` 是否完整
|
||||
- `type_id` / `name` / `version` 是否有效
|
||||
- rule / extract / stage 结构是否合法
|
||||
- 字段引用是否一致
|
||||
- phase / risk / score / activate_if 是否符合 DSL 约束
|
||||
|
||||
#### 建议新增文件
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/ruleValidator.py`
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- 非法 YAML 不能发布
|
||||
- 校验错误可返回给前端展示
|
||||
|
||||
### M2-4 实现规则版本上传与落库
|
||||
|
||||
#### 目标
|
||||
|
||||
让规则版本真正以“OSS 存正文、DB 存索引”的方式保存。
|
||||
|
||||
#### 处理步骤
|
||||
|
||||
```text
|
||||
接收 YAML 文本
|
||||
-> 语法校验
|
||||
-> DSL 校验
|
||||
-> 计算 sha256 / file_size
|
||||
-> 生成 rules/{rule_type}/{version_no}/rules.yaml
|
||||
-> 上传 OSS
|
||||
-> 写 leaudit_rule_versions
|
||||
-> 需要时写 validation_report.json
|
||||
```
|
||||
|
||||
#### 需要修改的文件
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/ruleServiceImpl.py`
|
||||
- `fastapi_common/fastapi_common_storage/oss_path_utils.py`
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- `leaudit_rule_versions` 中完整记录:
|
||||
- `oss_url`
|
||||
- `file_sha256`
|
||||
- `file_size`
|
||||
- `metadata_type_id`
|
||||
- `metadata_name`
|
||||
- `metadata_version`
|
||||
|
||||
### M2-5 实现发布与回滚
|
||||
|
||||
#### 目标
|
||||
|
||||
通过切换 `leaudit_rule_sets.current_version_id` 管理当前生效版本。
|
||||
|
||||
#### 处理要求
|
||||
|
||||
- 发布时更新当前生效版本
|
||||
- 回滚时切换回历史版本
|
||||
- 保留审计信息
|
||||
- 不影响历史 run 对旧版本的追溯
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- 新 run 会自动使用新版本
|
||||
- 老 run 仍保留历史规则来源
|
||||
|
||||
### M2-6 暴露规则管理控制器
|
||||
|
||||
#### 目标
|
||||
|
||||
为后续规则编辑页面提供 API。
|
||||
|
||||
#### 建议新增文件
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/controllers/ruleController.py`
|
||||
|
||||
#### 建议接口
|
||||
|
||||
- `GET /api/leaudit/rule-sets`
|
||||
- `GET /api/leaudit/rule-sets/{ruleType}/versions`
|
||||
- `GET /api/leaudit/rule-versions/{versionId}/content`
|
||||
- `POST /api/leaudit/rule-sets/{ruleType}/validate`
|
||||
- `POST /api/leaudit/rule-sets/{ruleType}/versions`
|
||||
- `POST /api/leaudit/rule-sets/{ruleType}/publish`
|
||||
- `POST /api/leaudit/rule-sets/{ruleType}/rollback`
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- 前端可直接基于这些接口做查看、编辑、发布、回滚
|
||||
- 响应统一走 `Result`
|
||||
|
||||
---
|
||||
|
||||
## M3:执行链正式化与持久化补齐
|
||||
|
||||
### M3-1 文档来源解析正式接入 OSS 服务
|
||||
|
||||
#### 目标
|
||||
|
||||
把 `fileSourceResolver.py` 从临时 URL 下载升级为正式存储接入。
|
||||
|
||||
#### 处理优先级
|
||||
|
||||
1. 优先 `localPath`
|
||||
2. 否则走 `ossUrl` / object key
|
||||
3. 落本地临时文件或 bytes 交给执行链
|
||||
|
||||
#### 需要修改的文件
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/fileSourceResolver.py`
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- 文档来源解析不再直接依赖 `urllib`
|
||||
- 后续换 OSS 实现无需改业务层
|
||||
|
||||
### M3-2 规则来源解析正式接入 OSS 服务
|
||||
|
||||
#### 目标
|
||||
|
||||
把 `ruleVersionResolver.py` 升级为正式的规则版本来源解析器。
|
||||
|
||||
#### 标准链路
|
||||
|
||||
```text
|
||||
run_id
|
||||
-> leaudit_audit_runs.rule_version_id
|
||||
-> leaudit_audit_runs.rule_source_oss_url
|
||||
-> 下载规则 YAML
|
||||
-> sha256 校验
|
||||
-> 写本地临时 YAML
|
||||
-> RulesLoader.load(local_path)
|
||||
```
|
||||
|
||||
#### 需要修改的文件
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/ruleVersionResolver.py`
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- 执行优先绑定具体规则版本
|
||||
- 不再主要依赖 `_TYPE_ID_RULES_MAP`
|
||||
|
||||
### M3-3 补全运行结果持久化
|
||||
|
||||
#### 目标
|
||||
|
||||
把 NativeRunner 当前未落库的数据继续补齐。
|
||||
|
||||
#### 需要补的内容
|
||||
|
||||
- `leaudit_run_metrics`
|
||||
- `leaudit_run_errors`
|
||||
- `rescue_outcomes`
|
||||
- `finishedAt`
|
||||
- `resultStatus`
|
||||
- 必要的 artifact 索引
|
||||
|
||||
#### 需要修改的文件
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/nativeRunner.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/storage_adapter.py`
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- 一次 run 的关键结果和错误都能追溯
|
||||
- 不再只落 `ocr/extract/evaluate/rule_results`
|
||||
|
||||
### M3-4 明确 run 状态机
|
||||
|
||||
#### 目标
|
||||
|
||||
统一 run 的状态与 phase 更新逻辑。
|
||||
|
||||
#### 建议状态
|
||||
|
||||
- `pending`
|
||||
- `running`
|
||||
- `completed`
|
||||
- `failed`
|
||||
|
||||
#### 需要修改的文件
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/tasks.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/auditServiceImpl.py`
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- 前端轮询可看到真实运行进度
|
||||
- 失败时能准确落状态和错误信息
|
||||
|
||||
---
|
||||
|
||||
## M4:全流程联调与验收
|
||||
|
||||
### M4-1 梳理上传入口
|
||||
|
||||
#### 目标
|
||||
|
||||
确认现有上传能力到底走哪些 controller / service,以及上传后是否已形成:
|
||||
|
||||
- `leaudit_documents`
|
||||
- `leaudit_document_files`
|
||||
- 文件来源记录
|
||||
- 当前活跃版本标记
|
||||
|
||||
#### 需要重点检查的位置
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/controllers/`
|
||||
- `fastapi_modules/fastapi_leaudit/services/`
|
||||
- 文档上传相关 service / controller
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- 明确上传后如何进入评查系统
|
||||
|
||||
### M4-2 接通上传后触发评查
|
||||
|
||||
#### 目标
|
||||
|
||||
形成稳定用户操作路径。
|
||||
|
||||
#### 最小可接受路径
|
||||
|
||||
##### 路径 A:两步操作
|
||||
|
||||
```text
|
||||
上传文档
|
||||
-> 手工点击触发评查
|
||||
```
|
||||
|
||||
##### 路径 B:一步自动触发
|
||||
|
||||
```text
|
||||
上传文档
|
||||
-> 自动创建 run 并触发评查
|
||||
```
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- 至少有一种路径可稳定跑通
|
||||
|
||||
### M4-3 补结果查询展示所需字段
|
||||
|
||||
#### 目标
|
||||
|
||||
确保前端能拿到完整结果展示数据。
|
||||
|
||||
#### 至少要能查到
|
||||
|
||||
- run 基本状态
|
||||
- 总分 / 通过数 / 失败数 / 跳过数
|
||||
- 规则明细
|
||||
- 失败原因
|
||||
- 产物地址(如有)
|
||||
|
||||
#### 需要重点修改的文件
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/auditServiceImpl.py`
|
||||
- 对应 controller / VO
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- 前端拿 `runId` 可以完成结果展示
|
||||
|
||||
### M4-4 准备联调样例
|
||||
|
||||
#### 目标
|
||||
|
||||
准备一套可重复使用的联调数据。
|
||||
|
||||
#### 至少需要
|
||||
|
||||
- 1 份真实文档样例
|
||||
- 1 条有效规则绑定
|
||||
- 1 份可访问规则 YAML
|
||||
- 1 套可运行配置
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- 能稳定复现成功链路
|
||||
- 能稳定复现失败链路
|
||||
|
||||
### M4-5 完成一次真实 E2E 验证
|
||||
|
||||
#### 验证目标链
|
||||
|
||||
```text
|
||||
上传文档
|
||||
-> document / file 入库
|
||||
-> 文件可下载
|
||||
-> 规则 YAML 可下载
|
||||
-> Native AuditCtx 执行
|
||||
-> OCR / 抽取 / 评查完成
|
||||
-> 结果写回 DB
|
||||
-> 查询结果成功
|
||||
```
|
||||
|
||||
#### 完成标准
|
||||
|
||||
- 至少成功跑通 1 个真实样例
|
||||
- 至少验证 1 个失败样例
|
||||
- 形成一份验收记录文档
|
||||
|
||||
---
|
||||
|
||||
## 推荐实施顺序
|
||||
|
||||
本轮开发建议严格按以下顺序执行:
|
||||
|
||||
### 第一阶段:先做 M1
|
||||
|
||||
- M1-1 配置模型
|
||||
- M1-2 OSS client
|
||||
- M1-3 路径工具
|
||||
- M1-4 接管现有 bridge 下载逻辑
|
||||
|
||||
原因:
|
||||
|
||||
- 这是后续规则管理和全流程联调的共同底座
|
||||
|
||||
### 第二阶段:做 M2
|
||||
|
||||
- M2-1 规则服务接口
|
||||
- M2-2 DTO / VO / BO
|
||||
- M2-3 规则校验
|
||||
- M2-4 新版本上传
|
||||
- M2-5 发布 / 回滚
|
||||
- M2-6 控制器接口
|
||||
|
||||
原因:
|
||||
|
||||
- 先把规则后端能力做实,再给前端接口
|
||||
|
||||
### 第三阶段:做 M3
|
||||
|
||||
- M3-1 文档来源正式接入
|
||||
- M3-2 规则来源正式接入
|
||||
- M3-3 持久化补齐
|
||||
- M3-4 run 状态机
|
||||
|
||||
原因:
|
||||
|
||||
- 这是把“能跑”升级为“可维护、可观察”
|
||||
|
||||
### 第四阶段:做 M4
|
||||
|
||||
- M4-1 上传入口梳理
|
||||
- M4-2 上传后触发
|
||||
- M4-3 结果查询补齐
|
||||
- M4-4 样例准备
|
||||
- M4-5 E2E 联调
|
||||
|
||||
原因:
|
||||
|
||||
- 最后做整体验收,避免中途反复返工
|
||||
|
||||
---
|
||||
|
||||
## 高优先级文件清单
|
||||
|
||||
### 一定会修改的文件
|
||||
|
||||
- `fastapi_admin/config/_settings.py`
|
||||
- `fastapi_admin/config/__init__.pyi`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/fileSourceResolver.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/ruleVersionResolver.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/tasks.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/nativeRunner.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/storage_adapter.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/ruleService.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/auditServiceImpl.py`
|
||||
|
||||
### 大概率新增的文件
|
||||
|
||||
- `fastapi_common/fastapi_common_storage/oss_client.py`
|
||||
- `fastapi_common/fastapi_common_storage/oss_path_utils.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/ossService.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/ossServiceImpl.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/ruleServiceImpl.py`
|
||||
- `fastapi_modules/fastapi_leaudit/controllers/ruleController.py`
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/ruleValidator.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/Dto/ruleVersionCreateDto.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/Dto/ruleValidateDto.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/Dto/rulePublishDto.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/vo/ruleVersionVo.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/vo/ruleContentVo.py`
|
||||
- `fastapi_modules/fastapi_leaudit/domian/bo/ruleVersionCreateBo.py`
|
||||
|
||||
---
|
||||
|
||||
## 验收口径
|
||||
|
||||
### M1 验收
|
||||
|
||||
- 文档和规则都经统一 OSS 服务访问
|
||||
- bridge 中不再散落临时下载逻辑
|
||||
|
||||
### M2 验收
|
||||
|
||||
- 可以新建规则版本、读内容、发布、回滚
|
||||
- YAML 正式存 OSS,元数据正式存 DB
|
||||
|
||||
### M3 验收
|
||||
|
||||
- run 与 rule version 绑定关系稳定
|
||||
- 运行结果持久化完整
|
||||
- 运行状态与 phase 可查询
|
||||
|
||||
### M4 验收
|
||||
|
||||
- 真实上传到评查结束可跑通
|
||||
- 查询结果成功
|
||||
- 有联调 / 验收记录
|
||||
|
||||
---
|
||||
|
||||
## 实施结论
|
||||
|
||||
本轮开发不再继续扩散平台手写评查逻辑,而是坚持以下原则:
|
||||
|
||||
- `leaudit` 核心不改
|
||||
- 平台负责 DB / OSS / API / 权限 / 任务入口
|
||||
- bridge 负责把平台语义转换成原生 `AuditCtx` 输入
|
||||
- 文档与规则都采用“OSS 真源 + 数据库存索引 + 本地临时文件执行”的模式
|
||||
|
||||
按本计划推进后,既能保证当前原生执行链继续稳定,也能为后续 YAML 在线编辑界面和完整评查业务闭环打下正式基础。
|
||||
@@ -0,0 +1,520 @@
|
||||
# LeAudit 跑通全流程所需准备项
|
||||
|
||||
## 1. 范围说明
|
||||
|
||||
本文记录的不是“仅把 YAML 规则搬到 OSS”这一件事,而是 **跑通 LeAudit 整个业务链路** 所需要补齐的能力。
|
||||
|
||||
这里的“跑通全流程”明确指:
|
||||
|
||||
```text
|
||||
上传文档
|
||||
→ 获取文件真源
|
||||
→ OCR
|
||||
→ 规则解析
|
||||
→ 字段抽取
|
||||
→ 评查
|
||||
→(可选)Rescue
|
||||
→ 结果落库
|
||||
→ 前端查询运行状态和结果
|
||||
```
|
||||
|
||||
也就是说,目标不是只做“规则编辑”,而是要让下面这条链条在当前项目内真实可执行:
|
||||
|
||||
- 用户上传文档
|
||||
- 平台找到该文档对应的规则版本
|
||||
- Bridge 调用 `leaudit` 引擎执行 OCR / Extract / Evaluate
|
||||
- 结果写入 `leaudit_*` 表
|
||||
- 前端可以查询 run 状态和评查结果
|
||||
|
||||
---
|
||||
|
||||
## 2. 当前状态判断
|
||||
|
||||
当前项目的设计方向是对的,但距离“整条链真正可跑通”还有明显缺口。
|
||||
|
||||
> 2026-04-27 补充结论:结合 `/home/wren-dev/Porject/leaudit/src` 源码确认,`leaudit`
|
||||
> 当前已经形成正式的服务编排层:`AuditCtx` +
|
||||
> `AuditService.audit(ctx)`。因此本项目后续不应长期维持“平台自己手搓
|
||||
> OCR / Extract / Evaluate / Rescue 编排”的模式,而应保留 bridge,
|
||||
> 但把 bridge 改造成“平台对象 -> 原生 AuditCtx -> 原生 AuditService -> 平台持久化”的适配层。
|
||||
|
||||
### 2.1 已具备的基础
|
||||
|
||||
- 已有 `docs/leaudit/` 一整套设计文档
|
||||
- 已有 `leaudit_*` 相关表设计与部分模型
|
||||
- 已有 bridge 目录骨架:
|
||||
- `pipeline.py`
|
||||
- `tasks.py`
|
||||
- `rules_loader.py`
|
||||
- `storage_adapter.py`
|
||||
- 已有规则集 / 规则版本 / 绑定表设计
|
||||
- 已有 YAML 在线编辑设计文档:`docs/规则编辑/yaml规则在线编辑设计.md`
|
||||
|
||||
### 2.2 当前仍未闭环的关键问题
|
||||
|
||||
- 评查服务入口还没有真正触发可执行任务
|
||||
- 规则加载仍以本地目录 / 硬编码过渡方案为主
|
||||
- OSS 规则文件上传 / 下载 / 校验链未补齐
|
||||
- 规则后台控制面未落地
|
||||
- 运行结果与 `run_id` 的强绑定还不够严格
|
||||
- 上传文件 → 文件真源 → 本地临时文件 → pipeline 的输入链还未完全收口
|
||||
|
||||
所以现在更准确的说法是:
|
||||
|
||||
- **架构蓝图已成型**
|
||||
- **部分代码骨架已存在**
|
||||
- **但全流程尚未真正打通**
|
||||
|
||||
---
|
||||
|
||||
## 3. 跑通整个流程,必须补齐的 8 大能力块
|
||||
|
||||
## 3.1 上传链路与文档真源
|
||||
|
||||
要跑通 OCR / 抽取 / 评查,首先必须保证上传文档在系统里成为一个稳定的“可执行输入”。
|
||||
|
||||
### 需要准备的能力
|
||||
|
||||
- 上传接口能够接收主文档 / 附件
|
||||
- 上传后写入 `leaudit_documents`
|
||||
- 文件元数据写入 `leaudit_document_files`
|
||||
- 原始文件上传到 OSS 或稳定本地真源
|
||||
- 为每个运行锁定 `document_file_id`
|
||||
- 需要时可把文档从 OSS 下载到本地临时路径供 `leaudit` 读取
|
||||
|
||||
### 为什么这是前提
|
||||
|
||||
`leaudit` 执行时依赖本地文件路径,因此即便业务真源在 OSS,执行阶段仍要有:
|
||||
|
||||
```text
|
||||
document_file.oss_url
|
||||
→ 下载到本地临时文件
|
||||
→ pipeline.run(file_path=local_tmp_path)
|
||||
```
|
||||
|
||||
如果这一层不稳定,后面 OCR、抽取、评查都无从谈起。
|
||||
|
||||
---
|
||||
|
||||
## 3.2 评查运行主线(Run)管理
|
||||
|
||||
整条链必须围绕 `leaudit_audit_runs` 来组织,否则运行结果会失去可追踪性。
|
||||
|
||||
### 需要准备的能力
|
||||
|
||||
- 触发评查时先创建一条 `leaudit_audit_runs`
|
||||
- 记录:
|
||||
- `document_id`
|
||||
- `document_file_id`
|
||||
- `run_no`
|
||||
- `trigger_source`
|
||||
- `status`
|
||||
- `rule_set_id`
|
||||
- `rule_version_id`
|
||||
- `rule_source_oss_url`
|
||||
- `rule_source_sha256`
|
||||
- 运行中逐步更新:
|
||||
- `status`
|
||||
- `phase`
|
||||
- `started_at`
|
||||
- `finished_at`
|
||||
- 汇总统计字段
|
||||
|
||||
### 当前项目缺口
|
||||
|
||||
当前 `AuditServiceImpl.Run()` 还没有真正创建和分发 run,只是直接抛出“Celery 任务集成待实现”:
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/auditServiceImpl.py`
|
||||
|
||||
因此,当前“触发评查”这一步实际上还没有闭环。
|
||||
|
||||
---
|
||||
|
||||
## 3.3 规则管理控制面
|
||||
|
||||
如果未来要做 YAML 在线编辑,那么规则一定不能只是本地 `rules/` 目录,而必须成为平台管理对象。
|
||||
|
||||
### 需要准备的能力
|
||||
|
||||
- 规则集列表
|
||||
- 规则版本列表
|
||||
- 查看某个版本 YAML 内容
|
||||
- 保存草稿版本
|
||||
- 发布指定版本
|
||||
- 回滚到历史版本
|
||||
- 规则编辑 / 发布权限控制
|
||||
- 发布 / 回滚审计
|
||||
|
||||
### 建议最少接口
|
||||
|
||||
- `GET /rule-sets`
|
||||
- `GET /rule-sets/{rule_type}/versions`
|
||||
- `GET /rule-versions/{id}/content`
|
||||
- `POST /rule-sets/{rule_type}/versions`
|
||||
- `POST /rule-sets/{rule_type}/validate`
|
||||
- `POST /rule-sets/{rule_type}/publish`
|
||||
- `POST /rule-sets/{rule_type}/rollback`
|
||||
|
||||
### 当前项目缺口
|
||||
|
||||
当前只有规则服务接口定义的一小部分骨架:
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/services/ruleService.py`
|
||||
|
||||
尚未形成完整规则后台能力。
|
||||
|
||||
---
|
||||
|
||||
## 3.4 规则文件存储链(OSS + DB)
|
||||
|
||||
这部分是“在线编辑”和“运行执行”之间的核心桥梁。
|
||||
|
||||
### 需要准备的能力
|
||||
|
||||
- YAML 文本上传到 OSS
|
||||
- OSS 路径写入 `leaudit_rule_versions.oss_url`
|
||||
- 同步保存:
|
||||
- `file_sha256`
|
||||
- `file_size`
|
||||
- `metadata_type_id`
|
||||
- `metadata_name`
|
||||
- `metadata_version`
|
||||
- 下载规则文件到本地临时目录
|
||||
- 下载后校验 sha256
|
||||
- 发布后能根据 `current_version_id` 找到正在生效的版本
|
||||
|
||||
### 运行时目标链路
|
||||
|
||||
```text
|
||||
leaudit_rule_type_bindings
|
||||
→ leaudit_rule_sets.current_version_id
|
||||
→ leaudit_rule_versions.oss_url
|
||||
→ 下载到本地临时文件
|
||||
→ load_rules_file(local_path)
|
||||
→ 执行 pipeline
|
||||
```
|
||||
|
||||
### 当前项目状态
|
||||
|
||||
当前项目已经开始按“原生 AuditCtx + Bridge 适配”方向落地两条来源链:
|
||||
|
||||
- 文档文件:
|
||||
- 已支持 `leaudit_document_files.local_path`
|
||||
- 也已支持 `leaudit_document_files.oss_url`
|
||||
- Worker 执行前会统一落为本地临时文件
|
||||
- 规则文件:
|
||||
- 已支持 `leaudit_audit_runs.rule_source_oss_url`
|
||||
- 运行时按 `run -> rule_version -> oss_url` 下载 YAML 到本地临时文件
|
||||
- 再交给 `RulesLoader.load(local_path)` 与 `NativeRunner` 执行
|
||||
|
||||
当前仍保留 fallback:
|
||||
|
||||
- `LEAUDIT_RULES_DIR`
|
||||
- `_TYPE_ID_RULES_MAP`
|
||||
- 本地 `rules/` 目录
|
||||
|
||||
也就是说,“DB + OSS -> 本地临时 YAML”的正式主链已经接入,旧本地目录逻辑仅作为兼容回退。
|
||||
|
||||
---
|
||||
|
||||
## 3.5 规则校验链
|
||||
|
||||
规则编辑能力一旦开放,就必须要有保存前 / 发布前校验,否则很容易把错误 YAML 发到线上。
|
||||
|
||||
### 至少需要两层校验
|
||||
|
||||
#### 1)YAML 语法校验
|
||||
|
||||
- 缩进是否正确
|
||||
- 结构是否可解析
|
||||
- 基本字段是否存在
|
||||
|
||||
#### 2)LeAudit DSL 语义校验
|
||||
|
||||
- `metadata` 是否完整
|
||||
- `type_id` / `version` / `name` 是否可识别
|
||||
- rule / stage / extract 结构是否符合 `leaudit` 的 DSL 约束
|
||||
- 规则中引用的字段是否存在
|
||||
- phase / activate_if / score / risk 等配置是否合理
|
||||
|
||||
### 需要准备的结果形式
|
||||
|
||||
- 校验是否通过
|
||||
- 错误列表
|
||||
- 警告列表
|
||||
- 可选:抽取出的 metadata 快照
|
||||
|
||||
---
|
||||
|
||||
## 3.6 执行链:OCR → 抽取 → 评查 → Rescue
|
||||
|
||||
这是整个系统真正的“核心业务流水线”。
|
||||
|
||||
### 需要准备的能力
|
||||
|
||||
- OCR 客户端可正常调用
|
||||
- 文档分类 / rules resolve 可正常执行
|
||||
- `dispatch_extract()` 能跑通字段抽取
|
||||
- `evaluate_extraction()` 能完成规则评查
|
||||
- 如果平台定义最终结果包含 Rescue,则补齐 rescue 阶段
|
||||
- 执行链中每个阶段都要能记录错误与耗时
|
||||
|
||||
### 当前项目情况
|
||||
|
||||
`pipeline.py` 已有主链骨架:
|
||||
|
||||
- OCR
|
||||
- 抽取
|
||||
- 坐标解析
|
||||
- phase detection
|
||||
- evaluate
|
||||
|
||||
见:
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/pipeline.py`
|
||||
|
||||
但当前还存在这些问题:
|
||||
|
||||
- 结果存储依赖“按 document_id 查最新 run”这种简化逻辑
|
||||
- rescue 尚未形成完整闭环
|
||||
- 任务上下文仍残留 `source_port` 过渡参数
|
||||
- 当前 `pipeline.py` 是平台侧自编排器,而不是调用 `leaudit` 原生
|
||||
`AuditService.audit(ctx)` 的适配包装器
|
||||
|
||||
因此还不能认为“执行链已经完全生产可用”。
|
||||
|
||||
### 最新架构修正建议
|
||||
|
||||
基于 `leaudit` 源码核对,正式建议改成:
|
||||
|
||||
```text
|
||||
平台文档/规则/配置
|
||||
→ bridge 解析输入
|
||||
→ 构造 AuditServices
|
||||
→ 构造 AuditConfig
|
||||
→ 构造原生 AuditCtx
|
||||
→ 调用 AuditService.audit(ctx)
|
||||
→ 从最终 ctx 提取产物落库
|
||||
```
|
||||
|
||||
也就是说:
|
||||
|
||||
- bridge 继续保留
|
||||
- 但 bridge 不再负责自己重写 7 阶段编排
|
||||
- bridge 负责“适配”和“持久化”
|
||||
- `leaudit` 原生 `AuditService.audit(ctx)` 负责“执行”
|
||||
|
||||
---
|
||||
|
||||
## 3.7 结果落库与查询链
|
||||
|
||||
跑通全流程不只是引擎执行成功,还包括结果能写进去、查出来。
|
||||
|
||||
### 需要准备的能力
|
||||
|
||||
- OCR 产物写入 `leaudit_artifacts`
|
||||
- 字段抽取结果写入 `leaudit_field_results`
|
||||
- 规则评查结果写入 `leaudit_rule_results`
|
||||
- 运行指标写入 `leaudit_run_metrics`
|
||||
- 错误信息写入 `leaudit_run_errors`
|
||||
- 补救结果写入 `leaudit_rescue_outcomes`
|
||||
- 更新 `leaudit_audit_runs` 汇总字段
|
||||
- 前端可查询:
|
||||
- run 状态
|
||||
- 规则级结果
|
||||
- 抽取字段
|
||||
- 汇总统计
|
||||
|
||||
### 当前项目缺口
|
||||
|
||||
当前 `StorageAdapter` 已有部分写入逻辑,但还有明显工程缺口:
|
||||
|
||||
- 多处依赖“按 document_id 找最新 run”
|
||||
- `GetResult()` 仍未从 `leaudit_rule_results` 真正查规则级结果
|
||||
|
||||
见:
|
||||
|
||||
- `fastapi_modules/fastapi_leaudit/leaudit_bridge/storage_adapter.py`
|
||||
- `fastapi_modules/fastapi_leaudit/services/impl/auditServiceImpl.py`
|
||||
|
||||
这说明“结果查询闭环”还未打通。
|
||||
|
||||
---
|
||||
|
||||
## 3.8 异步任务、缓存、幂等与审计
|
||||
|
||||
如果只在本地同步跑 Demo,可以先简化;但如果要真的作为平台运行,就必须补齐基础工程能力。
|
||||
|
||||
### 需要准备的能力
|
||||
|
||||
#### 任务调度
|
||||
|
||||
- Celery / Redis 异步任务
|
||||
- 任务超时
|
||||
- 失败重试
|
||||
- 队列优先级
|
||||
|
||||
#### 幂等与并发
|
||||
|
||||
- 同一个文档重复点击“评查”如何处理
|
||||
- 同一 `run_id` 重试还是新建 run
|
||||
- 避免结果串写到错误 run
|
||||
|
||||
#### 缓存失效
|
||||
|
||||
- 新规则发布后,旧缓存如何失效
|
||||
- 多 worker 下规则缓存如何同步更新
|
||||
|
||||
#### 审计
|
||||
|
||||
- 谁上传了规则
|
||||
- 谁发布了规则
|
||||
- 谁触发了评查
|
||||
- 该评查具体用了哪版规则
|
||||
|
||||
### 当前项目缺口
|
||||
|
||||
- Celery 仍未真正接入业务主链
|
||||
- 发布后的规则缓存失效机制未明确实现
|
||||
- 审计日志表和日志落库链未形成闭环
|
||||
|
||||
---
|
||||
|
||||
## 4. 从“规则编辑”到“全流程可跑”的完整依赖链
|
||||
|
||||
如果要把功能讲清楚,可以把整个系统拆成下面这条依赖链:
|
||||
|
||||
```text
|
||||
【A】上传文档
|
||||
→ 写 leaudit_documents / leaudit_document_files
|
||||
→ 文件进 OSS
|
||||
|
||||
【B】编辑规则
|
||||
→ YAML 文本保存
|
||||
→ 语法/语义校验
|
||||
→ 上传 rules.yaml 到 OSS
|
||||
→ 写 leaudit_rule_versions
|
||||
→ 发布切换 current_version_id
|
||||
|
||||
【C】触发评查
|
||||
→ 创建 leaudit_audit_runs
|
||||
→ 锁定 document_file_id + rule_version_id
|
||||
→ 分发任务
|
||||
|
||||
【D】bridge 执行
|
||||
→ 下载文档到本地临时文件
|
||||
→ 下载规则到本地临时 YAML
|
||||
→ OCR
|
||||
→ Extract
|
||||
→ Evaluate
|
||||
→ Rescue(如启用)
|
||||
|
||||
【E】结果写回
|
||||
→ artifacts / field_results / rule_results / metrics / errors
|
||||
→ 更新 audit_runs 汇总
|
||||
|
||||
【F】前端查询
|
||||
→ 查 run 状态
|
||||
→ 查规则结果
|
||||
→ 查字段结果
|
||||
→ 展示最终评查报告
|
||||
```
|
||||
|
||||
只有 A~F 都闭环,才能说“跑通整个流程”。
|
||||
|
||||
---
|
||||
|
||||
## 5. 建议的实施优先级
|
||||
|
||||
## P0:先跑通最小闭环
|
||||
|
||||
目标:先让“上传文档 -> 触发评查 -> OCR/抽取/评查 -> 结果查询”最小可用。
|
||||
|
||||
### P0 需要完成
|
||||
|
||||
- 上传文件能落真源
|
||||
- `AuditServiceImpl.Run()` 真正可触发
|
||||
- 创建 `leaudit_audit_runs`
|
||||
- `pipeline.run()` 真正执行
|
||||
- `StorageAdapter` 明确按 `run_id` 写结果
|
||||
- `GetRunStatus()` / `GetResult()` 能查到真实数据
|
||||
|
||||
> 注意:这一阶段甚至可以暂时继续兼容本地 `rules/`,重点先是把业务主链打通。
|
||||
|
||||
---
|
||||
|
||||
## P1:切换规则真相源到 OSS + DB
|
||||
|
||||
目标:让规则不再依赖本地目录作为正式来源。
|
||||
|
||||
### P1 需要完成
|
||||
|
||||
- 规则版本上传到 OSS
|
||||
- `leaudit_rule_versions` 完整入库
|
||||
- `leaudit_rule_type_bindings` 真正生效
|
||||
- `tasks.py` / `rules_loader.py` 走 `DB -> OSS -> 本地临时 YAML`
|
||||
- `_TYPE_ID_RULES_MAP` 降级为 fallback
|
||||
|
||||
---
|
||||
|
||||
## P2:开放 YAML 在线编辑能力
|
||||
|
||||
目标:让规则成为后台可管理资产。
|
||||
|
||||
### P2 需要完成
|
||||
|
||||
- 规则列表 / 版本历史 / YAML 内容查看
|
||||
- 编辑保存
|
||||
- YAML 语法校验
|
||||
- DSL 语义校验
|
||||
- 发布 / 回滚
|
||||
- 权限与审计
|
||||
|
||||
---
|
||||
|
||||
## P3:补齐平台级工程能力
|
||||
|
||||
目标:让系统从“能跑”升级到“稳定可运营”。
|
||||
|
||||
### P3 需要完成
|
||||
|
||||
- Celery 多队列
|
||||
- Redis 缓存与缓存失效
|
||||
- 幂等控制
|
||||
- 失败重试
|
||||
- 规则缓存刷新
|
||||
- 跨区域权限 / 审计
|
||||
- run_metrics / run_errors / rescue_outcomes 全量落库
|
||||
|
||||
---
|
||||
|
||||
## 6. 一句话结论
|
||||
|
||||
如果目标是“跑通整个流程:上传、OCR、抽取、评查”,那么除了“把规则 YAML 放 OSS、路径放数据库”之外,还必须同时补齐:
|
||||
|
||||
- 上传文件真源链
|
||||
- 评查 run 主线
|
||||
- 规则控制面
|
||||
- 规则文件上传/下载/校验链
|
||||
- 正式执行链
|
||||
- 结果落库与结果查询链
|
||||
- 任务、缓存、幂等、审计等基础设施能力
|
||||
|
||||
所以这不是一个单点功能,而是一条完整平台链路的闭环建设。
|
||||
|
||||
---
|
||||
|
||||
## 7. 当前项目的核心推进建议
|
||||
|
||||
按实际落地顺序,建议当前项目这样推进:
|
||||
|
||||
1. 先打通“上传 -> 触发评查 -> run -> 结果查询”最小链路
|
||||
2. 再把规则解析从本地目录切到 `OSS + DB`
|
||||
3. 然后再做 YAML 在线编辑、发布、回滚
|
||||
4. 最后补缓存、审计、并发、重试等平台级能力
|
||||
|
||||
这样做的好处是:
|
||||
|
||||
- 可以尽快验证主业务链是否真实可用
|
||||
- 不会在规则后台还没落地时就把复杂度全部堆上来
|
||||
- 能明确区分“能跑通”和“能运营”的阶段目标
|
||||
Reference in New Issue
Block a user