docs: add fix-double-finalize-and-bindings-api implementation plan
This commit is contained in:
@@ -0,0 +1,241 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="zh-CN">
|
||||
<head>
|
||||
<meta charset="UTF-8" />
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||||
<title>AuditCtx — 结构与调用速查</title>
|
||||
<style>
|
||||
:root {
|
||||
--c-bg: #fafaf8;
|
||||
--c-panel: #ffffff;
|
||||
--c-text: #1a1a1a;
|
||||
--c-muted: #666;
|
||||
--c-subtle: #999;
|
||||
--c-border: #e5e5e0;
|
||||
--c-accent: #2a4d6e;
|
||||
--c-accent-soft: #eaf0f6;
|
||||
--c-code-bg: #2d2d2a;
|
||||
--c-code-text: #f0f0e8;
|
||||
--c-inline-bg: #f0ede5;
|
||||
--c-inline-text: #5a3a1a;
|
||||
--font-cn: "PingFang SC", "Microsoft YaHei", system-ui, sans-serif;
|
||||
--font-mono: "JetBrains Mono", "SF Mono", Menlo, monospace;
|
||||
}
|
||||
|
||||
* { box-sizing: border-box; }
|
||||
html, body {
|
||||
margin: 0; padding: 0;
|
||||
background: var(--c-bg); color: var(--c-text);
|
||||
font-family: var(--font-cn); line-height: 1.7; font-size: 15px;
|
||||
}
|
||||
|
||||
.layout {
|
||||
max-width: 880px;
|
||||
margin: 0 auto;
|
||||
padding: 48px 56px 80px;
|
||||
}
|
||||
|
||||
h1 {
|
||||
font-size: 28px; font-weight: 600; margin: 0 0 6px;
|
||||
border-bottom: 3px solid var(--c-accent); padding-bottom: 14px;
|
||||
}
|
||||
h2 {
|
||||
font-size: 20px; font-weight: 600;
|
||||
margin: 40px 0 14px;
|
||||
padding-left: 12px;
|
||||
border-left: 4px solid var(--c-accent);
|
||||
}
|
||||
.subtitle { color: var(--c-muted); font-size: 14px; margin-bottom: 16px; }
|
||||
|
||||
.lead {
|
||||
background: var(--c-accent-soft);
|
||||
border-left: 4px solid var(--c-accent);
|
||||
padding: 14px 20px;
|
||||
margin: 20px 0 28px;
|
||||
border-radius: 0 6px 6px 0;
|
||||
}
|
||||
.lead strong { color: var(--c-accent); }
|
||||
|
||||
p { margin: 10px 0; }
|
||||
ul { padding-left: 22px; margin: 8px 0; }
|
||||
ul li { margin: 5px 0; }
|
||||
ul li::marker { color: var(--c-accent); }
|
||||
|
||||
code {
|
||||
font-family: var(--font-mono);
|
||||
background: var(--c-inline-bg);
|
||||
color: var(--c-inline-text);
|
||||
padding: 1px 6px; border-radius: 3px;
|
||||
font-size: 13px;
|
||||
}
|
||||
pre {
|
||||
background: var(--c-code-bg);
|
||||
color: var(--c-code-text);
|
||||
padding: 16px 20px;
|
||||
border-radius: 6px;
|
||||
overflow-x: auto;
|
||||
margin: 12px 0;
|
||||
font-family: var(--font-mono);
|
||||
font-size: 13px;
|
||||
line-height: 1.55;
|
||||
}
|
||||
pre code {
|
||||
background: transparent; color: inherit; padding: 0; font-size: inherit;
|
||||
}
|
||||
.cmt { color: #8a9a7a; }
|
||||
|
||||
table {
|
||||
width: 100%; border-collapse: collapse; margin: 12px 0;
|
||||
background: var(--c-panel);
|
||||
border: 1px solid var(--c-border);
|
||||
border-radius: 6px; overflow: hidden;
|
||||
font-size: 14px;
|
||||
}
|
||||
th, td {
|
||||
padding: 9px 13px; text-align: left; vertical-align: top;
|
||||
border-bottom: 1px solid var(--c-border);
|
||||
}
|
||||
th {
|
||||
background: #f5f3ec; font-weight: 600;
|
||||
color: var(--c-muted); font-size: 13px;
|
||||
}
|
||||
tr:last-child td { border-bottom: none; }
|
||||
td code { font-size: 12px; }
|
||||
|
||||
.fileref {
|
||||
font-family: var(--font-mono); font-size: 12px;
|
||||
background: #efece4; color: #5a3a1a;
|
||||
padding: 1px 6px; border-radius: 3px;
|
||||
border: 1px solid #d8d2c0;
|
||||
}
|
||||
|
||||
.note {
|
||||
background: var(--c-accent-soft);
|
||||
border-left: 4px solid var(--c-accent);
|
||||
padding: 10px 16px; margin: 14px 0;
|
||||
border-radius: 0 6px 6px 0;
|
||||
font-size: 14px;
|
||||
}
|
||||
.note .tag { font-weight: 700; color: var(--c-accent); margin-right: 6px; }
|
||||
|
||||
.field-cat-身份 { color: #2a5a8a; font-weight: 600; }
|
||||
.field-cat-装配 { color: #80560a; font-weight: 600; }
|
||||
.field-cat-产物 { color: #4a7050; font-weight: 600; }
|
||||
.field-cat-诊断 { color: #888; font-weight: 600; }
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
<div class="layout">
|
||||
|
||||
<h1>AuditCtx — 结构与调用速查</h1>
|
||||
<div class="subtitle">LeAudit 服务编排层的不可变上下文 · 2026-04-27</div>
|
||||
|
||||
<div class="lead">
|
||||
<strong>是什么:</strong>
|
||||
<code>AuditCtx</code>(<span class="fileref">src/leaudit/services/audit_ctx.py</span>)是一次评查运行的<strong>不可变骨架</strong>。<code>@dataclass(frozen=True)</code>,构造一次,跨 service 流转;每个 stage 用 <code>with_xxx()</code> 派生新 ctx,旧 ctx 不变。
|
||||
</div>
|
||||
|
||||
<h2>结构</h2>
|
||||
|
||||
<table>
|
||||
<thead><tr><th style="width:24%">字段</th><th style="width:14%">类</th><th style="width:32%">类型</th><th>说明</th></tr></thead>
|
||||
<tbody>
|
||||
<tr><td><code>document_id</code></td><td><span class="field-cat-身份">身份</span></td><td><code>str</code></td><td>这次 run 的 ID</td></tr>
|
||||
<tr><td><code>rules_file</code></td><td><span class="field-cat-身份">身份</span></td><td><code>RulesFile | None</code></td><td>评查规则;None 时由分类器决定</td></tr>
|
||||
<tr><td><code>file_path</code></td><td><span class="field-cat-身份">身份</span></td><td><code>str | None</code></td><td>本地文件路径</td></tr>
|
||||
<tr><td><code>page_range</code></td><td><span class="field-cat-身份">身份</span></td><td><code>tuple[int,...] | None</code></td><td>子文档页面范围</td></tr>
|
||||
<tr><td><code>services</code></td><td><span class="field-cat-装配">装配</span></td><td><code>AuditServices</code></td><td>所有 service / client 的容器</td></tr>
|
||||
<tr><td><code>config</code></td><td><span class="field-cat-装配">装配</span></td><td><code>AuditConfig</code></td><td>运行期旋钮</td></tr>
|
||||
<tr><td><code>normalized_doc</code></td><td><span class="field-cat-产物">产物</span></td><td><code>NormalizedDocument | None</code></td><td>OCR + 分类 + 分段</td></tr>
|
||||
<tr><td><code>extraction</code></td><td><span class="field-cat-产物">产物</span></td><td><code>ExtractionBundle | None</code></td><td>抽取字段 / 多实体 / 派生</td></tr>
|
||||
<tr><td><code>phase</code></td><td><span class="field-cat-产物">产物</span></td><td><code>str | None</code></td><td><code>draft</code> 或 <code>executed</code></td></tr>
|
||||
<tr><td><code>evaluation</code></td><td><span class="field-cat-产物">产物</span></td><td><code>EvaluationResult | None</code></td><td>每条规则的评查结论</td></tr>
|
||||
<tr><td><code>fallback_tasks</code></td><td><span class="field-cat-产物">产物</span></td><td><code>tuple[RescueTask,...]</code></td><td>失败规则的修救任务</td></tr>
|
||||
<tr><td><code>extraction_errors</code></td><td><span class="field-cat-诊断">诊断</span></td><td><code>tuple[str,...]</code></td><td>抽取错误日志</td></tr>
|
||||
<tr><td><code>timing</code></td><td><span class="field-cat-诊断">诊断</span></td><td><code>Mapping[str,float]</code></td><td>每阶段耗时</td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
|
||||
<div class="note"><span class="tag">不可变性:</span><code>frozen=True</code> 防字段重绑;<code>__post_init__</code> 把 list/dict 强转 <code>tuple</code> / <code>MappingProxyType</code>,防 <code>ctx.timing["x"] = 1</code> 这种穿透改写。</div>
|
||||
|
||||
<h2>调用</h2>
|
||||
|
||||
<h3 style="font-size:15px; color:var(--c-accent); margin:20px 0 8px;">1. 构造</h3>
|
||||
|
||||
<pre><code>from leaudit.services.audit_ctx import AuditCtx
|
||||
from leaudit.services.audit_services import AuditServices
|
||||
from leaudit.config.audit_config import AuditConfig
|
||||
|
||||
services = AuditServices(
|
||||
llm_client=llm, vlm_client=vlm, ocr_client=ocr,
|
||||
normalization=norm_svc, extraction=ext_svc, evaluation=eval_svc,
|
||||
)
|
||||
|
||||
ctx = AuditCtx(
|
||||
document_id="my_doc",
|
||||
rules_file=rules,
|
||||
services=services,
|
||||
file_path="/path/to/doc.pdf",
|
||||
config=AuditConfig(group_size=8),
|
||||
)</code></pre>
|
||||
|
||||
<h3 style="font-size:15px; color:var(--c-accent); margin:20px 0 8px;">2. 演化(每个 stage 一次)</h3>
|
||||
|
||||
<pre><code>ctx = ctx.with_normalized_doc(ocr_result) <span class="cmt"># Stage 1</span>
|
||||
ctx = ctx.with_extraction(extraction) <span class="cmt"># Stage 3</span>
|
||||
ctx = ctx.with_phase("executed") <span class="cmt"># Stage 4</span>
|
||||
ctx = ctx.with_evaluation(evaluation) <span class="cmt"># Stage 5</span>
|
||||
ctx = ctx.with_fallback_task(task) <span class="cmt"># Stage 6(每条失败规则一次)</span>
|
||||
ctx = ctx.with_timing(ocr=1.2, total=8.5) <span class="cmt"># 累加耗时</span></code></pre>
|
||||
|
||||
<p>所有 <code>with_xxx()</code> 都是 <code>dataclasses.replace()</code> 的薄包装——<strong>返回新对象</strong>,旧 ctx 不变。要用 <code>ctx = ctx.with_...()</code> 接住返回值。</p>
|
||||
|
||||
<h3 style="font-size:15px; color:var(--c-accent); margin:20px 0 8px;">3. 读取</h3>
|
||||
|
||||
<pre><code>ocr_result = ctx.normalized_doc
|
||||
fields = ctx.extraction.fields
|
||||
phase = ctx.phase
|
||||
score = ctx.evaluation.total_score
|
||||
ocr_secs = ctx.timing.get("ocr", 0.0)
|
||||
text = ctx.source_text <span class="cmt"># 派生属性 = ctx.extraction.source_text</span></code></pre>
|
||||
|
||||
<h3 style="font-size:15px; color:var(--c-accent); margin:20px 0 8px;">4. 两个桥接 API</h3>
|
||||
|
||||
<table>
|
||||
<thead><tr><th style="width:30%">API</th><th>用途</th></tr></thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td><code>ctx.effective_config</code></td>
|
||||
<td>合并 <code>ctx.config</code> + <code>Settings</code>。<strong>业务代码统一只读这个</strong>,禁止 <code>get_settings()</code> / <code>os.getenv</code>(架构 guard 在 <span class="fileref">tests/architecture/</span>)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>ctx.as_rescue_inputs()</code></td>
|
||||
<td>转成 engine 旧的 <code>RescueInputs</code> 形状,桥接 <code>engine._try_ai_rescue</code>,不用改 engine</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
|
||||
<h2>典型调用:AuditService.audit(ctx)</h2>
|
||||
|
||||
<p>所有 stage 已经在 <code>AuditService.audit()</code> 里编排好了(<span class="fileref">src/leaudit/services/audit_service.py:152</span>)。普通用法只要构造 ctx 然后调它:</p>
|
||||
|
||||
<pre><code>ctx = AuditCtx(document_id="...", rules_file=rules, services=services,
|
||||
file_path="...", config=cfg)
|
||||
ctx = await audit_service.audit(ctx)
|
||||
<span class="cmt"># 跑完 ctx 已经被填满 normalized_doc / extraction / phase / evaluation / fallback_tasks / timing</span></code></pre>
|
||||
|
||||
<p>七个 stage 顺序:<code>Normalize</code> → <code>Resolve rules</code> → <code>Extract</code> → <code>Phase 判定</code> → <code>Evaluate</code> → <code>Rescue</code> → <code>Finalize</code>。</p>
|
||||
|
||||
<h2>规则三条</h2>
|
||||
|
||||
<ul>
|
||||
<li>加产物字段 → <code>audit_ctx.py</code> 加字段 + 一个 <code>with_xxx</code> helper</li>
|
||||
<li>读运行配置 → <code>ctx.effective_config</code>,<strong>不要</strong>直接 <code>get_settings()</code></li>
|
||||
<li>改 stage 行为 → <strong>必须</strong> return 新 ctx,<strong>不要</strong>原地改字段</li>
|
||||
</ul>
|
||||
|
||||
</div>
|
||||
|
||||
</body>
|
||||
</html>
|
||||
Reference in New Issue
Block a user