Files
leaudit-platform-backend/docs/接口/文档上传与评查接口.md
T
2026-04-29 11:48:50 +08:00

693 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 文档上传与评查接口
这份文档描述当前已经落地的文档上传、文档列表、自动评查、手动评查、状态查询、结果查询接口。
当前接口围绕以下业务语义设计:
- 每次前端上传都会形成一个平台内部文档实例
- 同名文档会尝试归入同一个版本组
- 同名且内容相同:
- 不新建版本
- `duplicateUpload=true`
- 如果 `autoRun=true`,仍然可以重新走一次评查流程
- 同名但内容变化:
- 新建版本
- 形成 `v2 / v3 / ...`
- 评查任务走 worker 异步执行
- 队列只有两档:
- `urgent`
- `normal`
---
## 1. 上传接口
### 路径
```http
POST /upload
```
### Content-Type
```http
multipart/form-data
```
### 用途
- 上传文档
- 创建或命中文档版本
- 建立 `leaudit_documents / leaudit_document_files`
- 可选自动触发评查
### 请求参数
| 参数 | 类型 | 必填 | 说明 |
|---|---|---:|---|
| `file` | file | 是 | 上传文件 |
| `typeId` | int | 否 | 文档类型 ID,和 `typeCode` 二选一至少传一个 |
| `typeCode` | string | 否 | 文档类型编码,例如 `contract.sale` |
| `region` | string | 否 | 区域,默认 `default` |
| `fileRole` | string | 否 | 文件角色,默认 `primary` |
| `createdBy` | int | 否 | 上传用户 ID |
| `autoRun` | bool | 否 | 是否上传后自动触发评查,默认 `false` |
| `speed` | string | 否 | 执行速度档位:`normal` / `urgent`,默认 `normal` |
### 版本匹配逻辑
上传时会先做版本候选匹配:
1. 归一化文件名,得到 `normalized_name`
2. 按以下条件查找最新版本候选:
- `type_id` 相同
- `region` 相同
- `normalized_name` 相同
- `is_latest_version = true`
- 主文件 `file_role = 'primary'`
3. 比较最新版本主文件的 `sha256`
结果分三种:
- 找不到候选
- 新建版本组
- 当前版本为 `v1`
- 找到候选且 `sha256` 相同
- 视为重复上传
- 不新建版本
- `duplicateUpload=true`
- 找到候选但 `sha256` 不同
- 新建版本
- 当前版本为 `v2 / v3 / ...`
- 旧版本 `is_latest_version=false`
- 新版本 `is_latest_version=true`
### 队列路由逻辑
- `speed=urgent` -> 投递 `leaudit.urgent`
- `speed=normal` -> 投递 `leaudit.normal`
### 请求示例:普通上传,不自动评查
```bash
curl -X POST 'http://127.0.0.1:8096/api/upload' \
-F 'file=@/path/to/合同.docx' \
-F 'typeCode=contract.sale' \
-F 'region=default' \
-F 'fileRole=primary' \
-F 'autoRun=false' \
-F 'speed=normal'
```
### 返回示例:首次上传,命中 `v1`
```json
{
"code": 200,
"message": "ok",
"data": {
"documentId": 11,
"internalDocumentNo": 1777426812904262854,
"versionGroupKey": "4e02e455aa504cb9b75a254727f1bb4c",
"versionNo": 1,
"previousVersionId": null,
"rootVersionId": 11,
"duplicateUpload": false,
"fileId": 12,
"typeId": 9,
"typeCode": "contract.sale",
"region": "default",
"fileName": "版本归档验证合同.docx",
"ossUrl": "bdocs/default/contract.sale/2026/04/11/v1/primary__版本归档验证合同.docx",
"speed": "normal",
"processingStatus": "waiting",
"autoRunTriggered": false,
"run": null
}
}
```
### 返回示例:重复上传,不升版
```json
{
"code": 200,
"message": "ok",
"data": {
"documentId": 11,
"internalDocumentNo": 1777426812904262854,
"versionGroupKey": "4e02e455aa504cb9b75a254727f1bb4c",
"versionNo": 1,
"previousVersionId": null,
"rootVersionId": 11,
"duplicateUpload": true,
"fileId": 12,
"typeId": 9,
"typeCode": "contract.sale",
"region": "default",
"fileName": "版本归档验证合同.docx",
"ossUrl": "bdocs/default/contract.sale/2026/04/11/v1/primary__版本归档验证合同.docx",
"speed": "normal",
"processingStatus": "waiting",
"autoRunTriggered": false,
"run": null
}
}
```
### 返回示例:同名但内容变化,自动形成 `v2`
```json
{
"code": 200,
"message": "ok",
"data": {
"documentId": 12,
"internalDocumentNo": 1777426813574315361,
"versionGroupKey": "4e02e455aa504cb9b75a254727f1bb4c",
"versionNo": 2,
"previousVersionId": 11,
"rootVersionId": 11,
"duplicateUpload": false,
"fileId": 13,
"typeId": 9,
"typeCode": "contract.sale",
"region": "default",
"fileName": "版本归档验证合同.docx",
"ossUrl": "bdocs/default/contract.sale/2026/04/12/v2/primary__版本归档验证合同.docx",
"speed": "normal",
"processingStatus": "waiting",
"autoRunTriggered": false,
"run": null
}
}
```
### 返回示例:重复上传但自动重新评查
```json
{
"code": 200,
"message": "ok",
"data": {
"documentId": 13,
"internalDocumentNo": 1777427235286905027,
"versionGroupKey": "4e02e455aa504cb9b75a254727f1bb4c",
"versionNo": 3,
"previousVersionId": 12,
"rootVersionId": 11,
"duplicateUpload": true,
"fileId": 14,
"typeId": 9,
"typeCode": "contract.sale",
"region": "default",
"fileName": "版本归档验证合同.docx",
"ossUrl": "bdocs/default/contract.sale/2026/04/13/v3/primary__版本归档验证合同.docx",
"speed": "normal",
"processingStatus": "queued",
"autoRunTriggered": true,
"run": {
"runId": 13,
"documentId": 13,
"runNo": 2,
"documentFileId": 14,
"status": "queued",
"phase": "dispatch",
"resultStatus": null,
"ruleSetId": 29,
"ruleVersionId": 9,
"ruleTypeId": "contract.sale",
"rescueApplied": false,
"totalScore": null,
"passedCount": null,
"failedCount": null,
"skippedCount": null,
"startedAt": null,
"finishedAt": null
}
}
}
```
---
## 2. 文档列表接口
### 路径
```http
GET /documents/list
```
### 用途
- 返回文档主列表
- 只返回每个版本组的最新版本
- 每条记录附带历史版本摘要,前端可以直接做“展开历史版本”
### 查询参数
| 参数 | 类型 | 必填 | 说明 |
|---|---|---:|---|
| `page` | int | 否 | 页码,从 `1` 开始,默认 `1` |
| `pageSize` | int | 否 | 每页数量,默认 `20`,最大 `100` |
| `keyword` | string | 否 | 文件名 / 归一化名称模糊搜索 |
| `typeCode` | string | 否 | 文档类型编码,例如 `contract.sale` |
| `region` | string | 否 | 区域过滤 |
| `processingStatus` | string | 否 | 文档处理状态过滤 |
| `resultStatus` | string | 否 | 最新 run 的结果状态过滤 |
### 查询逻辑
- 主查询只看 `leaudit_documents.is_latest_version = true`
- 只关联主文件:
- `leaudit_document_files.is_active = true`
- `leaudit_document_files.file_role = 'primary'`
- 当前评查状态来自 `leaudit_audit_runs`
- 历史版本按 `version_group_key` 再查一次并挂到 `historyVersions`
### 请求示例
```bash
curl 'http://127.0.0.1:8096/api/documents/list?page=1&pageSize=5'
```
### 带筛选请求示例
```bash
curl 'http://127.0.0.1:8096/api/documents/list?page=1&pageSize=2&keyword=版本归档&typeCode=contract.sale&region=default'
```
### 返回示例
```json
{
"code": 200,
"message": "ok",
"data": {
"total": 1,
"page": 1,
"pageSize": 2,
"totalPages": 1,
"documents": [
{
"documentId": 13,
"internalDocumentNo": 1777427235286905027,
"versionGroupKey": "4e02e455aa504cb9b75a254727f1bb4c",
"versionNo": 3,
"rootVersionId": 11,
"previousVersionId": 12,
"typeId": 9,
"typeCode": "contract.sale",
"region": "default",
"normalizedName": "版本归档验证合同",
"fileId": 14,
"fileName": "版本归档验证合同.docx",
"fileExt": "docx",
"mimeType": "application/octet-stream",
"fileSize": 587279,
"ossUrl": "bdocs/default/contract.sale/2026/04/13/v3/primary__版本归档验证合同.docx",
"processingStatus": "completed",
"currentRunId": 13,
"runStatus": "completed",
"resultStatus": "review",
"totalScore": 92.0,
"passedCount": 25,
"failedCount": 3,
"skippedCount": 0,
"updatedAt": "2026-04-29T01:50:05.241397+00:00",
"hasHistory": true,
"totalVersions": 3,
"historyVersions": [
{
"documentId": 12,
"fileId": 13,
"versionNo": 2,
"fileName": "版本归档验证合同.docx",
"fileExt": "docx",
"processingStatus": "waiting",
"runStatus": null,
"resultStatus": null,
"updatedAt": "2026-04-29T01:47:15.250697+00:00"
},
{
"documentId": 11,
"fileId": 12,
"versionNo": 1,
"fileName": "版本归档验证合同.docx",
"fileExt": "docx",
"processingStatus": "waiting",
"runStatus": null,
"resultStatus": null,
"updatedAt": "2026-04-29T01:40:13.538839+00:00"
}
]
}
]
}
}
```
### 返回字段说明
| 字段 | 说明 |
|---|---|
| `documents[]` | 主列表,仅最新版本 |
| `versionGroupKey` | 同一版本链的归档组键 |
| `versionNo` | 当前版本号 |
| `rootVersionId` | 版本链根文档 ID |
| `previousVersionId` | 上一版本文档 ID |
| `hasHistory` | 是否存在历史版本 |
| `totalVersions` | 该版本组的总版本数 |
| `historyVersions[]` | 历史版本摘要,按 `versionNo DESC` 排序 |
---
## 3. 手动触发评查
### 路径
```http
POST /audit/run
```
### 用途
- 对指定 `documentId` 手动触发一次新的评查 run
- 不改变文档版本
- 只新增 `leaudit_audit_runs`
### 请求体
```json
{
"documentId": 13,
"ruleType": null,
"force": false,
"speed": "normal"
}
```
### 参数说明
| 字段 | 类型 | 必填 | 说明 |
|---|---|---:|---|
| `documentId` | int | 是 | 文档 ID |
| `ruleType` | string/null | 否 | 指定规则类型编码 |
| `force` | bool | 否 | 是否强制重跑 |
| `speed` | string | 否 | `normal` / `urgent` |
### 请求示例
```bash
curl -X POST 'http://127.0.0.1:8096/api/audit/run' \
-H 'Content-Type: application/json' \
-d '{
"documentId": 13,
"force": false,
"speed": "urgent"
}'
```
### 返回示例
```json
{
"code": 200,
"message": "ok",
"data": {
"runId": 15,
"documentId": 13,
"runNo": 3,
"documentFileId": 14,
"status": "queued",
"phase": "dispatch",
"resultStatus": null,
"ruleSetId": 29,
"ruleVersionId": 9,
"ruleTypeId": "contract.sale",
"rescueApplied": false,
"totalScore": null,
"passedCount": null,
"failedCount": null,
"skippedCount": null,
"startedAt": null,
"finishedAt": null
}
}
```
---
## 4. 查询运行状态
### 路径
```http
GET /audit/run/{runId}
```
### 用途
- 查询 run 当前状态
- 适合前端轮询
### 状态说明
常见状态:
- `queued`
- `running`
- `completed`
- `failed`
常见阶段:
- `dispatch`
- `prepare`
- `ocr`
- `extract`
- `evaluate`
- `rescue`
- `persist`
- `executed`
### 请求示例
```bash
curl 'http://127.0.0.1:8096/api/audit/run/11'
```
### 返回示例
```json
{
"code": 200,
"message": "ok",
"data": {
"runId": 11,
"documentId": 10,
"runNo": 1,
"documentFileId": 11,
"status": "completed",
"phase": "executed",
"resultStatus": "review",
"ruleSetId": 29,
"ruleVersionId": 9,
"ruleTypeId": "contract.sale",
"rescueApplied": true,
"totalScore": 89.0,
"passedCount": 24,
"failedCount": 4,
"skippedCount": 0,
"startedAt": "2026-04-28T19:01:01.766352+08:00",
"finishedAt": "2026-04-28T19:03:11.044894+08:00"
}
}
```
---
## 5. 查询评查结果
### 路径
```http
GET /audit/result/{runId}
```
### 用途
- 查询本次 run 的完整结果
- 包括:
- 规则结果
- 抽取字段
- 运行错误
- rescue 结果
- metrics
- artifacts
### 请求示例
```bash
curl 'http://127.0.0.1:8096/api/audit/result/11'
```
### 返回结构说明
顶层字段:
| 字段 | 说明 |
|---|---|
| `runId` | 运行 ID |
| `documentId` | 文档 ID |
| `documentFileId` | 本次锁定的文件 ID |
| `status` | 运行状态 |
| `totalScore` | 总分 |
| `passedCount` | 通过数 |
| `failedCount` | 失败数 |
| `skippedCount` | 跳过数 |
| `phase` | 当前阶段 |
| `resultStatus` | 总体结果 |
| `rescueApplied` | 是否执行 rescue |
| `ruleSetId` | 规则集 ID |
| `ruleVersionId` | 规则版本 ID |
| `startedAt` / `finishedAt` | 起止时间 |
| `rules` | 规则结果列表 |
| `fields` | 抽取字段列表 |
| `errors` | 错误列表 |
| `rescueOutcomes` | 补救结果列表 |
| `metrics` | 阶段指标 |
| `artifacts` | 产物列表 |
### 返回示例(节选)
```json
{
"code": 200,
"message": "ok",
"data": {
"runId": 11,
"documentId": 10,
"documentFileId": 11,
"status": "completed",
"totalScore": 89.0,
"passedCount": 24,
"failedCount": 4,
"skippedCount": 0,
"phase": "executed",
"resultStatus": "review",
"rescueApplied": true,
"ruleSetId": 29,
"ruleVersionId": 9,
"startedAt": "2026-04-28T19:01:01.766352+08:00",
"finishedAt": "2026-04-28T19:03:11.044894+08:00",
"rules": [
{
"ruleId": "MM-SALE-012",
"ruleName": "甲方信用代码校验",
"passed": false,
"status": "executed",
"risk": "medium",
"score": 3.0,
"failMessage": "甲方统一社会信用代码校验位错误"
}
],
"fields": [
{
"fieldName": "合同名称",
"valueText": "智慧法务平台建设采购项目合同",
"confidence": 0.9991
}
],
"rescueOutcomes": [
{
"ruleId": "MM-SALE-012",
"status": "final_fail",
"finalStatus": "review",
"requiresHumanReview": true,
"failureReason": "Agent (4 iter, requires_human): token_budget_exhausted"
}
],
"metrics": {
"ocrSeconds": 79.06,
"extractSeconds": 11.87,
"evaluateSeconds": 9.9,
"totalSeconds": 100.83,
"pageCount": 2,
"fieldCount": 35,
"ruleCount": 28,
"rescueRuleCount": 5,
"artifactCount": 8
},
"artifacts": [
{
"artifactType": "ocr_json",
"fileName": "ocr_result.json",
"fileExt": "json",
"mimeType": "application/json"
}
]
}
}
```
---
## 6. worker 日志怎么看
worker 关键日志已经做了可读化,重点看这两类:
### 投递日志
```text
run_id=13 已投递到 worker 队列: queue=leaudit.normal, speed=normal, task_id=...
```
### 执行日志
```text
run_id=13 worker开始执行: queue=leaudit.normal, speed=normal, filename=版本归档验证合同.docx
```
结合状态查询接口可以快速判断:
- 是否已经成功投递
- 是否已被 worker 消费
- 跑的是 `urgent` 还是 `normal`
---
## 7. 前端建议接法
### 文档上传页
1.`POST /upload`
2. 读取返回:
- `documentId`
- `versionGroupKey`
- `versionNo`
- `duplicateUpload`
- `run`
### 自动评查场景
如果 `autoRun=true` 且返回里 `run != null`
1.`run.runId`
2. 轮询 `GET /audit/run/{runId}`
3. `status=completed/failed` 后停止轮询
4. 再调 `GET /audit/result/{runId}`
### 列表页
列表页建议默认只展示:
- `is_latest_version = true` 的 document
点击某条后,再按:
- `versionGroupKey`
展开其历史版本。