feat: complete M1-M3 infrastructure — OSS client, native execution chain, rule lifecycle API, system docs
- M1: unified OSS client (upload/download/presign) + path utils + config - M2: rule service with validate/create/publish/rollback + binding CRUD endpoints - M3: native AuditCtx runner, file/rule resolvers, storage adapter with full persistence - docs: SYSTEM_OVERVIEW.md as comprehensive architecture reference - fix: double finalize — terminal state now written once by finalize_run
This commit is contained in:
@@ -0,0 +1,193 @@
|
|||||||
|
# LeAudit Platform — 系统现状总览
|
||||||
|
|
||||||
|
> 最后更新:2026-04-28
|
||||||
|
|
||||||
|
## 一、目标架构
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─ API ───────────────────────────────────────────────────────────┐
|
||||||
|
│ AuditController (/audit) RuleController (/rule-sets) │
|
||||||
|
│ POST /run 触发评查 GET / 规则集列表 │
|
||||||
|
│ GET /run/:id 查询状态 GET /{type}/versions 版本列表 │
|
||||||
|
│ GET /result/:id 查询结果 GET /versions/:id/content 正文│
|
||||||
|
│ POST /{type}/validate 校验 │
|
||||||
|
│ POST /{type}/versions 创建 │
|
||||||
|
│ POST /{type}/publish 发布 │
|
||||||
|
│ POST /{type}/rollback 回滚 │
|
||||||
|
│ GET /bindings 绑定列表 │
|
||||||
|
│ POST /{type}/bindings 创建绑定│
|
||||||
|
│ PUT /bindings/{id} 更新绑定│
|
||||||
|
│ DELETE /bindings/{id} 删除绑定│
|
||||||
|
├─ Service ───────────────────────────────────────────────────────┤
|
||||||
|
│ AuditServiceImpl RuleServiceImpl + OssServiceImpl│
|
||||||
|
├─ Bridge ────────────────────────────────────────────────────────┤
|
||||||
|
│ FileSourceResolver RuleVersionResolver RulesLoader │
|
||||||
|
│ AuditCtxBuilder AuditServiceFactory NativeRunner │
|
||||||
|
│ StorageAdapter ResultAdapter RuleValidator │
|
||||||
|
│ tasks.py (dispatch_leaudit_task) │
|
||||||
|
├─ leaudit (不改) ────────────────────────────────────────────────│
|
||||||
|
│ AuditCtx → AuditService.audit() → OCR/Normalize/Extract/ │
|
||||||
|
│ Evaluate/Rescue → 填充 ctx.normalized_doc/extraction/ │
|
||||||
|
│ evaluation/fallback_tasks/timing/extraction_errors │
|
||||||
|
├─ 存储 ──────────────────────────────────────────────────────────┤
|
||||||
|
│ OSS (MinIO) PostgreSQL │
|
||||||
|
│ bdocs/{region}/{type}/{doc_id}/ leaudit_documents │
|
||||||
|
│ rules/{rule_type}/{version_no}/ leaudit_document_files │
|
||||||
|
│ artifacts/{region}/{run_id}/ leaudit_audit_runs │
|
||||||
|
│ leaudit_rule_sets │
|
||||||
|
│ leaudit_rule_versions │
|
||||||
|
│ leaudit_rule_type_bindings │
|
||||||
|
│ leaudit_rule_results │
|
||||||
|
│ leaudit_field_results │
|
||||||
|
│ leaudit_run_metrics │
|
||||||
|
│ leaudit_run_errors │
|
||||||
|
│ leaudit_rescue_outcomes │
|
||||||
|
│ leaudit_artifacts │
|
||||||
|
└─────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## 二、两条核心数据流
|
||||||
|
|
||||||
|
### 流 A:规则生命周期
|
||||||
|
|
||||||
|
```
|
||||||
|
编辑 YAML
|
||||||
|
→ POST /rule-sets/{type}/validate → RuleValidator (YAML语法 + DSL语义)
|
||||||
|
→ POST /rule-sets/{type}/versions → 上传 OSS + INSERT leaudit_rule_versions
|
||||||
|
→ POST /rule-sets/{type}/publish → UPDATE leaudit_rule_sets.current_version_id
|
||||||
|
→ POST /rule-sets/{type}/bindings → INSERT leaudit_rule_type_bindings
|
||||||
|
→ 新 run 自动绑定新版本到对应文档类型
|
||||||
|
```
|
||||||
|
|
||||||
|
### 流 B:评查执行
|
||||||
|
|
||||||
|
```
|
||||||
|
POST /audit/run { documentId }
|
||||||
|
→ AuditServiceImpl.Run()
|
||||||
|
1. 查 leaudit_documents + leaudit_document_files
|
||||||
|
2. 查 leaudit_rule_type_bindings → rule_set_id + rule_version_id
|
||||||
|
3. INSERT leaudit_audit_runs (rule_version_id, rule_source_oss_url, sha256)
|
||||||
|
4. FileSourceResolver → 下载文档 bytes
|
||||||
|
5. dispatch_leaudit_task()
|
||||||
|
┌─ RuleVersionResolver → OSS 下载规则 YAML → SHA256 校验
|
||||||
|
├─ RulesLoader → RulesFile
|
||||||
|
├─ NativeRunner.run() → AuditCtx → AuditService.audit()
|
||||||
|
└─ NativeRunner.persist_result()
|
||||||
|
├─ save_ocr_result() → leaudit_artifacts
|
||||||
|
├─ save_extraction_result() → leaudit_field_results
|
||||||
|
├─ save_evaluation_results() → leaudit_rule_results + 分数
|
||||||
|
├─ save_run_errors() → leaudit_run_errors
|
||||||
|
├─ save_rescue_outcomes() → leaudit_rescue_outcomes
|
||||||
|
├─ save_run_metrics() → leaudit_run_metrics
|
||||||
|
└─ finalize_run() → 终态 (result_status/finished_at)
|
||||||
|
```
|
||||||
|
|
||||||
|
## 三、模块完成度
|
||||||
|
|
||||||
|
### M1:OSS 基础设施 — 100%
|
||||||
|
|
||||||
|
| 文件 | 状态 | 说明 |
|
||||||
|
|---|---|---|
|
||||||
|
| `fastapi_common/fastapi_common_storage/oss_client.py` | 已完成 | 上传/下载/Presign/ObjectExists |
|
||||||
|
| `fastapi_common/fastapi_common_storage/oss_path_utils.py` | 已完成 | BuildBusinessDocKey/ArtifactKey/RuleYamlKey |
|
||||||
|
| `fastapi_admin/config/_settings.py` | 已完成 | OSS_ENDPOINT/ACCESS_KEY/SECRET_KEY/BUCKET/REGION/USE_SSL/PRESIGN_EXPIRE |
|
||||||
|
| `fastapi_admin/config/__init__.pyi` | 已完成 | 类型声明 |
|
||||||
|
|
||||||
|
### M2:规则管理后端 — 100%
|
||||||
|
|
||||||
|
| 文件 | 状态 | 说明 |
|
||||||
|
|---|---|---|
|
||||||
|
| `controllers/ruleController.py` | 已完成 | 11 个端点(含绑定管理) |
|
||||||
|
| `services/ruleService.py` | 已完成 | IRuleService 接口 |
|
||||||
|
| `services/impl/ruleServiceImpl.py` | 已完成 | 规则 CRUD + 绑定 CRUD |
|
||||||
|
| `leaudit_bridge/ruleValidator.py` | 已完成 | YAML 语法 + DSL 语义校验 |
|
||||||
|
| `domian/Dto/ruleVersionCreateDto.py` | 已完成 | 创建版本 DTO |
|
||||||
|
| `domian/Dto/ruleValidateDto.py` | 已完成 | 校验 DTO |
|
||||||
|
| `domian/Dto/rulePublishDto.py` | 已完成 | 发布/回滚 DTO |
|
||||||
|
| `domian/Dto/ruleBindingDto.py` | 已完成 | 绑定 DTO |
|
||||||
|
| `domian/vo/ruleVo.py` | 已完成 | RuleSetVO/VersionVO/ContentVO/ValidationVO/BindingVO |
|
||||||
|
|
||||||
|
### M3:执行链与持久化 — ~95%
|
||||||
|
|
||||||
|
| 文件 | 状态 | 说明 |
|
||||||
|
|---|---|---|
|
||||||
|
| `leaudit_bridge/auditCtxBuilder.py` | 已完成 | 构建原生 AuditCtx |
|
||||||
|
| `leaudit_bridge/auditServiceFactory.py` | 已完成 | 创建 AuditService |
|
||||||
|
| `leaudit_bridge/nativeRunner.py` | 已完成 | 原生执行入口 + persist_result |
|
||||||
|
| `leaudit_bridge/fileSourceResolver.py` | 已完成 | 文档来源解析(接 OSS) |
|
||||||
|
| `leaudit_bridge/ruleVersionResolver.py` | 已完成 | 规则版本解析(含 SHA256) |
|
||||||
|
| `leaudit_bridge/storage_adapter.py` | 已完成 | 全部持久化方法(已修双重 finalize) |
|
||||||
|
| `leaudit_bridge/tasks.py` | 已完成 | 任务入口 + 失败处理 |
|
||||||
|
| `leaudit_bridge/pipeline.py` | 待退役 | 旧管线(已被 nativeRunner 替代) |
|
||||||
|
| `leaudit_bridge/client_factory.py` | 90% | OCR/LLM/VLM 客户端工厂 |
|
||||||
|
|
||||||
|
### M4:全流程联调 — 0%
|
||||||
|
|
||||||
|
| 项目 | 状态 |
|
||||||
|
|---|---|
|
||||||
|
| 上传入口梳理 | 待做 |
|
||||||
|
| 上传后自动触发评查 | 待做 |
|
||||||
|
| 结果查询展示 | 待做 |
|
||||||
|
| 联调样例准备 | 待做 |
|
||||||
|
| E2E 验证 | 待做 |
|
||||||
|
|
||||||
|
## 四、持久化执行顺序与终态
|
||||||
|
|
||||||
|
`persist_result()` 严格按以下顺序执行,`finalize_run` 在最后是唯一的终态写入者:
|
||||||
|
|
||||||
|
```
|
||||||
|
save_ocr_result()
|
||||||
|
save_extraction_result()
|
||||||
|
save_evaluation_results() ← 只写分数/计数,不写 finished_at
|
||||||
|
save_run_errors()
|
||||||
|
save_rescue_outcomes()
|
||||||
|
save_run_metrics()
|
||||||
|
finalize_run() ← 唯一写 result_status / finished_at / rescue_applied / phase
|
||||||
|
```
|
||||||
|
|
||||||
|
**终态来源(直接从 AuditCtx 读取,不另存平行状态):**
|
||||||
|
|
||||||
|
| 平台表 | 来源 |
|
||||||
|
|---|---|
|
||||||
|
| `leaudit_run_metrics` | `ctx.timing` |
|
||||||
|
| `leaudit_run_errors` | `ctx.extraction_errors` + `ctx.extraction.all_errors` |
|
||||||
|
| `leaudit_rescue_outcomes` | `ctx.fallback_tasks` |
|
||||||
|
| `leaudit_audit_runs.result_status` | 综合 `ctx.fallback_tasks` + `ctx.evaluation` 推导 |
|
||||||
|
| `leaudit_audit_runs.finished_at` | `finalize_run` 写入 now() |
|
||||||
|
| `leaudit_audit_runs.rescue_applied` | `bool(ctx.fallback_tasks)` |
|
||||||
|
|
||||||
|
## 五、数据库表关系
|
||||||
|
|
||||||
|
```
|
||||||
|
leaudit_document_types
|
||||||
|
└── leaudit_rule_type_bindings (doc_type_id → rule_set_id)
|
||||||
|
└── leaudit_rule_sets (current_version_id)
|
||||||
|
└── leaudit_rule_versions (oss_url, sha256)
|
||||||
|
└── leaudit_audit_runs (rule_version_id, rule_source_oss_url)
|
||||||
|
├── leaudit_rule_results
|
||||||
|
├── leaudit_field_results
|
||||||
|
├── leaudit_run_metrics
|
||||||
|
├── leaudit_run_errors
|
||||||
|
├── leaudit_rescue_outcomes
|
||||||
|
└── leaudit_artifacts
|
||||||
|
```
|
||||||
|
|
||||||
|
## 六、关键设计原则
|
||||||
|
|
||||||
|
1. **leaudit 核心不改** — 所有定制在 bridge 层,leaudit 保持纯 Python 包
|
||||||
|
2. **只读 AuditCtx** — 执行完只从 ctx 上读取,不自己模拟 stage
|
||||||
|
3. **规则版本溯源** — 每个 run 绑定具体 `rule_version_id`,老 run 不受新发布影响
|
||||||
|
4. **OSS 真源 + DB 索引** — YAML 正文存 OSS,元数据索引存 PostgreSQL
|
||||||
|
5. **终态单点写入** — `finalize_run` 是 run 主表终态的唯一写入者
|
||||||
|
6. **独立 session 提交** — 每个 `save_*` 独立会话,按"先子表后主表"顺序保证最终一致性
|
||||||
|
|
||||||
|
## 七、已知待补项
|
||||||
|
|
||||||
|
| 缺口 | 优先级 |
|
||||||
|
|---|---|
|
||||||
|
| `run_metrics.llm_call_count` / `vlm_call_count` 仍为空 | P1(可从 RescueTask 汇总) |
|
||||||
|
| `ctx.timing` 缺 normalize/rescue 独立 key | P2(等 leaudit 原生补) |
|
||||||
|
| 文档状态更新逻辑分散在两处 | P1(应收敛到 StorageAdapter) |
|
||||||
|
| Celery 异步化 | P2(当前同步可跑通,生产需异步) |
|
||||||
|
| `_TYPE_ID_RULES_MAP` 硬编码兜底 | P2(等 bindings 全覆盖后移除) |
|
||||||
|
| M4 E2E 联调 | P0(下个里程碑) |
|
||||||
@@ -27,10 +27,13 @@ REDIS_PASSWORD: str
|
|||||||
|
|
||||||
# OSS
|
# OSS
|
||||||
OSS_ENDPOINT: str
|
OSS_ENDPOINT: str
|
||||||
|
OSS_BASE_URL: str
|
||||||
OSS_ACCESS_KEY: str
|
OSS_ACCESS_KEY: str
|
||||||
OSS_SECRET_KEY: str
|
OSS_SECRET_KEY: str
|
||||||
OSS_BUCKET: str
|
OSS_BUCKET: str
|
||||||
OSS_REGION: str
|
OSS_REGION: str
|
||||||
|
OSS_USE_SSL: bool
|
||||||
|
OSS_PRESIGN_EXPIRE_SECONDS: int
|
||||||
|
|
||||||
# LLM
|
# LLM
|
||||||
LLM_BASE_URL: str
|
LLM_BASE_URL: str
|
||||||
|
|||||||
@@ -52,15 +52,19 @@ class RedisSettings(_Base):
|
|||||||
REDIS_PORT: int = 6379
|
REDIS_PORT: int = 6379
|
||||||
REDIS_DB: int = 0
|
REDIS_DB: int = 0
|
||||||
REDIS_PASSWORD: str = ""
|
REDIS_PASSWORD: str = ""
|
||||||
|
REDIS_KEY_PREFIX: str = "leaudit"
|
||||||
|
|
||||||
|
|
||||||
class OssSettings(_Base):
|
class OssSettings(_Base):
|
||||||
"""OSS 对象存储配置 [OSS]。"""
|
"""OSS 对象存储配置 [OSS]。"""
|
||||||
OSS_ENDPOINT: str = ""
|
OSS_ENDPOINT: str = ""
|
||||||
|
OSS_BASE_URL: str = ""
|
||||||
OSS_ACCESS_KEY: str = ""
|
OSS_ACCESS_KEY: str = ""
|
||||||
OSS_SECRET_KEY: str = ""
|
OSS_SECRET_KEY: str = ""
|
||||||
OSS_BUCKET: str = "leaudit"
|
OSS_BUCKET: str = "leaudit"
|
||||||
OSS_REGION: str = ""
|
OSS_REGION: str = ""
|
||||||
|
OSS_USE_SSL: bool = True
|
||||||
|
OSS_PRESIGN_EXPIRE_SECONDS: int = 3600
|
||||||
|
|
||||||
|
|
||||||
class LlmSettings(_Base):
|
class LlmSettings(_Base):
|
||||||
@@ -74,6 +78,7 @@ class VlmSettings(_Base):
|
|||||||
"""VLM 配置 [VLM]。"""
|
"""VLM 配置 [VLM]。"""
|
||||||
VLM_BASE_URL: str = ""
|
VLM_BASE_URL: str = ""
|
||||||
VLM_MODEL: str = ""
|
VLM_MODEL: str = ""
|
||||||
|
VLM_API_KEY: str = ""
|
||||||
|
|
||||||
|
|
||||||
class OcrSettings(_Base):
|
class OcrSettings(_Base):
|
||||||
|
|||||||
@@ -0,0 +1,6 @@
|
|||||||
|
"""对象存储公共能力导出。"""
|
||||||
|
|
||||||
|
from fastapi_common.fastapi_common_storage.oss_client import OssClient, OssObjectRef
|
||||||
|
from fastapi_common.fastapi_common_storage.oss_path_utils import OssPathUtils
|
||||||
|
|
||||||
|
__all__ = ["OssClient", "OssObjectRef", "OssPathUtils"]
|
||||||
@@ -0,0 +1,232 @@
|
|||||||
|
"""统一 OSS / MinIO 客户端。"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import tempfile
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from datetime import timedelta
|
||||||
|
from io import BytesIO
|
||||||
|
from pathlib import Path
|
||||||
|
from urllib.parse import urlparse
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
|
||||||
|
from fastapi_admin.config import (
|
||||||
|
OSS_ACCESS_KEY,
|
||||||
|
OSS_BASE_URL,
|
||||||
|
OSS_BUCKET,
|
||||||
|
OSS_ENDPOINT,
|
||||||
|
OSS_PRESIGN_EXPIRE_SECONDS,
|
||||||
|
OSS_SECRET_KEY,
|
||||||
|
OSS_USE_SSL,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class OssObjectRef:
|
||||||
|
"""对象存储引用。"""
|
||||||
|
|
||||||
|
bucket: str
|
||||||
|
objectKey: str
|
||||||
|
source: str
|
||||||
|
isDirectUrl: bool = False
|
||||||
|
|
||||||
|
|
||||||
|
class OssClient:
|
||||||
|
"""统一封装文档、规则与产物的 OSS 访问。"""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
endpoint: str | None = None,
|
||||||
|
accessKey: str | None = None,
|
||||||
|
secretKey: str | None = None,
|
||||||
|
bucket: str | None = None,
|
||||||
|
useSsl: bool | None = None,
|
||||||
|
baseUrl: str | None = None,
|
||||||
|
presignExpireSeconds: int | None = None,
|
||||||
|
) -> None:
|
||||||
|
self.endpoint = endpoint or OSS_ENDPOINT
|
||||||
|
self.accessKey = accessKey or OSS_ACCESS_KEY
|
||||||
|
self.secretKey = secretKey or OSS_SECRET_KEY
|
||||||
|
self.bucket = bucket or OSS_BUCKET
|
||||||
|
self.useSsl = OSS_USE_SSL if useSsl is None else useSsl
|
||||||
|
self.baseUrl = (baseUrl or OSS_BASE_URL).rstrip("/")
|
||||||
|
self.presignExpireSeconds = presignExpireSeconds or OSS_PRESIGN_EXPIRE_SECONDS
|
||||||
|
self._minioClient = None
|
||||||
|
|
||||||
|
def UploadBytes(
|
||||||
|
self,
|
||||||
|
ObjectKey: str,
|
||||||
|
Content: bytes,
|
||||||
|
ContentType: str = "application/octet-stream",
|
||||||
|
Bucket: str | None = None,
|
||||||
|
) -> str:
|
||||||
|
"""上传二进制内容并返回对象引用。"""
|
||||||
|
Client = self._GetMinioClient()
|
||||||
|
TargetBucket = Bucket or self.bucket
|
||||||
|
Data = BytesIO(Content)
|
||||||
|
Client.put_object(
|
||||||
|
TargetBucket,
|
||||||
|
ObjectKey,
|
||||||
|
Data,
|
||||||
|
length=len(Content),
|
||||||
|
content_type=ContentType,
|
||||||
|
)
|
||||||
|
return self.BuildObjectUrl(ObjectKey=ObjectKey, Bucket=TargetBucket)
|
||||||
|
|
||||||
|
def UploadText(
|
||||||
|
self,
|
||||||
|
ObjectKey: str,
|
||||||
|
Content: str,
|
||||||
|
ContentType: str = "text/plain; charset=utf-8",
|
||||||
|
Bucket: str | None = None,
|
||||||
|
) -> str:
|
||||||
|
"""上传文本内容并返回对象引用。"""
|
||||||
|
return self.UploadBytes(
|
||||||
|
ObjectKey=ObjectKey,
|
||||||
|
Content=Content.encode("utf-8"),
|
||||||
|
ContentType=ContentType,
|
||||||
|
Bucket=Bucket,
|
||||||
|
)
|
||||||
|
|
||||||
|
def UploadFile(
|
||||||
|
self,
|
||||||
|
ObjectKey: str,
|
||||||
|
LocalPath: str,
|
||||||
|
ContentType: str = "application/octet-stream",
|
||||||
|
Bucket: str | None = None,
|
||||||
|
) -> str:
|
||||||
|
"""上传本地文件并返回对象引用。"""
|
||||||
|
FilePath = Path(LocalPath)
|
||||||
|
return self.UploadBytes(
|
||||||
|
ObjectKey=ObjectKey,
|
||||||
|
Content=FilePath.read_bytes(),
|
||||||
|
ContentType=ContentType,
|
||||||
|
Bucket=Bucket,
|
||||||
|
)
|
||||||
|
|
||||||
|
def DownloadBytes(self, Source: str, Bucket: str | None = None) -> bytes:
|
||||||
|
"""下载对象内容。"""
|
||||||
|
Ref = self.ResolveObjectRef(Source=Source, Bucket=Bucket)
|
||||||
|
if Ref.isDirectUrl:
|
||||||
|
return self._DownloadBytesFromUrl(Ref.source)
|
||||||
|
|
||||||
|
Client = self._GetMinioClient()
|
||||||
|
Response = Client.get_object(Ref.bucket, Ref.objectKey)
|
||||||
|
try:
|
||||||
|
return Response.read()
|
||||||
|
finally:
|
||||||
|
Response.close()
|
||||||
|
Response.release_conn()
|
||||||
|
|
||||||
|
def DownloadToTempFile(
|
||||||
|
self,
|
||||||
|
Source: str,
|
||||||
|
*,
|
||||||
|
Suffix: str = "",
|
||||||
|
Prefix: str = "oss-",
|
||||||
|
Bucket: str | None = None,
|
||||||
|
) -> str:
|
||||||
|
"""下载对象到本地临时文件。"""
|
||||||
|
Content = self.DownloadBytes(Source=Source, Bucket=Bucket)
|
||||||
|
return self.WriteTempBytes(Content=Content, Suffix=Suffix, Prefix=Prefix)
|
||||||
|
|
||||||
|
def WriteTempBytes(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
Content: bytes,
|
||||||
|
Suffix: str = "",
|
||||||
|
Prefix: str = "oss-",
|
||||||
|
) -> str:
|
||||||
|
"""把内存中的对象内容写入本地临时文件。"""
|
||||||
|
with tempfile.NamedTemporaryFile(mode="wb", suffix=Suffix, prefix=Prefix, delete=False) as TempFile:
|
||||||
|
TempFile.write(Content)
|
||||||
|
return TempFile.name
|
||||||
|
|
||||||
|
def ObjectExists(self, Source: str, Bucket: str | None = None) -> bool:
|
||||||
|
"""判断对象是否存在。"""
|
||||||
|
Ref = self.ResolveObjectRef(Source=Source, Bucket=Bucket)
|
||||||
|
if Ref.isDirectUrl:
|
||||||
|
try:
|
||||||
|
Response = httpx.head(Ref.source, timeout=30.0, follow_redirects=True)
|
||||||
|
return Response.status_code < 400
|
||||||
|
except Exception:
|
||||||
|
return False
|
||||||
|
|
||||||
|
Client = self._GetMinioClient()
|
||||||
|
try:
|
||||||
|
Client.stat_object(Ref.bucket, Ref.objectKey)
|
||||||
|
return True
|
||||||
|
except Exception:
|
||||||
|
return False
|
||||||
|
|
||||||
|
def BuildObjectUrl(self, ObjectKey: str, Bucket: str | None = None) -> str:
|
||||||
|
"""构造对象访问地址。"""
|
||||||
|
TargetBucket = Bucket or self.bucket
|
||||||
|
if self.baseUrl:
|
||||||
|
return f"{self.baseUrl}/{TargetBucket}/{ObjectKey.lstrip('/')}"
|
||||||
|
return f"oss://{TargetBucket}/{ObjectKey.lstrip('/')}"
|
||||||
|
|
||||||
|
def PresignGetUrl(self, Source: str, Bucket: str | None = None) -> str:
|
||||||
|
"""生成对象下载签名 URL。"""
|
||||||
|
Ref = self.ResolveObjectRef(Source=Source, Bucket=Bucket)
|
||||||
|
if Ref.isDirectUrl:
|
||||||
|
return Ref.source
|
||||||
|
|
||||||
|
Client = self._GetMinioClient()
|
||||||
|
return Client.presigned_get_object(
|
||||||
|
Ref.bucket,
|
||||||
|
Ref.objectKey,
|
||||||
|
expires=timedelta(seconds=self.presignExpireSeconds),
|
||||||
|
)
|
||||||
|
|
||||||
|
def ResolveObjectRef(self, Source: str, Bucket: str | None = None) -> OssObjectRef:
|
||||||
|
"""把 URL / oss:// / object key 统一解析成对象引用。"""
|
||||||
|
Parsed = urlparse(Source)
|
||||||
|
if Parsed.scheme in {"http", "https"}:
|
||||||
|
return OssObjectRef(
|
||||||
|
bucket=Bucket or self.bucket,
|
||||||
|
objectKey="",
|
||||||
|
source=Source,
|
||||||
|
isDirectUrl=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
if Parsed.scheme == "oss":
|
||||||
|
BucketName = Parsed.netloc or Bucket or self.bucket
|
||||||
|
ObjectKey = Parsed.path.lstrip("/")
|
||||||
|
return OssObjectRef(
|
||||||
|
bucket=BucketName,
|
||||||
|
objectKey=ObjectKey,
|
||||||
|
source=Source,
|
||||||
|
)
|
||||||
|
|
||||||
|
return OssObjectRef(
|
||||||
|
bucket=Bucket or self.bucket,
|
||||||
|
objectKey=Source.lstrip("/"),
|
||||||
|
source=Source,
|
||||||
|
)
|
||||||
|
|
||||||
|
def _DownloadBytesFromUrl(self, Url: str) -> bytes:
|
||||||
|
"""从直链地址下载对象。"""
|
||||||
|
with httpx.Client(timeout=60.0, follow_redirects=True) as Client:
|
||||||
|
Response = Client.get(Url)
|
||||||
|
Response.raise_for_status()
|
||||||
|
return Response.content
|
||||||
|
|
||||||
|
def _GetMinioClient(self):
|
||||||
|
"""获取底层 MinIO 客户端。"""
|
||||||
|
if self._minioClient is None:
|
||||||
|
from minio import Minio
|
||||||
|
|
||||||
|
Endpoint = self.endpoint
|
||||||
|
Parsed = urlparse(Endpoint if "://" in Endpoint else f"http://{Endpoint}")
|
||||||
|
Host = Parsed.netloc or Parsed.path
|
||||||
|
Secure = self.useSsl if Parsed.scheme not in {"http", "https"} else Parsed.scheme == "https"
|
||||||
|
self._minioClient = Minio(
|
||||||
|
Host,
|
||||||
|
access_key=self.accessKey,
|
||||||
|
secret_key=self.secretKey,
|
||||||
|
secure=Secure,
|
||||||
|
)
|
||||||
|
return self._minioClient
|
||||||
@@ -0,0 +1,42 @@
|
|||||||
|
"""OSS 路径工具。"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
|
||||||
|
class OssPathUtils:
|
||||||
|
"""统一生成 LeAudit 使用的 OSS object key。"""
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def BuildBusinessDocKey(
|
||||||
|
Region: str,
|
||||||
|
TypeCode: str,
|
||||||
|
DocumentId: int,
|
||||||
|
Version: str,
|
||||||
|
FileRole: str,
|
||||||
|
FileName: str,
|
||||||
|
) -> str:
|
||||||
|
"""生成业务文档 object key。"""
|
||||||
|
Ext = Path(FileName).suffix or ""
|
||||||
|
return f"bdocs/{Region}/{TypeCode}/{DocumentId}/{Version}/{FileRole}{Ext}"
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def BuildArtifactKey(
|
||||||
|
Region: str,
|
||||||
|
RunId: int,
|
||||||
|
ArtifactType: str,
|
||||||
|
Detail: str,
|
||||||
|
) -> str:
|
||||||
|
"""生成评查产物 object key。"""
|
||||||
|
return f"artifacts/{Region}/{RunId}/{ArtifactType}/{Detail}"
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def BuildRuleYamlKey(RuleType: str, VersionNo: str) -> str:
|
||||||
|
"""生成规则 YAML object key。"""
|
||||||
|
return f"rules/{RuleType}/{VersionNo}/rules.yaml"
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def BuildRuleValidationReportKey(RuleType: str, VersionNo: str) -> str:
|
||||||
|
"""生成规则校验报告 object key。"""
|
||||||
|
return f"rules/{RuleType}/{VersionNo}/validation_report.json"
|
||||||
@@ -0,0 +1,10 @@
|
|||||||
|
"""规则发布 DTO。"""
|
||||||
|
|
||||||
|
from pydantic import BaseModel, Field
|
||||||
|
|
||||||
|
|
||||||
|
class RulePublishDTO(BaseModel):
|
||||||
|
"""规则版本发布/回滚请求。"""
|
||||||
|
|
||||||
|
versionId: int = Field(..., description="规则版本ID")
|
||||||
|
operatorUserId: int | None = Field(None, description="操作用户ID")
|
||||||
@@ -0,0 +1,9 @@
|
|||||||
|
"""规则校验 DTO。"""
|
||||||
|
|
||||||
|
from pydantic import BaseModel, Field
|
||||||
|
|
||||||
|
|
||||||
|
class RuleValidateDTO(BaseModel):
|
||||||
|
"""规则 YAML 校验请求。"""
|
||||||
|
|
||||||
|
yamlText: str = Field(..., description="规则 YAML 正文")
|
||||||
@@ -0,0 +1,11 @@
|
|||||||
|
"""规则版本创建 DTO。"""
|
||||||
|
|
||||||
|
from pydantic import BaseModel, Field
|
||||||
|
|
||||||
|
|
||||||
|
class RuleVersionCreateDTO(BaseModel):
|
||||||
|
"""创建规则版本请求。"""
|
||||||
|
|
||||||
|
yamlText: str = Field(..., description="规则 YAML 正文")
|
||||||
|
changeNote: str | None = Field(None, description="版本变更说明")
|
||||||
|
editorUserId: int | None = Field(None, description="编辑者用户ID")
|
||||||
@@ -1,44 +1,30 @@
|
|||||||
"""leaudit bridge — use leaudit's full pipeline with docauditai's database storage.
|
"""LeAudit Bridge 模块。
|
||||||
|
|
||||||
Directly calls leaudit's OCR → extraction → evaluation pipeline
|
对平台暴露统一桥接入口,内部逐步从旧的手写 pipeline
|
||||||
and persists results into docauditai's PostgreSQL via PostgREST.
|
迁移到原生 ``AuditCtx`` + ``AuditService`` 路线。
|
||||||
|
|
||||||
Configuration switch (in env.{port}):
|
|
||||||
PIPELINE_MODE=leaudit → use leaudit pipeline
|
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from leaudit_bridge.client_factory import (
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.client_factory import (
|
||||||
create_ocr_client,
|
create_ocr_client,
|
||||||
create_llm_client,
|
create_llm_client,
|
||||||
create_vlm_client,
|
create_vlm_client,
|
||||||
)
|
)
|
||||||
from leaudit_bridge.ocr_bridge import BridgeOCRClient
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.ocr_bridge import BridgeOCRClient
|
||||||
from leaudit_bridge.pipeline import LauditPipeline, PipelineResult
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.pipeline import LauditPipeline, PipelineResult
|
||||||
from leaudit_bridge.rules_loader import RulesLoader
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.rules_loader import RulesLoader
|
||||||
from leaudit_bridge.storage_adapter import StorageAdapter
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.storage_adapter import StorageAdapter
|
||||||
|
|
||||||
|
|
||||||
def is_leaudit_mode() -> bool:
|
def is_leaudit_mode() -> bool:
|
||||||
"""Check if the system is configured to use the leaudit pipeline."""
|
"""新平台始终使用 leaudit pipeline。"""
|
||||||
from core.config import PIPELINE_MODE
|
return True
|
||||||
return PIPELINE_MODE == "leaudit"
|
|
||||||
|
|
||||||
|
|
||||||
def create_pipeline(rules_path: str | None = None) -> LauditPipeline:
|
def create_pipeline(rules_path: str | None = None) -> LauditPipeline:
|
||||||
"""Create a fully configured LauditPipeline from current config.
|
"""创建旧版兼容 LauditPipeline。
|
||||||
|
|
||||||
Wraps the raw OCR client with DocNormalizationAdapter so that a single
|
当前仍保留该入口兼容旧调用方,后续正式执行链应逐步切到
|
||||||
``.ocr()`` call produces a fully enriched OcrResult with:
|
``NativeRunner``。
|
||||||
- Document classification (type_id + rules_file_path)
|
|
||||||
- Dossier segmentation (sub-document page mapping)
|
|
||||||
- Seal/signature enrichment (text, seal_id, party_id)
|
|
||||||
- Normalized markdown (seal blocks + page separators)
|
|
||||||
|
|
||||||
Args:
|
|
||||||
rules_path: If provided, forces the adapter to use this rules file
|
|
||||||
for classification and segmentation. When None, the adapter
|
|
||||||
uses the RulesFileRegistry to classify from document content,
|
|
||||||
enabling auto-detection of sub-types (e.g. 行政许可 variants).
|
|
||||||
"""
|
"""
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from leaudit.doc_normalization.adapter import DocNormalizationAdapter
|
from leaudit.doc_normalization.adapter import DocNormalizationAdapter
|
||||||
@@ -51,7 +37,7 @@ def create_pipeline(rules_path: str | None = None) -> LauditPipeline:
|
|||||||
# Build registry from rules/ directory for content-based classification
|
# Build registry from rules/ directory for content-based classification
|
||||||
registry = None
|
registry = None
|
||||||
if rules_path is None:
|
if rules_path is None:
|
||||||
rules_dir = Path(__file__).resolve().parents[1] / "rules"
|
rules_dir = Path(__file__).resolve().parents[3] / "rules"
|
||||||
if rules_dir.is_dir():
|
if rules_dir.is_dir():
|
||||||
registry = RulesFileRegistry.from_directory(rules_dir)
|
registry = RulesFileRegistry.from_directory(rules_dir)
|
||||||
|
|
||||||
|
|||||||
@@ -0,0 +1,75 @@
|
|||||||
|
"""Build native leaudit ``AuditCtx`` instances from platform-side inputs."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from dataclasses import dataclass, field
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
from leaudit.config.audit_config import AuditConfig
|
||||||
|
from leaudit.services.audit_ctx import AuditCtx
|
||||||
|
from leaudit.services.audit_services import AuditServices
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class NativeAuditMetadata:
|
||||||
|
"""Platform-side metadata kept outside the native ``AuditCtx`` model."""
|
||||||
|
|
||||||
|
run_id: int
|
||||||
|
document_id: int
|
||||||
|
document_file_id: int | None = None
|
||||||
|
rule_set_id: int | None = None
|
||||||
|
rule_version_id: int | None = None
|
||||||
|
trigger_user_id: int | None = None
|
||||||
|
extras: dict[str, Any] = field(default_factory=dict)
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class NativeAuditBuildInput:
|
||||||
|
"""Everything the bridge knows before constructing a native ``AuditCtx``."""
|
||||||
|
|
||||||
|
metadata: NativeAuditMetadata
|
||||||
|
file_path: str
|
||||||
|
services: AuditServices
|
||||||
|
rules_file: Any | None = None
|
||||||
|
page_range: tuple[int, ...] | None = None
|
||||||
|
rule_source_path: str | None = None
|
||||||
|
force_rules_path: str | None = None
|
||||||
|
config_overrides: dict[str, Any] = field(default_factory=dict)
|
||||||
|
|
||||||
|
|
||||||
|
class AuditCtxBuilder:
|
||||||
|
"""Translate platform-side run inputs into leaudit's native ``AuditCtx``."""
|
||||||
|
|
||||||
|
def build(self, payload: NativeAuditBuildInput) -> AuditCtx:
|
||||||
|
"""Create a native ``AuditCtx`` ready for ``AuditService.audit``."""
|
||||||
|
config = self.build_config(
|
||||||
|
force_rules_path=payload.force_rules_path or payload.rule_source_path,
|
||||||
|
overrides=payload.config_overrides,
|
||||||
|
)
|
||||||
|
return AuditCtx(
|
||||||
|
document_id=str(payload.metadata.document_id),
|
||||||
|
rules_file=payload.rules_file,
|
||||||
|
services=payload.services,
|
||||||
|
file_path=payload.file_path,
|
||||||
|
page_range=payload.page_range,
|
||||||
|
config=config,
|
||||||
|
)
|
||||||
|
|
||||||
|
def build_config(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
force_rules_path: str | None = None,
|
||||||
|
overrides: dict[str, Any] | None = None,
|
||||||
|
) -> AuditConfig:
|
||||||
|
"""Build native ``AuditConfig`` from platform-side overrides."""
|
||||||
|
raw = dict(overrides or {})
|
||||||
|
if force_rules_path and "force_rules_path" not in raw:
|
||||||
|
raw["force_rules_path"] = force_rules_path
|
||||||
|
return AuditConfig(**raw)
|
||||||
|
|
||||||
|
|
||||||
|
__all__ = [
|
||||||
|
"AuditCtxBuilder",
|
||||||
|
"NativeAuditBuildInput",
|
||||||
|
"NativeAuditMetadata",
|
||||||
|
]
|
||||||
@@ -0,0 +1,132 @@
|
|||||||
|
"""Factory helpers for leaudit's native service-layer orchestration.
|
||||||
|
|
||||||
|
This module is the bridge-side assembly point for native leaudit services:
|
||||||
|
|
||||||
|
- ``AuditServices``
|
||||||
|
- ``DocNormalizationService``
|
||||||
|
- ``ExtractionService``
|
||||||
|
- ``EvaluationService``
|
||||||
|
- ``RescueService``
|
||||||
|
- ``AuditService``
|
||||||
|
|
||||||
|
The platform should not construct these objects in controllers/services
|
||||||
|
directly. Keep all leaudit-native wiring inside ``leaudit_bridge/``.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from leaudit.services.audit_service import AuditService
|
||||||
|
from leaudit.services.audit_services import AuditServices
|
||||||
|
from leaudit.services.doc_normalization_service import DocNormalizationService
|
||||||
|
from leaudit.services.evaluation_service import EvaluationService
|
||||||
|
from leaudit.services.extraction_service import ExtractionService
|
||||||
|
from leaudit.services.rescue_service import RescueService
|
||||||
|
|
||||||
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.client_factory import (
|
||||||
|
create_llm_client,
|
||||||
|
create_ocr_client,
|
||||||
|
create_vlm_client,
|
||||||
|
)
|
||||||
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.ocr_bridge import BridgeOCRClient
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class NativeServiceBundle:
|
||||||
|
"""Fully assembled native leaudit service bundle."""
|
||||||
|
|
||||||
|
audit_service: AuditService
|
||||||
|
audit_services: AuditServices
|
||||||
|
normalization_service: DocNormalizationService
|
||||||
|
extraction_service: ExtractionService
|
||||||
|
evaluation_service: EvaluationService
|
||||||
|
rescue_service: RescueService | None
|
||||||
|
|
||||||
|
|
||||||
|
class AuditServiceFactory:
|
||||||
|
"""Build native leaudit services for one platform-side run."""
|
||||||
|
|
||||||
|
def create_bundle(self, rules_path: str | None = None) -> NativeServiceBundle:
|
||||||
|
"""Create a fully wired native leaudit service bundle.
|
||||||
|
|
||||||
|
``rules_path`` is only used to force the normalization adapter's
|
||||||
|
classification path when the caller wants a fixed rules file.
|
||||||
|
"""
|
||||||
|
normalization_service, audit_services = self._create_normalization_services(
|
||||||
|
rules_path=rules_path
|
||||||
|
)
|
||||||
|
extraction_service = ExtractionService(
|
||||||
|
session=None,
|
||||||
|
llm_client=audit_services.llm_client,
|
||||||
|
)
|
||||||
|
evaluation_service = EvaluationService(session=None)
|
||||||
|
rescue_service = RescueService(
|
||||||
|
session=None,
|
||||||
|
llm_client=audit_services.llm_client,
|
||||||
|
vlm_client=audit_services.vlm_client,
|
||||||
|
extraction_service=extraction_service,
|
||||||
|
)
|
||||||
|
|
||||||
|
audit_service = AuditService(
|
||||||
|
document_service=None,
|
||||||
|
normalization_service=normalization_service,
|
||||||
|
extraction_service=extraction_service,
|
||||||
|
evaluation_service=evaluation_service,
|
||||||
|
rescue_service=rescue_service,
|
||||||
|
services=AuditServices(
|
||||||
|
llm_client=audit_services.llm_client,
|
||||||
|
vlm_client=audit_services.vlm_client,
|
||||||
|
ocr_client=audit_services.ocr_client,
|
||||||
|
normalization=normalization_service,
|
||||||
|
extraction=extraction_service,
|
||||||
|
evaluation=evaluation_service,
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
return NativeServiceBundle(
|
||||||
|
audit_service=audit_service,
|
||||||
|
audit_services=audit_service.services,
|
||||||
|
normalization_service=normalization_service,
|
||||||
|
extraction_service=extraction_service,
|
||||||
|
evaluation_service=evaluation_service,
|
||||||
|
rescue_service=rescue_service,
|
||||||
|
)
|
||||||
|
|
||||||
|
def _create_normalization_services(
|
||||||
|
self, rules_path: str | None = None
|
||||||
|
) -> tuple[DocNormalizationService, AuditServices]:
|
||||||
|
"""Create normalization service plus low-level shared clients."""
|
||||||
|
from leaudit.doc_normalization.adapter import DocNormalizationAdapter
|
||||||
|
from leaudit.doc_normalization.doc_classifier import RulesFileRegistry
|
||||||
|
|
||||||
|
raw_ocr = create_ocr_client()
|
||||||
|
llm_client = create_llm_client()
|
||||||
|
vlm_client = create_vlm_client()
|
||||||
|
|
||||||
|
registry = None
|
||||||
|
if rules_path is None:
|
||||||
|
rules_dir = Path(__file__).resolve().parents[3] / "rules"
|
||||||
|
if rules_dir.is_dir():
|
||||||
|
registry = RulesFileRegistry.from_directory(rules_dir)
|
||||||
|
|
||||||
|
adapter = DocNormalizationAdapter(
|
||||||
|
ocr_client=raw_ocr,
|
||||||
|
registry=registry,
|
||||||
|
llm_client=llm_client,
|
||||||
|
vlm_client=vlm_client,
|
||||||
|
force_rules_path=rules_path,
|
||||||
|
)
|
||||||
|
ocr_client = BridgeOCRClient(adapter, vlm_client=vlm_client)
|
||||||
|
normalization_service = DocNormalizationService(ocr_client)
|
||||||
|
audit_services = AuditServices(
|
||||||
|
llm_client=llm_client,
|
||||||
|
vlm_client=vlm_client,
|
||||||
|
ocr_client=raw_ocr,
|
||||||
|
normalization=normalization_service,
|
||||||
|
)
|
||||||
|
return normalization_service, audit_services
|
||||||
|
|
||||||
|
|
||||||
|
__all__ = ["AuditServiceFactory", "NativeServiceBundle"]
|
||||||
@@ -5,13 +5,15 @@ from __future__ import annotations
|
|||||||
import logging
|
import logging
|
||||||
from typing import TYPE_CHECKING
|
from typing import TYPE_CHECKING
|
||||||
|
|
||||||
from core.config import (
|
from fastapi_admin.config import (
|
||||||
OCR_CONFIG,
|
OCR_BASE_URL,
|
||||||
DEFAULT_BASE_URL,
|
OCR_TIMEOUT,
|
||||||
DEFAULT_LLM_MODEL,
|
LLM_BASE_URL,
|
||||||
DEFAULT_API_KEY,
|
LLM_MODEL,
|
||||||
DEFAULT_VLM_BASE_URL,
|
LLM_API_KEY,
|
||||||
DEFAULT_VLM_MODEL,
|
VLM_BASE_URL,
|
||||||
|
VLM_MODEL,
|
||||||
|
VLM_API_KEY,
|
||||||
)
|
)
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
if TYPE_CHECKING:
|
||||||
@@ -29,8 +31,8 @@ def create_ocr_client() -> BaseOCRClient:
|
|||||||
|
|
||||||
base_url = os.getenv("LEAUDIT_OCR_URL", "").rstrip("/")
|
base_url = os.getenv("LEAUDIT_OCR_URL", "").rstrip("/")
|
||||||
if not base_url:
|
if not base_url:
|
||||||
base_url = OCR_CONFIG["API_URL"].rsplit("/api/v1/ocr", 1)[0]
|
base_url = OCR_BASE_URL.rstrip("/")
|
||||||
timeout = float(OCR_CONFIG["TIMEOUT"])
|
timeout = float(OCR_TIMEOUT)
|
||||||
|
|
||||||
client = ChandraOCRClient(
|
client = ChandraOCRClient(
|
||||||
base_url=base_url,
|
base_url=base_url,
|
||||||
@@ -45,9 +47,9 @@ def create_llm_client() -> BaseLLMClient:
|
|||||||
"""Create a leaudit OpenAICompatibleClient from docauditai's LLM config."""
|
"""Create a leaudit OpenAICompatibleClient from docauditai's LLM config."""
|
||||||
from leaudit.llm.openai_client import OpenAICompatibleClient
|
from leaudit.llm.openai_client import OpenAICompatibleClient
|
||||||
|
|
||||||
base_url = DEFAULT_BASE_URL
|
base_url = LLM_BASE_URL
|
||||||
model = DEFAULT_LLM_MODEL
|
model = LLM_MODEL
|
||||||
api_key = DEFAULT_API_KEY or "no-key"
|
api_key = LLM_API_KEY or "no-key"
|
||||||
|
|
||||||
client = OpenAICompatibleClient(
|
client = OpenAICompatibleClient(
|
||||||
api_key=api_key,
|
api_key=api_key,
|
||||||
@@ -63,9 +65,9 @@ def create_vlm_client() -> BaseVLMClient | None:
|
|||||||
"""Create a leaudit QwenVLMClient from docauditai's VLM config."""
|
"""Create a leaudit QwenVLMClient from docauditai's VLM config."""
|
||||||
from leaudit.llm.qwen_vlm_client import QwenVLMClient
|
from leaudit.llm.qwen_vlm_client import QwenVLMClient
|
||||||
|
|
||||||
base_url = DEFAULT_VLM_BASE_URL
|
base_url = VLM_BASE_URL
|
||||||
model = DEFAULT_VLM_MODEL
|
model = VLM_MODEL
|
||||||
api_key = DEFAULT_API_KEY or "no-key"
|
api_key = VLM_API_KEY or LLM_API_KEY or "no-key"
|
||||||
|
|
||||||
if not base_url or not model:
|
if not base_url or not model:
|
||||||
log.info("leaudit VLM client skipped: no VLM config")
|
log.info("leaudit VLM client skipped: no VLM config")
|
||||||
|
|||||||
@@ -0,0 +1,63 @@
|
|||||||
|
"""文档文件来源解析器。"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from fastapi_common.fastapi_common_logger import logger
|
||||||
|
from fastapi_common.fastapi_common_storage.oss_client import OssClient
|
||||||
|
|
||||||
|
from fastapi_modules.fastapi_leaudit.models.leauditDocumentFile import LeauditDocumentFile
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class FileSourcePayload:
|
||||||
|
"""可供执行链消费的文件载荷。"""
|
||||||
|
|
||||||
|
fileName: str
|
||||||
|
fileContent: bytes
|
||||||
|
sourceType: str
|
||||||
|
sourcePath: str | None = None
|
||||||
|
|
||||||
|
|
||||||
|
class FileSourceResolver:
|
||||||
|
"""解析文档文件来源。"""
|
||||||
|
|
||||||
|
def __init__(self, Oss: OssClient | None = None) -> None:
|
||||||
|
self.Oss = Oss or OssClient()
|
||||||
|
|
||||||
|
async def ResolvePayload(self, DocumentFile: LeauditDocumentFile) -> FileSourcePayload:
|
||||||
|
"""解析文档文件,返回任务入口可直接消费的字节载荷。"""
|
||||||
|
if DocumentFile.localPath:
|
||||||
|
LocalPath = Path(DocumentFile.localPath)
|
||||||
|
if LocalPath.is_file():
|
||||||
|
return FileSourcePayload(
|
||||||
|
fileName=DocumentFile.fileName,
|
||||||
|
fileContent=LocalPath.read_bytes(),
|
||||||
|
sourceType="local",
|
||||||
|
sourcePath=str(LocalPath),
|
||||||
|
)
|
||||||
|
|
||||||
|
if DocumentFile.ossUrl:
|
||||||
|
return await self._DownloadFromUrl(
|
||||||
|
FileName=DocumentFile.fileName,
|
||||||
|
Url=DocumentFile.ossUrl,
|
||||||
|
)
|
||||||
|
|
||||||
|
raise ValueError("当前文档文件既无可用 localPath,也无可用 ossUrl")
|
||||||
|
|
||||||
|
async def _DownloadFromUrl(self, FileName: str, Url: str) -> FileSourcePayload:
|
||||||
|
"""从 OSS 或 URL 下载文件内容。"""
|
||||||
|
try:
|
||||||
|
Content = self.Oss.DownloadBytes(Url)
|
||||||
|
except Exception as Error:
|
||||||
|
logger.error(f"下载 OSS 文件失败: url={Url}, error={Error}")
|
||||||
|
raise
|
||||||
|
|
||||||
|
return FileSourcePayload(
|
||||||
|
fileName=FileName,
|
||||||
|
fileContent=Content,
|
||||||
|
sourceType="oss",
|
||||||
|
sourcePath=Url,
|
||||||
|
)
|
||||||
@@ -0,0 +1,185 @@
|
|||||||
|
"""Native AuditCtx runner for the platform bridge.
|
||||||
|
|
||||||
|
This module is the target execution path after deprecating the old
|
||||||
|
platform-side hand-written pipeline orchestration.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from dataclasses import dataclass, field
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
from leaudit.services.audit_ctx import AuditCtx
|
||||||
|
|
||||||
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.auditCtxBuilder import (
|
||||||
|
AuditCtxBuilder,
|
||||||
|
NativeAuditBuildInput,
|
||||||
|
NativeAuditMetadata,
|
||||||
|
)
|
||||||
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.auditServiceFactory import (
|
||||||
|
AuditServiceFactory,
|
||||||
|
NativeServiceBundle,
|
||||||
|
)
|
||||||
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.storage_adapter import StorageAdapter
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class NativeRunRequest:
|
||||||
|
"""Platform-side request payload for one native leaudit run."""
|
||||||
|
|
||||||
|
metadata: NativeAuditMetadata
|
||||||
|
local_file_path: str
|
||||||
|
rules_file: Any | None = None
|
||||||
|
rule_source_path: str | None = None
|
||||||
|
rules_path_override: str | None = None
|
||||||
|
page_range: tuple[int, ...] | None = None
|
||||||
|
config_overrides: dict[str, Any] = field(default_factory=dict)
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class NativeRunResult:
|
||||||
|
"""Result of one native ``AuditService.audit(ctx)`` execution."""
|
||||||
|
|
||||||
|
ctx: AuditCtx
|
||||||
|
service_bundle: NativeServiceBundle
|
||||||
|
metadata: NativeAuditMetadata
|
||||||
|
|
||||||
|
|
||||||
|
class NativeRunner:
|
||||||
|
"""Bridge-side runner that delegates orchestration to native leaudit."""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
service_factory: AuditServiceFactory | None = None,
|
||||||
|
ctx_builder: AuditCtxBuilder | None = None,
|
||||||
|
storage_adapter: StorageAdapter | None = None,
|
||||||
|
) -> None:
|
||||||
|
self.service_factory = service_factory or AuditServiceFactory()
|
||||||
|
self.ctx_builder = ctx_builder or AuditCtxBuilder()
|
||||||
|
self.storage = storage_adapter or StorageAdapter()
|
||||||
|
|
||||||
|
async def run(self, request: NativeRunRequest) -> NativeRunResult:
|
||||||
|
"""Execute one native leaudit run and return the populated ctx.
|
||||||
|
|
||||||
|
Persistence is intentionally not mixed into the orchestration step.
|
||||||
|
The caller can choose when to persist the final ctx to platform tables.
|
||||||
|
"""
|
||||||
|
bundle = self.service_factory.create_bundle(
|
||||||
|
rules_path=request.rules_path_override or request.rule_source_path
|
||||||
|
)
|
||||||
|
ctx = self.ctx_builder.build(
|
||||||
|
NativeAuditBuildInput(
|
||||||
|
metadata=request.metadata,
|
||||||
|
file_path=request.local_file_path,
|
||||||
|
services=bundle.audit_services,
|
||||||
|
rules_file=request.rules_file,
|
||||||
|
page_range=request.page_range,
|
||||||
|
rule_source_path=request.rule_source_path,
|
||||||
|
force_rules_path=request.rules_path_override,
|
||||||
|
config_overrides=request.config_overrides,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
ctx = await bundle.audit_service.audit(ctx)
|
||||||
|
return NativeRunResult(
|
||||||
|
ctx=ctx,
|
||||||
|
service_bundle=bundle,
|
||||||
|
metadata=request.metadata,
|
||||||
|
)
|
||||||
|
|
||||||
|
async def persist_result(self, result: NativeRunResult) -> None:
|
||||||
|
"""Persist a native run into platform-owned ``leaudit_*`` tables.
|
||||||
|
"""
|
||||||
|
document_id = result.metadata.document_id
|
||||||
|
run_id = result.metadata.run_id
|
||||||
|
ctx = result.ctx
|
||||||
|
extraction_errors = list(ctx.extraction_errors)
|
||||||
|
if not extraction_errors and ctx.extraction is not None:
|
||||||
|
extraction_errors = list(ctx.extraction.all_errors)
|
||||||
|
|
||||||
|
if ctx.normalized_doc is not None:
|
||||||
|
await self.storage.save_ocr_result(
|
||||||
|
document_id,
|
||||||
|
ctx.normalized_doc,
|
||||||
|
run_id=run_id,
|
||||||
|
)
|
||||||
|
if ctx.extraction is not None:
|
||||||
|
await self.storage.save_extraction_result(
|
||||||
|
document_id,
|
||||||
|
ctx.extraction,
|
||||||
|
run_id=run_id,
|
||||||
|
)
|
||||||
|
if ctx.evaluation is not None and ctx.rules_file is not None and ctx.extraction is not None:
|
||||||
|
await self.storage.save_evaluation_results(
|
||||||
|
document_id,
|
||||||
|
ctx.rules_file,
|
||||||
|
ctx.evaluation,
|
||||||
|
ctx.extraction,
|
||||||
|
run_id=run_id,
|
||||||
|
rule_version_id=result.metadata.rule_version_id,
|
||||||
|
)
|
||||||
|
if extraction_errors:
|
||||||
|
await self.storage.save_run_errors(
|
||||||
|
document_id,
|
||||||
|
run_id=run_id,
|
||||||
|
stage=ctx.phase or "extract",
|
||||||
|
messages=extraction_errors,
|
||||||
|
level="warning",
|
||||||
|
error_code="EXTRACTION_WARNING",
|
||||||
|
)
|
||||||
|
if ctx.fallback_tasks:
|
||||||
|
await self.storage.save_rescue_outcomes(
|
||||||
|
document_id,
|
||||||
|
run_id=run_id,
|
||||||
|
tasks=ctx.fallback_tasks,
|
||||||
|
)
|
||||||
|
|
||||||
|
await self.storage.save_run_metrics(
|
||||||
|
document_id,
|
||||||
|
run_id=run_id,
|
||||||
|
timing=dict(ctx.timing),
|
||||||
|
page_count=len(ctx.normalized_doc.pages) if ctx.normalized_doc is not None else None,
|
||||||
|
sub_document_count=len(ctx.extraction.sub_documents) if ctx.extraction is not None and getattr(ctx.extraction, "sub_documents", None) else 0,
|
||||||
|
field_count=len(ctx.extraction.fields) if ctx.extraction is not None else 0,
|
||||||
|
rule_count=len(ctx.evaluation.rules) if ctx.evaluation is not None else (len(ctx.rules_file.flat_rules) if ctx.rules_file is not None else 0),
|
||||||
|
rescue_rule_count=len(ctx.fallback_tasks),
|
||||||
|
artifact_count=self._estimate_artifact_count(ctx),
|
||||||
|
)
|
||||||
|
|
||||||
|
result_status = "review" if any(task.requires_human_review for task in ctx.fallback_tasks) else self._resolve_result_status(ctx)
|
||||||
|
await self.storage.finalize_run(
|
||||||
|
document_id,
|
||||||
|
run_id=run_id,
|
||||||
|
result_status=result_status,
|
||||||
|
rescue_applied=bool(ctx.fallback_tasks),
|
||||||
|
phase=ctx.phase,
|
||||||
|
finished=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
def _estimate_artifact_count(self, ctx: AuditCtx) -> int:
|
||||||
|
"""粗略估算当前运行已经产出的平台产物数。"""
|
||||||
|
count = 0
|
||||||
|
if ctx.normalized_doc is not None:
|
||||||
|
count += 1
|
||||||
|
if ctx.extraction is not None:
|
||||||
|
count += 1
|
||||||
|
if ctx.evaluation is not None:
|
||||||
|
count += 1
|
||||||
|
if ctx.fallback_tasks:
|
||||||
|
count += len(ctx.fallback_tasks)
|
||||||
|
return count
|
||||||
|
|
||||||
|
def _resolve_result_status(self, ctx: AuditCtx) -> str:
|
||||||
|
"""按原生 AuditCtx 结果推导运行状态。"""
|
||||||
|
if ctx.evaluation is None:
|
||||||
|
return "error"
|
||||||
|
if ctx.evaluation.errors:
|
||||||
|
return "error"
|
||||||
|
if ctx.evaluation.failed_count == 0 and ctx.evaluation.skipped_count == 0:
|
||||||
|
return "pass"
|
||||||
|
if ctx.evaluation.failed_count > 0:
|
||||||
|
return "fail"
|
||||||
|
return "partial"
|
||||||
|
|
||||||
|
|
||||||
|
__all__ = ["NativeRunRequest", "NativeRunResult", "NativeRunner"]
|
||||||
@@ -23,7 +23,7 @@ from leaudit.llm.base import BaseLLMClient
|
|||||||
from leaudit.ocr.base import BaseOCRClient
|
from leaudit.ocr.base import BaseOCRClient
|
||||||
from leaudit.ocr.models import OcrResult
|
from leaudit.ocr.models import OcrResult
|
||||||
|
|
||||||
from leaudit_bridge.storage_adapter import StorageAdapter
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.storage_adapter import StorageAdapter
|
||||||
|
|
||||||
log = logging.getLogger(__name__)
|
log = logging.getLogger(__name__)
|
||||||
|
|
||||||
@@ -229,7 +229,7 @@ class LauditPipeline:
|
|||||||
self, document_id: int, ocr_result: OcrResult,
|
self, document_id: int, ocr_result: OcrResult,
|
||||||
) -> None:
|
) -> None:
|
||||||
"""Extract case number from OCR and write to database."""
|
"""Extract case number from OCR and write to database."""
|
||||||
from leaudit_bridge.case_number_extractor import (
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.case_number_extractor import (
|
||||||
extract_case_number_with_llm,
|
extract_case_number_with_llm,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|||||||
@@ -0,0 +1,86 @@
|
|||||||
|
"""规则 YAML 校验器。"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import importlib
|
||||||
|
import sys
|
||||||
|
from dataclasses import dataclass
|
||||||
|
|
||||||
|
import yaml
|
||||||
|
from pydantic import ValidationError
|
||||||
|
|
||||||
|
from leaudit.dsl.loader import parse_rules_yaml_text
|
||||||
|
from leaudit.dsl.validator import DSLValidationError, validate as validate_rules
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class RuleValidationPayload:
|
||||||
|
"""规则校验结果。"""
|
||||||
|
|
||||||
|
valid: bool
|
||||||
|
ruleType: str | None = None
|
||||||
|
ruleName: str | None = None
|
||||||
|
versionNo: str | None = None
|
||||||
|
ruleCount: int = 0
|
||||||
|
extractCount: int = 0
|
||||||
|
errors: list[str] | None = None
|
||||||
|
|
||||||
|
|
||||||
|
class RuleValidator:
|
||||||
|
"""负责规则 YAML 的语法与 DSL 语义校验。"""
|
||||||
|
|
||||||
|
_CHECK_MODULES = (
|
||||||
|
"leaudit.engine.checks.required",
|
||||||
|
"leaudit.engine.checks.compare",
|
||||||
|
"leaudit.engine.checks.format_check",
|
||||||
|
"leaudit.engine.checks.text",
|
||||||
|
"leaudit.engine.checks.multi_entity",
|
||||||
|
"leaudit.engine.checks.visual",
|
||||||
|
"leaudit.engine.checks.external",
|
||||||
|
"leaudit.engine.checks.assert_check",
|
||||||
|
"leaudit.engine.checks.code_check",
|
||||||
|
"leaudit.engine.checks.ai_check",
|
||||||
|
)
|
||||||
|
|
||||||
|
def ValidateYaml(self, YamlText: str) -> RuleValidationPayload:
|
||||||
|
"""校验 YAML 并返回摘要结果。"""
|
||||||
|
try:
|
||||||
|
RulesFile = parse_rules_yaml_text(YamlText)
|
||||||
|
self._EnsureChecksImported()
|
||||||
|
validate_rules(RulesFile, registered_primitives=None)
|
||||||
|
except yaml.YAMLError as Error:
|
||||||
|
return RuleValidationPayload(valid=False, errors=[f"YAML 语法错误: {Error}"])
|
||||||
|
except ValidationError as Error:
|
||||||
|
return RuleValidationPayload(valid=False, errors=[f"Schema 校验失败: {Error}"])
|
||||||
|
except DSLValidationError as Error:
|
||||||
|
return RuleValidationPayload(valid=False, errors=[f"DSL 校验失败: {Error}"])
|
||||||
|
except Exception as Error:
|
||||||
|
return RuleValidationPayload(valid=False, errors=[f"规则校验失败: {Error}"])
|
||||||
|
|
||||||
|
return RuleValidationPayload(
|
||||||
|
valid=True,
|
||||||
|
ruleType=RulesFile.metadata.type_id,
|
||||||
|
ruleName=RulesFile.metadata.name,
|
||||||
|
versionNo=RulesFile.metadata.version,
|
||||||
|
ruleCount=len(RulesFile.flat_rules),
|
||||||
|
extractCount=len(RulesFile.flat_extract),
|
||||||
|
errors=[],
|
||||||
|
)
|
||||||
|
|
||||||
|
def ParseValidated(self, YamlText: str):
|
||||||
|
"""解析并返回已通过完整校验的 RulesFile。"""
|
||||||
|
Validation = self.ValidateYaml(YamlText)
|
||||||
|
if not Validation.valid:
|
||||||
|
raise ValueError("; ".join(Validation.errors or ["规则校验失败"]))
|
||||||
|
return parse_rules_yaml_text(YamlText)
|
||||||
|
|
||||||
|
def _EnsureChecksImported(self) -> None:
|
||||||
|
"""确保所有检查器模块已注册。"""
|
||||||
|
for ModuleName in self._CHECK_MODULES:
|
||||||
|
try:
|
||||||
|
if ModuleName in sys.modules:
|
||||||
|
importlib.reload(sys.modules[ModuleName])
|
||||||
|
else:
|
||||||
|
importlib.import_module(ModuleName)
|
||||||
|
except Exception:
|
||||||
|
continue
|
||||||
@@ -0,0 +1,125 @@
|
|||||||
|
"""规则版本来源解析器。"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import hashlib
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from fastapi_common.fastapi_common_logger import logger
|
||||||
|
from fastapi_common.fastapi_common_sqlalchemy.database import GetAsyncSession
|
||||||
|
from fastapi_common.fastapi_common_storage.oss_client import OssClient
|
||||||
|
from sqlalchemy import text
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class RuleVersionPayload:
|
||||||
|
"""规则文件解析结果。"""
|
||||||
|
|
||||||
|
localPath: str
|
||||||
|
sourceType: str
|
||||||
|
sourcePath: str
|
||||||
|
ruleVersionId: int | None = None
|
||||||
|
ruleTypeId: str | None = None
|
||||||
|
fileSha256: str | None = None
|
||||||
|
|
||||||
|
|
||||||
|
class RuleVersionResolver:
|
||||||
|
"""按运行记录解析规则 YAML 文件来源。"""
|
||||||
|
|
||||||
|
def __init__(self, Oss: OssClient | None = None) -> None:
|
||||||
|
self.Oss = Oss or OssClient()
|
||||||
|
|
||||||
|
async def ResolveForRun(self, RunId: int) -> RuleVersionPayload | None:
|
||||||
|
"""根据运行记录解析规则文件来源。"""
|
||||||
|
RunInfo = await self._LoadRunInfo(RunId)
|
||||||
|
if not RunInfo:
|
||||||
|
return None
|
||||||
|
|
||||||
|
LocalCachePath = RunInfo["rule_local_cache_path"]
|
||||||
|
if LocalCachePath:
|
||||||
|
CachePath = Path(LocalCachePath)
|
||||||
|
if CachePath.is_file():
|
||||||
|
return RuleVersionPayload(
|
||||||
|
localPath=str(CachePath),
|
||||||
|
sourceType="local_cache",
|
||||||
|
sourcePath=str(CachePath),
|
||||||
|
ruleVersionId=RunInfo["rule_version_id"],
|
||||||
|
ruleTypeId=RunInfo["rule_type_id"],
|
||||||
|
fileSha256=RunInfo["rule_source_sha256"],
|
||||||
|
)
|
||||||
|
|
||||||
|
SourceUrl = RunInfo["rule_source_oss_url"]
|
||||||
|
if not SourceUrl:
|
||||||
|
return None
|
||||||
|
|
||||||
|
return await self._DownloadFromUrl(
|
||||||
|
Url=SourceUrl,
|
||||||
|
RuleVersionId=RunInfo["rule_version_id"],
|
||||||
|
RuleTypeId=RunInfo["rule_type_id"],
|
||||||
|
ExpectedSha256=RunInfo["rule_source_sha256"],
|
||||||
|
)
|
||||||
|
|
||||||
|
async def _LoadRunInfo(self, RunId: int) -> dict[str, object] | None:
|
||||||
|
"""读取运行记录中的规则来源信息。"""
|
||||||
|
async with GetAsyncSession() as Session:
|
||||||
|
Result = await Session.execute(
|
||||||
|
text(
|
||||||
|
"""
|
||||||
|
SELECT
|
||||||
|
rule_version_id,
|
||||||
|
rule_type_id,
|
||||||
|
rule_source_oss_url,
|
||||||
|
rule_source_sha256,
|
||||||
|
rule_local_cache_path
|
||||||
|
FROM leaudit_audit_runs
|
||||||
|
WHERE id = :run_id
|
||||||
|
LIMIT 1
|
||||||
|
"""
|
||||||
|
),
|
||||||
|
{"run_id": RunId},
|
||||||
|
)
|
||||||
|
Row = Result.mappings().first()
|
||||||
|
return dict(Row) if Row else None
|
||||||
|
|
||||||
|
async def _DownloadFromUrl(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
Url: str,
|
||||||
|
RuleVersionId: int | None,
|
||||||
|
RuleTypeId: str | None,
|
||||||
|
ExpectedSha256: str | None,
|
||||||
|
) -> RuleVersionPayload:
|
||||||
|
"""从 OSS 下载规则 YAML 到本地临时文件。"""
|
||||||
|
try:
|
||||||
|
Content = self.Oss.DownloadBytes(Url)
|
||||||
|
except Exception as Error:
|
||||||
|
logger.error(f"下载规则 YAML 失败: url={Url}, error={Error}")
|
||||||
|
raise
|
||||||
|
|
||||||
|
ActualSha256 = hashlib.sha256(Content).hexdigest()
|
||||||
|
if ExpectedSha256 and ActualSha256.lower() != ExpectedSha256.lower():
|
||||||
|
raise ValueError(
|
||||||
|
"规则 YAML SHA256 校验失败: "
|
||||||
|
f"expected={ExpectedSha256}, actual={ActualSha256}"
|
||||||
|
)
|
||||||
|
|
||||||
|
FilePrefix = "leaudit-rule-"
|
||||||
|
if RuleTypeId:
|
||||||
|
SafeTypeId = RuleTypeId.replace("/", "_").replace(".", "_")
|
||||||
|
FilePrefix = f"{FilePrefix}{SafeTypeId}-"
|
||||||
|
|
||||||
|
LocalPath = self.Oss.WriteTempBytes(
|
||||||
|
Content=Content,
|
||||||
|
Suffix=".yaml",
|
||||||
|
Prefix=FilePrefix,
|
||||||
|
)
|
||||||
|
|
||||||
|
return RuleVersionPayload(
|
||||||
|
localPath=LocalPath,
|
||||||
|
sourceType="oss",
|
||||||
|
sourcePath=Url,
|
||||||
|
ruleVersionId=RuleVersionId,
|
||||||
|
ruleTypeId=RuleTypeId,
|
||||||
|
fileSha256=ActualSha256,
|
||||||
|
)
|
||||||
@@ -1,28 +1,38 @@
|
|||||||
"""Celery task for leaudit pipeline processing.
|
"""LeAudit 任务入口。"""
|
||||||
|
|
||||||
Activated when PIPELINE_MODE=leaudit in env.{port} config.
|
|
||||||
Replaces the legacy OCR → extraction → evaluation pipeline with
|
|
||||||
leaudit's YAML-rules-driven approach.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import asyncio
|
import asyncio
|
||||||
import os
|
import os
|
||||||
|
from pathlib import Path
|
||||||
import tempfile
|
import tempfile
|
||||||
import time
|
import time
|
||||||
from typing import Any, Dict, Optional
|
from typing import Any, Dict, Optional
|
||||||
|
|
||||||
from core.celery_app_limited import celery_app
|
from fastapi_common.fastapi_common_logger import logger
|
||||||
from core.postgrest.client import get_postgrest_client
|
|
||||||
from core.logger import log
|
|
||||||
|
|
||||||
from leaudit_bridge import create_pipeline, RulesLoader
|
from fastapi_admin.config import LEAUDIT_RULES_DIR
|
||||||
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.nativeRunner import (
|
||||||
|
NativeRunRequest,
|
||||||
|
NativeRunner,
|
||||||
|
)
|
||||||
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.auditCtxBuilder import (
|
||||||
|
NativeAuditMetadata,
|
||||||
|
)
|
||||||
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.ruleVersionResolver import (
|
||||||
|
RuleVersionResolver,
|
||||||
|
)
|
||||||
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.rules_loader import RulesLoader
|
||||||
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.storage_adapter import StorageAdapter
|
||||||
|
|
||||||
|
# Celery 集成待 P2 阶段实现,当前使用同步占位
|
||||||
|
# from core.celery_app_limited import celery_app
|
||||||
|
|
||||||
|
log = logger
|
||||||
|
|
||||||
|
|
||||||
@celery_app.task(bind=True, name="leaudit.process_document")
|
# P2: Celery 集成后启用 @celery_app.task 装饰器
|
||||||
def leaudit_process_document(
|
def leaudit_process_document(
|
||||||
self,
|
|
||||||
document_id: int,
|
document_id: int,
|
||||||
file_content: bytes,
|
file_content: bytes,
|
||||||
filename: str,
|
filename: str,
|
||||||
@@ -30,43 +40,41 @@ def leaudit_process_document(
|
|||||||
source_port: Optional[int] = None,
|
source_port: Optional[int] = None,
|
||||||
rules_path: Optional[str] = None,
|
rules_path: Optional[str] = None,
|
||||||
):
|
):
|
||||||
"""Process a document using leaudit's full pipeline.
|
"""处理单个文档的 LeAudit 任务。"""
|
||||||
|
task_id = os.urandom(8).hex()
|
||||||
Steps: OCR → Extraction → Evaluation → Store in docauditai DB.
|
log.info(f"[任务ID: {task_id}] leaudit管线开始处理: {filename}")
|
||||||
"""
|
|
||||||
task_id = self.request.id
|
|
||||||
log.task.info(f"[任务ID: {task_id}] leaudit管线开始处理: {filename}")
|
|
||||||
|
|
||||||
|
# 新平台:region 通过参数传递,不再依赖 os.environ 切换
|
||||||
if source_port:
|
if source_port:
|
||||||
from core.utils.instance_context import set_instance_environment
|
log.info(f"[任务ID: {task_id}] 来源端口: {source_port}")
|
||||||
instance_name = set_instance_environment(source_port)
|
|
||||||
log.task.info(
|
|
||||||
f"[任务ID: {task_id}] 实例环境: {instance_name} (端口: {source_port})"
|
|
||||||
)
|
|
||||||
|
|
||||||
if upload_info is None:
|
if upload_info is None:
|
||||||
upload_info = {}
|
upload_info = {}
|
||||||
|
|
||||||
loop = asyncio.new_event_loop()
|
loop = asyncio.new_event_loop()
|
||||||
asyncio.set_event_loop(loop)
|
asyncio.set_event_loop(loop)
|
||||||
|
temp_paths: list[str] = []
|
||||||
|
storage = StorageAdapter()
|
||||||
|
|
||||||
try:
|
try:
|
||||||
rules_path_resolved = rules_path or _resolve_rules_path(document_id, loop)
|
run_id = _resolve_run_id(document_id, upload_info, loop)
|
||||||
|
rules_resolution = _resolve_rules_runtime(document_id, run_id, rules_path, loop)
|
||||||
|
loop.run_until_complete(_update_run_status_safe(run_id, "running"))
|
||||||
|
rules_path_resolved = rules_resolution["rules_path"]
|
||||||
|
|
||||||
# For types with a known mapping (e.g. 行政处罚), pre-load rules_file.
|
|
||||||
# For types that need content classification (e.g. 行政许可 sub-types),
|
|
||||||
# rules_path will be None → adapter classifies after OCR → pipeline
|
|
||||||
# loads rules from ocr_result.rules_file_path.
|
|
||||||
rules_file = None
|
rules_file = None
|
||||||
if rules_path_resolved:
|
if rules_path_resolved:
|
||||||
loader = RulesLoader()
|
loader = RulesLoader()
|
||||||
rules_file = loader.load(rules_path_resolved)
|
rules_file = loader.load(rules_path_resolved)
|
||||||
log.task.info(
|
temp_rule_path = rules_resolution.get("temp_rule_path")
|
||||||
|
if isinstance(temp_rule_path, str):
|
||||||
|
temp_paths.append(temp_rule_path)
|
||||||
|
log.info(
|
||||||
f"[任务ID: {task_id}] RulesFile pre-loaded: {rules_path_resolved} "
|
f"[任务ID: {task_id}] RulesFile pre-loaded: {rules_path_resolved} "
|
||||||
f"({len(rules_file.flat_rules)} rules, {len(rules_file.flat_extract)} fields)"
|
f"({len(rules_file.flat_rules)} rules, {len(rules_file.flat_extract)} fields)"
|
||||||
)
|
)
|
||||||
else:
|
else:
|
||||||
log.task.info(
|
log.info(
|
||||||
f"[任务ID: {task_id}] No fixed rules_path — "
|
f"[任务ID: {task_id}] No fixed rules_path — "
|
||||||
"will classify from document content after OCR"
|
"will classify from document content after OCR"
|
||||||
)
|
)
|
||||||
@@ -75,47 +83,79 @@ def leaudit_process_document(
|
|||||||
with tempfile.NamedTemporaryFile(suffix=suffix, delete=False) as temp:
|
with tempfile.NamedTemporaryFile(suffix=suffix, delete=False) as temp:
|
||||||
temp.write(file_content)
|
temp.write(file_content)
|
||||||
temp_path = temp.name
|
temp_path = temp.name
|
||||||
|
temp_paths.append(temp_path)
|
||||||
|
|
||||||
pipeline = create_pipeline(rules_path=rules_path_resolved)
|
runner = NativeRunner()
|
||||||
|
|
||||||
t0 = time.time()
|
t0 = time.time()
|
||||||
result = loop.run_until_complete(
|
native_result = loop.run_until_complete(
|
||||||
pipeline.run(
|
runner.run(
|
||||||
document_id=document_id,
|
NativeRunRequest(
|
||||||
file_path=temp_path,
|
metadata=NativeAuditMetadata(
|
||||||
rules_file=rules_file,
|
run_id=run_id,
|
||||||
source_port=source_port or int(os.getenv("APP_PORT", "8000")),
|
document_id=document_id,
|
||||||
|
rule_version_id=_optional_int(upload_info, "rule_version_id", "ruleVersionId"),
|
||||||
|
extras={"taskId": task_id},
|
||||||
|
),
|
||||||
|
local_file_path=temp_path,
|
||||||
|
rules_file=rules_file,
|
||||||
|
rule_source_path=rules_path_resolved,
|
||||||
|
)
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
|
loop.run_until_complete(runner.persist_result(native_result))
|
||||||
elapsed = round(time.time() - t0, 2)
|
elapsed = round(time.time() - t0, 2)
|
||||||
|
|
||||||
try:
|
ctx = native_result.ctx
|
||||||
os.remove(temp_path)
|
loop.run_until_complete(_update_run_phase_safe(run_id, ctx.phase))
|
||||||
except OSError:
|
loop.run_until_complete(_update_run_status_safe(run_id, "completed"))
|
||||||
pass
|
loop.run_until_complete(_update_status_safe(document_id, "completed"))
|
||||||
|
log.info(
|
||||||
log.task.info(
|
f"[任务ID: {task_id}] leaudit管线完成: phase={ctx.phase}, "
|
||||||
f"[任务ID: {task_id}] leaudit管线完成: phase={result.detected_phase}, "
|
f"timing={dict(ctx.timing)}, 总耗时={elapsed:.1f}s"
|
||||||
f"timing={result.timing}, 总耗时={elapsed:.1f}s"
|
|
||||||
)
|
)
|
||||||
|
|
||||||
return {
|
return {
|
||||||
"status": "success",
|
"status": "success",
|
||||||
"document_id": document_id,
|
"document_id": document_id,
|
||||||
"phase": result.detected_phase,
|
"run_id": run_id,
|
||||||
"timing": result.timing,
|
"phase": ctx.phase,
|
||||||
"errors": result.errors,
|
"timing": dict(ctx.timing),
|
||||||
|
"errors": list(ctx.extraction.all_errors) if ctx.extraction is not None else list(ctx.extraction_errors),
|
||||||
}
|
}
|
||||||
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
log.task.error(f"[任务ID: {task_id}] leaudit管线失败: {e}", exc_info=True)
|
log.error(f"[任务ID: {task_id}] leaudit管线失败: {e}", exc_info=True)
|
||||||
try:
|
try:
|
||||||
loop.run_until_complete(_update_status_safe(document_id, "Failed"))
|
loop.run_until_complete(_update_status_safe(document_id, "failed"))
|
||||||
|
if 'run_id' in locals():
|
||||||
|
failed_phase = "persist"
|
||||||
|
if "native_result" in locals():
|
||||||
|
failed_phase = native_result.ctx.phase or failed_phase
|
||||||
|
loop.run_until_complete(
|
||||||
|
storage.fail_run(
|
||||||
|
document_id,
|
||||||
|
run_id=run_id,
|
||||||
|
phase=failed_phase,
|
||||||
|
message=str(e),
|
||||||
|
detail_json={
|
||||||
|
"taskId": task_id,
|
||||||
|
"filename": filename,
|
||||||
|
"errorType": type(e).__name__,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
)
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
pass
|
||||||
raise
|
raise
|
||||||
|
|
||||||
finally:
|
finally:
|
||||||
|
for temp_path in temp_paths:
|
||||||
|
try:
|
||||||
|
if Path(temp_path).exists():
|
||||||
|
os.remove(temp_path)
|
||||||
|
except OSError:
|
||||||
|
pass
|
||||||
loop.close()
|
loop.close()
|
||||||
|
|
||||||
|
|
||||||
@@ -129,49 +169,158 @@ _TYPE_ID_RULES_MAP: dict[int, str] = {
|
|||||||
|
|
||||||
def _resolve_rules_path(document_id: int, loop: asyncio.AbstractEventLoop) -> str | None:
|
def _resolve_rules_path(document_id: int, loop: asyncio.AbstractEventLoop) -> str | None:
|
||||||
"""Resolve rules_path: config override → document metadata → type_id mapping."""
|
"""Resolve rules_path: config override → document metadata → type_id mapping."""
|
||||||
from core.config import LEAUDIT_CONFIG
|
from fastapi_common.fastapi_common_sqlalchemy.database import GetAsyncSession
|
||||||
|
from sqlalchemy import text as sa_text
|
||||||
|
|
||||||
# 1. Config override (when explicitly set)
|
# 1. Config override (when explicitly set in app.toml)
|
||||||
config_path = LEAUDIT_CONFIG.get("RULES_PATH", "")
|
if LEAUDIT_RULES_DIR:
|
||||||
if config_path:
|
return LEAUDIT_RULES_DIR
|
||||||
return config_path
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
client = get_postgrest_client()
|
async def _fetch():
|
||||||
doc = loop.run_until_complete(
|
async with GetAsyncSession() as session:
|
||||||
client.select(
|
result = await session.execute(
|
||||||
table="documents",
|
sa_text("SELECT type_id FROM leaudit_documents WHERE id = :did"),
|
||||||
filters={"id": f"eq.{document_id}"},
|
{"did": document_id},
|
||||||
single=True,
|
)
|
||||||
)
|
row = result.fetchone()
|
||||||
)
|
if row and row[0] and row[0] in _TYPE_ID_RULES_MAP:
|
||||||
if not doc:
|
return f"{_TYPE_ID_RULES_MAP[row[0]]}/rules.yaml"
|
||||||
return None
|
return None
|
||||||
|
|
||||||
# 2. Document-level override
|
return loop.run_until_complete(_fetch())
|
||||||
rfp = doc.get("rules_file_path")
|
|
||||||
if rfp:
|
|
||||||
return rfp
|
|
||||||
|
|
||||||
# 3. type_id mapping
|
|
||||||
type_id = doc.get("type_id")
|
|
||||||
if type_id and type_id in _TYPE_ID_RULES_MAP:
|
|
||||||
return f"{_TYPE_ID_RULES_MAP[type_id]}/rules.yaml"
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
log.task.warning(f"Failed to resolve rules_path from document: {e}")
|
log.warning(f"Failed to resolve rules_path from document: {e}")
|
||||||
|
|
||||||
return None
|
return None
|
||||||
|
|
||||||
|
|
||||||
async def _update_status_safe(document_id: int, status: str) -> None:
|
def _resolve_rules_runtime(
|
||||||
"""Safely update document status, ignoring errors."""
|
document_id: int,
|
||||||
|
run_id: int,
|
||||||
|
explicit_rules_path: str | None,
|
||||||
|
loop: asyncio.AbstractEventLoop,
|
||||||
|
) -> dict[str, str | None]:
|
||||||
|
"""解析本次执行使用的规则来源。"""
|
||||||
|
if explicit_rules_path:
|
||||||
|
return {
|
||||||
|
"rules_path": explicit_rules_path,
|
||||||
|
"temp_rule_path": None,
|
||||||
|
"source_type": "explicit",
|
||||||
|
"source_path": explicit_rules_path,
|
||||||
|
}
|
||||||
|
|
||||||
|
resolver = RuleVersionResolver()
|
||||||
try:
|
try:
|
||||||
client = get_postgrest_client()
|
payload = loop.run_until_complete(resolver.ResolveForRun(run_id))
|
||||||
await client.update(
|
if payload:
|
||||||
table="documents",
|
log.info(
|
||||||
filters={"id": f"eq.{document_id}"},
|
f"run_id={run_id} 规则来源已解析: sourceType={payload.sourceType}, "
|
||||||
data={"status": status},
|
f"sourcePath={payload.sourcePath}, localPath={payload.localPath}"
|
||||||
)
|
)
|
||||||
|
return {
|
||||||
|
"rules_path": payload.localPath,
|
||||||
|
"temp_rule_path": payload.localPath if payload.sourceType == "oss" else None,
|
||||||
|
"source_type": payload.sourceType,
|
||||||
|
"source_path": payload.sourcePath,
|
||||||
|
}
|
||||||
|
except Exception as e:
|
||||||
|
log.warning(f"Failed to resolve rule version from run: run_id={run_id}, error={e}")
|
||||||
|
|
||||||
|
fallback_rules_path = _resolve_rules_path(document_id, loop)
|
||||||
|
return {
|
||||||
|
"rules_path": fallback_rules_path,
|
||||||
|
"temp_rule_path": None,
|
||||||
|
"source_type": "legacy_fallback",
|
||||||
|
"source_path": fallback_rules_path,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_run_id(
|
||||||
|
document_id: int,
|
||||||
|
upload_info: Dict[str, Any] | None,
|
||||||
|
loop: asyncio.AbstractEventLoop,
|
||||||
|
) -> int:
|
||||||
|
"""解析本次任务对应的运行 ID。"""
|
||||||
|
if upload_info:
|
||||||
|
for key in ("run_id", "runId"):
|
||||||
|
value = upload_info.get(key)
|
||||||
|
if isinstance(value, int):
|
||||||
|
return value
|
||||||
|
|
||||||
|
from fastapi_common.fastapi_common_sqlalchemy.database import GetAsyncSession
|
||||||
|
from sqlalchemy import text as sa_text
|
||||||
|
|
||||||
|
async def _fetch() -> int:
|
||||||
|
async with GetAsyncSession() as session:
|
||||||
|
result = await session.execute(
|
||||||
|
sa_text("SELECT id FROM leaudit_audit_runs WHERE document_id = :did ORDER BY id DESC LIMIT 1"),
|
||||||
|
{"did": document_id},
|
||||||
|
)
|
||||||
|
row = result.fetchone()
|
||||||
|
if not row:
|
||||||
|
raise ValueError(f"未找到 document_id={document_id} 对应的 run 记录")
|
||||||
|
return int(row[0])
|
||||||
|
|
||||||
|
return loop.run_until_complete(_fetch())
|
||||||
|
|
||||||
|
|
||||||
|
def _optional_int(payload: Dict[str, Any] | None, *keys: str) -> int | None:
|
||||||
|
"""从字典中按顺序取可用整数。"""
|
||||||
|
if not payload:
|
||||||
|
return None
|
||||||
|
|
||||||
|
for key in keys:
|
||||||
|
value = payload.get(key)
|
||||||
|
if isinstance(value, int):
|
||||||
|
return value
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
async def _update_status_safe(document_id: int, status: str) -> None:
|
||||||
|
"""Safely update document status via SQLAlchemy, ignoring errors."""
|
||||||
|
try:
|
||||||
|
from fastapi_common.fastapi_common_sqlalchemy.database import GetAsyncSession
|
||||||
|
from sqlalchemy import text as sa_text
|
||||||
|
|
||||||
|
async with GetAsyncSession() as session:
|
||||||
|
await session.execute(
|
||||||
|
sa_text("UPDATE leaudit_documents SET processing_status = :s, update_time = now() WHERE id = :did"),
|
||||||
|
{"s": status, "did": document_id},
|
||||||
|
)
|
||||||
|
await session.commit()
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
async def _update_run_status_safe(run_id: int, status: str) -> None:
|
||||||
|
"""安全更新运行状态。"""
|
||||||
|
try:
|
||||||
|
from fastapi_common.fastapi_common_sqlalchemy.database import GetAsyncSession
|
||||||
|
from sqlalchemy import text as sa_text
|
||||||
|
|
||||||
|
async with GetAsyncSession() as session:
|
||||||
|
await session.execute(
|
||||||
|
sa_text("UPDATE leaudit_audit_runs SET status = :s, update_time = now() WHERE id = :rid"),
|
||||||
|
{"s": status, "rid": run_id},
|
||||||
|
)
|
||||||
|
await session.commit()
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
async def _update_run_phase_safe(run_id: int, phase: str | None) -> None:
|
||||||
|
"""安全更新运行阶段。"""
|
||||||
|
try:
|
||||||
|
from fastapi_common.fastapi_common_sqlalchemy.database import GetAsyncSession
|
||||||
|
from sqlalchemy import text as sa_text
|
||||||
|
|
||||||
|
async with GetAsyncSession() as session:
|
||||||
|
await session.execute(
|
||||||
|
sa_text("UPDATE leaudit_audit_runs SET phase = :p, update_time = now() WHERE id = :rid"),
|
||||||
|
{"p": phase, "rid": run_id},
|
||||||
|
)
|
||||||
|
await session.commit()
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
pass
|
||||||
|
|
||||||
@@ -190,12 +339,16 @@ def dispatch_leaudit_task(
|
|||||||
source_port: Optional[int] = None,
|
source_port: Optional[int] = None,
|
||||||
rules_path: Optional[str] = None,
|
rules_path: Optional[str] = None,
|
||||||
):
|
):
|
||||||
"""Dispatch a leaudit processing task."""
|
"""Dispatch a leaudit processing task.
|
||||||
return leaudit_process_document.apply_async(
|
|
||||||
args=[document_id, file_content, filename],
|
P2: Celery 集成后改用 leaudit_process_document.apply_async(...)
|
||||||
kwargs={
|
当前阶段直接同步调用。
|
||||||
"upload_info": upload_info,
|
"""
|
||||||
"source_port": source_port or int(os.getenv("APP_PORT", "8000")),
|
return leaudit_process_document(
|
||||||
"rules_path": rules_path,
|
document_id=document_id,
|
||||||
},
|
file_content=file_content,
|
||||||
|
filename=filename,
|
||||||
|
upload_info=upload_info,
|
||||||
|
source_port=source_port or int(os.getenv("APP_PORT", "8000")),
|
||||||
|
rules_path=rules_path,
|
||||||
)
|
)
|
||||||
|
|||||||
@@ -2,7 +2,8 @@
|
|||||||
|
|
||||||
from fastapi_modules.fastapi_leaudit.services.auditService import IAuditService
|
from fastapi_modules.fastapi_leaudit.services.auditService import IAuditService
|
||||||
from fastapi_modules.fastapi_leaudit.services.authService import IAuthService
|
from fastapi_modules.fastapi_leaudit.services.authService import IAuthService
|
||||||
|
from fastapi_modules.fastapi_leaudit.services.ossService import IOssService
|
||||||
from fastapi_modules.fastapi_leaudit.services.permissionService import IPermissionService
|
from fastapi_modules.fastapi_leaudit.services.permissionService import IPermissionService
|
||||||
from fastapi_modules.fastapi_leaudit.services.ruleService import IRuleService
|
from fastapi_modules.fastapi_leaudit.services.ruleService import IRuleService
|
||||||
|
|
||||||
__all__ = ["IAuditService", "IAuthService", "IPermissionService", "IRuleService"]
|
__all__ = ["IAuditService", "IAuthService", "IOssService", "IPermissionService", "IRuleService"]
|
||||||
|
|||||||
@@ -9,7 +9,7 @@ class IAuditService(ABC):
|
|||||||
"""评查服务接口。"""
|
"""评查服务接口。"""
|
||||||
|
|
||||||
@abstractmethod
|
@abstractmethod
|
||||||
async def Run(self, DocumentId: int) -> AuditRunVO:
|
async def Run(self, DocumentId: int, RuleType: str | None = None, Force: bool = False) -> AuditRunVO:
|
||||||
"""触发文档评查。"""
|
"""触发文档评查。"""
|
||||||
...
|
...
|
||||||
|
|
||||||
|
|||||||
@@ -4,13 +4,22 @@
|
|||||||
文档 → OCR → Extract → Evaluate → Rescue → Persist
|
文档 → OCR → Extract → Evaluate → Rescue → Persist
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
from datetime import datetime
|
||||||
|
|
||||||
from fastapi_common.fastapi_common_logger import logger
|
from fastapi_common.fastapi_common_logger import logger
|
||||||
from fastapi_common.fastapi_common_sqlalchemy.database import GetAsyncSession
|
from fastapi_common.fastapi_common_sqlalchemy.database import GetAsyncSession
|
||||||
from fastapi_common.fastapi_common_web.domain.responses import StatusCodeEnum
|
from fastapi_common.fastapi_common_web.domain.responses import StatusCodeEnum
|
||||||
from fastapi_common.fastapi_common_web.exception.LeauditException import LeauditException
|
from fastapi_common.fastapi_common_web.exception.LeauditException import LeauditException
|
||||||
|
from sqlalchemy import select, text
|
||||||
|
|
||||||
from fastapi_modules.fastapi_leaudit.domian.vo.auditVo import AuditRunVO, AuditResultVO
|
from fastapi_modules.fastapi_leaudit.domian.vo.auditVo import AuditRunVO, AuditResultVO
|
||||||
from fastapi_modules.fastapi_leaudit.models import LeauditAuditRun
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.fileSourceResolver import FileSourceResolver
|
||||||
|
from fastapi_modules.fastapi_leaudit.leaudit_bridge.tasks import dispatch_leaudit_task
|
||||||
|
from fastapi_modules.fastapi_leaudit.models import (
|
||||||
|
LeauditAuditRun,
|
||||||
|
LeauditDocument,
|
||||||
|
LeauditDocumentFile,
|
||||||
|
)
|
||||||
from fastapi_modules.fastapi_leaudit.services import IAuditService
|
from fastapi_modules.fastapi_leaudit.services import IAuditService
|
||||||
|
|
||||||
|
|
||||||
@@ -20,12 +29,116 @@ class AuditServiceImpl(IAuditService):
|
|||||||
async def Run(self, DocumentId: int, RuleType: str | None = None, Force: bool = False) -> AuditRunVO:
|
async def Run(self, DocumentId: int, RuleType: str | None = None, Force: bool = False) -> AuditRunVO:
|
||||||
"""触发文档评查。
|
"""触发文档评查。
|
||||||
|
|
||||||
实际执行流程由 Celery 任务异步处理。
|
当前阶段同步触发 bridge 执行链,后续再切换为 Celery 异步分发。
|
||||||
"""
|
"""
|
||||||
async with GetAsyncSession() as session:
|
async with GetAsyncSession() as session:
|
||||||
# TODO: 从 bridge 层获取 pipeline,提交 Celery 任务
|
|
||||||
logger.info(f"触发评查: documentId={DocumentId}, ruleType={RuleType}")
|
logger.info(f"触发评查: documentId={DocumentId}, ruleType={RuleType}")
|
||||||
raise LeauditException(StatusCodeEnum.HTTP_500_INTERNAL_SERVER_ERROR, "Celery 任务集成待实现")
|
document = await session.get(LeauditDocument, DocumentId)
|
||||||
|
if not document:
|
||||||
|
raise LeauditException(StatusCodeEnum.HTTP_404_NOT_FOUND, "评查文档不存在")
|
||||||
|
|
||||||
|
fileResult = await session.execute(
|
||||||
|
select(LeauditDocumentFile)
|
||||||
|
.where(
|
||||||
|
LeauditDocumentFile.documentId == DocumentId,
|
||||||
|
LeauditDocumentFile.isActive.is_(True),
|
||||||
|
)
|
||||||
|
.order_by(LeauditDocumentFile.Id.desc())
|
||||||
|
.limit(1)
|
||||||
|
)
|
||||||
|
documentFile = fileResult.scalar_one_or_none()
|
||||||
|
if not documentFile:
|
||||||
|
raise LeauditException(StatusCodeEnum.HTTP_400_BAD_REQUEST, "当前文档没有可执行文件版本")
|
||||||
|
|
||||||
|
runNoResult = await session.execute(
|
||||||
|
select(LeauditAuditRun.runNo)
|
||||||
|
.where(LeauditAuditRun.documentId == DocumentId)
|
||||||
|
.order_by(LeauditAuditRun.runNo.desc())
|
||||||
|
.limit(1)
|
||||||
|
)
|
||||||
|
latestRunNo = runNoResult.scalar_one_or_none() or 0
|
||||||
|
|
||||||
|
bindingResult = await session.execute(
|
||||||
|
text(
|
||||||
|
"""
|
||||||
|
SELECT
|
||||||
|
rs.id AS rule_set_id,
|
||||||
|
rs.current_version_id AS rule_version_id,
|
||||||
|
rv.oss_url AS rule_source_oss_url,
|
||||||
|
rv.file_sha256 AS rule_source_sha256,
|
||||||
|
rv.metadata_type_id AS rule_type_id
|
||||||
|
FROM leaudit_rule_type_bindings b
|
||||||
|
JOIN leaudit_rule_sets rs ON rs.id = b.rule_set_id
|
||||||
|
LEFT JOIN leaudit_rule_versions rv ON rv.id = rs.current_version_id
|
||||||
|
WHERE b.doc_type_id = :doc_type_id
|
||||||
|
AND b.is_active = true
|
||||||
|
ORDER BY b.priority DESC, b.id DESC
|
||||||
|
LIMIT 1
|
||||||
|
"""
|
||||||
|
),
|
||||||
|
{"doc_type_id": document.typeId},
|
||||||
|
)
|
||||||
|
binding = bindingResult.mappings().first()
|
||||||
|
if not binding or not binding["rule_set_id"] or not binding["rule_version_id"]:
|
||||||
|
raise LeauditException(StatusCodeEnum.HTTP_400_BAD_REQUEST, "当前文档类型未绑定可用规则版本")
|
||||||
|
|
||||||
|
run = LeauditAuditRun(
|
||||||
|
documentId=DocumentId,
|
||||||
|
documentFileId=documentFile.Id,
|
||||||
|
runNo=int(latestRunNo) + 1,
|
||||||
|
triggerSource="manual" if not Force else "retry",
|
||||||
|
status="pending",
|
||||||
|
ruleSetId=int(binding["rule_set_id"]),
|
||||||
|
ruleVersionId=int(binding["rule_version_id"]),
|
||||||
|
ruleTypeId=binding["rule_type_id"],
|
||||||
|
ruleSourceOssUrl=binding["rule_source_oss_url"],
|
||||||
|
ruleSourceSha256=binding["rule_source_sha256"],
|
||||||
|
startedAt=datetime.now(),
|
||||||
|
)
|
||||||
|
session.add(run)
|
||||||
|
await session.flush()
|
||||||
|
|
||||||
|
document.currentRunId = run.Id
|
||||||
|
document.processingStatus = "running"
|
||||||
|
await session.commit()
|
||||||
|
await session.refresh(run)
|
||||||
|
|
||||||
|
try:
|
||||||
|
Resolver = FileSourceResolver()
|
||||||
|
Payload = await Resolver.ResolvePayload(documentFile)
|
||||||
|
except Exception as Error:
|
||||||
|
raise LeauditException(
|
||||||
|
StatusCodeEnum.HTTP_500_INTERNAL_SERVER_ERROR,
|
||||||
|
f"读取评查文件失败: {Error}",
|
||||||
|
) from Error
|
||||||
|
|
||||||
|
dispatch_leaudit_task(
|
||||||
|
document_id=DocumentId,
|
||||||
|
file_content=Payload.fileContent,
|
||||||
|
filename=Payload.fileName,
|
||||||
|
upload_info={
|
||||||
|
"run_id": run.Id,
|
||||||
|
"rule_version_id": run.ruleVersionId,
|
||||||
|
"rule_source_oss_url": run.ruleSourceOssUrl,
|
||||||
|
"source_type": Payload.sourceType,
|
||||||
|
"source_path": Payload.sourcePath,
|
||||||
|
},
|
||||||
|
rules_path=RuleType,
|
||||||
|
)
|
||||||
|
|
||||||
|
await session.refresh(run)
|
||||||
|
return AuditRunVO(
|
||||||
|
runId=run.Id,
|
||||||
|
documentId=run.documentId,
|
||||||
|
runNo=run.runNo,
|
||||||
|
status=run.status,
|
||||||
|
phase=run.phase,
|
||||||
|
totalScore=float(run.totalScore) if run.totalScore else None,
|
||||||
|
passedCount=run.passedCount,
|
||||||
|
failedCount=run.failedCount,
|
||||||
|
startedAt=run.startedAt,
|
||||||
|
finishedAt=run.finishedAt,
|
||||||
|
)
|
||||||
|
|
||||||
async def GetRunStatus(self, RunId: int) -> AuditRunVO:
|
async def GetRunStatus(self, RunId: int) -> AuditRunVO:
|
||||||
"""查询评查运行状态。"""
|
"""查询评查运行状态。"""
|
||||||
@@ -52,7 +165,33 @@ class AuditServiceImpl(IAuditService):
|
|||||||
run = await session.get(LeauditAuditRun, RunId)
|
run = await session.get(LeauditAuditRun, RunId)
|
||||||
if not run:
|
if not run:
|
||||||
raise LeauditException(StatusCodeEnum.HTTP_404_NOT_FOUND, "评查运行记录不存在")
|
raise LeauditException(StatusCodeEnum.HTTP_404_NOT_FOUND, "评查运行记录不存在")
|
||||||
# TODO: 从 leaudit_rule_results 表查询规则级结果
|
result = await session.execute(
|
||||||
|
text(
|
||||||
|
"""
|
||||||
|
SELECT
|
||||||
|
rule_id,
|
||||||
|
rule_name,
|
||||||
|
risk,
|
||||||
|
score,
|
||||||
|
passed,
|
||||||
|
status,
|
||||||
|
skip_reason,
|
||||||
|
confidence,
|
||||||
|
pass_message,
|
||||||
|
fail_message,
|
||||||
|
remediation,
|
||||||
|
extracted_fields,
|
||||||
|
field_positions,
|
||||||
|
rescue_applied,
|
||||||
|
rescue_passed
|
||||||
|
FROM leaudit_rule_results
|
||||||
|
WHERE run_id = :run_id
|
||||||
|
ORDER BY id ASC
|
||||||
|
"""
|
||||||
|
),
|
||||||
|
{"run_id": RunId},
|
||||||
|
)
|
||||||
|
rules = [dict(row) for row in result.mappings().all()]
|
||||||
return AuditResultVO(
|
return AuditResultVO(
|
||||||
runId=run.Id,
|
runId=run.Id,
|
||||||
totalScore=float(run.totalScore) if run.totalScore else None,
|
totalScore=float(run.totalScore) if run.totalScore else None,
|
||||||
@@ -61,5 +200,5 @@ class AuditServiceImpl(IAuditService):
|
|||||||
skippedCount=run.skippedCount or 0,
|
skippedCount=run.skippedCount or 0,
|
||||||
phase=run.phase,
|
phase=run.phase,
|
||||||
rescueApplied=run.rescueApplied or False,
|
rescueApplied=run.rescueApplied or False,
|
||||||
rules=[],
|
rules=rules,
|
||||||
)
|
)
|
||||||
|
|||||||
@@ -0,0 +1,69 @@
|
|||||||
|
"""OSS 服务实现。"""
|
||||||
|
|
||||||
|
from fastapi_common.fastapi_common_storage.oss_client import OssClient
|
||||||
|
|
||||||
|
from fastapi_modules.fastapi_leaudit.services.ossService import IOssService
|
||||||
|
|
||||||
|
|
||||||
|
class OssServiceImpl(IOssService):
|
||||||
|
"""OSS 服务实现。"""
|
||||||
|
|
||||||
|
def __init__(self, Client: OssClient | None = None) -> None:
|
||||||
|
self.Client = Client or OssClient()
|
||||||
|
|
||||||
|
async def DownloadBytes(self, Source: str, Bucket: str | None = None) -> bytes:
|
||||||
|
"""下载对象内容。"""
|
||||||
|
return self.Client.DownloadBytes(Source=Source, Bucket=Bucket)
|
||||||
|
|
||||||
|
async def DownloadToTempFile(
|
||||||
|
self,
|
||||||
|
Source: str,
|
||||||
|
Suffix: str = "",
|
||||||
|
Prefix: str = "oss-",
|
||||||
|
Bucket: str | None = None,
|
||||||
|
) -> str:
|
||||||
|
"""下载对象到本地临时文件。"""
|
||||||
|
return self.Client.DownloadToTempFile(
|
||||||
|
Source=Source,
|
||||||
|
Suffix=Suffix,
|
||||||
|
Prefix=Prefix,
|
||||||
|
Bucket=Bucket,
|
||||||
|
)
|
||||||
|
|
||||||
|
async def UploadBytes(
|
||||||
|
self,
|
||||||
|
ObjectKey: str,
|
||||||
|
Content: bytes,
|
||||||
|
ContentType: str = "application/octet-stream",
|
||||||
|
Bucket: str | None = None,
|
||||||
|
) -> str:
|
||||||
|
"""上传二进制内容。"""
|
||||||
|
return self.Client.UploadBytes(
|
||||||
|
ObjectKey=ObjectKey,
|
||||||
|
Content=Content,
|
||||||
|
ContentType=ContentType,
|
||||||
|
Bucket=Bucket,
|
||||||
|
)
|
||||||
|
|
||||||
|
async def UploadText(
|
||||||
|
self,
|
||||||
|
ObjectKey: str,
|
||||||
|
Content: str,
|
||||||
|
ContentType: str = "text/plain; charset=utf-8",
|
||||||
|
Bucket: str | None = None,
|
||||||
|
) -> str:
|
||||||
|
"""上传文本内容。"""
|
||||||
|
return self.Client.UploadText(
|
||||||
|
ObjectKey=ObjectKey,
|
||||||
|
Content=Content,
|
||||||
|
ContentType=ContentType,
|
||||||
|
Bucket=Bucket,
|
||||||
|
)
|
||||||
|
|
||||||
|
async def ObjectExists(self, Source: str, Bucket: str | None = None) -> bool:
|
||||||
|
"""判断对象是否存在。"""
|
||||||
|
return self.Client.ObjectExists(Source=Source, Bucket=Bucket)
|
||||||
|
|
||||||
|
async def PresignGetUrl(self, Source: str, Bucket: str | None = None) -> str:
|
||||||
|
"""生成对象下载签名 URL。"""
|
||||||
|
return self.Client.PresignGetUrl(Source=Source, Bucket=Bucket)
|
||||||
@@ -0,0 +1,55 @@
|
|||||||
|
"""OSS 服务接口。"""
|
||||||
|
|
||||||
|
from abc import ABC, abstractmethod
|
||||||
|
|
||||||
|
|
||||||
|
class IOssService(ABC):
|
||||||
|
"""OSS 服务接口。"""
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
async def DownloadBytes(self, Source: str, Bucket: str | None = None) -> bytes:
|
||||||
|
"""下载对象内容。"""
|
||||||
|
...
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
async def DownloadToTempFile(
|
||||||
|
self,
|
||||||
|
Source: str,
|
||||||
|
Suffix: str = "",
|
||||||
|
Prefix: str = "oss-",
|
||||||
|
Bucket: str | None = None,
|
||||||
|
) -> str:
|
||||||
|
"""下载对象到本地临时文件。"""
|
||||||
|
...
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
async def UploadBytes(
|
||||||
|
self,
|
||||||
|
ObjectKey: str,
|
||||||
|
Content: bytes,
|
||||||
|
ContentType: str = "application/octet-stream",
|
||||||
|
Bucket: str | None = None,
|
||||||
|
) -> str:
|
||||||
|
"""上传二进制内容。"""
|
||||||
|
...
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
async def UploadText(
|
||||||
|
self,
|
||||||
|
ObjectKey: str,
|
||||||
|
Content: str,
|
||||||
|
ContentType: str = "text/plain; charset=utf-8",
|
||||||
|
Bucket: str | None = None,
|
||||||
|
) -> str:
|
||||||
|
"""上传文本内容。"""
|
||||||
|
...
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
async def ObjectExists(self, Source: str, Bucket: str | None = None) -> bool:
|
||||||
|
"""判断对象是否存在。"""
|
||||||
|
...
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
async def PresignGetUrl(self, Source: str, Bucket: str | None = None) -> str:
|
||||||
|
"""生成对象下载签名 URL。"""
|
||||||
|
...
|
||||||
@@ -21,6 +21,7 @@ dependencies = [
|
|||||||
"pyjwt>=2.10.0",
|
"pyjwt>=2.10.0",
|
||||||
"pillow>=11.0.0",
|
"pillow>=11.0.0",
|
||||||
"pyyaml>=6.0",
|
"pyyaml>=6.0",
|
||||||
|
"minio>=7.2.8",
|
||||||
"leaudit",
|
"leaudit",
|
||||||
]
|
]
|
||||||
|
|
||||||
|
|||||||
@@ -18,8 +18,8 @@ CREATE TABLE IF NOT EXISTS leaudit_entry_modules (
|
|||||||
areas JSONB,
|
areas JSONB,
|
||||||
sort_order INTEGER NOT NULL DEFAULT 0,
|
sort_order INTEGER NOT NULL DEFAULT 0,
|
||||||
is_enabled BOOLEAN NOT NULL DEFAULT true,
|
is_enabled BOOLEAN NOT NULL DEFAULT true,
|
||||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
create_time TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
update_time TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||||
);
|
);
|
||||||
|
|
||||||
COMMENT ON TABLE leaudit_entry_modules IS '入口模块表 — 定义前端导航入口菜单及其区域配置';
|
COMMENT ON TABLE leaudit_entry_modules IS '入口模块表 — 定义前端导航入口菜单及其区域配置';
|
||||||
@@ -32,8 +32,8 @@ COMMENT ON COLUMN leaudit_entry_modules.icon_path IS '入口菜单图标OSS路
|
|||||||
COMMENT ON COLUMN leaudit_entry_modules.areas IS '区域配置JSON,格式: [{"area":"梅州","enabled":true,"sort_order":1}]';
|
COMMENT ON COLUMN leaudit_entry_modules.areas IS '区域配置JSON,格式: [{"area":"梅州","enabled":true,"sort_order":1}]';
|
||||||
COMMENT ON COLUMN leaudit_entry_modules.sort_order IS '排序序号';
|
COMMENT ON COLUMN leaudit_entry_modules.sort_order IS '排序序号';
|
||||||
COMMENT ON COLUMN leaudit_entry_modules.is_enabled IS '是否启用';
|
COMMENT ON COLUMN leaudit_entry_modules.is_enabled IS '是否启用';
|
||||||
COMMENT ON COLUMN leaudit_entry_modules.created_at IS '创建时间';
|
COMMENT ON COLUMN leaudit_entry_modules.create_time IS '创建时间';
|
||||||
COMMENT ON COLUMN leaudit_entry_modules.updated_at IS '更新时间';
|
COMMENT ON COLUMN leaudit_entry_modules.update_time IS '更新时间';
|
||||||
|
|
||||||
|
|
||||||
-- ============================================================================
|
-- ============================================================================
|
||||||
@@ -51,9 +51,9 @@ CREATE TABLE IF NOT EXISTS leaudit_document_types (
|
|||||||
prompt_config JSONB,
|
prompt_config JSONB,
|
||||||
is_enabled BOOLEAN NOT NULL DEFAULT true,
|
is_enabled BOOLEAN NOT NULL DEFAULT true,
|
||||||
sort_order INTEGER NOT NULL DEFAULT 0,
|
sort_order INTEGER NOT NULL DEFAULT 0,
|
||||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
create_time TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
update_time TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||||
deleted_at TIMESTAMPTZ
|
delete_time TIMESTAMPTZ
|
||||||
);
|
);
|
||||||
|
|
||||||
COMMENT ON TABLE leaudit_document_types IS '文档类型定义表 — 定义系统支持的文档类型及其分类/抽取配置';
|
COMMENT ON TABLE leaudit_document_types IS '文档类型定义表 — 定义系统支持的文档类型及其分类/抽取配置';
|
||||||
@@ -69,9 +69,9 @@ COMMENT ON COLUMN leaudit_document_types.extraction_mode IS '抽取模式: singl
|
|||||||
COMMENT ON COLUMN leaudit_document_types.prompt_config IS '提示词配置JSON,覆盖默认的LLM提示词模板';
|
COMMENT ON COLUMN leaudit_document_types.prompt_config IS '提示词配置JSON,覆盖默认的LLM提示词模板';
|
||||||
COMMENT ON COLUMN leaudit_document_types.is_enabled IS '是否启用';
|
COMMENT ON COLUMN leaudit_document_types.is_enabled IS '是否启用';
|
||||||
COMMENT ON COLUMN leaudit_document_types.sort_order IS '排序序号';
|
COMMENT ON COLUMN leaudit_document_types.sort_order IS '排序序号';
|
||||||
COMMENT ON COLUMN leaudit_document_types.created_at IS '创建时间';
|
COMMENT ON COLUMN leaudit_document_types.create_time IS '创建时间';
|
||||||
COMMENT ON COLUMN leaudit_document_types.updated_at IS '更新时间';
|
COMMENT ON COLUMN leaudit_document_types.update_time IS '更新时间';
|
||||||
COMMENT ON COLUMN leaudit_document_types.deleted_at IS '软删除时间,NULL表示未删除';
|
COMMENT ON COLUMN leaudit_document_types.delete_time IS '软删除时间,NULL表示未删除';
|
||||||
|
|
||||||
|
|
||||||
-- ============================================================================
|
-- ============================================================================
|
||||||
@@ -86,9 +86,9 @@ CREATE TABLE IF NOT EXISTS leaudit_evaluation_point_groups (
|
|||||||
rule_set_id BIGINT,
|
rule_set_id BIGINT,
|
||||||
sort_order INTEGER NOT NULL DEFAULT 0,
|
sort_order INTEGER NOT NULL DEFAULT 0,
|
||||||
is_enabled BOOLEAN NOT NULL DEFAULT true,
|
is_enabled BOOLEAN NOT NULL DEFAULT true,
|
||||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
create_time TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
update_time TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||||
deleted_at TIMESTAMPTZ
|
delete_time TIMESTAMPTZ
|
||||||
);
|
);
|
||||||
|
|
||||||
COMMENT ON TABLE leaudit_evaluation_point_groups IS '评查点规则组表 — 评查点的树形分组结构,支持PID层级嵌套';
|
COMMENT ON TABLE leaudit_evaluation_point_groups IS '评查点规则组表 — 评查点的树形分组结构,支持PID层级嵌套';
|
||||||
@@ -101,9 +101,9 @@ COMMENT ON COLUMN leaudit_evaluation_point_groups.description IS '分组描述';
|
|||||||
COMMENT ON COLUMN leaudit_evaluation_point_groups.rule_set_id IS '关联的规则集ID,外键引用 leaudit_rule_sets.id';
|
COMMENT ON COLUMN leaudit_evaluation_point_groups.rule_set_id IS '关联的规则集ID,外键引用 leaudit_rule_sets.id';
|
||||||
COMMENT ON COLUMN leaudit_evaluation_point_groups.sort_order IS '同级排序序号';
|
COMMENT ON COLUMN leaudit_evaluation_point_groups.sort_order IS '同级排序序号';
|
||||||
COMMENT ON COLUMN leaudit_evaluation_point_groups.is_enabled IS '是否启用';
|
COMMENT ON COLUMN leaudit_evaluation_point_groups.is_enabled IS '是否启用';
|
||||||
COMMENT ON COLUMN leaudit_evaluation_point_groups.created_at IS '创建时间';
|
COMMENT ON COLUMN leaudit_evaluation_point_groups.create_time IS '创建时间';
|
||||||
COMMENT ON COLUMN leaudit_evaluation_point_groups.updated_at IS '更新时间';
|
COMMENT ON COLUMN leaudit_evaluation_point_groups.update_time IS '更新时间';
|
||||||
COMMENT ON COLUMN leaudit_evaluation_point_groups.deleted_at IS '软删除时间,NULL表示未删除';
|
COMMENT ON COLUMN leaudit_evaluation_point_groups.delete_time IS '软删除时间,NULL表示未删除';
|
||||||
|
|
||||||
|
|
||||||
-- ============================================================================
|
-- ============================================================================
|
||||||
@@ -133,9 +133,9 @@ CREATE TABLE IF NOT EXISTS leaudit_evaluation_points (
|
|||||||
action_config TEXT,
|
action_config TEXT,
|
||||||
document_attribute_type VARCHAR(20) DEFAULT 'ALL',
|
document_attribute_type VARCHAR(20) DEFAULT 'ALL',
|
||||||
sort_order INTEGER NOT NULL DEFAULT 0,
|
sort_order INTEGER NOT NULL DEFAULT 0,
|
||||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
create_time TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
update_time TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||||
deleted_at TIMESTAMPTZ
|
delete_time TIMESTAMPTZ
|
||||||
);
|
);
|
||||||
|
|
||||||
COMMENT ON TABLE leaudit_evaluation_points IS '规则点(评查点)表 — 评查点的完整元数据定义,与DSL YAML中的rule一一对应';
|
COMMENT ON TABLE leaudit_evaluation_points IS '规则点(评查点)表 — 评查点的完整元数据定义,与DSL YAML中的rule一一对应';
|
||||||
@@ -163,9 +163,9 @@ COMMENT ON COLUMN leaudit_evaluation_points.post_action IS '评查后动作: non
|
|||||||
COMMENT ON COLUMN leaudit_evaluation_points.action_config IS '动作配置文本,配合 post_action 使用';
|
COMMENT ON COLUMN leaudit_evaluation_points.action_config IS '动作配置文本,配合 post_action 使用';
|
||||||
COMMENT ON COLUMN leaudit_evaluation_points.document_attribute_type IS '适用文档属性: ALL(通用) | SALE(买卖) | LEASE(租赁) | SERVICE(服务) | MANDATE(委托) | CONSTRUCTION(建设工程) | TRAINING(培训) | TECHNOLOGY(技术) | DONATION(赠与) | TRANSPORT(运输) | STORAGE(仓储) | COOPERATION(合作) | UNDERTAKING(承揽)';
|
COMMENT ON COLUMN leaudit_evaluation_points.document_attribute_type IS '适用文档属性: ALL(通用) | SALE(买卖) | LEASE(租赁) | SERVICE(服务) | MANDATE(委托) | CONSTRUCTION(建设工程) | TRAINING(培训) | TECHNOLOGY(技术) | DONATION(赠与) | TRANSPORT(运输) | STORAGE(仓储) | COOPERATION(合作) | UNDERTAKING(承揽)';
|
||||||
COMMENT ON COLUMN leaudit_evaluation_points.sort_order IS '排序序号';
|
COMMENT ON COLUMN leaudit_evaluation_points.sort_order IS '排序序号';
|
||||||
COMMENT ON COLUMN leaudit_evaluation_points.created_at IS '创建时间';
|
COMMENT ON COLUMN leaudit_evaluation_points.create_time IS '创建时间';
|
||||||
COMMENT ON COLUMN leaudit_evaluation_points.updated_at IS '更新时间';
|
COMMENT ON COLUMN leaudit_evaluation_points.update_time IS '更新时间';
|
||||||
COMMENT ON COLUMN leaudit_evaluation_points.deleted_at IS '软删除时间,NULL表示未删除';
|
COMMENT ON COLUMN leaudit_evaluation_points.delete_time IS '软删除时间,NULL表示未删除';
|
||||||
|
|
||||||
|
|
||||||
-- ============================================================================
|
-- ============================================================================
|
||||||
@@ -194,8 +194,8 @@ COMMENT ON COLUMN jwt_tokens.last_used_at IS '令牌最近一次使用时间';
|
|||||||
COMMENT ON COLUMN jwt_tokens.is_revoked IS '是否已被吊销';
|
COMMENT ON COLUMN jwt_tokens.is_revoked IS '是否已被吊销';
|
||||||
COMMENT ON COLUMN jwt_tokens.revoked_at IS '吊销时间';
|
COMMENT ON COLUMN jwt_tokens.revoked_at IS '吊销时间';
|
||||||
COMMENT ON COLUMN jwt_tokens.revoke_reason IS '吊销原因: logout(主动登出) | password_change(密码修改) | admin(管理员强制) | security(安全事件)';
|
COMMENT ON COLUMN jwt_tokens.revoke_reason IS '吊销原因: logout(主动登出) | password_change(密码修改) | admin(管理员强制) | security(安全事件)';
|
||||||
COMMENT ON COLUMN jwt_tokens.created_at IS '记录创建时间';
|
COMMENT ON COLUMN jwt_tokens.create_time IS '记录创建时间';
|
||||||
COMMENT ON COLUMN jwt_tokens.updated_at IS '记录更新时间';
|
COMMENT ON COLUMN jwt_tokens.update_time IS '记录更新时间';
|
||||||
|
|
||||||
|
|
||||||
-- --------------------------------------------------------------------------
|
-- --------------------------------------------------------------------------
|
||||||
@@ -208,8 +208,8 @@ COMMENT ON COLUMN leaudit_documents.biz_document_id IS '业务系统文档ID,
|
|||||||
COMMENT ON COLUMN leaudit_documents.type_id IS '文档类型ID,外键引用 leaudit_document_types.id';
|
COMMENT ON COLUMN leaudit_documents.type_id IS '文档类型ID,外键引用 leaudit_document_types.id';
|
||||||
COMMENT ON COLUMN leaudit_documents.processing_status IS '处理状态: waiting(等待处理) | running(处理中) | completed(已完成) | failed(失败)';
|
COMMENT ON COLUMN leaudit_documents.processing_status IS '处理状态: waiting(等待处理) | running(处理中) | completed(已完成) | failed(失败)';
|
||||||
COMMENT ON COLUMN leaudit_documents.current_run_id IS '当前活跃的评查运行ID,外键引用 leaudit_audit_runs.id';
|
COMMENT ON COLUMN leaudit_documents.current_run_id IS '当前活跃的评查运行ID,外键引用 leaudit_audit_runs.id';
|
||||||
COMMENT ON COLUMN leaudit_documents.created_at IS '记录创建时间';
|
COMMENT ON COLUMN leaudit_documents.create_time IS '记录创建时间';
|
||||||
COMMENT ON COLUMN leaudit_documents.updated_at IS '记录更新时间';
|
COMMENT ON COLUMN leaudit_documents.update_time IS '记录更新时间';
|
||||||
|
|
||||||
|
|
||||||
-- --------------------------------------------------------------------------
|
-- --------------------------------------------------------------------------
|
||||||
@@ -230,7 +230,7 @@ COMMENT ON COLUMN leaudit_document_files.oss_url IS 'OSS对象存储URL(唯一
|
|||||||
COMMENT ON COLUMN leaudit_document_files.storage_provider IS '存储提供商: minio | aliyun_oss | local';
|
COMMENT ON COLUMN leaudit_document_files.storage_provider IS '存储提供商: minio | aliyun_oss | local';
|
||||||
COMMENT ON COLUMN leaudit_document_files.is_active IS '是否为当前活跃版本';
|
COMMENT ON COLUMN leaudit_document_files.is_active IS '是否为当前活跃版本';
|
||||||
COMMENT ON COLUMN leaudit_document_files.created_by IS '上传者用户ID,关联 sso_users.id';
|
COMMENT ON COLUMN leaudit_document_files.created_by IS '上传者用户ID,关联 sso_users.id';
|
||||||
COMMENT ON COLUMN leaudit_document_files.created_at IS '记录创建时间';
|
COMMENT ON COLUMN leaudit_document_files.create_time IS '记录创建时间';
|
||||||
|
|
||||||
|
|
||||||
-- --------------------------------------------------------------------------
|
-- --------------------------------------------------------------------------
|
||||||
@@ -269,8 +269,8 @@ COMMENT ON COLUMN leaudit_audit_runs.skipped_count IS '跳过的规则数(条
|
|||||||
COMMENT ON COLUMN leaudit_audit_runs.result_status IS '整体结果: pass(通过) | fail(不通过) | partial(部分通过) | review(需人工复核)';
|
COMMENT ON COLUMN leaudit_audit_runs.result_status IS '整体结果: pass(通过) | fail(不通过) | partial(部分通过) | review(需人工复核)';
|
||||||
COMMENT ON COLUMN leaudit_audit_runs.started_at IS '评查开始时间';
|
COMMENT ON COLUMN leaudit_audit_runs.started_at IS '评查开始时间';
|
||||||
COMMENT ON COLUMN leaudit_audit_runs.finished_at IS '评查结束时间';
|
COMMENT ON COLUMN leaudit_audit_runs.finished_at IS '评查结束时间';
|
||||||
COMMENT ON COLUMN leaudit_audit_runs.created_at IS '记录创建时间';
|
COMMENT ON COLUMN leaudit_audit_runs.create_time IS '记录创建时间';
|
||||||
COMMENT ON COLUMN leaudit_audit_runs.updated_at IS '记录更新时间';
|
COMMENT ON COLUMN leaudit_audit_runs.update_time IS '记录更新时间';
|
||||||
|
|
||||||
|
|
||||||
-- --------------------------------------------------------------------------
|
-- --------------------------------------------------------------------------
|
||||||
@@ -297,7 +297,7 @@ COMMENT ON COLUMN leaudit_artifacts.oss_url IS 'OSS对象存储URL(唯一真
|
|||||||
COMMENT ON COLUMN leaudit_artifacts.storage_provider IS '存储提供商: minio | aliyun_oss | local';
|
COMMENT ON COLUMN leaudit_artifacts.storage_provider IS '存储提供商: minio | aliyun_oss | local';
|
||||||
COMMENT ON COLUMN leaudit_artifacts.is_persisted IS '是否已持久化到OSS';
|
COMMENT ON COLUMN leaudit_artifacts.is_persisted IS '是否已持久化到OSS';
|
||||||
COMMENT ON COLUMN leaudit_artifacts.retention_policy IS '保留策略: temp(临时) | run_life(评查周期) | permanent(永久)';
|
COMMENT ON COLUMN leaudit_artifacts.retention_policy IS '保留策略: temp(临时) | run_life(评查周期) | permanent(永久)';
|
||||||
COMMENT ON COLUMN leaudit_artifacts.created_at IS '记录创建时间';
|
COMMENT ON COLUMN leaudit_artifacts.create_time IS '记录创建时间';
|
||||||
|
|
||||||
|
|
||||||
-- --------------------------------------------------------------------------
|
-- --------------------------------------------------------------------------
|
||||||
@@ -327,8 +327,8 @@ COMMENT ON COLUMN leaudit_rule_results.remediation IS '修正建议JSON,结构
|
|||||||
COMMENT ON COLUMN leaudit_rule_results.rescue_applied IS '是否已应用补救流程';
|
COMMENT ON COLUMN leaudit_rule_results.rescue_applied IS '是否已应用补救流程';
|
||||||
COMMENT ON COLUMN leaudit_rule_results.rescue_passed IS '补救后是否通过';
|
COMMENT ON COLUMN leaudit_rule_results.rescue_passed IS '补救后是否通过';
|
||||||
COMMENT ON COLUMN leaudit_rule_results.result_payload IS '原始结果JSON(完整保留leaudit引擎输出)';
|
COMMENT ON COLUMN leaudit_rule_results.result_payload IS '原始结果JSON(完整保留leaudit引擎输出)';
|
||||||
COMMENT ON COLUMN leaudit_rule_results.created_at IS '记录创建时间';
|
COMMENT ON COLUMN leaudit_rule_results.create_time IS '记录创建时间';
|
||||||
COMMENT ON COLUMN leaudit_rule_results.updated_at IS '记录更新时间';
|
COMMENT ON COLUMN leaudit_rule_results.update_time IS '记录更新时间';
|
||||||
|
|
||||||
|
|
||||||
-- --------------------------------------------------------------------------
|
-- --------------------------------------------------------------------------
|
||||||
@@ -352,7 +352,7 @@ COMMENT ON COLUMN leaudit_field_results.rule_score IS '规则置信分(业务
|
|||||||
COMMENT ON COLUMN leaudit_field_results.hard_failed IS '是否硬失败(落在置信阈值以下且无兜底值)';
|
COMMENT ON COLUMN leaudit_field_results.hard_failed IS '是否硬失败(落在置信阈值以下且无兜底值)';
|
||||||
COMMENT ON COLUMN leaudit_field_results.fallback_value IS '兜底/替代值(置信不足时使用)';
|
COMMENT ON COLUMN leaudit_field_results.fallback_value IS '兜底/替代值(置信不足时使用)';
|
||||||
COMMENT ON COLUMN leaudit_field_results.meta_json IS '扩展元数据JSON(phase信息、field位置、分组等)';
|
COMMENT ON COLUMN leaudit_field_results.meta_json IS '扩展元数据JSON(phase信息、field位置、分组等)';
|
||||||
COMMENT ON COLUMN leaudit_field_results.created_at IS '记录创建时间';
|
COMMENT ON COLUMN leaudit_field_results.create_time IS '记录创建时间';
|
||||||
|
|
||||||
|
|
||||||
-- --------------------------------------------------------------------------
|
-- --------------------------------------------------------------------------
|
||||||
@@ -376,7 +376,7 @@ COMMENT ON COLUMN leaudit_run_metrics.llm_call_count IS 'LLM调用次数';
|
|||||||
COMMENT ON COLUMN leaudit_run_metrics.vlm_call_count IS 'VLM调用次数';
|
COMMENT ON COLUMN leaudit_run_metrics.vlm_call_count IS 'VLM调用次数';
|
||||||
COMMENT ON COLUMN leaudit_run_metrics.rescue_rule_count IS '补救规则数(触发rescue的规则数量)';
|
COMMENT ON COLUMN leaudit_run_metrics.rescue_rule_count IS '补救规则数(触发rescue的规则数量)';
|
||||||
COMMENT ON COLUMN leaudit_run_metrics.artifact_count IS '产出的产物文件总数';
|
COMMENT ON COLUMN leaudit_run_metrics.artifact_count IS '产出的产物文件总数';
|
||||||
COMMENT ON COLUMN leaudit_run_metrics.created_at IS '记录创建时间';
|
COMMENT ON COLUMN leaudit_run_metrics.create_time IS '记录创建时间';
|
||||||
|
|
||||||
|
|
||||||
-- --------------------------------------------------------------------------
|
-- --------------------------------------------------------------------------
|
||||||
@@ -392,7 +392,7 @@ COMMENT ON COLUMN leaudit_run_errors.level IS '错误级别: fatal(致命) | err
|
|||||||
COMMENT ON COLUMN leaudit_run_errors.error_code IS '错误码(标准化标识,如 OCR_TIMEOUT / LLM_API_ERROR)';
|
COMMENT ON COLUMN leaudit_run_errors.error_code IS '错误码(标准化标识,如 OCR_TIMEOUT / LLM_API_ERROR)';
|
||||||
COMMENT ON COLUMN leaudit_run_errors.message IS '错误描述信息';
|
COMMENT ON COLUMN leaudit_run_errors.message IS '错误描述信息';
|
||||||
COMMENT ON COLUMN leaudit_run_errors.detail_json IS '错误详情JSON(堆栈、上下文、请求参数等)';
|
COMMENT ON COLUMN leaudit_run_errors.detail_json IS '错误详情JSON(堆栈、上下文、请求参数等)';
|
||||||
COMMENT ON COLUMN leaudit_run_errors.created_at IS '记录创建时间';
|
COMMENT ON COLUMN leaudit_run_errors.create_time IS '记录创建时间';
|
||||||
|
|
||||||
|
|
||||||
-- --------------------------------------------------------------------------
|
-- --------------------------------------------------------------------------
|
||||||
@@ -414,8 +414,8 @@ COMMENT ON COLUMN leaudit_rescue_outcomes.vlm_calls IS '补救过程VLM调用次
|
|||||||
COMMENT ON COLUMN leaudit_rescue_outcomes.duration_ms IS '补救总耗时(毫秒)';
|
COMMENT ON COLUMN leaudit_rescue_outcomes.duration_ms IS '补救总耗时(毫秒)';
|
||||||
COMMENT ON COLUMN leaudit_rescue_outcomes.requires_human_review IS '是否需要人工复核';
|
COMMENT ON COLUMN leaudit_rescue_outcomes.requires_human_review IS '是否需要人工复核';
|
||||||
COMMENT ON COLUMN leaudit_rescue_outcomes.payload IS '补救过程完整数据JSON(含每层rescue的输入输出)';
|
COMMENT ON COLUMN leaudit_rescue_outcomes.payload IS '补救过程完整数据JSON(含每层rescue的输入输出)';
|
||||||
COMMENT ON COLUMN leaudit_rescue_outcomes.created_at IS '记录创建时间';
|
COMMENT ON COLUMN leaudit_rescue_outcomes.create_time IS '记录创建时间';
|
||||||
COMMENT ON COLUMN leaudit_rescue_outcomes.updated_at IS '记录更新时间';
|
COMMENT ON COLUMN leaudit_rescue_outcomes.update_time IS '记录更新时间';
|
||||||
|
|
||||||
|
|
||||||
-- --------------------------------------------------------------------------
|
-- --------------------------------------------------------------------------
|
||||||
@@ -433,9 +433,9 @@ COMMENT ON COLUMN leaudit_rule_sets.current_version_id IS '当前生效版本ID
|
|||||||
COMMENT ON COLUMN leaudit_rule_sets.status IS '状态: draft(草稿) | active(生效) | deprecated(废弃) | archived(归档)';
|
COMMENT ON COLUMN leaudit_rule_sets.status IS '状态: draft(草稿) | active(生效) | deprecated(废弃) | archived(归档)';
|
||||||
COMMENT ON COLUMN leaudit_rule_sets.is_builtin IS '是否为系统内置规则(内置规则不可删除)';
|
COMMENT ON COLUMN leaudit_rule_sets.is_builtin IS '是否为系统内置规则(内置规则不可删除)';
|
||||||
COMMENT ON COLUMN leaudit_rule_sets.owner_user_id IS '负责人用户ID,外键引用 sso_users.id';
|
COMMENT ON COLUMN leaudit_rule_sets.owner_user_id IS '负责人用户ID,外键引用 sso_users.id';
|
||||||
COMMENT ON COLUMN leaudit_rule_sets.created_at IS '记录创建时间';
|
COMMENT ON COLUMN leaudit_rule_sets.create_time IS '记录创建时间';
|
||||||
COMMENT ON COLUMN leaudit_rule_sets.updated_at IS '记录更新时间';
|
COMMENT ON COLUMN leaudit_rule_sets.update_time IS '记录更新时间';
|
||||||
COMMENT ON COLUMN leaudit_rule_sets.deleted_at IS '软删除时间,NULL表示未删除';
|
COMMENT ON COLUMN leaudit_rule_sets.delete_time IS '软删除时间,NULL表示未删除';
|
||||||
|
|
||||||
|
|
||||||
-- --------------------------------------------------------------------------
|
-- --------------------------------------------------------------------------
|
||||||
@@ -461,8 +461,8 @@ COMMENT ON COLUMN leaudit_rule_versions.change_note IS '版本变更说明';
|
|||||||
COMMENT ON COLUMN leaudit_rule_versions.editor_user_id IS '编辑者用户ID,外键引用 sso_users.id';
|
COMMENT ON COLUMN leaudit_rule_versions.editor_user_id IS '编辑者用户ID,外键引用 sso_users.id';
|
||||||
COMMENT ON COLUMN leaudit_rule_versions.publisher_user_id IS '发布者用户ID,外键引用 sso_users.id';
|
COMMENT ON COLUMN leaudit_rule_versions.publisher_user_id IS '发布者用户ID,外键引用 sso_users.id';
|
||||||
COMMENT ON COLUMN leaudit_rule_versions.published_at IS '发布时间';
|
COMMENT ON COLUMN leaudit_rule_versions.published_at IS '发布时间';
|
||||||
COMMENT ON COLUMN leaudit_rule_versions.created_at IS '记录创建时间';
|
COMMENT ON COLUMN leaudit_rule_versions.create_time IS '记录创建时间';
|
||||||
COMMENT ON COLUMN leaudit_rule_versions.updated_at IS '记录更新时间';
|
COMMENT ON COLUMN leaudit_rule_versions.update_time IS '记录更新时间';
|
||||||
|
|
||||||
|
|
||||||
-- --------------------------------------------------------------------------
|
-- --------------------------------------------------------------------------
|
||||||
@@ -478,8 +478,8 @@ COMMENT ON COLUMN leaudit_rule_type_bindings.binding_mode IS '绑定模式: expl
|
|||||||
COMMENT ON COLUMN leaudit_rule_type_bindings.priority IS '优先级(数值越大优先级越高),通配符绑定通常优先级较低';
|
COMMENT ON COLUMN leaudit_rule_type_bindings.priority IS '优先级(数值越大优先级越高),通配符绑定通常优先级较低';
|
||||||
COMMENT ON COLUMN leaudit_rule_type_bindings.is_active IS '是否激活';
|
COMMENT ON COLUMN leaudit_rule_type_bindings.is_active IS '是否激活';
|
||||||
COMMENT ON COLUMN leaudit_rule_type_bindings.note IS '备注说明';
|
COMMENT ON COLUMN leaudit_rule_type_bindings.note IS '备注说明';
|
||||||
COMMENT ON COLUMN leaudit_rule_type_bindings.created_at IS '记录创建时间';
|
COMMENT ON COLUMN leaudit_rule_type_bindings.create_time IS '记录创建时间';
|
||||||
COMMENT ON COLUMN leaudit_rule_type_bindings.updated_at IS '记录更新时间';
|
COMMENT ON COLUMN leaudit_rule_type_bindings.update_time IS '记录更新时间';
|
||||||
|
|
||||||
|
|
||||||
-- ============================================================================
|
-- ============================================================================
|
||||||
|
|||||||
Reference in New Issue
Block a user