feat: integrate govdoc platform updates #5

Merged
TanWenyan merged 3 commits from feat/govdoc-module-integration into main 2026-05-18 16:10:27 +08:00
10 changed files with 2151 additions and 92 deletions
Showing only changes of commit 1bacfe41b7 - Show all commits
@@ -0,0 +1,513 @@
# 内部公文前端拆分实施清单
## 1. 文档目的
本文档只解决一个问题:
- 在不改变“内部公文”业务语义的前提下,如何把当前前端实现拆成一套与“交叉评查”同级的独立页面架构
本文档关注的是:
- 页面编排边界
- 组件职责边界
- `govdoc``reviews / cross-checking` 的复用边界
- 分阶段实施顺序
本文档不做以下事情:
- 不改后端业务语义
- 不要求照搬旧项目代码
- 不把 `Collabora` 当成整个中栏预览架构
---
## 2. 结论先行
内部公文前端应按以下原则重构:
> **像交叉评查一样独立成页,但复用 reviews 的定位型预览能力。**
准确解释如下:
- 内部公文应有自己独立的页面 orchestrator
- 内部公文应有自己独立的业务组件层
- 内部公文应有自己独立的 TS service / adapter 层
- 中栏 PDF / DOCX 预览不应重新发明一套,而应优先复用 `reviews` 已有能力
- `Collabora` 只应作为 DOCX viewer,不应承担“问题定位主架构”
因此,目标不是:
- 把当前 `govdoc-audit` 页面继续补丁式扩写
而是:
- 把内部公文前端收敛为“独立页面编排 + 统一预览协议 + 独立业务壳”的平台化实现
---
## 3. 当前实现现状
## 3.1 当前内部公文前端入口
当前内部公文详情页主入口为:
- [components/govdoc-audit/audit.tsx](/home/wren-dev/Porject/leaudit-platform/legal-platform-frontend/components/govdoc-audit/audit.tsx:1)
当前内部公文列表页主入口为:
- [components/govdoc-audit/audits.tsx](/home/wren-dev/Porject/leaudit-platform/legal-platform-frontend/components/govdoc-audit/audits.tsx:1)
当前路由入口为:
- [app/(audit)/govdoc/audits/page.tsx](/home/wren-dev/Porject/leaudit-platform/legal-platform-frontend/app/(audit)/govdoc/audits/page.tsx:1)
- [app/(audit)/govdoc/detail/[documentId]/page.tsx](/home/wren-dev/Porject/leaudit-platform/legal-platform-frontend/app/(audit)/govdoc/detail/[documentId]/page.tsx:1)
当前详情页已经具备:
- 顶部摘要与报告下载操作
- 评查 / 结构 / 大纲 / 实体 tab
- 中栏文档视图
- 右栏 findings / checked rules 展示
问题不在于“没有功能”,而在于“页面职责混装”。
---
## 3.2 当前详情页耦合点
当前 [audit.tsx](/home/wren-dev/Porject/leaudit-platform/legal-platform-frontend/components/govdoc-audit/audit.tsx:1) 同时承担了以下职责:
- 页面数据加载
- 顶部操作区渲染
- tab 状态切换
- 结果统计条渲染
- 中栏文档视图调度
- 右栏问题面板调度
- 规则弹窗调度
这会带来三个问题:
- 页面 orchestrator 和业务组件未分层
- 中栏预览协议没有向平台现有 `reviews` 能力对齐
- 右栏问题区与 `reviews / cross-checking` 的定位交互无法复用
---
## 3.3 当前中栏预览为什么不应继续沿现状扩写
当前内部公文中栏主要使用:
- [components/govdoc-audit/doc-view.tsx](/home/wren-dev/Porject/leaudit-platform/legal-platform-frontend/components/govdoc-audit/doc-view.tsx:1)
而平台现有成熟的“定位型预览”能力在:
- [components/reviews/previewComponents/PdfPreviewTest.tsx](/home/wren-dev/Porject/leaudit-platform/legal-platform-frontend/components/reviews/previewComponents/PdfPreviewTest.tsx:1)
- [components/reviews/previewComponents/DocxPreviewTest.tsx](/home/wren-dev/Porject/leaudit-platform/legal-platform-frontend/components/reviews/previewComponents/DocxPreviewTest.tsx:1)
- [app/(audit)/reviews-test/ReviewsTestClient.tsx](/home/wren-dev/Porject/leaudit-platform/legal-platform-frontend/app/(audit)/reviews-test/ReviewsTestClient.tsx:1)
- [app/(audit)/cross-checking/result/CrossCheckingResultClient.tsx](/home/wren-dev/Porject/leaudit-platform/legal-platform-frontend/app/(audit)/cross-checking/result/CrossCheckingResultClient.tsx:1)
必须明确:
- PDF 中栏定位主能力不是 `Collabora`
- DOCX 中栏当前虽然使用 `CollaboraViewer`,但它承担的是文档渲染,不是完整的问题定位架构
如果内部公文要做到:
- 点击问题点后定位到对应页
- 对问题字段/段落做高亮
- 为后续“问题行定位”保留升级空间
则中栏必须对齐现有平台预览输入协议,而不是继续把 `DocView` 做成一套孤岛实现。
---
## 4. 目标架构
## 4.1 总体原则
前端目标架构应满足以下四条:
- 内部公文页面独立编排
- 中栏预览能力平台复用
- 业务面板 govdoc 自治
- 数据适配集中在 adapter 层
可以概括为:
> **govdoc 自己负责业务壳,platform 负责通用预览能力。**
---
## 4.2 目标目录结构
建议拆分为以下结构:
```text
legal-platform-frontend/
app/(audit)/govdoc/
audits/page.tsx
detail/[documentId]/page.tsx
components/govdoc-audit/
GovdocAuditListPage.tsx
GovdocAuditResultPage.tsx
GovdocSummaryHeader.tsx
GovdocFindingPanel.tsx
GovdocStructurePanel.tsx
GovdocOutlinePanel.tsx
GovdocEntityPanel.tsx
GovdocReportActions.tsx
lib/api/govdoc-audit/
api.ts
types.ts
adapters.ts
govdoc-routes.ts
```
说明如下:
- `page.tsx` 只保留路由入口职责
- `GovdocAuditResultPage.tsx` 负责详情页 orchestrator
- `GovdocAuditListPage.tsx` 负责列表页 orchestrator
- `Govdoc*Panel` 负责内部公文独有业务视图
- `adapters.ts` 负责把 govdoc 后端返回结果转成前端视图模型
---
## 4.3 页面编排职责
### 详情页 orchestrator
建议新增:
- `components/govdoc-audit/GovdocAuditResultPage.tsx`
该组件只负责:
- 读取 `documentId / runId`
- 调用 govdoc API
- 维护 tab 状态
- 维护当前激活问题点
- 维护当前预览定位目标
- 组装中栏与右栏
它不应承担:
- 具体 finding 卡片渲染细节
- 实体/结构/大纲具体 UI 细节
- 预览底层渲染逻辑
这部分应当参照:
- [CrossCheckingResultClient.tsx](/home/wren-dev/Porject/leaudit-platform/legal-platform-frontend/app/(audit)/cross-checking/result/CrossCheckingResultClient.tsx:1)
---
### 列表页 orchestrator
建议新增:
- `components/govdoc-audit/GovdocAuditListPage.tsx`
该组件负责:
- 列表数据加载
- 筛选状态
- 批量操作状态
- 导出与删除
- 跳转详情页
它应继续保持内部公文自己的筛选语义,但 UI 节奏应向平台文档列表页靠拢。
列表页设计参照:
- [DocumentsListClient.tsx](/home/wren-dev/Porject/leaudit-platform/legal-platform-frontend/app/(audit)/documents/list/DocumentsListClient.tsx:1)
---
## 5. 与 reviews / cross-checking 的复用边界
## 5.1 应复用的能力
内部公文应复用以下能力:
- PDF 中栏预览组件
- DOCX 中栏预览组件
- 问题点点击后的预览定位协议
- 页码跳转、高亮、bbox / charPositions 定位能力
优先复用对象:
- [PdfPreviewTest.tsx](/home/wren-dev/Porject/leaudit-platform/legal-platform-frontend/components/reviews/previewComponents/PdfPreviewTest.tsx:1)
- [DocxPreviewTest.tsx](/home/wren-dev/Porject/leaudit-platform/legal-platform-frontend/components/reviews/previewComponents/DocxPreviewTest.tsx:1)
复用的是:
- 预览能力
- 定位协议
- 用户交互模型
不是:
- 合同业务语义
- 卷宗业务命名
- 旧页面外壳
---
## 5.2 不应复用的部分
以下部分不应直接复用:
- `reviews-test` 自身的业务标题、业务字段命名
- 合同/卷宗专有的右栏业务解释
- `cross-checking` 的评分协同、提议投票、交叉意见面板
原因是:
- 这些属于业务壳,而不是平台通用能力
内部公文应保留自己的:
- findings 口径
- checked rules 口径
- 结构 / 大纲 / 实体口径
- 报告下载口径
---
## 5.3 Collabora 的正确定位
`CollaboraViewer` 的边界必须明确:
- 它是 DOCX viewer
- 它可以承担跳页、文本高亮、编辑/只读查看
- 它不是内部公文详情页的业务 orchestrator
- 它也不是“问题行精确定位”的完整方案
因此:
- `Collabora` 只能留在 `DocxPreviewTest` 这一层
- 不应让 govdoc 页面继续直接围绕 `Collabora` 自己长出一套完整详情页体系
---
## 6. 必须新增的 adapter 层
## 6.1 为什么必须有 adapters.ts
当前 `lib/api/govdoc-audit` 下已有:
- `api.ts`
- `types.ts`
- `govdoc-routes.ts`
但还缺一层:
- `adapters.ts`
这层必须存在,因为它承担的是“业务结果语义 -> 预览与页面视图语义”的转换。
如果没有这层,后果会是:
- govdoc 页面自己维护一套 findings 展示模型
- reviews 页面自己维护一套 preview target 模型
- 同类定位交互会出现两套不兼容实现
---
## 6.2 adapters.ts 建议职责
`adapters.ts` 建议至少提供以下能力:
-`govdoc` 结果对象转换为结果页 view model
-`finding / checked_rule` 转换为右栏展示项
-`finding / paragraph / entity` 转换为中栏跳转目标
- 根据文件类型产出统一 preview target
- 将后端报告产物状态转换为按钮展示状态
建议输出的数据语义包括:
- `previewKind`
- `previewPath`
- `activeTarget`
- `findingItems`
- `summaryCards`
- `reportActions`
- `structureItems`
- `outlineItems`
- `entityItems`
这样后续页面层只编排,不解释后端字段细节。
---
## 7. 分阶段实施顺序
## 7.1 第一阶段:补 adapter,不改页面语义
目标:
- 先把数据适配层补齐
动作:
- 新增 `lib/api/govdoc-audit/adapters.ts`
- 收敛 `audit.tsx` 里对原始接口字段的直接解释
- 把 preview target 语义统一为:
- `page`
- `highlightValue`
- `bboxHighlight`
- `charPositions`
本阶段收益:
- 不改用户可见业务逻辑
- 为后续替换中栏和右栏做稳定基础
---
## 7.2 第二阶段:拆详情页 orchestrator
目标:
- 让 govdoc 详情页像 `cross-checking` 一样拥有独立 orchestrator
动作:
- 新增 `GovdocAuditResultPage.tsx`
- 将现有 `audit.tsx` 逻辑迁入新组件
- 路由入口改为挂载新组件
- 顶部摘要、下载操作、tab 切换拆成子组件
本阶段收益:
- 页面职责清晰
- 后续中栏和右栏可以独立演进
---
## 7.3 第三阶段:切换中栏到定位型预览
目标:
- 内部公文详情页中栏不再以 `DocView` 为核心
动作:
- 根据文件类型切换到 `PdfPreviewTest / DocxPreviewTest`
- 从 govdoc adapter 输出统一 preview target
- 让右栏点击直接驱动中栏定位
本阶段注意:
- PDF 定位优先支持 `bboxHighlight / charPositions`
- DOCX 优先支持 `targetPage + highlightValue`
- 不承诺此阶段立即做到“DOCX 行级精确定位”
---
## 7.4 第四阶段:重构右栏与 tab 业务壳
目标:
- 保留 govdoc 自己的业务面板,但交互模型对齐平台
动作:
- 将当前 `RightPanel` 重构为 `GovdocFindingPanel`
- 将结构、大纲、实体分面板组件化
- 收敛旧的孤立交互状态
本阶段收益:
- govdoc 保持业务独立
- 同时具备平台统一的交互体验
---
## 7.5 第五阶段:样式与布局收口
目标:
- govdoc 页面在视觉上向平台现有绿色主题和通用 panel 节奏靠齐
动作:
- 减少 `.govdoc-audit-scope` 中重复定义
- 优先复用 `layout-primitives.css`
- 保留必要的 govdoc 业务样式命名空间
本阶段原则:
- 先统一布局和交互节奏
- 再减少样式重复
- 不先做“大改视觉”
---
## 8. 风险点与前置条件
## 8.1 最大风险不在前端组件,而在定位数据颗粒度
内部公文要实现“定位到哪一行有问题”,前端只是承载层,真正决定上限的是后端给的数据。
前端能稳定消费的数据类型分为两类:
- PDF
- `page`
- `bbox`
- `page_box`
- `char_positions`
- DOCX
- `targetPage`
- `highlightValue`
- 未来如果需要更高精度,还需要更细粒度锚点
如果后端只给:
- 问题描述
- 规则结果
而不给定位数据,那么前端最多只能做到:
- 页级定位
- 文本关键字高亮
不能承诺做到稳定的“行级定位”。
---
## 8.2 不应在这一阶段做的事情
以下动作不建议和本次拆分同时进行:
- 重写 govdoc 全部视觉设计
- 把 govdoc 规则语义改造成合同/卷宗语义
- 试图把所有 `reviews` 业务组件直接搬进 govdoc
- 在没有 adapter 的情况下直接大规模替换页面
原因很简单:
- 这些动作会把“前端分层重构”和“业务改动”混在一起,增加回归风险
---
## 9. 最终边界结论
内部公文前端的正确实现边界应锁定为:
- **像交叉评查一样,独立成页**
- **像 reviews 一样,复用定位型预览能力**
- **像平台模块一样,数据解释集中在 adapter 层**
- **像内部公文自己一样,保留 findings / checked rules / structure / outline / entities 的业务语义**
更直白地说:
- `Govdoc 页面` 负责业务编排
- `reviews 预览组件` 负责中栏定位能力
- `Collabora` 只负责 DOCX 渲染
- `adapters.ts` 负责把 govdoc 后端结果翻译成前端可复用语义
这就是内部公文前端后续实施的固定边界。
+654
View File
@@ -0,0 +1,654 @@
<!doctype html>
<html lang="zh-CN">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>内部公文报告 UI 样例</title>
<style>
:root {
--bg: #f8fafc;
--panel: #ffffff;
--panel-soft: #f1f5f9;
--line: #e2e8f0;
--text: #0f172a;
--muted: #64748b;
--primary: #00684a;
--primary-soft: #e8f3ef;
--danger: #dc2626;
--danger-soft: #fef2f2;
--warn: #b45309;
--warn-soft: #fffbeb;
--info: #1d4ed8;
--info-soft: #eff6ff;
--shadow: 0 10px 30px rgba(15, 23, 42, 0.08);
}
* {
box-sizing: border-box;
}
body {
margin: 0;
background:
radial-gradient(circle at top left, rgba(0, 104, 74, 0.06), transparent 22%),
linear-gradient(180deg, #fbfdff 0%, var(--bg) 100%);
color: var(--text);
font: 14px/1.6 "PingFang SC", "Microsoft YaHei", sans-serif;
}
.page {
max-width: 1360px;
margin: 0 auto;
padding: 32px 24px 56px;
}
.hero {
display: grid;
grid-template-columns: 240px minmax(0, 1fr);
gap: 20px;
align-items: stretch;
margin-bottom: 20px;
}
.card {
background: var(--panel);
border: 1px solid var(--line);
border-radius: 18px;
box-shadow: var(--shadow);
}
.score-card {
padding: 24px 20px;
display: flex;
flex-direction: column;
justify-content: space-between;
background:
linear-gradient(180deg, rgba(0, 104, 74, 0.05) 0%, rgba(255, 255, 255, 0.96) 36%),
var(--panel);
}
.score-ring {
width: 132px;
height: 132px;
margin: 0 auto 18px;
border-radius: 50%;
background: conic-gradient(var(--primary) 0%, #dbe4ea 0%);
display: grid;
place-items: center;
}
.score-ring::before {
content: "";
width: 102px;
height: 102px;
border-radius: 50%;
background: #fff;
box-shadow: inset 0 0 0 1px rgba(226, 232, 240, 0.9);
grid-area: 1 / 1;
}
.score-value {
grid-area: 1 / 1;
position: relative;
z-index: 1;
text-align: center;
}
.score-value strong {
display: block;
font-size: 34px;
line-height: 1;
letter-spacing: -0.04em;
}
.score-value span {
display: block;
margin-top: 6px;
font-size: 12px;
color: var(--muted);
}
.score-caption {
text-align: center;
color: var(--muted);
font-size: 12px;
}
.hero-main {
padding: 24px 26px;
display: flex;
flex-direction: column;
justify-content: space-between;
gap: 18px;
}
.eyebrow {
display: inline-flex;
align-items: center;
gap: 8px;
min-height: 28px;
padding: 0 12px;
border-radius: 999px;
background: var(--primary-soft);
color: var(--primary);
font-size: 12px;
font-weight: 600;
}
h1 {
margin: 10px 0 8px;
font-size: 30px;
line-height: 1.2;
}
.doc-meta {
color: var(--muted);
font-size: 13px;
}
.summary-grid {
display: grid;
grid-template-columns: repeat(4, minmax(0, 1fr));
gap: 12px;
}
.metric {
border: 1px solid var(--line);
border-radius: 14px;
background: linear-gradient(180deg, #fff, #f8fafc);
padding: 14px 16px;
}
.metric-label {
color: var(--muted);
font-size: 12px;
}
.metric-value {
margin-top: 6px;
display: flex;
align-items: baseline;
gap: 6px;
}
.metric-value strong {
font-size: 26px;
line-height: 1;
letter-spacing: -0.04em;
}
.metric-value span {
color: var(--muted);
font-size: 12px;
}
.chips {
display: flex;
flex-wrap: wrap;
gap: 8px;
}
.chip {
display: inline-flex;
align-items: center;
gap: 8px;
min-height: 30px;
padding: 0 12px;
border-radius: 999px;
border: 1px solid transparent;
font-size: 12px;
font-weight: 600;
}
.chip.error {
color: var(--danger);
background: var(--danger-soft);
border-color: #fecaca;
}
.chip.warning {
color: var(--warn);
background: var(--warn-soft);
border-color: #fde68a;
}
.chip.info {
color: var(--info);
background: var(--info-soft);
border-color: #bfdbfe;
}
.content {
display: grid;
grid-template-columns: 320px minmax(0, 1fr);
gap: 20px;
align-items: start;
}
.sidebar {
padding: 18px;
position: sticky;
top: 20px;
}
.section-title {
margin: 0 0 14px;
font-size: 13px;
font-weight: 700;
}
.stat-list {
display: grid;
gap: 10px;
}
.stat-item {
border: 1px solid var(--line);
border-radius: 12px;
background: #fff;
padding: 12px 14px;
}
.stat-item strong {
display: block;
font-size: 20px;
line-height: 1;
margin-bottom: 6px;
}
.stat-item span {
color: var(--muted);
font-size: 12px;
}
.legend {
margin-top: 16px;
padding-top: 16px;
border-top: 1px solid var(--line);
}
.legend-row {
display: flex;
align-items: center;
justify-content: space-between;
gap: 12px;
padding: 8px 0;
color: var(--muted);
font-size: 12px;
}
.legend-row b {
color: var(--text);
font-size: 13px;
}
.table-card {
overflow: hidden;
}
.toolbar {
display: flex;
align-items: center;
justify-content: space-between;
gap: 12px;
padding: 16px 18px;
border-bottom: 1px solid var(--line);
background: linear-gradient(180deg, #ffffff 0%, #f8fafc 100%);
}
.toolbar h2 {
margin: 0;
font-size: 15px;
}
.toolbar span {
color: var(--muted);
font-size: 12px;
}
.filters {
display: flex;
gap: 8px;
}
.filter {
min-height: 30px;
padding: 0 12px;
border-radius: 999px;
border: 1px solid var(--line);
background: #fff;
color: var(--muted);
font-size: 12px;
font-weight: 600;
}
.filter.active {
color: var(--primary);
background: var(--primary-soft);
border-color: rgba(0, 104, 74, 0.2);
}
table {
width: 100%;
border-collapse: collapse;
}
thead th {
padding: 12px 14px;
text-align: left;
background: #f8fafc;
color: var(--muted);
font-size: 12px;
font-weight: 700;
border-bottom: 1px solid var(--line);
white-space: nowrap;
}
tbody td {
padding: 14px;
border-bottom: 1px solid #edf2f7;
vertical-align: top;
}
tbody tr:hover {
background: #fcfdfd;
}
.finding-id {
font-family: ui-monospace, SFMono-Regular, Menlo, monospace;
color: var(--muted);
font-size: 12px;
}
.rule-cell strong {
display: block;
font-size: 13px;
}
.rule-cell span {
display: block;
margin-top: 4px;
color: var(--muted);
font-size: 12px;
}
.severity-tag {
display: inline-flex;
align-items: center;
min-height: 26px;
padding: 0 10px;
border-radius: 999px;
border: 1px solid transparent;
font-size: 12px;
font-weight: 700;
text-transform: uppercase;
}
.severity-tag.error {
color: var(--danger);
background: var(--danger-soft);
border-color: #fecaca;
}
.severity-tag.warning {
color: var(--warn);
background: var(--warn-soft);
border-color: #fde68a;
}
.location {
font-family: ui-monospace, SFMono-Regular, Menlo, monospace;
color: #334155;
font-size: 12px;
}
.msg {
min-width: 360px;
}
.msg-main {
font-size: 13px;
color: var(--text);
}
.context {
margin-top: 8px;
padding: 10px 12px;
border-radius: 12px;
background: #f8fafc;
color: var(--muted);
font-size: 12px;
}
.suggestion {
margin-top: 8px;
padding: 10px 12px;
border-radius: 12px;
background: #eff6ff;
color: #0f4c81;
font-size: 12px;
}
@media (max-width: 1080px) {
.hero,
.content {
grid-template-columns: 1fr;
}
.summary-grid {
grid-template-columns: repeat(2, minmax(0, 1fr));
}
.sidebar {
position: static;
}
}
@media (max-width: 720px) {
.page {
padding: 18px 14px 32px;
}
.summary-grid {
grid-template-columns: 1fr;
}
.toolbar {
align-items: flex-start;
flex-direction: column;
}
.msg {
min-width: 0;
}
}
</style>
</head>
<body>
<div class="page">
<section class="hero">
<div class="card score-card">
<div class="score-ring">
<div class="score-value">
<strong>0</strong>
<span>综合得分</span>
</div>
</div>
<div class="score-caption">当前样例沿用你提供的实际报告数据,不改业务语义</div>
</div>
<div class="card hero-main">
<div>
<div class="eyebrow">统一报告样式锚点</div>
<h1>公文格式审核报告</h1>
<div class="doc-meta">买卖合同 (1).docx · 共 123 项问题 · 样例用于确认 UI / 配色方向</div>
</div>
<div class="summary-grid">
<div class="metric">
<div class="metric-label">错误项</div>
<div class="metric-value"><strong>31</strong><span>error</span></div>
</div>
<div class="metric">
<div class="metric-label">警告项</div>
<div class="metric-value"><strong>92</strong><span>warning</span></div>
</div>
<div class="metric">
<div class="metric-label">提示项</div>
<div class="metric-value"><strong>0</strong><span>info</span></div>
</div>
<div class="metric">
<div class="metric-label">问题类别</div>
<div class="metric-value"><strong>4</strong><span>标题 / 发文 / 格式 / 其他</span></div>
</div>
</div>
<div class="chips">
<span class="chip error">错误 31</span>
<span class="chip warning">警告 92</span>
<span class="chip info">提示 0</span>
</div>
</div>
</section>
<section class="content">
<aside class="card sidebar">
<h2 class="section-title">报告摘要</h2>
<div class="stat-list">
<div class="stat-item">
<strong>GW-F-003</strong>
<span>当前命中最多的规则,二级标题用楷体三号</span>
</div>
<div class="stat-item">
<strong>P35 - P83</strong>
<span>问题主要集中在正文中段的二级标题样式</span>
</div>
<div class="stat-item">
<strong>标题 / 发文</strong>
<span>顶部实体未识别,标题文种与发文字号前置规则同时触发</span>
</div>
</div>
<div class="legend">
<div class="legend-row">
<span>设计基调</span>
<b>贴合系统统一工作台</b>
</div>
<div class="legend-row">
<span>信息组织</span>
<b>摘要 + 问题表格</b>
</div>
<div class="legend-row">
<span>配色策略</span>
<b>Slate + 平台绿</b>
</div>
</div>
</aside>
<div class="card table-card">
<div class="toolbar">
<div>
<h2>问题明细</h2>
<span>保留当前报告语义,只收敛版式和视觉层级</span>
</div>
<div class="filters">
<button class="filter active" type="button">全部</button>
<button class="filter" type="button">错误</button>
<button class="filter" type="button">警告</button>
</div>
</div>
<table>
<thead>
<tr>
<th>编号</th>
<th>规则</th>
<th>严重度</th>
<th>类别</th>
<th>位置</th>
<th>说明</th>
</tr>
</thead>
<tbody>
<tr>
<td class="finding-id">F-c0dfd361</td>
<td class="rule-cell">
<strong>GW-T-001</strong>
<span>标题文种合规性</span>
</td>
<td><span class="severity-tag error">error</span></td>
<td>标题</td>
<td class="location">P-1 ()</td>
<td class="msg">
<div class="msg-main">目标实体「title」未识别到</div>
<div class="context">原文:未识别到标题内容,无法继续执行标题文种合规校验。</div>
<div class="suggestion">建议:补全标题并确保标题文种符合规则要求。</div>
</td>
</tr>
<tr>
<td class="finding-id">F-a896eaa4</td>
<td class="rule-cell">
<strong>GW-N-001</strong>
<span>发文字号必须用六角括号</span>
</td>
<td><span class="severity-tag error">error</span></td>
<td>发文</td>
<td class="location">P-1 ()</td>
<td class="msg">
<div class="msg-main">目标实体「doc_number」未识别到</div>
<div class="context">原文:未识别到发文字号,年份括号规则无法匹配。</div>
<div class="suggestion">建议:发文字号年份应用六角括号〔〕,不得使用方括号或圆括号。</div>
</td>
</tr>
<tr>
<td class="finding-id">F-087a4841</td>
<td class="rule-cell">
<strong>GW-F-003</strong>
<span>二级标题用楷体三号</span>
</td>
<td><span class="severity-tag error">error</span></td>
<td>格式</td>
<td class="location">P35 (heading_2)</td>
<td class="msg">
<div class="msg-main">字体或字号不符合(实际 仿宋 Nonept,期望 楷体 16pt</div>
<div class="context">原文:(一)甲方从乙方处购买:</div>
<div class="suggestion">建议:二级标题应使用楷体三号,保持同级标题样式一致。</div>
</td>
</tr>
<tr>
<td class="finding-id">F-37b4bb81</td>
<td class="rule-cell">
<strong>GW-F-003</strong>
<span>二级标题用楷体三号</span>
</td>
<td><span class="severity-tag error">error</span></td>
<td>格式</td>
<td class="location">P39 (heading_2)</td>
<td class="msg">
<div class="msg-main">字体或字号不符合(实际 仿宋 Nonept,期望 楷体 16pt</div>
<div class="context">原文:(二)质量要求:</div>
<div class="suggestion">建议:这一类同级标题可在正式版中支持折叠聚合同规则项。</div>
</td>
</tr>
<tr>
<td class="finding-id">F-b2140a78</td>
<td class="rule-cell">
<strong>GW-F-003</strong>
<span>二级标题用楷体三号</span>
</td>
<td><span class="severity-tag warning">warning</span></td>
<td>格式</td>
<td class="location">P62 (heading_2)</td>
<td class="msg">
<div class="msg-main">格式接近但未完全满足规则要求</div>
<div class="context">原文:(一)交付时间: 。</div>
<div class="suggestion">建议:保留原有规则说明内容,只把告警与错误的视觉层级拉开。</div>
</td>
</tr>
</tbody>
</table>
</div>
</section>
</div>
</body>
</html>
@@ -34,6 +34,7 @@ class RagMessageItemVO(BaseModel):
answer: str = Field(...) answer: str = Field(...)
feedback: dict | None = Field(None) feedback: dict | None = Field(None)
retrieverResources: list[dict] | None = Field(None) retrieverResources: list[dict] | None = Field(None)
suggestedQuestions: list[str] = Field(default_factory=list)
createdAt: int = Field(0) createdAt: int = Field(0)
@@ -1,76 +1,594 @@
"""把 AuditResult 渲染成单文件 HTML 报告。""" """把 AuditResult 渲染成单文件 HTML 报告。"""
from __future__ import annotations from __future__ import annotations
from collections import Counter
from html import escape from html import escape
from fastapi_modules.fastapi_leaudit.govdoc_engine.engine.result import AuditResult from fastapi_modules.fastapi_leaudit.govdoc_engine.engine.result import AuditResult
_CSS = """ _CSS = """
body { font-family: -apple-system, "PingFang SC", sans-serif; margin: 0; padding: 24px; * { box-sizing: border-box; }
background: #f7f7f9; color: #1a1a1a; } html, body { margin: 0; padding: 0; }
.header { display: flex; align-items: center; gap: 16px; margin-bottom: 24px; } body {
.score { width: 96px; height: 96px; border-radius: 50%; font-family: -apple-system, "PingFang SC", "Microsoft YaHei", sans-serif;
background: conic-gradient(#22c55e var(--p), #e5e7eb var(--p)); background: #f3f6f5;
display: grid; place-items: center; font-weight: 700; font-size: 22px; color: #111; } color: #0f172a;
.score-inner { background: white; width: 76px; height: 76px; border-radius: 50%; }
display: grid; place-items: center; } a { color: inherit; }
.tag { padding: 2px 8px; border-radius: 999px; font-size: 12px; } .page {
.error { background: #fee2e2; color: #b91c1c; } width: 100%;
.warning { background: #fef9c3; color: #a16207; } padding: 20px 24px 32px;
.info { background: #dbeafe; color: #1d4ed8; } }
table { width: 100%; border-collapse: collapse; background: white; border-radius: 8px; .stack {
overflow: hidden; box-shadow: 0 1px 3px rgba(0,0,0,0.06); } display: flex;
th, td { padding: 10px 12px; text-align: left; border-bottom: 1px solid #f1f5f9; vertical-align: top; } flex-direction: column;
th { background: #f8fafc; font-size: 13px; } gap: 20px;
td.msg { max-width: 480px; } }
.context { color: #64748b; font-size: 12px; margin-top: 4px; } .card {
background: #ffffff;
border: 1px solid #e2e8f0;
border-radius: 12px;
box-shadow: 0 1px 3px rgba(15, 23, 42, 0.08);
overflow: hidden;
}
.card-head {
height: 48px;
display: flex;
align-items: center;
justify-content: space-between;
gap: 12px;
padding: 0 20px;
border-bottom: 1px solid #e2e8f0;
background: #fcfdfd;
}
.card-title {
font-size: 14px;
font-weight: 600;
color: #1e293b;
}
.card-subtitle {
font-size: 12px;
color: #64748b;
}
.summary-grid {
display: grid;
grid-template-columns: 220px minmax(0, 1fr);
gap: 20px;
padding: 20px;
}
.score-box {
border: 1px solid #cfe4dc;
background: #f7fbf9;
border-radius: 10px;
padding: 20px;
}
.score-label {
font-size: 12px;
font-weight: 500;
color: #475569;
}
.score-value {
margin-top: 12px;
font-size: 42px;
line-height: 1;
font-weight: 600;
letter-spacing: -0.05em;
color: #0f172a;
}
.score-track {
margin-top: 16px;
height: 8px;
background: #dbe8e3;
border-radius: 999px;
overflow: hidden;
}
.score-fill {
height: 100%;
background: #00684a;
}
.score-note {
margin-top: 16px;
font-size: 12px;
line-height: 1.75;
color: #475569;
}
.summary-main {
min-width: 0;
}
.eyebrow {
display: inline-flex;
align-items: center;
height: 28px;
padding: 0 12px;
border: 1px solid #cfe4dc;
border-radius: 6px;
background: #e8f3ef;
color: #00684a;
font-size: 12px;
font-weight: 500;
}
.report-title {
margin: 12px 0 0;
font-size: 32px;
line-height: 1.25;
letter-spacing: -0.03em;
font-weight: 600;
color: #0f172a;
}
.report-meta {
margin-top: 8px;
font-size: 15px;
color: #475569;
}
.metrics {
margin-top: 20px;
display: grid;
grid-template-columns: repeat(4, minmax(0, 1fr));
gap: 16px;
}
.metric {
border: 1px solid #e2e8f0;
border-radius: 10px;
background: #fcfdfd;
padding: 16px 20px;
}
.metric-label {
font-size: 13px;
font-weight: 500;
color: #64748b;
}
.metric-value {
margin-top: 12px;
display: flex;
align-items: baseline;
gap: 8px;
}
.metric-value strong {
font-size: 30px;
line-height: 1;
letter-spacing: -0.04em;
font-weight: 600;
color: #0f172a;
}
.metric-value span {
font-size: 13px;
color: #64748b;
}
.chips {
margin-top: 20px;
display: flex;
flex-wrap: wrap;
gap: 12px;
}
.chip,
.severity-tag {
display: inline-flex;
align-items: center;
border: 1px solid transparent;
border-radius: 6px;
font-weight: 600;
}
.chip {
height: 32px;
padding: 0 12px;
font-size: 12px;
}
.severity-tag {
height: 32px;
padding: 0 12px;
font-size: 12px;
text-transform: uppercase;
}
.error {
border-color: #fecaca;
background: #fef2f2;
color: #b91c1c;
}
.warning {
border-color: #fde68a;
background: #fffbeb;
color: #b45309;
}
.info {
border-color: #bfdbfe;
background: #eff6ff;
color: #1d4ed8;
}
.content-grid {
display: grid;
grid-template-columns: 340px minmax(0, 1fr);
gap: 20px;
}
.sidebar-body {
padding: 16px;
display: flex;
flex-direction: column;
gap: 16px;
}
.summary-row {
border: 1px solid #e2e8f0;
border-radius: 10px;
background: #fcfdfd;
padding: 16px;
}
.summary-row-label {
font-size: 12px;
font-weight: 500;
color: #64748b;
}
.summary-row-value {
margin-top: 8px;
font-size: 22px;
line-height: 1;
letter-spacing: -0.03em;
font-weight: 600;
color: #0f172a;
}
.summary-row-desc {
margin-top: 12px;
font-size: 13px;
line-height: 1.75;
color: #475569;
}
.table-toolbar {
height: 48px;
display: flex;
align-items: center;
justify-content: space-between;
gap: 12px;
padding: 0 20px;
border-bottom: 1px solid #e2e8f0;
background: #fcfdfd;
}
.toolbar-left {
min-width: 0;
}
.toolbar-title {
font-size: 14px;
font-weight: 600;
color: #1e293b;
}
.toolbar-desc {
margin-top: 2px;
font-size: 12px;
color: #64748b;
}
.toolbar-filters {
display: flex;
gap: 8px;
}
.filter {
display: inline-flex;
align-items: center;
height: 32px;
padding: 0 12px;
border: 1px solid #e2e8f0;
border-radius: 6px;
background: #ffffff;
color: #64748b;
font-size: 12px;
font-weight: 500;
}
.filter.active {
border-color: rgba(0, 104, 74, 0.2);
background: #e8f3ef;
color: #00684a;
}
.table-wrap {
overflow-x: auto;
}
table {
width: 100%;
min-width: 1320px;
border-collapse: collapse;
}
thead tr {
background: #f8fafc;
color: #475569;
font-size: 13px;
font-weight: 500;
}
th {
padding: 16px 20px;
text-align: left;
border-bottom: 1px solid #e2e8f0;
white-space: nowrap;
}
td {
padding: 20px;
vertical-align: top;
border-bottom: 1px solid #f1f5f9;
}
tbody tr:hover {
background: #f8fafc;
}
.mono {
font-family: ui-monospace, SFMono-Regular, Menlo, monospace;
}
.id-cell {
font-size: 13px;
color: #64748b;
}
.rule-id {
font-size: 15px;
font-weight: 600;
color: #1e293b;
}
.rule-name {
margin-top: 4px;
font-size: 13px;
color: #64748b;
}
.category-cell {
font-size: 14px;
color: #334155;
}
.location-cell {
font-size: 13px;
color: #334155;
}
.message-cell {
min-width: 560px;
}
.message-main {
font-size: 15px;
line-height: 1.8;
color: #0f172a;
}
.context-box,
.suggestion-box {
margin-top: 12px;
border-radius: 6px;
padding: 12px 16px;
font-size: 13px;
line-height: 1.8;
}
.context-box {
border: 1px solid #e2e8f0;
background: #f8fafc;
color: #475569;
}
.suggestion-box {
border: 1px solid #cfe4dc;
background: #f4faf7;
color: #0d6b4d;
}
.empty {
padding: 24px 20px;
text-align: center;
color: #64748b;
font-size: 14px;
}
@media (max-width: 1200px) {
.summary-grid,
.content-grid {
grid-template-columns: 1fr;
}
.metrics {
grid-template-columns: repeat(2, minmax(0, 1fr));
}
}
@media (max-width: 720px) {
.page {
padding: 16px;
}
.metrics {
grid-template-columns: 1fr;
}
.table-toolbar,
.card-head {
height: auto;
min-height: 48px;
padding-top: 12px;
padding-bottom: 12px;
align-items: flex-start;
flex-direction: column;
}
}
""" """
def render_html(result: AuditResult) -> str: def render_html(result: AuditResult) -> str:
s = result.summary summary = result.summary
score = s.score score = int(summary.score or 0)
pct = f"{score}%" score_pct = max(0, min(score, 100))
rows = [] severity_counts = _severity_counts(result)
for f in result.findings: category_count = len([key for key, value in (summary.by_category or {}).items() if key and value])
loc = f.location filename = escape(str(result.document.get("filename", "")))
suggest = ( top_rule_id, top_rule_count = _top_rule(result)
f'<div style="color:#0369a1">建议: {escape(f.suggestion)}</div>' line_range = _line_range(result)
if f.suggestion else "" entity_summary = _entity_summary(result)
)
rows.append(f"""
<tr>
<td>{escape(f.finding_id)}</td>
<td>{escape(f.rule_id)}<br><span style="color:#64748b;font-size:12px">{escape(f.rule_name)}</span></td>
<td><span class="tag {f.severity}">{f.severity}</span></td>
<td>{escape(f.category)}</td>
<td>P{loc.paragraph_index} ({escape(loc.role or '')})</td>
<td class="msg">{escape(f.message)}
<div class="context">原文: {escape((loc.context or '')[:80])}</div>
{suggest}
</td>
</tr>""")
body = f"""<!doctype html> rows = []
<html lang="zh"><head><meta charset="utf-8"><title>公文审核报告</title> for finding in result.findings:
<style>{_CSS}</style></head><body> location_label = _format_location(finding.location.paragraph_index)
<div class="header"> context = escape((finding.location.context or "").strip())
<div class="score" style="--p:{pct}"><div class="score-inner">{score}</div></div> message = escape(finding.message)
<div> suggestion = escape(finding.suggestion) if finding.suggestion else "按规则要求修正对应内容。"
<h1 style="margin:0">公文格式审核报告</h1>
<div style="color:#64748b">{escape(result.document.get('filename', ''))} · 共 {s.total_findings} 项</div> rows.append(
<div style="margin-top:6px"> f"""
<span class="tag error">错误 {s.by_severity.get('error', 0)}</span> <tr>
<span class="tag warning">警告 {s.by_severity.get('warning', 0)}</span> <td class="mono id-cell">{escape(finding.finding_id)}</td>
<span class="tag info">提示 {s.by_severity.get('info', 0)}</span> <td>
<div class="rule-id">{escape(finding.rule_id)}</div>
<div class="rule-name">{escape(finding.rule_name)}</div>
</td>
<td><span class="severity-tag {escape(finding.severity)}">{escape(finding.severity)}</span></td>
<td class="category-cell">{escape(finding.category)}</td>
<td class="mono location-cell">{location_label}</td>
<td class="message-cell">
<div class="message-main">{message}</div>
<div class="context-box">原文:{context or "未提取到上下文"}</div>
<div class="suggestion-box">建议:{suggestion}</div>
</td>
</tr>"""
)
return f"""<!doctype html>
<html lang="zh">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>公文审核报告</title>
<style>{_CSS}</style>
</head>
<body>
<div class="page">
<div class="stack">
<section class="card">
<div class="card-head">
<div class="card-title">报告摘要</div>
<div class="card-subtitle">不改报告语义,仅收敛样式、配色与信息层级</div>
</div>
<div class="summary-grid">
<div class="score-box">
<div class="score-label">综合得分</div>
<div class="score-value">{score}</div>
<div class="score-track"><div class="score-fill" style="width:{score_pct}%"></div></div>
<div class="score-note">这份正式 HTML 报告沿用平台工作台的版式语言,突出摘要、明细和建议三层信息。</div>
</div>
<div class="summary-main">
<div class="eyebrow">正式报告样式方向</div>
<h1 class="report-title">公文格式审核报告</h1>
<div class="report-meta">{filename} · 共 {summary.total_findings} 项问题 · 用作正式 HTML 报告输出</div>
<div class="metrics">
<div class="metric">
<div class="metric-label">错误项</div>
<div class="metric-value"><strong>{severity_counts["error"]}</strong><span>error</span></div>
</div>
<div class="metric">
<div class="metric-label">警告项</div>
<div class="metric-value"><strong>{severity_counts["warning"]}</strong><span>warning</span></div>
</div>
<div class="metric">
<div class="metric-label">提示项</div>
<div class="metric-value"><strong>{severity_counts["info"]}</strong><span>info</span></div>
</div>
<div class="metric">
<div class="metric-label">问题类别</div>
<div class="metric-value"><strong>{category_count}</strong><span>标题 / 发文 / 格式 / 其他</span></div>
</div>
</div>
<div class="chips">
<span class="chip error">错误 {severity_counts["error"]}</span>
<span class="chip warning">警告 {severity_counts["warning"]}</span>
<span class="chip info">提示 {severity_counts["info"]}</span>
</div>
</div>
</div>
</section>
<section class="content-grid">
<aside class="card">
<div class="card-head">
<div class="card-title">侧边摘要</div>
<div class="card-subtitle">工作台侧栏语义</div>
</div>
<div class="sidebar-body">
<article class="summary-row">
<div class="summary-row-label">命中最多规则</div>
<div class="summary-row-value">{escape(top_rule_id)}</div>
<div class="summary-row-desc">当前命中最多的规则共 {top_rule_count} 项,适合在正式版中作为摘要提示保留。</div>
</article>
<article class="summary-row">
<div class="summary-row-label">集中行号</div>
<div class="summary-row-value">{escape(line_range)}</div>
<div class="summary-row-desc">问题主要集中在这一段行号范围,便于阅读者快速判断问题分布区域。</div>
</article>
<article class="summary-row">
<div class="summary-row-label">实体状态</div>
<div class="summary-row-value">{escape(entity_summary)}</div>
<div class="summary-row-desc">按现有识别结果汇总实体抽取状态,用于辅助理解顶部结构类问题。</div>
</article>
</div>
</aside>
<article class="card">
<div class="table-toolbar">
<div class="toolbar-left">
<div class="toolbar-title">问题明细</div>
<div class="toolbar-desc">保留当前报告语义,只收敛版式、层级和配色。</div>
</div>
<div class="toolbar-filters">
<span class="filter active">全部</span>
<span class="filter">错误</span>
<span class="filter">警告</span>
</div>
</div>
<div class="table-wrap">
<table>
<thead>
<tr>
<th>编号</th>
<th>规则</th>
<th>严重度</th>
<th>类别</th>
<th>位置</th>
<th>说明</th>
</tr>
</thead>
<tbody>
{''.join(rows) or '<tr><td colspan="6" class="empty">未发现问题</td></tr>'}
</tbody>
</table>
</div>
</article>
</section>
</div> </div>
</div> </div>
</div> </body>
<table> </html>"""
<thead><tr>
<th>编号</th><th>规则</th><th>严重度</th><th>类别</th><th>位置</th><th>说明</th>
</tr></thead> def _severity_counts(result: AuditResult) -> dict[str, int]:
<tbody>{''.join(rows) or '<tr><td colspan=6>未发现问题</td></tr>'}</tbody> counts = Counter(finding.severity for finding in result.findings)
</table> return {
</body></html>""" "error": counts.get("error", 0),
return body "warning": counts.get("warning", 0),
"info": counts.get("info", 0),
}
def _top_rule(result: AuditResult) -> tuple[str, int]:
counter = Counter(finding.rule_id for finding in result.findings if finding.rule_id)
if not counter:
return "", 0
rule_id, count = counter.most_common(1)[0]
return rule_id, count
def _line_range(result: AuditResult) -> str:
indices = sorted(
{
int(finding.location.paragraph_index) + 1
for finding in result.findings
if finding.location.paragraph_index is not None
}
)
if not indices:
return "未定位"
if len(indices) == 1:
return f"{indices[0]}"
return f"{indices[0]} 行 - 第 {indices[-1]}"
def _entity_summary(result: AuditResult) -> str:
expected = ["title", "doc_number", "recipient", "date"]
missing = [key for key in expected if not result.entities.get(key)]
if not missing:
return "核心实体齐全"
if len(missing) == len(expected):
return "标题 / 发文"
return "缺少 " + " / ".join(missing[:2])
def _format_location(paragraph_index: int | None) -> str:
if paragraph_index is None:
return "未定位"
return f"{int(paragraph_index) + 1}"
@@ -7,7 +7,7 @@ import json
import mimetypes import mimetypes
import time import time
from dataclasses import dataclass from dataclasses import dataclass
from datetime import datetime from datetime import date, datetime
from pathlib import Path from pathlib import Path
from typing import Any from typing import Any
@@ -60,6 +60,20 @@ class GovdocServiceImpl(IGovdocService):
self.OssService = OssService or OssServiceImpl() self.OssService = OssService or OssServiceImpl()
self.Storage = StorageAdapter() self.Storage = StorageAdapter()
def _parse_date_filter(self, value: str | None, field_name: str) -> date | None:
if value is None:
return None
normalized = value.strip()
if not normalized:
return None
try:
return date.fromisoformat(normalized)
except ValueError as exc:
raise LeauditException(
StatusCodeEnum.HTTP_400_BAD_REQUEST,
f"{field_name} 格式非法,应为 YYYY-MM-DD",
) from exc
# ── 文档 ────────────────────────────────────────────── # ── 文档 ──────────────────────────────────────────────
async def UploadDocument( async def UploadDocument(
@@ -250,12 +264,14 @@ class GovdocServiceImpl(IGovdocService):
if resultStatus: if resultStatus:
filters.append("COALESCE(gr.result_status, '') = :result_status") filters.append("COALESCE(gr.result_status, '') = :result_status")
params["result_status"] = resultStatus.strip() params["result_status"] = resultStatus.strip()
if dateFrom: parsedDateFrom = self._parse_date_filter(dateFrom, "dateFrom")
filters.append("d.created_at >= CAST(:date_from AS date)") parsedDateTo = self._parse_date_filter(dateTo, "dateTo")
params["date_from"] = dateFrom.strip() if parsedDateFrom:
if dateTo: filters.append("d.created_at::date >= :date_from")
filters.append("d.created_at < (CAST(:date_to AS date) + INTERVAL '1 day')") params["date_from"] = parsedDateFrom
params["date_to"] = dateTo.strip() if parsedDateTo:
filters.append("d.created_at::date <= :date_to")
params["date_to"] = parsedDateTo
whereClause = " AND ".join(filters) whereClause = " AND ".join(filters)
@@ -901,9 +917,10 @@ class GovdocServiceImpl(IGovdocService):
artifact = await self._get_report_artifact(runId, "html_report") artifact = await self._get_report_artifact(runId, "html_report")
if not artifact: if not artifact:
return {"runId": runId, "htmlUrl": ""} return {"runId": runId, "htmlUrl": ""}
content = await self.OssService.DownloadBytes(str(artifact["oss_url"]))
return { return {
"runId": runId, "runId": runId,
"htmlUrl": await self.OssService.PresignGetUrl(str(artifact["oss_url"])), "html": content.decode("utf-8"),
} }
async def GetReportDocx(self, runId: int) -> dict[str, Any]: async def GetReportDocx(self, runId: int) -> dict[str, Any]:
@@ -4,6 +4,7 @@ import json
import uuid import uuid
from typing import AsyncGenerator from typing import AsyncGenerator
import httpx
from sqlalchemy import text from sqlalchemy import text
from fastapi_common.fastapi_common_sqlalchemy.database import GetAsyncSession from fastapi_common.fastapi_common_sqlalchemy.database import GetAsyncSession
@@ -25,6 +26,7 @@ from fastapi_modules.fastapi_leaudit.domian.vo.ragChatVo import (
RagMessagePageVO, RagMessagePageVO,
RagOperationResultVO, RagOperationResultVO,
) )
from fastapi_modules.fastapi_leaudit.rag_engine.config import RAG_CONFIG
from fastapi_modules.fastapi_leaudit.rag_engine.generator import generate_stream from fastapi_modules.fastapi_leaudit.rag_engine.generator import generate_stream
from fastapi_modules.fastapi_leaudit.rag_engine.question_chains import generate_followups from fastapi_modules.fastapi_leaudit.rag_engine.question_chains import generate_followups
from fastapi_modules.fastapi_leaudit.services.ragChatService import IRagChatService from fastapi_modules.fastapi_leaudit.services.ragChatService import IRagChatService
@@ -194,7 +196,7 @@ class RagChatServiceImpl(IRagChatService):
await session.execute( await session.execute(
text( text(
""" """
SELECT message_id, role, content, sources, feedback, created_at SELECT message_id, role, content, sources, metadata, feedback, created_at
FROM rag_message FROM rag_message
WHERE conversation_id = :conversation_id WHERE conversation_id = :conversation_id
ORDER BY created_at ASC ORDER BY created_at ASC
@@ -216,6 +218,11 @@ class RagChatServiceImpl(IRagChatService):
row = items[idx] row = items[idx]
if row["role"] == "user": if row["role"] == "user":
answer = items[idx + 1] if idx + 1 < len(items) and items[idx + 1]["role"] == "assistant" else None answer = items[idx + 1] if idx + 1 < len(items) and items[idx + 1]["role"] == "assistant" else None
answer_sources = self._parse_json_field(answer.get("sources")) if answer else []
answer_metadata = self._parse_json_field(answer.get("metadata")) if answer else {}
suggested_questions = answer_metadata.get("suggested_questions") if isinstance(answer_metadata, dict) else []
if not isinstance(suggested_questions, list):
suggested_questions = []
data.append( data.append(
RagMessageItemVO( RagMessageItemVO(
id=(answer["message_id"] if answer else row["message_id"]), id=(answer["message_id"] if answer else row["message_id"]),
@@ -223,7 +230,8 @@ class RagChatServiceImpl(IRagChatService):
query=row["content"], query=row["content"],
answer=answer["content"] if answer else "", answer=answer["content"] if answer else "",
feedback=({"rating": answer["feedback"]} if answer and answer.get("feedback") else None), feedback=({"rating": answer["feedback"]} if answer and answer.get("feedback") else None),
retrieverResources=(answer.get("sources") if answer else None), retrieverResources=answer_sources or None,
suggestedQuestions=[str(item) for item in suggested_questions],
createdAt=int(row["created_at"].timestamp()) if row.get("created_at") else 0, createdAt=int(row["created_at"].timestamp()) if row.get("created_at") else 0,
) )
) )
@@ -392,6 +400,18 @@ class RagChatServiceImpl(IRagChatService):
area = row.get("area") or "" area = row.get("area") or ""
return area in ("", "省级", user_area or "") or bool(row.get("dataset_public")) return area in ("", "省级", user_area or "") or bool(row.get("dataset_public"))
def _parse_json_field(self, value):
if value is None:
return {}
if isinstance(value, (dict, list)):
return value
if isinstance(value, str):
try:
return json.loads(value)
except Exception:
return {}
return {}
async def _ensure_conversation(self, user_id: int, conversation_id: str | None, app_id: int | None) -> str: async def _ensure_conversation(self, user_id: int, conversation_id: str | None, app_id: int | None) -> str:
if conversation_id and conversation_id != "-1": if conversation_id and conversation_id != "-1":
async with GetAsyncSession() as session: async with GetAsyncSession() as session:
@@ -450,7 +470,7 @@ class RagChatServiceImpl(IRagChatService):
await session.execute( await session.execute(
text( text(
""" """
SELECT id, name, collection_name, retrieval_model SELECT id, name, collection_name, retrieval_model, embedding_model
FROM rag_dataset FROM rag_dataset
WHERE id = :dataset_id AND deleted_at IS NULL WHERE id = :dataset_id AND deleted_at IS NULL
LIMIT 1 LIMIT 1
@@ -475,7 +495,12 @@ class RagChatServiceImpl(IRagChatService):
return [], dataset.get("name") or "" return [], dataset.get("name") or ""
try: try:
collection = get_chroma().get_or_create_collection(dataset["collection_name"]) collection = get_chroma().get_or_create_collection(dataset["collection_name"])
result = collection.query(query_texts=[query], n_results=max(top_k, 1)) query_embedding = await self._embed_texts([query], dataset.get("embedding_model") or "")
result = collection.query(
query_embeddings=query_embedding,
n_results=max(top_k, 1),
include=["documents", "metadatas", "distances"],
)
docs = (result.get("documents") or [[]])[0] docs = (result.get("documents") or [[]])[0]
metas = (result.get("metadatas") or [[]])[0] metas = (result.get("metadatas") or [[]])[0]
distances = (result.get("distances") or [[]])[0] distances = (result.get("distances") or [[]])[0]
@@ -483,7 +508,8 @@ class RagChatServiceImpl(IRagChatService):
for idx, doc in enumerate(docs): for idx, doc in enumerate(docs):
meta = metas[idx] if idx < len(metas) else {} meta = metas[idx] if idx < len(metas) else {}
dist = distances[idx] if idx < len(distances) else 0.0 dist = distances[idx] if idx < len(distances) else 0.0
score = 1 - float(dist or 0.0) distance = max(0.0, float(dist or 0.0))
score = 1.0 / (1.0 + distance)
if score_threshold is not None and score < score_threshold: if score_threshold is not None and score < score_threshold:
continue continue
chunks.append( chunks.append(
@@ -501,6 +527,46 @@ class RagChatServiceImpl(IRagChatService):
except Exception: except Exception:
return [], dataset.get("name") or "" return [], dataset.get("name") or ""
async def _embed_texts(self, texts: list[str], model_name: str) -> list[list[float]]:
embed_url = (RAG_CONFIG.get("EMBED_URL") or "").strip() or f"{RAG_CONFIG['LLM_BASE_URL'].rstrip('/')}/embeddings"
embed_key = (RAG_CONFIG.get("EMBED_KEY") or "").strip() or RAG_CONFIG["LLM_API_KEY"]
embed_model = model_name or (RAG_CONFIG.get("EMBED_MODEL") or "").strip() or "text-embedding-v4"
batch_size = max(1, int(RAG_CONFIG.get("EMBED_BATCH_SIZE") or 10))
if not embed_url or not embed_key:
raise LeauditException(StatusCodeEnum.HTTP_500_INTERNAL_SERVER_ERROR, "未配置可用的向量化服务")
embeddings: list[list[float]] = []
async with httpx.AsyncClient(timeout=120.0) as client:
for start in range(0, len(texts), batch_size):
batch_texts = texts[start:start + batch_size]
try:
response = await client.post(
embed_url,
headers={
"Content-Type": "application/json",
"Authorization": f"Bearer {embed_key}",
},
json={"model": embed_model, "input": batch_texts},
)
response.raise_for_status()
except httpx.HTTPStatusError as exc:
error_message = exc.response.text.strip() or f"{exc.response.status_code} {exc.response.reason_phrase}"
raise LeauditException(
StatusCodeEnum.HTTP_500_INTERNAL_SERVER_ERROR,
f"向量化服务调用失败: {error_message[:300]}",
) from exc
payload = response.json()
rows = payload.get("data") or []
batch_embeddings = [row.get("embedding") for row in rows if isinstance(row, dict) and row.get("embedding")]
if len(batch_embeddings) != len(batch_texts):
raise LeauditException(StatusCodeEnum.HTTP_500_INTERNAL_SERVER_ERROR, "向量化结果数量异常")
embeddings.extend(batch_embeddings)
if len(embeddings) != len(texts):
raise LeauditException(StatusCodeEnum.HTTP_500_INTERNAL_SERVER_ERROR, "向量化结果数量异常")
return embeddings
def _build_sources(self, context_chunks: list[dict], dataset_name: str) -> list[dict]: def _build_sources(self, context_chunks: list[dict], dataset_name: str) -> list[dict]:
return [ return [
{ {
@@ -1186,7 +1186,7 @@ class RagDatasetServiceImpl(IRagDatasetService):
content = documents[index] if index < len(documents) else "" content = documents[index] if index < len(documents) else ""
metadata = metadatas[index] if index < len(metadatas) and isinstance(metadatas[index], dict) else {} metadata = metadatas[index] if index < len(metadatas) and isinstance(metadatas[index], dict) else {}
distance = float(distances[index]) if index < len(distances) and distances[index] is not None else 1.0 distance = float(distances[index]) if index < len(distances) and distances[index] is not None else 1.0
score = max(0.0, min(1.0, 1.0 - distance)) score = max(0.0, min(1.0, 1.0 / (1.0 + max(0.0, distance))))
if score_threshold_enabled and score_threshold is not None and score < score_threshold: if score_threshold_enabled and score_threshold is not None and score < score_threshold:
continue continue
+8 -16
View File
@@ -167,10 +167,8 @@ start_backend() {
log_info "启动后端服务 (端口: $BACKEND_PORT)..." log_info "启动后端服务 (端口: $BACKEND_PORT)..."
: > "$BACKEND_LOG" : > "$BACKEND_LOG"
( nohup bash -lc "cd \"$BACKEND_DIR\" && exec \"$BACKEND_PYTHON\" run.py" \
cd "$BACKEND_DIR" >> "$BACKEND_LOG" 2>&1 < /dev/null &
exec "$BACKEND_PYTHON" run.py
) >> "$BACKEND_LOG" 2>&1 &
pid=$! pid=$!
sleep 2 sleep 2
@@ -201,10 +199,8 @@ start_frontend() {
log_info "启动前端开发服务 (端口: $FRONTEND_DEV_PORT,代理入口: $FRONTEND_PUBLIC_PORT)..." log_info "启动前端开发服务 (端口: $FRONTEND_DEV_PORT,代理入口: $FRONTEND_PUBLIC_PORT)..."
: > "$FRONTEND_LOG" : > "$FRONTEND_LOG"
( nohup bash -lc "cd \"$FRONTEND_DIR\" && exec npm run dev:dev" \
cd "$FRONTEND_DIR" >> "$FRONTEND_LOG" 2>&1 < /dev/null &
exec npm run dev:dev
) >> "$FRONTEND_LOG" 2>&1 &
pid=$! pid=$!
sleep 4 sleep 4
@@ -238,10 +234,8 @@ start_worker() {
log_info "启动 Worker 服务..." log_info "启动 Worker 服务..."
: > "$WORKER_LOG" : > "$WORKER_LOG"
( nohup bash -lc "cd \"$PROJECT_DIR\" && exec \"$WORKER_SCRIPT\"" \
cd "$PROJECT_DIR" >> "$WORKER_LOG" 2>&1 < /dev/null &
exec "$WORKER_SCRIPT"
) >> "$WORKER_LOG" 2>&1 &
pid=$! pid=$!
echo "$pid" > "$WORKER_PID_FILE" echo "$pid" > "$WORKER_PID_FILE"
sleep 2 sleep 2
@@ -273,10 +267,8 @@ start_beat() {
log_info "启动 Beat 调度服务..." log_info "启动 Beat 调度服务..."
: > "$BEAT_LOG" : > "$BEAT_LOG"
( nohup bash -lc "cd \"$PROJECT_DIR\" && exec \"$BEAT_SCRIPT\"" \
cd "$PROJECT_DIR" >> "$BEAT_LOG" 2>&1 < /dev/null &
exec "$BEAT_SCRIPT"
) >> "$BEAT_LOG" 2>&1 &
pid=$! pid=$!
echo "$pid" > "$BEAT_PID_FILE" echo "$pid" > "$BEAT_PID_FILE"
sleep 2 sleep 2
Submodule legal-platform-frontend updated: f6bb4aa552...d541eb74ae
+298
View File
@@ -0,0 +1,298 @@
#!/usr/bin/env python3
"""按已有 govdoc run 重生成 HTML 报告并覆盖 OSS 产物。"""
from __future__ import annotations
import argparse
import asyncio
import hashlib
import json
from typing import Any
from sqlalchemy import text
from fastapi_common.fastapi_common_sqlalchemy.database import GetAsyncSession
from fastapi_modules.fastapi_leaudit.govdoc_engine.engine.result import (
AuditResult,
AuditSummary,
CheckedRule,
OutlineNode,
StructureItem,
)
from fastapi_modules.fastapi_leaudit.govdoc_engine.models import Finding, Location
from fastapi_modules.fastapi_leaudit.govdoc_engine.parser.entities import SemanticEntity
from fastapi_modules.fastapi_leaudit.govdoc_engine.reporter.html_renderer import render_html
from fastapi_modules.fastapi_leaudit.services.impl.ossServiceImpl import OssServiceImpl
def _parse_json(raw: Any) -> Any:
if raw is None or raw == "":
return None
if isinstance(raw, (dict, list)):
return raw
try:
return json.loads(raw)
except Exception:
return None
def _build_checked_rules(rule_rows: list[dict[str, Any]]) -> list[CheckedRule]:
checked_rules: list[CheckedRule] = []
seen_rule_ids: set[str] = set()
for row in rule_rows:
rule_id = str(row["rule_id"])
if rule_id in seen_rule_ids:
continue
seen_rule_ids.add(rule_id)
status = str(row.get("result") or "pass")
checked_rules.append(
CheckedRule(
rule_id=rule_id,
name=row.get("rule_name") or rule_id,
severity=row.get("severity") or "info",
category=row.get("category") or "",
status=status if status in {"pass", "fail", "skipped"} else "pass",
skip_reason=row.get("skip_reason") or "",
)
)
return checked_rules
def _build_findings(rule_rows: list[dict[str, Any]]) -> list[Finding]:
findings: list[Finding] = []
for index, row in enumerate(rule_rows):
if row.get("result") != "fail":
continue
paragraph_index = int(row.get("paragraph_index") or 0)
findings.append(
Finding(
finding_id=f"{row['rule_id']}-{paragraph_index or index}",
rule_id=str(row["rule_id"]),
rule_name=row.get("rule_name") or str(row["rule_id"]),
severity=row.get("severity") or "info",
category=row.get("category") or "",
location=Location(
paragraph_index=paragraph_index,
role=row.get("location_path"),
char_start=0,
char_end=0,
context=row.get("paragraph_text") or "",
),
actual=_parse_json(row.get("actual")) or {},
expected=_parse_json(row.get("expected")) or {},
message=row.get("message") or "",
suggestion=row.get("suggestion") or "",
evidence=str(row.get("evidence") or ""),
confidence=1.0,
)
)
return findings
def _build_summary(run_row: dict[str, Any], findings: list[Finding]) -> AuditSummary:
severity_stats: dict[str, int] = {}
category_stats: dict[str, int] = {}
for finding in findings:
severity_stats[finding.severity] = severity_stats.get(finding.severity, 0) + 1
if finding.category:
category_stats[finding.category] = category_stats.get(finding.category, 0) + 1
return AuditSummary(
score=int(float(run_row.get("total_score") or 0)),
total_findings=len(findings),
by_severity=severity_stats,
by_category=category_stats,
passed_count=int(run_row.get("passed_count") or 0),
failed_count=int(run_row.get("failed_count") or 0),
skipped_count=int(run_row.get("skipped_count") or 0),
)
def _normalize_structure_item(item: dict[str, Any]) -> dict[str, Any]:
return {
"role": item.get("role"),
"label": item.get("label") or "",
"count": item.get("count") or 0,
"expected": bool(item.get("expected", False)),
"paragraph_indices": item.get("paragraphIndices") or item.get("paragraph_indices") or [],
"samples": item.get("samples") or [],
"char_total": item.get("charTotal") or item.get("char_total") or 0,
"dominant_font": item.get("dominantFont") or item.get("dominant_font"),
"dominant_size_pt": item.get("dominantSizePt") or item.get("dominant_size_pt"),
"style_uniform": bool(item.get("styleUniform", item.get("style_uniform", True))),
}
def _normalize_outline_node(item: dict[str, Any]) -> dict[str, Any]:
return {
"paragraph_index": item.get("paragraphIndex") or item.get("paragraph_index") or 0,
"level": item.get("level") or 0,
"text": item.get("text") or "",
"children": [_normalize_outline_node(child) for child in (item.get("children") or [])],
}
async def regenerate_html_report(run_id: int) -> None:
oss_service = OssServiceImpl()
async with GetAsyncSession() as session:
run_row = (
await session.execute(
text(
"""
SELECT
gr.id,
gr.document_id,
gr.total_score,
gr.passed_count,
gr.failed_count,
gr.skipped_count,
gr.result_summary_json,
d.region,
f.file_name
FROM govdoc_runs gr
JOIN leaudit_documents d
ON d.id = gr.document_id
AND d.deleted_at IS NULL
JOIN leaudit_document_files f
ON f.document_id = d.id
AND f.file_role = 'original'
AND f.is_active = true
AND f.deleted_at IS NULL
WHERE gr.id = :run_id
AND gr.deleted_at IS NULL
LIMIT 1
"""
),
{"run_id": run_id},
)
).mappings().first()
if not run_row:
raise RuntimeError(f"run {run_id} 不存在")
rule_rows = (
await session.execute(
text(
"""
SELECT
rule_id,
rule_name,
severity,
category,
result,
skip_reason,
message,
suggestion,
actual,
expected,
evidence,
paragraph_index,
paragraph_text,
location_path
FROM govdoc_rule_results
WHERE run_id = :run_id
AND deleted_at IS NULL
ORDER BY id ASC
"""
),
{"run_id": run_id},
)
).mappings().all()
artifact_row = (
await session.execute(
text(
"""
SELECT id, file_name, oss_url
FROM govdoc_report_artifacts
WHERE run_id = :run_id
AND artifact_type = 'html_report'
AND deleted_at IS NULL
ORDER BY id DESC
LIMIT 1
"""
),
{"run_id": run_id},
)
).mappings().first()
if not artifact_row:
raise RuntimeError(f"run {run_id} 没有 html_report 产物记录")
aux = _parse_json(run_row.get("result_summary_json")) or {}
findings = _build_findings(rule_rows)
result = AuditResult(
audit_id=str(run_id),
document={
"documentId": int(run_row["document_id"]),
"filename": run_row.get("file_name") or "",
},
summary=_build_summary(run_row, findings),
findings=findings,
checked_rules=_build_checked_rules(rule_rows),
structure=[
StructureItem.model_validate(_normalize_structure_item(item))
for item in aux.get("structure", [])
],
outline=[
OutlineNode.model_validate(_normalize_outline_node(item))
for item in aux.get("outline", [])
],
entities={
name: SemanticEntity.model_validate(value)
for name, value in (aux.get("entities") or {}).items()
if value is not None
},
)
html = render_html(result)
html_bytes = html.encode("utf-8")
sha256 = hashlib.sha256(html_bytes).hexdigest()
await oss_service.UploadText(
ObjectKey=str(artifact_row["oss_url"]),
Content=html,
ContentType="text/html; charset=utf-8",
)
await session.execute(
text(
"""
UPDATE govdoc_report_artifacts
SET file_size = :file_size,
sha256 = :sha256,
mime_type = 'text/html; charset=utf-8',
updated_at = now()
WHERE id = :artifact_id
"""
),
{
"artifact_id": int(artifact_row["id"]),
"file_size": len(html_bytes),
"sha256": sha256,
},
)
await session.commit()
print(
json.dumps(
{
"runId": run_id,
"documentId": int(run_row["document_id"]),
"fileName": run_row.get("file_name") or "",
"artifactOssKey": artifact_row["oss_url"],
"htmlBytes": len(html_bytes),
"sha256": sha256,
},
ensure_ascii=False,
)
)
def main() -> None:
parser = argparse.ArgumentParser(description="重生成 govdoc HTML 报告")
parser.add_argument("run_id", type=int, help="govdoc run id")
args = parser.parse_args()
asyncio.run(regenerate_html_report(args.run_id))
if __name__ == "__main__":
main()