feat(rag): add temporary chat attachments

This commit is contained in:
wren
2026-05-25 15:37:37 +08:00
parent 0f385c9839
commit 75c077da77
16 changed files with 2257 additions and 16 deletions
@@ -0,0 +1,89 @@
# Chat Temporary RAG Attachments Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Build conversation-scoped temporary RAG attachments for chat with 7-day TTL, parser/OCR indexing, strict tenant/user/conversation isolation, and dual retrieval with the existing formal knowledge base.
**Architecture:** Add a focused `RagChatAttachmentServiceImpl` responsible for attachment lifecycle, parsing, chunking, indexing, retrieval, validation, and cleanup. Extend `RagChatServiceImpl` so a chat message can include an `AttachmentId`, retrieve attachment facts first, then formal KB legal context, and generate one answer with source-aware chunks. Add frontend upload/poll/status plumbing inside the existing chat input path without touching unrelated dirty frontend files.
**Tech Stack:** FastAPI, SQLAlchemy text queries, Chroma, existing `RagRetriever`, existing OSS client, existing LeAudit OCR bridge, React/Next.js, Ant Design upload controls.
---
### Task 1: Backend Attachment Contract And Unit Tests
**Files:**
- Create: `tests/test_rag_chat_attachment_service.py`
- Create: `fastapi_modules/fastapi_leaudit/domian/vo/ragChatAttachmentVo.py`
- Modify: `fastapi_modules/fastapi_leaudit/domian/Dto/ragChatDto.py`
- [ ] Write tests for TTL, collection naming, scope matching, and chunk metadata.
- [ ] Run: `pytest tests/test_rag_chat_attachment_service.py -q`; expected failures mention missing `RagChatAttachmentServiceImpl`.
- [ ] Add attachment DTO and VO classes.
- [ ] Re-run the same test and keep remaining failures focused on missing service implementation.
### Task 2: Backend Attachment Service
**Files:**
- Create: `fastapi_modules/fastapi_leaudit/services/ragChatAttachmentService.py`
- Create: `fastapi_modules/fastapi_leaudit/services/impl/ragChatAttachmentServiceImpl.py`
- Create: `scripts/创建sql/schema_add_rag_chat_attachments.sql`
- [ ] Implement schema creation for `rag_chat_attachment`.
- [ ] Implement `BuildCollectionName`, `BuildChunks`, `CreateAttachment`, `GetAttachment`, `ValidateAttachmentForChat`, `RetrieveAttachmentContext`, `DeleteAttachment`, and `CleanupExpiredAttachments`.
- [ ] Implement parsers for txt/md/json/csv/docx/pdf/xlsx/images.
- [ ] Implement async indexing with status transitions and best-effort cleanup.
- [ ] Run: `pytest tests/test_rag_chat_attachment_service.py -q`; expected pass.
### Task 3: Chat Message Dual Retrieval
**Files:**
- Modify: `fastapi_modules/fastapi_leaudit/services/ragChatService.py`
- Modify: `fastapi_modules/fastapi_leaudit/services/impl/ragChatServiceImpl.py`
- Modify: `fastapi_modules/fastapi_leaudit/rag_engine/generator.py`
- Modify: `tests/test_rag_chat_streaming_sources.py`
- [ ] Add `AttachmentId` to chat service and DTO call path.
- [ ] Add tests proving `_run_message_task` merges attachment chunks and formal KB chunks with distinct source scopes.
- [ ] Run targeted test and confirm red.
- [ ] Implement retrieval merge and grounded legal query construction.
- [ ] Update generator prompt to group uploaded file facts separately from formal KB basis.
- [ ] Run targeted tests and confirm green.
### Task 4: Backend Controller Routes
**Files:**
- Modify: `fastapi_modules/fastapi_leaudit/controllers/ragChatController.py`
- [ ] Add `POST /chat/attachments`, `GET /chat/attachments/{AttachmentId}`, and `DELETE /chat/attachments/{AttachmentId}`.
- [ ] Extend message send route to pass `Body.attachmentId`.
- [ ] Reuse `rag:chat:use` permission and existing tenant context.
- [ ] Run backend attachment and chat streaming tests.
### Task 5: Frontend Upload And Send Plumbing
**Files:**
- Create: `legal-platform-frontend/app/api/chat-attachments/route.ts`
- Create: `legal-platform-frontend/app/api/chat-attachments/[attachmentId]/route.ts`
- Create: `legal-platform-frontend/lib/api/legacy/dify-chat/attachment.ts`
- Modify: `legal-platform-frontend/lib/api/legacy/dify-chat/types.ts`
- Modify: `legal-platform-frontend/lib/api/legacy/dify-chat/client.ts`
- Modify: `legal-platform-frontend/app/api/chat-messages/route.ts`
- Modify: `legal-platform-frontend/hooks/use-chat-message.ts`
- Modify: `legal-platform-frontend/components/dify-chat/index.tsx`
- Modify: `legal-platform-frontend/components/dify-chat/chat-input.tsx`
- [ ] Add frontend attachment API client and Next route proxies.
- [ ] Update chat input to allow one file, upload immediately, poll status, show completed/error/removable state, and block send while indexing.
- [ ] Pass `attachmentId` through `Chat` -> `useChatMessage` -> `/api/chat-messages` -> FastAPI.
- [ ] Keep upload scoped to current conversation id; for new chats, create/send with conversation after backend returns id or require existing conversation before attachment.
### Task 6: Verification
**Files:**
- All touched files
- [ ] Run: `pytest tests/test_rag_chat_attachment_service.py tests/test_rag_chat_streaming_sources.py -q`.
- [ ] Run frontend type/test command if available and scoped enough.
- [ ] Check `git status --short` at root and frontend subrepo.
- [ ] Report changed files, verification output, and any known gaps.
@@ -0,0 +1,150 @@
# Chat Temporary RAG Attachments Design
## Goal
Implement in-chat file upload for `/chat-with-llm/chat` so a user can upload one document or image, have it parsed and indexed into a temporary conversation-scoped RAG collection, and then ask questions that combine facts from the uploaded file with legal knowledge from the existing app knowledge base. Temporary attachment indexes expire after 7 days by default.
## Core Behavior
The upload belongs to the chat experience, not the permanent knowledge-base management UI. A file uploaded inside a conversation creates a temporary attachment record and a temporary vector collection. Chat messages may reference that attachment by `attachmentId`.
The answer flow is dual-source RAG:
1. Retrieve from the temporary attachment collection to establish facts from the uploaded file.
2. Build a grounded legal search query from the user question plus attachment hits.
3. Retrieve from the existing formal app knowledge base to find laws, penalties, cases, or policy rules.
4. Generate a single answer that clearly uses uploaded-file content as facts and formal knowledge-base content as legal basis.
If no attachment is selected, chat continues using the existing formal knowledge-base retrieval path.
## Scope And Isolation
Temporary knowledge must never leak between conversations or users. Backend validation must never trust frontend `attachmentId` alone. Every attachment operation validates:
- `tenant_code` or resolved tenant context
- `user_id`
- `conversation_id`
- `attachment_id`
- `deleted_at IS NULL`
- `expires_at > NOW()`
- `indexing_status = 'completed'` for chat retrieval
The temporary collection name must include all isolation dimensions:
```text
chat_attachment_{tenantCode}_{userId}_{conversationHash}_{attachmentId}
```
Chunk metadata also carries:
- `tenant_code`
- `user_id`
- `conversation_id`
- `attachment_id`
- `source_scope = "chat_attachment"`
- `document_name`
- `page`
- `chunk_index`
This creates defense in depth: database filtering protects attachment ownership, and vector metadata protects retrieval integrity if collection access is ever broadened.
## Supported Files
Initial backend support:
- Text: `.txt`, `.md`, `.json`, `.csv`
- Word: `.docx`
- PDF: `.pdf`
- Excel: `.xlsx` using `openpyxl` when available, with a clear 400 error if the dependency is missing
- Images: `.png`, `.jpg`, `.jpeg`, `.webp`, `.bmp`, `.tif`, `.tiff`
Images are parsed through the existing LeAudit OCR bridge. If OCR returns structured pages, the service converts pages to text. If only a raw dictionary is returned, it extracts common fields such as `full_text`, `text`, `content`, `pages`, and OCR line text.
## Lifecycle
Upload flow:
1. Frontend posts multipart file with `conversation_id` and optional `app_id`.
2. Backend validates chat permission and conversation ownership.
3. Backend creates `rag_chat_attachment` with `indexing_status = 'waiting'` and `expires_at = now + 7 days`.
4. Backend stores the original file in OSS under a temporary chat path.
5. Backend starts async indexing:
- `parsing`
- `splitting`
- `indexing`
- `completed`
- or `error`
6. Frontend polls attachment status and enables send only after `completed`.
Deletion flow:
- User can remove the attachment from the chat UI.
- Backend soft-deletes the row.
- Backend attempts to delete the temporary Chroma collection and OSS object.
- Deletion failures in Chroma/OSS do not block the user-facing delete result.
TTL cleanup:
- `expires_at` defaults to 7 days.
- Retrieval rejects expired attachments immediately.
- A cleanup method soft-deletes expired records and best-effort deletes Chroma collections and OSS objects.
- The cleanup method can later be wired to an existing scheduler if one exists.
## API Shape
Backend routes under `/api/v3/rag`:
- `POST /chat/attachments`
- multipart: `file`, `conversation_id`, optional `app_id`
- returns attachment id, filename, status, expires timestamp, and collection name
- `GET /chat/attachments/{AttachmentId}`
- returns status and metadata after ownership validation
- `DELETE /chat/attachments/{AttachmentId}`
- soft deletes and best-effort removes temporary artifacts
- `POST /chat/messages`
- extends existing body with optional `attachmentId`
Frontend routes:
- `POST /api/chat-attachments`
- `GET /api/chat-attachments/[attachmentId]`
- `DELETE /api/chat-attachments/[attachmentId]`
- Existing `/api/chat-messages` forwards `attachment_id` / `attachmentId`.
## Chat Generation Contract
The generator must receive context chunks with an explicit `source_scope`:
- `chat_attachment`: facts extracted from uploaded file
- `formal_kb`: laws, penalties, and authoritative references from the configured app knowledge base
Prompt construction must tell the model:
- Uploaded attachment chunks are factual input from the user's file.
- Formal knowledge-base chunks are the source for legal rules and penalties.
- Do not invent laws, penalties, or file facts that are absent from the provided contexts.
Sources returned to the UI must preserve source type so users can tell whether a cited segment came from the uploaded file or the formal knowledge base.
## Error Handling
- Empty file: 400
- Unsupported type: 400
- Attachment from another user, tenant, or conversation: 403 or 404
- Expired attachment: 410-like business error using the existing exception mechanism
- Attachment not completed when sending: 400
- Parser found no text: status `error`, send disabled in UI
- OCR/index failures: status `error`, error message capped for display
## Testing Requirements
Backend tests must cover:
- Default `expires_at` is 7 days.
- Collection names include sanitized tenant/user/conversation/attachment isolation fields.
- Scope validation rejects mismatched tenant, user, or conversation.
- Expired or non-completed attachments cannot be used for chat retrieval.
- Built chunk metadata contains tenant/user/conversation/attachment isolation fields.
- Chat task merges temporary attachment chunks and formal KB chunks, preserving source metadata.
Frontend tests are optional in the first slice if the project does not already have focused chat component tests, but the implementation must keep the UI constrained to one selected attachment.