Files
leaudit-platform-backend/docs/superpowers/specs/2026-05-25-chat-temporary-rag-attachments-design.md
2026-05-25 15:37:37 +08:00

6.1 KiB

Chat Temporary RAG Attachments Design

Goal

Implement in-chat file upload for /chat-with-llm/chat so a user can upload one document or image, have it parsed and indexed into a temporary conversation-scoped RAG collection, and then ask questions that combine facts from the uploaded file with legal knowledge from the existing app knowledge base. Temporary attachment indexes expire after 7 days by default.

Core Behavior

The upload belongs to the chat experience, not the permanent knowledge-base management UI. A file uploaded inside a conversation creates a temporary attachment record and a temporary vector collection. Chat messages may reference that attachment by attachmentId.

The answer flow is dual-source RAG:

  1. Retrieve from the temporary attachment collection to establish facts from the uploaded file.
  2. Build a grounded legal search query from the user question plus attachment hits.
  3. Retrieve from the existing formal app knowledge base to find laws, penalties, cases, or policy rules.
  4. Generate a single answer that clearly uses uploaded-file content as facts and formal knowledge-base content as legal basis.

If no attachment is selected, chat continues using the existing formal knowledge-base retrieval path.

Scope And Isolation

Temporary knowledge must never leak between conversations or users. Backend validation must never trust frontend attachmentId alone. Every attachment operation validates:

  • tenant_code or resolved tenant context
  • user_id
  • conversation_id
  • attachment_id
  • deleted_at IS NULL
  • expires_at > NOW()
  • indexing_status = 'completed' for chat retrieval

The temporary collection name must include all isolation dimensions:

chat_attachment_{tenantCode}_{userId}_{conversationHash}_{attachmentId}

Chunk metadata also carries:

  • tenant_code
  • user_id
  • conversation_id
  • attachment_id
  • source_scope = "chat_attachment"
  • document_name
  • page
  • chunk_index

This creates defense in depth: database filtering protects attachment ownership, and vector metadata protects retrieval integrity if collection access is ever broadened.

Supported Files

Initial backend support:

  • Text: .txt, .md, .json, .csv
  • Word: .docx
  • PDF: .pdf
  • Excel: .xlsx using openpyxl when available, with a clear 400 error if the dependency is missing
  • Images: .png, .jpg, .jpeg, .webp, .bmp, .tif, .tiff

Images are parsed through the existing LeAudit OCR bridge. If OCR returns structured pages, the service converts pages to text. If only a raw dictionary is returned, it extracts common fields such as full_text, text, content, pages, and OCR line text.

Lifecycle

Upload flow:

  1. Frontend posts multipart file with conversation_id and optional app_id.
  2. Backend validates chat permission and conversation ownership.
  3. Backend creates rag_chat_attachment with indexing_status = 'waiting' and expires_at = now + 7 days.
  4. Backend stores the original file in OSS under a temporary chat path.
  5. Backend starts async indexing:
    • parsing
    • splitting
    • indexing
    • completed
    • or error
  6. Frontend polls attachment status and enables send only after completed.

Deletion flow:

  • User can remove the attachment from the chat UI.
  • Backend soft-deletes the row.
  • Backend attempts to delete the temporary Chroma collection and OSS object.
  • Deletion failures in Chroma/OSS do not block the user-facing delete result.

TTL cleanup:

  • expires_at defaults to 7 days.
  • Retrieval rejects expired attachments immediately.
  • A cleanup method soft-deletes expired records and best-effort deletes Chroma collections and OSS objects.
  • The cleanup method can later be wired to an existing scheduler if one exists.

API Shape

Backend routes under /api/v3/rag:

  • POST /chat/attachments
    • multipart: file, conversation_id, optional app_id
    • returns attachment id, filename, status, expires timestamp, and collection name
  • GET /chat/attachments/{AttachmentId}
    • returns status and metadata after ownership validation
  • DELETE /chat/attachments/{AttachmentId}
    • soft deletes and best-effort removes temporary artifacts
  • POST /chat/messages
    • extends existing body with optional attachmentId

Frontend routes:

  • POST /api/chat-attachments
  • GET /api/chat-attachments/[attachmentId]
  • DELETE /api/chat-attachments/[attachmentId]
  • Existing /api/chat-messages forwards attachment_id / attachmentId.

Chat Generation Contract

The generator must receive context chunks with an explicit source_scope:

  • chat_attachment: facts extracted from uploaded file
  • formal_kb: laws, penalties, and authoritative references from the configured app knowledge base

Prompt construction must tell the model:

  • Uploaded attachment chunks are factual input from the user's file.
  • Formal knowledge-base chunks are the source for legal rules and penalties.
  • Do not invent laws, penalties, or file facts that are absent from the provided contexts.

Sources returned to the UI must preserve source type so users can tell whether a cited segment came from the uploaded file or the formal knowledge base.

Error Handling

  • Empty file: 400
  • Unsupported type: 400
  • Attachment from another user, tenant, or conversation: 403 or 404
  • Expired attachment: 410-like business error using the existing exception mechanism
  • Attachment not completed when sending: 400
  • Parser found no text: status error, send disabled in UI
  • OCR/index failures: status error, error message capped for display

Testing Requirements

Backend tests must cover:

  • Default expires_at is 7 days.
  • Collection names include sanitized tenant/user/conversation/attachment isolation fields.
  • Scope validation rejects mismatched tenant, user, or conversation.
  • Expired or non-completed attachments cannot be used for chat retrieval.
  • Built chunk metadata contains tenant/user/conversation/attachment isolation fields.
  • Chat task merges temporary attachment chunks and formal KB chunks, preserving source metadata.

Frontend tests are optional in the first slice if the project does not already have focused chat component tests, but the implementation must keep the UI constrained to one selected attachment.