← Back to main spec
FUTURE SCOPE
Organization Context Enrichment
This feature is out of scope for the current Workerline Enhancements sprint. It is documented here as a standalone specification for future implementation.
Overview
Allow organizations to upload documents (PDFs, text files, policy manuals) that are automatically processed and injected as context into AI conversations for that organization's sessions. This enriches the AI's responses with org-specific knowledge — company policies, vessel procedures, benefit details, etc.
How It Works
Org Instructions (Existing)
The survey_instructions table already exists in the schema. Enhancement: improved dashboard UI with template suggestions and preview of how instructions affect AI behavior.
Org-Specific Documents (New)
Upload PDFs/text files → extract text → store → concatenate into the AI system prompt for that org's sessions (simple text injection, not semantic retrieval).
sequenceDiagram
participant Admin as 🖥️ Admin
participant API as ⚡ scb-api
participant DB as 💾 SQLite
Admin->>API: Upload PDF
API->>API: Extract + chunk text
API->>DB: Store org_document
Note over API,DB: During conversations...
participant W as 🚢 Worker
participant OAI as 🤖 OpenAI
W->>API: Chat message
API->>DB: Load org_documents
API->>API: Build prompt + doc excerpts
API->>OAI: Chat completion (with context)
OAI-->>W: Streaming response
Data Model
erDiagram
orgs ||--o{ org_documents : has
org_documents {
text id PK
text org_id FK
text name
text file_type
text extracted_text
text created_at
}
API Routes
| Method | Route | Auth | Description |
POST | /org/:id/documents | Admin | Upload org document (PDF/text) |
GET | /org/:id/documents | Admin | List org documents |
DELETE | /org/:id/documents/:docId | Admin | Delete org document |
Implementation Notes
- PDF text extraction via
pdf-parse or similar lightweight library
- Simple text injection into system prompt (no vector DB / RAG for MVP)
- Character limit per org to prevent prompt overflow (~8K chars)
- Dashboard UI: upload widget, list with delete, preview of extracted text
- Access scoped by org via existing
verifyOrgAccess middleware
Future enhancement: If document volume grows, consider vector embeddings + similarity search (RAG) for more precise context injection. For now, simple text concatenation is sufficient.
Foundation Analysis — What Already Exists
This feature has a small footprint because it builds almost entirely on existing infrastructure:
survey_instructions table — already exists in the production schema (not shown in the main spec's simplified ERD) for org-level AI prompt customization. Context enrichment extends this pattern.
- OpenAI GPT-4 integration — existing chat completion pipeline already accepts system prompts. Document excerpts are injected as additional system context.
verifyOrgAccess middleware — existing auth layer scopes all admin routes by org.
- Dashboard (scb) — existing admin UI patterns (CRUD lists, file upload) can be reused for the document management interface.
- File storage — R2/S3 integration (if used for learn content uploads in the current sprint) can be reused for document storage.
Key insight: This is primarily a data-pipeline + dashboard-UI task. No new external services, no new infrastructure patterns. The heaviest lift is the text extraction and the dashboard upload interface.
Estimated Cost
Organization Context Enrichment
$2,200
2–3 days · agentic coding workflow · builds entirely on existing infrastructure
Pricing note: This estimate assumes the same AI-driven development workflow used in the base sprint ($13,200 for 7 features / 2–3 weeks). Context Enrichment is a small, well-bounded task — 3 API routes, 1 table, PDF text extraction, and a basic dashboard upload UI. No new external services or infrastructure patterns.
What's Included
| Deliverable | Details |
| Document upload API | POST/GET/DELETE routes with org-scoped access. PDF + plain text support. |
| Text extraction pipeline | Automatic text extraction on upload via pdf-parse. Chunking to stay within prompt limits (~8K chars per org). |
| AI context injection | Org documents concatenated into the GPT-4 system prompt during worker conversations. Full text injected up to the ~8K character limit per org — no topic-based filtering in MVP (simple concatenation, not semantic search). |
| Dashboard UI | Upload widget, document list with preview of extracted text, delete confirmation. Integrated into existing org settings page. |
| Improved instructions UI | Better editing experience for existing survey_instructions with template suggestions. |
Ongoing Costs
- OpenAI: Marginal increase in prompt tokens (~$0.001–0.005 per conversation from injected context)
- Storage: Negligible (text files are small; PDFs stored in existing R2/S3)