AI Assistant & Tooling Constraints¶
This spec captures the operational, security, and behavioral constraints that every AI-agent feature must respect. The ai-service (see ai-service/src/routes/chat.routes.ts and ai-service/src/index.ts) is the single source of truth for conversations, tool execution, and policy generation.
1. Architecture & Observability¶
- Express-based host: The service runs on Express with
helmet,cors,morgan, and JSON parsing enabled (ai-service/src/index.ts). Kong forwards/api/aiand/api/agentrequests without stripping the path, so the router expects the full path. - Tracing: OpenTelemetry is initialized via
ai-service/src/tracing.ts, exporting toPHOENIX_COLLECTOR_ENDPOINT(defaulthttp://phoenix:4317). Every handler uses theai-servicetracer to annotate spans and push attributes for Phoenix. - HTTP keep-alive: The OpenAI client (
OpenAIfromopenaiinchat.routes.ts) is constructed with sharedhttp.Agent/https.Agentinstances (keepAlive, maxSockets), 60 s timeout, andmaxRetries=2.
2. Authentication & Access¶
- Preferred path: Kong injects
X-User-Id/X-User-Roleheaders after validating JWTs.authenticatefirst trusts these headers, populatesreq.user, and logs the decision. - Fallback: If Kong headers are missing, the code falls back to verifying a bearer JWT with
JWT_SECRET(or the development fallbackyour-secret-key), so any future feature must ensure tokens includeuserId/subandrole. - Session invariants: Every route under
/api/agentand/api/airequires authentication; unauthenticated requests receive401.
3. Conversation & Chat Flow¶
- AI user anchor: Conversations always pair a human user with the synthetic
[email protected]user (rolecompliance).getOrCreateAiUserensures this user exists before recording messages. - Endpoints:
GET /api/agent/conversationsandGET /api/agent/conversations/:id/messageslist conversations scoped to the authenticated user, excluding AI-only threads in the human-facing chat UI.POST /api/agent/chatis the central handler. It builds a system prompt seeded with the user’s role, current projects, and requested tools, persists the user message, executes tool calls when GPT requests them, logs responses, and appends assistant outputs to the conversation.POST /api/agent/preferenceandGET /api/agent/switch-viewlet the UI persist dashboard view choices.POST /api/ai/policy-generate,GET /api/ai/templates, andGET /api/ai/user-contextorchestrate policy generation & onboarding context.- Observability hooks: Every chat request wraps logic in an OpenTelemetry span, adds
session.id,user.id,llm.model_name, attaches tool execution events, and records token usage underllm.token_count.*.
4. Tool Contracts (ai-service/src/tools)¶
Toolkit functions must sit in ai-service/src/tools/implementations.ts, return ToolResponse with text/optional widget, and never mutate shared state. The current registry exposes:
- Project management
show_project_creation_formserves dropdown data for frameworks.submit_project_creationposts toPOST /api/customer/projectsand requires the confirmation codeWIDGET_VERIFIED_X9.- Evidence & controls
upload_evidence_intentrenders an upload widget around a control.link_evidencecalls/api/auditor/projects/:projectId/controls/:controlId/evidence/linkafter injecting Kong headers.- Compliance analytics
generate_compliance_graph,get_unified_compliance_summary, andget_compliance_projectionconsume backend endpoints (/api/customer/dashboard,/api/compliance/summary,/api/compliance/projection) depending on user role.- Agent & risk telemetry
get_agent_statsreadsFLEET_SERVICE_URL/api/agents/stats.get_risk_analysishits/api/customer/risk/overview.- Search experiences
search_cybersecurity_standardsusesVECTOR_STORE_URLto query a vector store.search_project_documentsfans out toDOCUMENT_RAG_URL/search.get_document_contexthitsDOCUMENT_RAG_URL/contextfor chunk-level detail.select_projects_for_searchfetches/api/{customer|auditor}/projectsto build selectors.- Human-in-the-loop
ask_confirmationreturns a button widget for asynchronous approvals.
Any new tool must declare JSON schema parameters, handle context.userId/context.userRole, and produce user-level telemetry via the optional span.
5. Policy & Template Generation¶
- Templates: Markdown policy files live under
ai-service/templates./api/ai/policy-generatereads the requested template, merges usercustomizations, and enriches it with context fetched fromGET /api/ai/user-context-internal(backend service atBACKEND_URL). - Platform Guardrails: The policy prompt instructs the model to act as “Kimi” (Moonshot AI) and refuse undesirable content. Responses are expected to be raw Markdown.
6. Deployment Configuration¶
- Feature flags & env vars:
USE_AI_GATEWAY: true routes requests via the Cloudflare AI Gateway (/v1/.../compat), otherwise the default Moonshot endpoint.MOONSHOT_API_KEY,GEMINI_API_KEY,GOOGLE_API_KEY: whichever is provided becomes the active credential.MOONSHOT_MODEL,GEMINI_MODEL: override the defaultkimi-k2-turbo-preview.BACKEND_URL,FLEET_SERVICE_URL,VECTOR_STORE_URL,DOCUMENT_RAG_URL: base URLs each tool depends on.JWT_SECRET: required for decrypting fallback tokens.- Logging/tracing endpoints:
PHOENIX_COLLECTOR_ENDPOINT,OTEL_SERVICE_NAME. - HTTP clients: All outbound requests (
axios) pass the Kong headers (X-User-Id,X-User-Role) plusAuthorizationwhen available so the backend/RAG services can re-evaluate access via their RBAC layer.
7. Behavior Expectations¶
- Tool error handling: Tools catch axios errors, log
response.data.message, and return user-friendly fallback text/widgets instead of bubbling raw stack traces. - Conversation consistency: Tool outputs are stitched back into LLM context via
toolMessages, and a final LLM pass runs when any tool returns data. - Allowlist: The AI user and project data are persisted via Prisma (
conversation,message,conversationParticipant). Future features must respect the same schema to keep historical transcripts consistent.