Publiclear Simulation Studio Implementation Spec
Single handoff document for frontend, backend, and realtime infrastructure. Covers screen flow, data contracts, streaming events, and acceptance tests.
Updated: February 22, 2026 (Asia/Kolkata)
OpenAPI Contracts
TypeScript Interfaces
Realtime Streaming
Acceptance Tests
0) Spec Index
- Product Architecture
- Screen Map and Route Flow
- Ingestion State Machine
- Personality Calibration Schema
- Voice Training Contract
- Avatar Generation Contract
- Core REST API Endpoints
- OpenAPI Snippets
- TypeScript Shared Interfaces
- Realtime Streaming Events
- Frontend Realtime Handling Rules
- Acceptance Test Matrix
- Performance and SLO Targets
- Security and Compliance Controls
1) Product Architecture
Frontend
- React + TypeScript SPA.
- Zustand or equivalent store for multi-step studio state.
- WebRTC/WebSocket realtime transport.
- UI modules: Demo, Studio, Portal, Admin/Owner controls.
Backend
- API service for CRUD and auth.
- Worker queue for ingestion and model jobs.
- Vector database for memory retrieval.
- Object storage for audio/photo/docs.
Core Services
simulation-service: workspace lifecycle, persona, relationships, access links.ingestion-service: parse docs/chats, chunk, embed, index.voice-service: quality checks, training orchestration, speaker profiles.avatar-service: photo validation, render jobs, output manifests.realtime-orchestrator: turn execution, RAG, LLM response, TTS, visemes.
2) Screen Map and Route Flow
| Screen | Route | Goal | Primary UI | API Calls | Done Condition |
|---|---|---|---|---|---|
| S0 | /studio/new |
Create workspace | Name, language, consent checkbox | POST /v1/simulations |
simulation_id exists |
| S1 | /studio/:id/ingest |
Add source memories | Upload queue, stage progress, retry | POST /v1/uploads/presign, POST /v1/simulations/:id/ingestions, GET /v1/jobs/:job_id/events |
At least one ingestion complete |
| S2 | /studio/:id/personality |
Calibrate personality | Sliders, values, catchphrases, taboo topics | PUT /v1/simulations/:id/persona |
Persona validation pass |
| S3 | /studio/:id/relationships |
Relationship behavior map | Relation cards and tone overrides | PUT /v1/simulations/:id/relationships |
One relation profile created |
| S4 | /studio/:id/voice |
Train voice model | Recorder, noise checks, segment quality report | POST /v1/simulations/:id/voice/samples, POST /v1/simulations/:id/voice/train |
Voice status ready |
| S5 | /studio/:id/avatar |
Generate avatar | Photo upload, quality checks, preview | POST /v1/simulations/:id/avatar/upload, POST /v1/simulations/:id/avatar/render |
Avatar status ready |
| S6 | /studio/:id/preview |
End-to-end test | Realtime chat + avatar + timeline | POST /v1/realtime/sessions |
Audio, text, viseme streams healthy |
| S7 | /studio/:id/publish |
Go live | Access rules, share link, slug setup | POST /v1/simulations/:id/publish, POST /v1/simulations/:id/access-links |
Portal link active |
| S8 | /portal/:slug |
Family usage | Avatar call, chat transcript, memory timeline | POST /v1/realtime/sessions, POST /v1/conversations/:id/turns |
Stable session and persisted transcript |
3) Ingestion State Machine
{
"states": [
"idle",
"uploading",
"queued",
"extracting_text",
"chunking",
"embedding",
"indexing",
"complete",
"failed"
],
"metrics": {
"progress_pct": "number",
"chunks_done": "number",
"chunks_total": "number",
"token_count": "number"
}
}
UI Rules During Ingestion
- Show current stage text, percent, and chunk counters at all times.
- Support parallel source processing without blocking whole pipeline.
- Allow retry on failed item; keep successful items untouched.
- Do not close page warning while jobs are running.
4) Personality Calibration Schema
{
"display_name": "string (2-80)",
"primary_language": "en-IN | hi-IN | ...",
"secondary_languages": ["string"],
"speech_style": {
"warmth": 0,
"humor": 0,
"formality": 0,
"directness": 0,
"spirituality": 0
},
"catchphrases": ["string"],
"core_values_ranked": ["family", "discipline", "kindness"],
"taboo_topics": ["string"],
"favorite_topics": ["string"],
"memory_cards": [
{
"year": 1983,
"title": "World Cup memory",
"people": ["brother", "neighbors"],
"context": "street radio listening",
"tone": "joyful"
}
]
}
5) Voice Training Contract
Validation Requirements
- Total voiced duration: at least 120 seconds.
- Segment duration: 8 to 25 seconds each.
- Sample rate: at least 16kHz.
- SNR: at least 18dB.
- Silence ratio: less than 35%.
- Single speaker confidence: at least 0.85.
Quality Failure Example
{
"status": "quality_failed",
"report": {
"snr_db": 13.2,
"clipping_pct": 0.4,
"silence_pct": 42.1,
"single_speaker_confidence": 0.91
},
"issues": [
{"code": "LOW_SNR", "message": "Move to quieter room"},
{"code": "TOO_MUCH_SILENCE", "message": "Speak continuously"}
]
}
6) Avatar Upload Contract
- Formats:
jpg,png,webp. - Max size: 10MB.
- Resolution: at least 1024x1024 (recommended 1536+).
- Single face only; no heavy occlusion.
- Pose limits: yaw/pitch/roll around 15 degrees max.
- Face area should occupy 25% to 70% of frame.
{
"status": "accepted",
"quality_report": {
"face_count": 1,
"occlusion_score": 0.08,
"yaw_deg": 6.1,
"pitch_deg": 4.2,
"lighting_score": 0.82
}
}
7) REST API Endpoints
| Method | Path | Purpose | Auth |
|---|---|---|---|
| POST | /v1/simulations | Create simulation workspace | Bearer |
| GET | /v1/simulations/:id | Read workspace metadata | Bearer |
| POST | /v1/uploads/presign | Create upload URL | Bearer |
| POST | /v1/simulations/:id/ingestions | Start ingestion job | Bearer |
| GET | /v1/jobs/:job_id/events | Stream job progress | Bearer |
| PUT | /v1/simulations/:id/persona | Save personality config | Bearer |
| PUT | /v1/simulations/:id/relationships | Save relationship maps | Bearer |
| POST | /v1/simulations/:id/voice/samples | Register voice segments | Bearer |
| POST | /v1/simulations/:id/voice/train | Start voice training | Bearer |
| POST | /v1/simulations/:id/avatar/upload | Upload avatar source image | Bearer |
| POST | /v1/simulations/:id/avatar/render | Start avatar render job | Bearer |
| POST | /v1/realtime/sessions | Create realtime session token | Bearer |
| POST | /v1/simulations/:id/publish | Publish simulation portal | Bearer |
| POST | /v1/simulations/:id/access-links | Create family share links | Bearer |
| POST | /v1/conversations/:id/turns | Persist chat turns | Bearer |
8) OpenAPI Snippets (YAML)
openapi: 3.1.0
info:
title: Publiclear API
version: 1.0.0
paths:
/v1/simulations:
post:
summary: Create simulation
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/CreateSimulationRequest'
responses:
'201':
description: Created
content:
application/json:
schema:
$ref: '#/components/schemas/Simulation'
/v1/realtime/sessions:
post:
summary: Create realtime session
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/RealtimeSessionRequest'
responses:
'200':
description: Session token
content:
application/json:
schema:
$ref: '#/components/schemas/RealtimeSession'
components:
schemas:
CreateSimulationRequest:
type: object
required: [name, primary_language, consent]
properties:
name: {type: string, minLength: 2, maxLength: 80}
primary_language: {type: string}
consent: {type: boolean}
Simulation:
type: object
properties:
simulation_id: {type: string}
owner_id: {type: string}
status: {type: string}
RealtimeSessionRequest:
type: object
required: [simulation_id, relationship_id]
properties:
simulation_id: {type: string}
relationship_id: {type: string}
mode: {type: string, enum: [audio, text, multimodal]}
RealtimeSession:
type: object
properties:
session_id: {type: string}
token: {type: string}
expires_at: {type: string, format: date-time}
9) TypeScript Shared Interfaces
export interface Simulation {
simulation_id: string;
owner_id: string;
name: string;
primary_language: string;
status: "draft" | "ready" | "published";
created_at: string;
updated_at: string;
}
export interface IngestionItem {
ingestion_id: string;
simulation_id: string;
source_type: "file" | "whatsapp" | "manual" | "email_export";
stage:
| "idle"
| "uploading"
| "queued"
| "extracting_text"
| "chunking"
| "embedding"
| "indexing"
| "complete"
| "failed";
progress_pct: number;
chunks_done: number;
chunks_total: number;
token_count: number;
error_code?: string;
error_message?: string;
}
export interface RelationshipProfile {
person_id: string;
name: string;
relation: string;
closeness: 1 | 2 | 3 | 4 | 5;
address_style: string;
inside_jokes: string[];
avoid_topics: string[];
response_tone_override?: {
warmth?: number;
formality?: number;
humor?: number;
directness?: number;
};
}
export interface TurnRequest {
session_id: string;
simulation_id: string;
relationship_id: string;
input: {
modality: "audio" | "text";
text?: string;
language: string;
audio_ref?: string;
};
options: {
stream: boolean;
temperature: number;
max_output_tokens: number;
};
}
export interface TurnResponse {
turn_id: string;
output_text: string;
memory_citations: string[];
latency_ms: {
retrieval: number;
first_token: number;
total: number;
};
}
10) Realtime Streaming Events
Event Types
turn.startedinput_transcript.partialinput_transcript.finalretrieval.readyresponse.output_text.deltaresponse.output_text.doneresponse.output_audio.deltaresponse.output_audio.doneavatar.viseme.deltaturn.completedturn.error
{
"type": "response.output_text.delta",
"turn_id": "turn_001",
"seq": 12,
"delta": "In school, we loved",
"timestamp": "2026-02-22T10:15:17.311Z"
}
{
"type": "avatar.viseme.delta",
"turn_id": "turn_001",
"seq": 12,
"audio_offset_ms": 1640,
"visemes": [
{"t_ms": 0, "v": "AA", "w": 0.7},
{"t_ms": 90, "v": "M", "w": 0.6},
{"t_ms": 170, "v": "EH", "w": 0.8}
]
}
11) Frontend Realtime Handling Rules
- Append
output_text.deltato active bubble immediately. - Audio chunks go through 80ms to 120ms jitter buffer.
- Apply visemes against the same audio playback clock.
- If visemes are delayed, fallback to amplitude-based mouth animation.
- Turn is complete only after both text and audio
doneevents arrive.
12) Acceptance Test Matrix
| Area | Test Case | Expected Result |
|---|---|---|
| S0 Create | Create simulation with consent unchecked | Submit blocked with inline validation |
| S1 Ingestion | Upload mixed files (pdf, txt, invalid) | Valid files proceed, invalid file shows actionable error |
| S1 Ingestion | Interrupt network during embedding | Progress resumes or retry starts from last persisted checkpoint |
| S2 Persona | Save with missing required sliders | Field-level errors and save prevented |
| S3 Relationships | Add tone override for one relationship | Override reflected in preview responses |
| S4 Voice | Audio with high noise | Quality fail with exact issue codes and recapture guidance |
| S5 Avatar | Upload photo with two faces | Rejected with multi-face error and guidance |
| S6 Preview | Run multimodal turn | Text delta, audio delta, viseme delta all stream successfully |
| S7 Publish | Create share link with expiry | Link generated, expiry enforced server-side |
| S8 Portal | Concurrent family sessions | No cross-session memory leakage |
13) Performance and SLO Targets
- P95 first text token under 700ms for warm sessions.
- P95 end-to-end voice turn under 2200ms.
- P95 websocket disconnect rate under 1% per day.
- Ingestion completion success above 99% excluding corrupted uploads.
- Portal availability target: 99.9% monthly.
14) Security and Compliance Controls
- Consent record required before training or publish actions.
- Encryption in transit (TLS 1.2+) and at rest.
- Role-based access for owner, editor, and family viewer roles.
- Signed URL uploads with short expiry windows.
- Audit log for model changes, publication, and share link events.
- Explicit AI disclosure inside portal UI and metadata responses.