xra1

XRAI Viewer — hybrid architecture

Goal: one XRAI document renders consistently across every platform (iOS / Android / macOS / Windows / visionOS / Quest), through any of N rendering engines, with live multiplayer + XRAI file I/O + WebAR entry, all without vendor lock.

Per-engine parity is enforced by the 8 lock gates in RUNTIMES_EVALUATION.md — no engine lands in the default viewer until it passes conformance + O9 parity.


Four layers

┌─────────────────────────────────────────────────────────────────┐
│  1. DATA LAYER                 XRAI v1.0 JSON (canonical)       │
│                                Load/save via File API + fetch   │
├─────────────────────────────────────────────────────────────────┤
│  2. LAYOUT LAYER               Graph / hypergraph / scene       │
│     Options:                   d3-force · ECharts-GL ·          │
│                                WebGPU compute (experimental)    │
├─────────────────────────────────────────────────────────────────┤
│  3. RENDERING LAYER            Engine-per-target:               │
│                                3d-force-graph (web default) ·   │
│                                PlayCanvas (PBR + WebXR) ·       │
│                                Needle (Unity → web, visionOS) · │
│                                Icosa (AR viewer + Gallery) ·    │
│                                Three.js + WebGPU (perf)         │
├─────────────────────────────────────────────────────────────────┤
│  4. TRANSPORT LAYER            LiveKit (primary per spec 010) · │
│                                WebRTC DataChannels · WebSocket  │
└─────────────────────────────────────────────────────────────────┘

Each layer swaps independently. Changing the renderer doesn’t change the data. Changing the transport doesn’t change the layout.

Procedural worlds — DNA → polymorphic phenotype (new)

One XRAI document, infinite worlds, any platform, gets smarter every day.

200 bytes of rules expand into a 4D world. Same rules render as voxels in a browser, particles on iPhone, gaussians on Vision Pro. Save → share → re-render anywhere with byte-identical output. Every render contributes telemetry to the global rule library — tomorrow’s worlds improve while everyone sleeps.

flowchart LR
    subgraph DNA["DNA · 200B"]
        A[XRAI document<br/>seed + archetype refs<br/>+ component params]
    end
    subgraph PHENOTYPE["PHENOTYPE · per-platform"]
        B[Strategy Registry<br/>hot-swap pure fns]
        C1[Unity VFX]
        C2[Three.js voxels]
        C3[Gaussian Splats]
        C4[SDF / Shader]
        C5[Icosa glb]
        C6[ASCII / emoji]
    end
    subgraph LEARN["AUTO-IMPROVE · global"]
        D[SSE telemetry<br/>RFC 0009 deltas]
        E[RFC 0013<br/>continuous learning]
        F[Rule library<br/>grows daily]
    end
    A --> B
    B --> C1 & C2 & C3 & C4 & C5 & C6
    C1 & C2 & C3 & C4 & C5 & C6 --> D
    D --> E --> F --> A

Canonical machinery (no reinvention)

Tier ladder (graceful degradation, RFC 0012:79–148)

flowchart LR
    T0[T0 ID] --> T2[T2 ASCII] --> T3[T3 grammar] --> T4[T4 emoji 🌳]
    T4 --> T8[T8 voxels] --> T11[T11 mesh]
    T11 --> T16[T16 hologram] --> T19[T19 neural]

T0–T6 are mandatory (RFC 0012:109): every runtime ships these as zero-dep built-ins. Nothing ever shows nothing.

User journey — voice to magic in <100ms

sequenceDiagram
    participant User
    participant Mic as Voice/STT
    participant SSE as XRAI SSE
    participant Reg as Strategy Registry
    participant View as Renderer

    User->>Mic: "create a misty rainforest"
    Mic->>SSE: archetype + seed (T0–T4 instant)
    SSE-->>View: 🌳 emoji placeholder
    View-->>User: visible <100ms ⚡
    SSE-->>Reg: components stream (T8–T19)
    Reg-->>View: voxels OR mesh OR GSplat by capability
    View-->>User: full world materializes
    View-->>SSE: telemetry (fps, dwell, latency)
    Note over User,View: Next user's "rainforest" is smarter

For developers: add a new world type in 10 lines

// strategyRegistry.ts — drop-in, no other file touched
export const STRATEGIES = {
  voxel, vfx, gsplat, shader, icosa, hybrid,
  fluid: (sse) => sse.components.fluid_params
    ? expandFluidSim(sse.seed, sse.components.fluid_params)
    : []  // ← T8 fallback
};

Plus one decoder manifest declaring consumes: ["FluidSim"] + produces.tier: [8, 11, 19]. Same XRAI doc now renders as fluid on capable devices, falls back to voxel on others. Zero changes to engines, transports, or other adapters.

Live demo

/voxel/ — Voxel Plant Generator. 80+ asset types, Burst-style noise, LOD, wind sim. Demonstrates the rule-library-as-data pattern that the canonical machinery formalizes.

Reference spec: specs/023-voxel-world-generator/PROCEDURAL_WORLDS_SPEC.md.

Universal encode / decode (new)

Ships at js/xrai-core.js + js/adapters/. 12 adapters today — every one is a single async (input) → xrai doc:

adapter input produces
webpage URL or HTML object.web-container + headings as subtree
wikipedia article title concept root + 24 linked concepts
arxiv paper ID concept + author people + abstract
twitter @handle (+ bearer) person root + up to 50 follows
linkedin profile slug person stub (OAuth required for full)
calendar .ics text event entities + temporal events
github-repo owner/repo code-repo + 80 file modules
github-commits owner/repo commits + history events
code-deps package text package + dependency nodes
markdown-spec markdown file root + section concepts
test-workflow {name,steps[]} suite + pass/fail events
concept-graph free-form concepts + edges

Registered via registerAdapter(name, fn) — add your own by dropping a file in js/adapters/ and importing it. Every adapter produces a valid v1.0 XRAI doc (round-trip-stable).

Mini pipeline editor (new)

Lives at configs.html. 2D SVG node editor, no deps. Nodes are adapters · transforms · renderers · sinks; wires carry XRAI docs between them. Pipelines save as .pipeline.json.

Surfaced in three places:

Portals iOS / Unity consumption

Same modules, three transports:

  1. ESM inline (iOS)src/services/xrai/XraiWebBridge.ts re-exports the core types and newScene/validate. RN screens can build XRAI docs without a WebView.
  2. WebView (iOS)XraiWebBridge.webBridgeInject() + injectXraiDoc(doc) let the RN app post docs into the live xrai.dev graph and receive pipeline results back.
  3. Unity EditorXraiHubTab.cs renders the Pipelines foldout with a URL bar + inbox watcher. Pipeline “sink” download files dropped in Temp/xrai-inbox/ auto-load.

Engine-per-target matrix

Target Primary engine Fallback Rationale
iOS Safari (iPhone / iPad) 3d-force-graph PlayCanvas WebGL works; WebXR unlocks when Apple ships Safari WebXR-AR
Android Chrome PlayCanvas 3d-force-graph WebXR mature; PlayCanvas has best perf-per-watt on mid-tier Android
macOS browser 3d-force-graph WebGPU + Three.js Best DPR / perf for data-dense graphs
Windows browser Same as macOS WebGPU has broad Win support via Chrome/Edge
visionOS Safari Needle Engine 3d-force-graph Needle has native visionOS Safari WebXR-AR (the critical Apple-glasses hedge per spec 015)
Meta Quest browser Needle OR PlayCanvas Both support Quest WebXR; pick per user preference
Portals native (iOS) Unity (canonical) Via portals:// URL scheme — opens the XRAI in the Portals app for full VFX pipeline

Icosa AR viewer is a CONSUMER of XRAI, not an engine: our object.glb / object.tilt entities publish to Icosa Gallery; their viewer renders the glTF/tilt payload. XRAI adds the relational + temporal metadata on top. See RUNTIMES.md § What Icosa IS and IS NOT.


Layout layer choice — 3d-force-graph today, ECharts-GL + WebGPU tomorrow

Why 3d-force-graph for launch:

Why ECharts-GL for scale:

Why WebGPU for largest:

Selection rule: if node count < 500 use 3d-force-graph. 500–10K → ECharts-GL. >10K → WebGPU compute (experimental, v1.2 target).

WebGPU ECharts hypergraph — v2 upgrade path (research note)

Inspiration: Keijiro Takahashi’s WebGPU experiments (keijiro/WebGPU samples on GitHub) — particle audio-reactive systems, globe visualizations, and compute-shader force simulations. Plus the Portals sibling project MetavidoVFX which pioneered the depth + stencil + audio + ML-pose shared compute substrate used in the CVPR paper.

Target: replace 3d-force-graph with a native WebGPU compute-shader force solver + ECharts graphGL styling, rendering typed glyphs as GPU-instanced sprites instead of per-node CanvasTexture sprites (current approach).

Benefits of the upgrade:

  1. 100K+ node budget — compute shaders calculate force iterations in parallel on the GPU. Current CPU-bound d3-force caps at ~2K interactive.
  2. Keijiro-style ambient field — a WebGPU particle background that reacts to node hover/click (same audio-reactive pattern MetavidoVFX uses for holograms), giving the viewer the “jARvis HUD of the future” feel the brand calls for.
  3. Typed glyph textures — a single sprite-atlas texture holds all 9 glyphs (▦ ◆ ◇ ○ ▲ ● ⬡ ⬢ ▤). GPU-instanced draws render 10K glyphs per frame with no CPU cost.
  4. Hyperedge geometry — n-ary relations (RFC 0002) can be rendered as shaded polygons connecting N participants, instead of forcing binary-edge decomposition.
  5. Live feed integration — Portals-app scene authoring streams XRAI deltas over LiveKit (spec 010); a WebGPU viewer can ingest + re-layout 60fps as scenes evolve.

v1 status (this release):

RFC candidates from this section:

This matches spec 006 KB Visualizer § Platform renderers: R3F (mobile), 3d-force-graph (web), Needle (visionOS), ECharts-GL (analytics), Mermaid (docs) — adopting the existing decision.


Input modalities — voice, hands, head, touch

Every viewer surface supports a layered input stack. Users opt in per modality.

Modality How Where Status
Touch / mouse OrbitControls via 3d-force-graph every browser ✅ always on
Voice (hey jarvis) Web Speech API SpeechRecognition + speechSynthesisresolveQuery → camera fly + HUD Chrome/Edge/Safari desktop + iOS Safari 14.5+ js/jarvis-web.js
Hand nav (webcam) MediaPipe HandLandmarker via CDN (dynamic import) → pinch → raycast + click; two-hand spread → zoom desktop webcam only (fine-pointer check) js/hands-web.js (opt-in)
Hand nav (WebXR) WebXR hand-input API during immersive-ar/vr session Quest browser, visionOS Safari once Apple unlocks WebXR-AR 🚧 stub — activates inside immersive session
Head pose WebXR viewerSpace transforms; drives LiveKit presence (avatar head) Quest / visionOS native / AR Foundation on iOS 🚧 multiplayer integration only
Gaze Quest / visionOS eye-tracking ('eye-gaze' feature) Quest Pro, Vision Pro 🚧 experimental

Design principles:

  1. Voice first, touch never removed. Every voice command has a touch equivalent. Users on quiet trains or with disabilities are not second-class.
  2. Hand nav is additive. Enabling hands never disables touch or voice. Pinch = click; touch-tap still works.
  3. Privacy signals are mandatory. Mic on = visible dot. Camera on = live preview thumbnail. Never hide them.
  4. Latency budget: wake-word detect < 100ms, pinch detect < 50ms, touch-tap < 16ms. All local; no server round-trip in the input path.

Hand gestures (v0)

Gesture Effect
Pinch (thumb-tip to index-tip, one hand) Click — raycasts to nearest node on screen
Two-hand spread → wider Zoom out
Two-hand spread → narrower Zoom in
Open-palm swipe horizontal Orbit (future)
Point + hold Hover preview (future)

Inspired by Google AI Studio hand-tracking prototypes + Portals spec 007 (HoloKit iOS native + Sentis editor webcam). v1 adds per-fingertip particle trails per Keijiro WebGPU demos — lands with the WebGPU ECharts hypergraph renderer (RFC 0008).


Multiplayer transport — LiveKit (spec 010)

Per spec 010 Multiplayer Normcore → LiveKit decision (2026-03-05) + spec 003 Hologram Telepresence Phase 2. LiveKit is the unified transport for cross-platform.

Cross-surface interop: the same room ID works from the Portals iOS app AND xrai.dev browsers. LiveKit server is wss://portals-dev.livekit.cloud (shared between src/services/livekit/LiveKitService.ts on iOS + js/live-web.js on web).

Flow (automated room creation + sharing):

  1. User clicks invitejs/live-web.js generates a room ID if none in URL, mirrors to ?room=<id>, copies the URL via navigator.share (mobile) or navigator.clipboard (desktop).
  2. Second user visits the URL → room ID already in URL params.
  3. Either user clicks live — module:
    • tries /api/livekit-token?room=<id>&identity=<uuid> for automatic token (Cloudflare Worker stub)
    • on miss, prompts for server URL + paste-token (same pattern as web/rgbd-viewer)
  4. LiveKit WebRTC:
    • Video tracks — published when user toggles local camera; subscribed from remotes, rendered in bottom-left corner tile (240×160)
    • DataChannel topics:
      • xrai-delta — scene edits (same message shape as spec 001 bridge types: add_object, update_transform, modify_objects, save_scene)
      • presence — head pose + hand landmarks (MediaPipe or WebXR)
      • voice — push-to-talk audio (LiveKit handles natively)
  5. Each client’s renderer applies the delta locally → renderer-agnostic: PlayCanvas client + Three.js client see the same scene.

Reference patterns:

Live hologram mode (v1 upgrade — RFC 0005 scope):

Reuse the existing web/rgbd-viewer/src/RGBDPointCloud.ts + HueDepthCodec.ts. When a participant publishes an RGBD video track (iOS ARKit on Portals app, or browser webcam + ONNX depth), other participants’ js/live-web.js switches the corner tile to a full-scene hologram reconstruction. Backed by the Portals compute substrate (CVPR paper § 2).

Scene import / edit / save / share — the shared editor loop:

Any participant can:

Current state (this release):

Security note: all multiplayer deltas are validated against XRAI schema before apply. No arbitrary code execution path. See SECURITY.md § Threat model.


Multiplayer presence — heads + hands + scenes

When LiveKit ships (per spec 010 + spec 003 Phase 2), every connected participant broadcasts:

Shared scene editing:

Each participant can:

  1. Import an XRAI scene — drag-drop / file-picker loads a .xrai.json into the shared graph; other participants see it appear in real time
  2. Edit — move a node (drag), add an entity (voice command or click toolbar), delete (select + Del), link (drag from one node to another)
  3. Save — current scene state → download + optionally publish to a shared URL
  4. Share — copy invite URL with ?room=<id> → new participants join the same scene

Live hologram mode:

Participants with AR-capable devices (Portals app iOS / visionOS Safari WebXR / Quest) can toggle “hologram” mode:

Privacy:

This layer is stubbed today — the invite button in the topnav generates ?room=<id> URLs, and the hydratePortalsFeed() loop fetches /api/portals-feed.json for pre-populated participants. Full transport lands with RFC 0005 (multiplayer delta protocol).


XRAI file I/O

Load (📂 Load XRAI):

Save (💾 Save XRAI):

Round-trip guarantee: Load → render → Save produces a semantically-equivalent XRAI document. Byte-identical is not guaranteed (formatting, key ordering may differ), but entity + relation sets match.


Cross-platform viewing paths

Platform URL path Renders via
xrai.dev (default) index.html 3d-force-graph
xrai.dev?engine=echarts runtimes/echarts/viewer.html ECharts-GL
xrai.dev?engine=playcanvas runtimes/playcanvas/viewer.html PlayCanvas
xrai.dev?engine=needle runtimes/needle/viewer.html Needle Engine
xrai.dev?engine=icosa runtimes/icosa/viewer.html Icosa AR (links out to Gallery)
Portals app (iOS) portals://xrai/open?src=<url> Unity VFX Graph (canonical)

When a browser visits xrai.dev with a portals:// URL scheme handler registered (after installing the Portals iOS app), the “📱 Open in Portals app” engine option routes there for full-fidelity rendering.


This viewer architecture reuses existing Portals specs — does not invent new requirements:

Spec What it covers Relevance to viewer
001 Unity-RN bridge 69 typed bridge message types Same message types carry multiplayer deltas
003 Hologram telepresence LiveKit integration (Phase 2) Multiplayer transport for XRAI viewer
004 Scene save XRAI Scene persistence XRAI load/save semantics
006 KB visualizer R3F / 3d-force-graph / Needle / ECharts-GL + hypergraph primitives Direct reuse of engine-per-target decisions
010 Multiplayer LiveKit Unified transport (spec frozen, LiveKit selected) Multiplayer layer
014 Web integration Login-gated Viewer/Editor + RGBD hologram web viewer viewer.portals.app (Portals-branded) + this xrai.dev (public MIT) share adapter patterns
015 visionOS / Needle Needle for visionOS Safari WebXR Primary engine for visionOS target
022 Universal asset IO Federated asset search + Icosa Gallery object.glb asset references render consistently across engines

RFCs this architecture implies (future)

None block v1.0 launch. The viewer ships with stubs for each engine except force-graph (fully working). Later engines land as their adapter + conformance markers + parity screenshots drop.


Adoption signals (what we measure)

Signal Target by Meaning
Non-Portals runtime passes v1.0 conformance end of Q3 2026 XRAI parse is portable
External XRAI doc loaded via File-API within 30d of launch File I/O actually used
Cross-engine parity screenshots committed Q3 2026 RUNTIMES_EVALUATION O9 gate passing
Live multiplayer session (2+ clients) Q4 2026 LiveKit transport landed
Portals app opens via portals:// deep-link launch day URL scheme registered

Meta: why a 3D force-graph landing ?

Because the format describes spatial hypergraphs, the landing page is an XRAI scene rendering itself. Self-referential. Dogfood. Every doc node is the doc it links to. Every edge is the cross-reference it represents.

When a visitor clicks a node and drills into SPEC.md, they’re traversing the same graph structure the spec defines. The medium is the message.