xra1

RFC 0009: Time-based stream ingestion via schema mapping + sparse semantic encoding


Summary

Make XRAI a neutral semantic bus: any time-based stream or log (sensor, agent trace, sim tick, app log, MCP tool call, ARKit anchor, LiveKit packet, Rerun recording, OpenTelemetry span, ROS bag topic, …) can be ingested via a declarative schema mapping into sparse semantic encoding (SSE) — deltas typed against a public ontology — so any decoder (RFC 0012: AI agents, LLMs, procedural/generative renderers, shaders/VFX, neural renderers, 3D viewers, game engines, bridges, converters, simulators) can render, simulate, replay, or generate from the same stream.

Concretely: XRAI grows three layers below the entity/relation/event schema — Sources → Mappings → SSE — and one layer above — Decoders (RFC 0012). The v1.0 JSON doc becomes the snapshot projection of an SSE stream at a given (timeline, t).

Motivation

Design

Four layers

┌───────────────────────────────────────────────────────────────────────────────┐
│  SOURCES           Any time-based stream / log                                │
│                    ARKit · LiveKit · OpenTelemetry · ROS bag · Rerun .rrd ·   │
│                    MCP tool trace · LLM chain-of-thought · sim tick · app log │
├───────────────────────────────────────────────────────────────────────────────┤
│  MAPPINGS          Declarative schema-mapping docs (xrai://schemas/…)         │
│                    source-field → xrai ontology slot, with transform + unit   │
├───────────────────────────────────────────────────────────────────────────────┤
│  SSE               Sparse Semantic Encoding — delta stream on the wire        │
│                    {entity_path, component, timeline.*, payload, delta_kind}  │
│                    Wire-format agnostic (JSON-Patch · Arrow IPC · CBOR · …)   │
├───────────────────────────────────────────────────────────────────────────────┤
│  DECODERS (RFC 0012)                                                          │
│                    Renderers (shader / VFX / PBR / neural) · game engines ·  │
│                    3D viewers · simulators · LLM agents · generators ·        │
│                    bridges (Rerun / OpenTelemetry / glTF / USD out) ·         │
│                    converters (diffusion-conditioned, procedural, L-system)   │
└───────────────────────────────────────────────────────────────────────────────┘

Each layer is independently versionable. A new source adds one mapping doc. A new decoder adds one consumer. Neither touches the others.

1. Sources — what counts as ingestible

Anything that emits rows at a rate ≥0 Hz with at least one time stamp and at least one named field. XRAI takes no opinion on the source’s native shape; a mapping (§2) declares the translation. Non-exhaustive catalog (every row has at least one shipped reference mapping planned in the implementation plan):

Source kind Example stream Typical rate Mapping id (planned)
Session / tool telemetry ~/.claude/session-stats/history.jsonl (duration_min, commits, est_tokens) event-driven claude-session-history-v1
Structured app log console.log + console.error lines, Unity Debug.Log, OSLog event-driven console-jsonl-v1, unity-debuglog-v1
Distributed trace / metric OpenTelemetry span / metric / log var opentelemetry-v1
Financial market data stock tick (NYSE/NASDAQ WebSocket), crypto order book (Coinbase/Kraken), OHLCV bar feed 1 Hz – 10 kHz market-tick-v1, ohlcv-bar-v1, orderbook-l2-v1
News / text stream RSS/Atom, news API (AP/Reuters/BBC), Twitter/BlueSky firehose, Mastodon stream event-driven rss-atom-v1, bluesky-firehose-v1
Motion capture OptiTrack NatNet, Vicon DataStream, Rokoko Smartsuit, Xsens MVN, MediaPipe Pose 33pt, OpenPose, Azure Kinect body 60–240 Hz natnet-v1, mediapipe-pose-v1, azure-kinect-body-v1
Video frame stream RTSP H.264/H.265, WebRTC VP9/AV1, HLS .ts segments, raw NAL, RGBD (HueDepth) 30–120 FPS video-rtsp-v1, webrtc-vp9-v1, rgbd-huedepth-v1
Image sequence PNG/JPG/EXR folder, photo burst, ARKit keyframe chain, scientific microscopy TIFF stack var image-seq-v1, arkit-keyframe-v1
Depth / point cloud iPhone LiDAR mesh, Livox LiDAR, Velodyne, RealSense depth, Intel Kinect 10–60 Hz ios-lidar-v1, velodyne-pcap-v1, realsense-d4xx-v1
AR / XR runtime ARKit anchors + face/body, ARCore trackables, OpenXR actions, WebXR transforms 60–120 Hz arkit-v1, arcore-v1, openxr-actions-v1, webxr-transforms-v1
Real-time transport LiveKit DataChannel (spec 010), WebRTC, WebSocket, MQTT, NATS, Kafka var livekit-datachannel-v1, mqtt-topic-v1, kafka-topic-v1
Agent trace LLM tool call (JSON), MCP server invocation, chain-of-thought step, LangGraph run event-driven mcp-tool-trace-v1, llm-cot-v1, langgraph-run-v1
Sim tick physics step (Bullet/PhysX/MuJoCo), ECS system pass (Unity DOTS/Bevy), cloth/fluid step 30–1000 Hz mujoco-tick-v1, unity-dots-v1
Recorded corpora Rerun .rrd, ROS bag / rosbag2, HDF5 LeRobot episode, CSV, Parquet offline rerun-rrd-v1, ros-bag-v1, lerobot-episode-v1
User input touch / gesture / voice / gaze / keyboard / MIDI 60 Hz input-events-v1, midi-stream-v1
Environmental / IoT sensor weather API, air quality, seismic, GPS / GNSS NMEA, Bluetooth beacons, industrial PLC (Modbus/OPC-UA) var nmea-gps-v1, modbus-v1, weather-api-v1
Biosignal heart rate (Polar/Apple Watch), EEG (Muse/OpenBCI), EMG, pulse-ox, respiration 1–512 Hz heart-rate-v1, eeg-openbci-v1
Generative / synthesis LLM diffusion frame, procedural tick, WFC step, shader Graph param, neural-renderer latent var diffusion-frame-v1, procedural-tick-v1

The important property: XRAI does not privilege “spatial” sources. A stock market tick and a motion-capture joint pose land on the same bus with the same delta shape — the only difference is which ontology components they populate (metric_value + asset_symbol vs pose_6dof + joint_id). That’s what makes the substrate general.

Conformance requirement: any adapter claiming to ingest source X MUST produce SSE deltas that round-trip to a v1.0 JSON snapshot at any supported (timeline, t). Sources with no natural spatial structure (e.g. a news feed) are still conformant — they emit entities under synthetic paths like /news/article/<uuid> with text + embedding_<dim> components; decoders decide whether to render them in space (e.g. word-cloud), as text overlay, or as a graph node.

2. Mappings — schema mapping docs

A mapping is a small JSON/YAML document at a well-known URI (xrai://schemas/mappings/<id>.json). It declares how fields from a source stream become XRAI ontology slots.

{
  "xrai_mapping_version": "1.2",
  "id": "arkit-v1",
  "source": { "kind": "arkit", "producer_hint": "Portals iOS app" },
  "ontology": "xrai://ontology/v1",
  "rules": [
    {
      "match":  { "source_type": "ARAnchor" },
      "emit":   {
        "entity_path": "/ar/anchor/{uuid}",
        "archetype":   "Anchor",
        "components":  {
          "transform":  { "from": "transform_mat4", "as": "mat4_to_pose6dof", "units": "m/rad" },
          "confidence": { "from": "tracking_state", "as": "enum_to_float01" }
        },
        "timeline": {
          "log_time":    { "from": "timestamp", "as": "ns" },
          "arkit_frame": { "from": "frame_index", "as": "int64" }
        }
      }
    },
    {
      "match": { "source_type": "ARFaceAnchor.blendShapes" },
      "emit":  {
        "entity_path": "/ar/face/{uuid}/blendshapes",
        "archetype":   "BlendShapes",
        "components":  { "blend_weights": { "from": "*", "as": "dict_float01" } }
      }
    }
  ]
}

Mapping semantics:

3. SSE — sparse semantic encoding (wire format)

SSE is the delta shape XRAI runtimes consume and emit. One record per changed component per entity per timeline-step.

Field Type Required Notes
entity_path string /scene/anchor_1/hologram_a (RFC 0010 hierarchy)
archetype string optional hint; decoders MAY dispatch on it
component string ontology-typed (e.g. pose_6dof, audio_reactive_gain, blend_weights)
timeline map<string, scalar> ≥1 entry; log_time recommended
payload ontology-typed value shape determined by component’s ontology entry
delta_kind enum add/update/remove/annotate annotate = metadata-only, doesn’t change state
provenance map<string, any> source id, mapping id, signer, model version — used by decoders that care

Wire-format-agnostic. SSE is a logical shape. Producers pick:

A runtime declares which wire formats it accepts; discovery by MIME:

All four are semantically equivalent.

4. Ontology — what component strings mean

component strings resolve against the master ontology (RFC 0013). That ontology is:

Starter corpus for v1.2 (shipped at xrai-website/ontology/v1.json, see RFC 0013 §10):

Every ontology entry links back to prior art (schema.org, Wikidata QIDs, OpenTelemetry semantic conventions, W3C verifiable credentials, WordNet synsets, USD schemas, NIST SI) so XRAI is a superset-merger, not a parallel invention. Full governance + learning pipeline in RFC 0013.

Snapshot projection (v1.0 compatibility)

Given an SSE stream + a (timeline, t) query, a runtime projects to a v1.0 JSON doc:

for each entity_path E that has received ≥1 delta at t' ≤ t (not later remove):
  entity.id = E
  entity.type = archetype OR inferred from components
  entity.components = { c : latest(payload) for each component c active at t }
  entity.transform = compose(pose_6dof up E's path)
emit { xrai_version:"1.0", scene: { entities:[…], relations:[…], events:[…] } }

This guarantees every SSE stream has an isomorphic v1.0 snapshot — the JSON spec stays canonical for humans + LLMs authoring small docs; SSE is the machine-native streaming form.

Conformance impact

A v1.2 runtime MUST:

  1. Accept at least one SSE wire format.
  2. Honor ontology-typed components (reject unknown types gracefully — annotate+preserve, don’t crash).
  3. Project to a v1.0 snapshot on demand.
  4. Emit SSE when authoring (so the whole loop is round-trippable).

v1.0/1.1 runtimes are unaffected — they see only the projected JSON.

Error semantics

Alternatives considered

A — hard-port Rerun’s .rrd + archetype set

What the earlier draft of this RFC proposed. Rejected because it conflates wire format (Arrow) with semantic shape (archetypes) and forces every ingestion path through Rerun’s specific archetype taxonomy. That locks out agent traces, app logs, and generative streams that don’t fit an image/mesh/points3d mental model. The current design keeps Arrow as ONE wire format, archetypes (RFC 0010) as ONE component-bundling convention, and opens the door to everything else.

B — adopt OpenTelemetry as-is

OT’s data model is observability-centric (logs, metrics, spans). Missing: spatial entities, hierarchy, transforms, material/shader/audio, relations. XRAI’s ontology would become a second-class dialect of OT. Rejected. But XRAI’s telemetry-shaped components (log_level, span_id, metric_value) reuse OT semantic conventions verbatim — no reinvention.

C — adopt USD layers

USD is layered composition, not streaming. Excellent for asset pipelines; poor for 60 Hz deltas. Rejected for the wire layer but kept as an input mapping (usd → SSE) and an output decoder (SSE → usd snapshot).

D — no ingestion layer, keep XRAI as author-only format

Status quo. Works for xra1.com author pages; falls apart when users ask “can I ingest this ARKit session / LiveKit recording / LLM trace and render it in Portals?” Rejected — the v2.1 embeddings and v3.0 meta-layer roadmap is unbuildable without a stream substrate.

E — one frozen wire format (Arrow only)

Simpler. Rejected: LLM agents and scripting users author in JSON-Patch; embedded devices prefer CBOR; enterprise prefers protobuf. Wire-format plurality with a single semantic shape is the cheapest path to broad adoption.

Backwards compatibility

Implementation plan

  1. RFC merged. Ontology-v1 frozen (starter set above).
  2. xrai://schemas/mappings/ registry established. First-batch mappings (one per source family in §1, ordered by near-term Portals need):
    • Tier 1 (dogfood / ships with Portals v4): arkit-v1, livekit-datachannel-v1, rgbd-huedepth-v1, mediapipe-pose-v1, claude-session-history-v1, console-jsonl-v1, unity-debuglog-v1, mcp-tool-trace-v1.
    • Tier 2 (broad adoption surface): opentelemetry-v1, rerun-rrd-v1, ros-bag-v1, lerobot-episode-v1, webxr-transforms-v1, openxr-actions-v1, rss-atom-v1, bluesky-firehose-v1.
    • Tier 3 (long-tail / contributed): market-tick-v1, ohlcv-bar-v1, orderbook-l2-v1, natnet-v1, azure-kinect-body-v1, ios-lidar-v1, velodyne-pcap-v1, nmea-gps-v1, modbus-v1, eeg-openbci-v1, heart-rate-v1, midi-stream-v1, kafka-topic-v1, mqtt-topic-v1, mujoco-tick-v1, diffusion-frame-v1.
  3. Reference SSE encoders/decoders in js/xrai-sse.js + Python + Rust.
  4. LiveKit DataChannel topic xrai-sse (supersedes xrai-delta wire spec, RFC 0005 updated in parallel).
  5. Conformance corpus (RFC 0003) extended — 10 round-trip tests per wire format.
  6. Decoder contract ships alongside (RFC 0012).

Unresolved

Prior art (primary sources only)

Future work

Adoption signals