ONNX Module

This module documents the ONNX/WebGPU integration used to generate Gaussian Splatting data directly on the GPU with no CPU readbacks. It provides GPU-only inference for dynamic point cloud generation, supporting both static and per-frame dynamic inference modes.

Overview

Execution Provider: onnxruntime-web (WebGPU)
I/O Binding: All inputs and outputs are GPU buffers (no CPU roundtrips)
Device Sharing: Reuses the app's WebGPU device for buffer compatibility
Precision Support: Automatic detection and configuration for float16, float32, int8, uint8
Outputs:
gaussian: (N, 10) packed gaussian parameters (precision auto-detected)
sh or rgb: (N, colorDim) color data (SH or RGB, precision auto-detected)
num_points: int32[1] actual count per inference
Consumers: DynamicPointCloud + GaussianPreprocessor (num_points GPU overwrite)

Key Features

GPU-Only Pipeline: No CPU readbacks, all data stays on GPU
Automatic Precision Detection: Detects data types from model metadata or output names
Static & Dynamic Modes: Supports one-time inference or per-frame updates
Color Mode Detection: Automatically detects SH vs RGB output
Capacity Detection: Auto-detects max points from model metadata
Graph Capture Support: Optional WebGPU graph capture for performance
Resource Management: Proper cleanup and disposal of GPU resources

Files of Interest

src/onnx/onnx_generator.ts - High-level generator facade
src/onnx/onnx_gpu_io.ts - Low-level GPU I/O binding
src/onnx/precision-detector.ts - Automatic precision detection
src/onnx/precision-types.ts - Precision type definitions
src/app/managers/onnx-manager.ts - ONNX model lifecycle management

Quick Start

Basic Usage (via ONNXManager)

import { ONNXManager } from './app/managers/onnx-manager';

const onnxManager = new ONNXManager(modelManager);

// Load ONNX model (static inference)
const entry = await onnxManager.loadONNXModel(
  device,
  '/models/gaussians3d.onnx',
  cameraMatrix,
  projectionMatrix,
  'static-model',
  { 
    staticInference: true,
    maxPoints: 2_000_000,
    debugLogging: true
  }
);

// Load ONNX model (dynamic per-frame inference)
const dynamicEntry = await onnxManager.loadONNXModel(
  device,
  '/models/gaussians3d_dynamic.onnx',
  cameraMatrix,
  projectionMatrix,
  'dynamic-model',
  { 
    staticInference: false,  // Enable per-frame updates
    maxPoints: 2_000_000,
    debugLogging: true
  }
);

Direct Generator Usage

import { ONNXGenerator } from './onnx/onnx_generator';

const generator = new ONNXGenerator({
  modelUrl: '/models/gaussians3d.onnx',
  maxPoints: 2_000_000,
  debugLogging: true,
  device: gpuDevice
});

await generator.initialize();
await generator.generate({
  cameraMatrix: viewMatrix,
  projectionMatrix: projMatrix,
  time: performance.now() / 1000
});

// Access GPU buffers
const gaussianBuffer = generator.getGaussianBuffer();
const shBuffer = generator.getSHBuffer();
const countBuffer = generator.getCountBuffer();

Precision Configuration

// Manual precision override
const entry = await onnxManager.loadONNXModel(
  device,
  '/models/model.onnx',
  cam, proj, 'model',
  {
    precisionConfig: {
      gaussian: { dataType: 'float32', bytesPerElement: 4 },
      color: { dataType: 'float16', bytesPerElement: 2 }
    }
  }
);

Data Flow

Loading: ONNXManager.loadONNXModel() creates ONNXGenerator and DynamicPointCloud
Initialization: Generator initializes ONNX Runtime session and preallocates GPU buffers
Inference: ONNXGenerator.generate() runs inference and writes to GPU buffers
Dynamic Updates: AnimationManager calls DynamicPointCloud.update() → ONNXGenerator.generate() per frame
Rendering: Preprocessor reads countBuffer to update instance count, then sort and render

See: Architecture and API Reference for details.

Architecture – Session lifecycle, buffer ownership, and precision-detection flow.
API Reference – Generator, manager, and precision-config APIs with usage notes.
Point Cloud Module – Shows how ONNX outputs feed DynamicPointCloud.
Preprocess Module – Details how ONNX counts and precision flags drive projection.
Timeline Module – Covers per-frame animation hooks for dynamic generators.
Config Module – Documents how ORT WASM paths are configured before inference.