Skip to content

ONNX Module

This module documents the ONNX/WebGPU integration used to generate Gaussian Splatting data directly on the GPU with no CPU readbacks. It provides GPU-only inference for dynamic point cloud generation, supporting both static and per-frame dynamic inference modes.

Overview

  • Execution Provider: onnxruntime-web (WebGPU)
  • I/O Binding: All inputs and outputs are GPU buffers (no CPU roundtrips)
  • Device Sharing: Reuses the app's WebGPU device for buffer compatibility
  • Precision Support: Automatic detection and configuration for float16, float32, int8, uint8
  • Outputs:
  • gaussian: (N, 10) packed gaussian parameters (precision auto-detected)
  • sh or rgb: (N, colorDim) color data (SH or RGB, precision auto-detected)
  • num_points: int32[1] actual count per inference
  • Consumers: DynamicPointCloud + GaussianPreprocessor (num_points GPU overwrite)

Key Features

  • GPU-Only Pipeline: No CPU readbacks, all data stays on GPU
  • Automatic Precision Detection: Detects data types from model metadata or output names
  • Static & Dynamic Modes: Supports one-time inference or per-frame updates
  • Color Mode Detection: Automatically detects SH vs RGB output
  • Capacity Detection: Auto-detects max points from model metadata
  • Graph Capture Support: Optional WebGPU graph capture for performance
  • Resource Management: Proper cleanup and disposal of GPU resources

Files of Interest

  • src/onnx/onnx_generator.ts - High-level generator facade
  • src/onnx/onnx_gpu_io.ts - Low-level GPU I/O binding
  • src/onnx/precision-detector.ts - Automatic precision detection
  • src/onnx/precision-types.ts - Precision type definitions
  • src/app/managers/onnx-manager.ts - ONNX model lifecycle management

Quick Start

Basic Usage (via ONNXManager)

import { ONNXManager } from './app/managers/onnx-manager';

const onnxManager = new ONNXManager(modelManager);

// Load ONNX model (static inference)
const entry = await onnxManager.loadONNXModel(
  device,
  '/models/gaussians3d.onnx',
  cameraMatrix,
  projectionMatrix,
  'static-model',
  { 
    staticInference: true,
    maxPoints: 2_000_000,
    debugLogging: true
  }
);

// Load ONNX model (dynamic per-frame inference)
const dynamicEntry = await onnxManager.loadONNXModel(
  device,
  '/models/gaussians3d_dynamic.onnx',
  cameraMatrix,
  projectionMatrix,
  'dynamic-model',
  { 
    staticInference: false,  // Enable per-frame updates
    maxPoints: 2_000_000,
    debugLogging: true
  }
);

Direct Generator Usage

import { ONNXGenerator } from './onnx/onnx_generator';

const generator = new ONNXGenerator({
  modelUrl: '/models/gaussians3d.onnx',
  maxPoints: 2_000_000,
  debugLogging: true,
  device: gpuDevice
});

await generator.initialize();
await generator.generate({
  cameraMatrix: viewMatrix,
  projectionMatrix: projMatrix,
  time: performance.now() / 1000
});

// Access GPU buffers
const gaussianBuffer = generator.getGaussianBuffer();
const shBuffer = generator.getSHBuffer();
const countBuffer = generator.getCountBuffer();

Precision Configuration

// Manual precision override
const entry = await onnxManager.loadONNXModel(
  device,
  '/models/model.onnx',
  cam, proj, 'model',
  {
    precisionConfig: {
      gaussian: { dataType: 'float32', bytesPerElement: 4 },
      color: { dataType: 'float16', bytesPerElement: 2 }
    }
  }
);

Data Flow

  1. Loading: ONNXManager.loadONNXModel() creates ONNXGenerator and DynamicPointCloud
  2. Initialization: Generator initializes ONNX Runtime session and preallocates GPU buffers
  3. Inference: ONNXGenerator.generate() runs inference and writes to GPU buffers
  4. Dynamic Updates: AnimationManager calls DynamicPointCloud.update()ONNXGenerator.generate() per frame
  5. Rendering: Preprocessor reads countBuffer to update instance count, then sort and render

See: Architecture and API Reference for details.

  • Architecture – Session lifecycle, buffer ownership, and precision-detection flow.
  • API Reference – Generator, manager, and precision-config APIs with usage notes.
  • Point Cloud Module – Shows how ONNX outputs feed DynamicPointCloud.
  • Preprocess Module – Details how ONNX counts and precision flags drive projection.
  • Timeline Module – Covers per-frame animation hooks for dynamic generators.
  • Config Module – Documents how ORT WASM paths are configured before inference.