Utils Module Architecture

The Utils module is the shared foundation for Visionary. Instead of being a single-purpose math library, it is divided into four cooperating subsystems that feed the renderer, managers, and IO pipelines:

┌─────────────────────────────────────────────────────────────┐
│                    Application Layer                        │
│  (Renderers · Managers · IO · Controllers · Diagnostics)    │
├─────────────────────────────────────────────────────────────┤
│                      Utils Module                           │
│  ┌────────────┬──────────────┬──────────────┬─────────────┐ │
│  │ Math Core  │ Data Layout  │ GPU Runtime  │ Renderer Ops│ │
│  │ (vec/aabb) │ (float16/SH) │ (read/profil)│ (env/init)  │ │
│  └────────────┴──────────────┴──────────────┴─────────────┘ │
├─────────────────────────────────────────────────────────────┤
│                gl-matrix · WebGPU · three/webgpu            │
└─────────────────────────────────────────────────────────────┘

Architectural Principles

Single Responsibility – each file addresses exactly one family of problems (e.g., half.ts owns every float16 concern).
Allocation Awareness – hot paths reuse typed arrays and staging buffers to stay GC-friendly.
Immutable Inputs – math helpers (Aabb, planeFromPoints, camera math) clone or return new values to avoid upstream state leaks.
Fail-Fast Debuggability – renderer and GPU helpers perform defensive checks (device availability, buffer usage flags, timestamp feature detection) and provide explicit logs to shorten feedback loops.

Subsystems

1. Math Core

Components: aabb.ts, camera-math.ts, transforms.ts, vector-math.ts

Aabb encapsulates min/max bounds and provides read-only center() / radius() helpers. Internally it clones vectors with vec3.clone to guarantee immutability.
Camera math functions implement the pinhole camera model (fov = 2 * atan(pixels / (2 * focal))) and live in their own module so they can be tree-shaken into both IO and runtime controllers.
VIEWPORT_Y_FLIP is precomputed once to prevent repeated mat4.fromValues calls inside render loops.
vector-math exposes simple tuple-based operations and planeFromPoints. The plane fitter relies on a covariance build + power-iteration eigenvector finder; it can optionally enforce an upward normal to match UI expectations.

Design goals: - zero shared state, - always numeric-safe (returns NaN triplets only when a vector cannot be normalized), and - easily portable to workers if heavy preprocessing ever moves off the main thread.

2. Data Layout & Float16

Components: half.ts, relevant exports in gpu.ts

f32_to_f16 / f16_to_f32 wrap an IEEE‑754 compliant conversion pipeline. Features such as configurable rounding, FTZ, overflow saturation, canonical NaN, and legacy exponent cutoffs let us mimic historical datasets when validating regressions.
packF16Array / unpackF16Array are batch helpers used when loading ONNX weights or streaming Gaussian parameters.
makeCopySH_PackedF16 repacks SH coefficients from channel-major source properties (f_dc_*, f_rest_*) into the shader-friendly [R0,G0,B0, R1,G1,B1, ...] order. It always writes 48 half values (24 u32) so WGSL structs can assume constant strides no matter which SH degree is used.
shNumCoefficients / shDegreeFromNumCoeffs allow managers to validate payload metadata before attempting to allocate GPU buffers.

Data Flow:

PLY/ONNX row → float32 math (sigmoid/exp/cov) → packF16Array/makeCopySH → Uint32Array buffer → GPU upload

The conversion subsystem is intentionally stateless; the IO module orchestrates it, but Utils guarantees the math and layout are consistent across every importer.

3. GPU Runtime & Instrumentation

Components: gpu.ts, debug-gpu-buffers.ts

align4 / align8 apply bitwise rounding to satisfy WebGPU copy & map constraints. They are reused by readWholeBuffer as well as ONNX upload helpers.
readWholeBuffer performs an end-to-end readback: create staging buffer → copy → wait → map → slice requested bytes → destroy staging buffer. Optional queue.onSubmittedWorkDone() calls are guarded so it runs on browsers that lack the API.
GPUStopwatch wraps timestamp query sets. It enforces one label per measurement, prevents capacity overflow, and resolves queries into its own buffer before mapping. All timings are returned in nanoseconds.
Gaussian math helpers (buildCov, sigmoid) live here because they run immediately before GPU uploads.
debug-gpu-buffers.ts adds higher-level diagnostics: reading ONNX count buffers, comparing values across buffers, and tracing the entire count pipeline. These helpers are safe to call in production builds because they fence GPU queues and destroy temporary buffers after use.
createShaderDebugBuffer / readShaderDebugBuffer give shader authors a sanctioned pathway to dump intermediate compute results back to JS for inspection.

4. Renderer & Environment Operations

Components: env-map-helper.ts, renderer-init-helper.ts

EnvMapHelper manages everything related to HDR environments: loading .hdr files via RGBELoader, generating PMREMs with WebGPU-compatible PMREMGenerator, synchronizing tone-mapping/exposure with an existing renderer, and updating scenes using WebGPU’s updateEnvironment hook when available. Every step validates renderer backend/device availability because build outputs may asynchronously hydrate Three.js internals.
RendererInitHelper is the orchestrator that clones renderer state while ensuring GPU resources are not shared. It exposes:
RendererInitOptions for passing a source renderer, original HDR texture, desired size, and fallback HDR URLs.
initializeRenderer which (1) copies config from a source renderer or applies defaults, (2) enforces size/pixel ratio, (3) decides whether to reuse an existing environment or load the fallback HDR, (4) creates a fresh PMREM texture per renderer, and (5) updates renderer + scene references.
isRendererInitialized for guard clauses before performing expensive operations (e.g., screenshot capture).
Internally the helper splits responsibilities: configuration copying (setupClearColor, tone mapping, color space, lighting flags), environment sourcing (setupEnvironmentFromSource, setupDefaultEnvironment), and renderer updates. This decomposition keeps logic composable—callers can skip certain steps if they only need environment cloning.

Renderer Resource Rule:

Configuration (tone mapping, clear color) → shared via copy
GPU resources (PMREM textures, buffers)  → re-created per renderer
Effects/passes                            → recreated by caller

Key Data Flows

Environment Bootstrap

sourceRenderer? ──► setupRendererConfig ──► renderer
             │
             └─ has scene.environment?
                     │ yes                          no
                     ▼                              ▼
          setupEnvironmentFromSource        setupDefaultEnvironment
                     │                              │
        originalTexture? ──► createEnvironmentMap ◄─ fallback HDR
                     │
                     ▼
           scene.environment / background updated

ONNX Count Debug Pipeline

ONNX Output Buffer ──► readONNXCountBuffer (u32)
                                  │
                                  ▼
ModelParams Buffer ──► readModelParamsNumPoints (u32)
                                  │
                                  ▼
Shader Storage ──► compareBufferValues / readShaderDebugBuffer

This pipeline gives immediate visibility into where a mismatch first occurs, allowing engineers to distinguish ONNX inference failures from GPU copy timing issues.

Memory & Reliability Strategies

View Reuse – half.ts maintains one Float32Array/Uint32Array pair for all conversions; makeCopySH_PackedF16 packs into a temporary Uint16Array(48) per point to keep code straightforward yet predictable.
Explicit Cleanup – every staging buffer (readWholeBuffer, debug helpers) is explicitly unmapped/destroyed; PMREM generators are disposed as soon as an environment texture is produced.
Input Validation – renderer helpers bail early if the backend or device is missing, and GPU debug helpers refuse to start when buffers are undefined.
Deterministic Layouts – SH packing pads to a constant stride, and float16 packing exposes the wordsPerPoint metadata so downstream GPU structs never guess.

Integration Points

Managers & Controllers: call renderer helpers to clone views for thumbnails, re-use math utilities for camera constraints, and rely on GPUStopwatch for per-stage timing overlays.
IO Module: owns file parsing but delegates every float16/SH conversion to Utils to ensure ONNX, KSplat, and PLY feeds produce identical GPU payloads.
Renderer: consumes Aabb, covariance math, and transform constants inside the Three.js/WebGPU pipeline.
Diagnostics & QA: run debugCountPipeline and shader debug buffers during field testing without building custom scripts.

By isolating these responsibilities while keeping the APIs tiny, the Utils module stays stable even as Visionary evolves—new formats or renderer modes can reuse the same primitives without duplicating brittle math or GPU code.