Chevron right icon

Gaussian splatting

What is gaussian splatting?

Quick definition

Gaussian splatting is a 3D representation technique that models scenes as collections of oriented 3D Gaussian functions, which are mathematical distributions that describe how light and color appear from any viewpoint. To help visualize this, imagine Gaussian splats as ‘clouds of colored fog’ or ‘dabs of paint in space’ that collectively form complex images. Unlike traditional polygon meshes that define surfaces through connected vertices, Gaussian splatting represents scenes through millions of small, semi-transparent ellipsoids that blend together during rendering to create fantastic 3D representations with efficient computation and natural support for progressive refinement.

What is gaussian splatting?

Gaussian splatting represents a fundamental shift in how 3D scenes are stored and rendered. For decades, 3D graphics have relied primarily on polygon meshes—surfaces defined by vertices, edges, and faces that approximate object geometry. While mesh-based approaches work well for hand-modeled content and enable precise geometric control, they struggle with the photorealistic reproduction of real-world scenes captured through photogrammetry or other scanning techniques. Converting complex real-world materials, lighting effects, and fine surface details into mesh geometry with textures requires significant manual optimization and often sacrifices fidelity.

Gaussian splatting takes a different approach, rooted in volumetric representations rather than surface geometry. Instead of defining explicit surfaces, Gaussian splat representations describe scenes through large collections of 3D Gaussian functions—each one a mathematical formula defining how a small region of space contributes color and opacity from any viewing angle. Each Gaussian is a semi-transparent, oriented ellipsoid floating in 3D space. Individually, these ellipsoids are simple. But arranged by the millions and rendered with appropriate blending, they can reconstruct photorealistic scenes with remarkable efficiency.

How gaussian splatting works

Gaussian splatting systems transform captured scenes into renderable Gaussian representations through several stages. Scene capture begins with photographs or video frames from different viewpoints. Structure-from-motion algorithms determine camera positions and estimate rough 3D geometry. The reconstruction process then optimizes a Gaussian splat representation to match the input photographs—initially placing Gaussians throughout the scene volume, then iteratively adjusting their parameters (position, size, orientation, color, opacity) to minimize differences between rendered views and actual photographs.

Each Gaussian is defined by: position in 3D space, a covariance matrix describing orientation and scale, spherical harmonic coefficients encoding view-dependent color, and opacity. Rendering projects these 3D Gaussians onto the 2D image plane, sorts them by depth, and blends them back to front. Modern implementations use tile-based approaches that partition the screen into small regions, enabling parallel GPU processing of millions of Gaussians at real-time frame rates.

Adaptive level-of-detail emerges naturally from the representation. Gaussians can be selectively removed based on viewing distance or screen-space contribution without restructuring the dataset, making Gaussian splatting inherently compatible with streaming architectures.

Why gaussian splatting matters

Photorealistic capture becomes substantially more accessible. Converting real-world scenes into traditional 3D representations requires extensive manual optimization. Gaussian splatting automates this conversion while maintaining photographic quality, enabling rapid deployment of captured content without manual asset work. Rendering performance at photorealistic quality shifts the feasibility boundary for real-time applications, enabling photorealistic experiences on consumer devices including mobile and standalone VR headsets.

File size efficiency is another practical advantage. High-quality Gaussian splat representations often compress to tens of megabytes for scenes that would require hundreds of megabytes as high-resolution meshes with detailed textures. This compression efficiency, combined with progressive refinement support, makes Gaussian splatting particularly suitable for web delivery.

Gaussian splatting vs. polygon meshes vs. neural radiance fields

Polygon meshes define surfaces through vertices connected into triangular faces, providing precise geometric control ideal for hand-modeled content. They struggle with photorealistic reproduction of complex real-world appearance, requiring manual optimization to balance quality with performance.

Neural radiance fields (NeRF) encode scenes in neural network weights that learn to predict color and density at any point in space, achieving remarkable photorealistic quality. But rendering requires evaluating neural networks thousands of times per pixel, making real-time performance impractical, and the representation is opaque—you can’t easily edit or segment specific scene elements.

Gaussian splatting occupies a productive middle ground: like NeRF, it reconstructs scenes automatically from photographs without manual modeling. Like meshes, it renders efficiently on standard GPU hardware at real-time frame rates. The explicit representation enables editing, segmentation, and progressive refinement that neural approaches make difficult. Gaussian splatting serves applications requiring photorealistic capture with real-time rendering and efficient distribution—retail product visualization, real estate walkthroughs, digital twins, and training simulations.

Gaussian splatting and content streaming

The discrete structure of Gaussian splat representations aligns well with progressive streaming architectures. Spatial partitioning divides scenes into transmittable regions—transmit Gaussians for the currently-viewed region first, load adjacent areas as users navigate. Importance-based prioritization transmits Gaussians covering larger screen areas or user focus regions first, ensuring reasonable quality immediately.

Progressive density refinement transmits Gaussians in multiple passes. An initial sparse subset provides basic scene structure; subsequent passes add Gaussians that capture finer detail. Users interact with functional representations within seconds while quality continuously improves. Unlike mesh streaming (where missing triangles create visible holes), missing Gaussians simply reduce accuracy without creating discontinuities—the remaining Gaussians blend together gracefully.

Related terms & concepts

See also: 3D streaming — The progressive delivery architecture that leverages Gaussian splatting’s discrete structure for efficient, adaptive content transmission.

See also: Level of detail (LOD) — The rendering technique Gaussian splatting supports natively through selective splat transmission by viewing distance.

See also: Neural radiance fields (NeRFs) — The complementary neural capture technique that Gaussian splatting improves upon for real-time rendering.