crispigt.

Evaluation of performance, perception, and the final call

2026-05-15
C#C++UnityBuoyancyPerformanceEvaluation

The accuracy post proved the math works. This post asks two harder questions: how fast is it? and does it actually look different?

Stress test setup

To get a realistic "can a game ship this?" number, I skipped micro-benchmarks and went straight to the brutal approach, 100 unity spheres, 712 triangles each, dropped into rolling Gerstner waves, measured with a simple FPSLogger script that averages frame count over 1-second windows and prints to the console.

Three configurations, same scene, same waves:

  1. Linear Path, all 100 bodies use the C++ DLL's built-in triangle clipper. No adaptive refinement.
  2. Adaptive N=1, one refinement sample per intersecting chord. The lightest possible adaptive cost.
  3. Adaptive N=2, two refinement samples per chord.

Stress test results

The numbers

The Linear path cruises at 363 FPS average, rock-steady across the entire 10-second run. That's 6× above the 60 FPS real-time target with 100 simultaneous physics bodies. The C++ DLL and Burst-compiled wave sampler are doing exactly what they were designed to do.

Spheres

The Adaptive paths tell a very different story. Both start reasonably (N=1 opens at 206 FPS, N=2 at 44 FPS) but then collapse over time. By 7 seconds, N=2 is at 2 FPS. By 35 seconds, N=1 is at 3 FPS.

Why does it get worse over time?

When the spheres are falling through the air, zero triangles are intersecting the water, so the adaptive clipper has nothing to do. As more spheres land and reach equilibrium, more triangles straddle the waterline every physics update. At equilibrium, a floating sphere has roughly a belt of triangles permanently half-submerged, maximising the number of SampleHeight calls per frame.

Each SampleHeight is a 2-iteration Newton solve evaluating three Gerstner waves with 6 trig calls per iteration. With N=2 adaptive samples, each intersecting triangle fires 2 extra SampleHeight calls. Across 100 spheres with 712 triangles each, a significant fraction intersecting the surface, that's tens of thousands of Newton solves per FixedUpdate.

Worse, the AdaptiveClipper builds its sub-triangle lists using C# List<T>, managed heap allocations that the garbage collector has to clean up every frame. The combination of trig overhead and GC pressure creates a vicious feedback loop: slower frames leads to more physics steps queued leads to even more work per frame.