Both implementations calculate radiosity coupling between surface elements:
- CPU version uses object-oriented C++ with explicit loops
- OpenCL version parallelizes calculations using GPU kernels
for each surface element i:
for each surface element j:
calculate geometric coupling
if coupling > threshold:
calculate occlusion
update coupling matrix
kernel makeRadiosityCouplings:
for each work item (parallel surface element i):
for each surface element j:
calculate geometric coupling
if coupling > threshold:
calculate occlusion
update coupling matrix
- OpenCL uses float4 vectors for better GPU utilization
- CPU version has more detailed geometric calculations
- OpenCL version handles memory differently (global buffers)
- CPU version includes more debug visualization options
- Exact ray-triangle intersection in getOcclusion() method (lines 142-160)
- Uses
rayInTriangle()method to check intersection - Processes obstacles sequentially in a for loop
- Binary result (1 for intersection, 0 for no intersection)
- No distance calculation or fuzzyness
- Example:
for(int i=0; i<triangleObstacles.size(); i++) {
if(triangleObstacles[i].rayIn(ray0,hX,hY)) {
return 1.0;
}
}- No explicit spatial acceleration structures
- Processes all obstacles sequentially
- Relies on CPU cache for performance
- Simple for loop through all obstacles
for(int i=0; i<triangleObstacles.size(); i++)- Work group processing (lines 898-961)
- Chunked obstacle processing
for(int i0=0; i0<ns.y; i0+= nL)Local memory caching for better memory access patterns
LOC[iL*3] = obstacles[i3];- Two-stage processing:
- Groups of points bounded within spheres
- Groups of triangles bounded within larger triangles
- Uses barrier synchronization for parallel processing
- Uses exact ray-triangle intersection
- Checks if ray intersects triangle using rayInTriangle method
- Returns binary result (1 for intersection, 0 for no intersection)
- Processes obstacles sequentially
- No distance calculation or fuzzyness
- Uses local memory caching for obstacle triangles
- Implements parallel processing using work groups
- Includes surface index checking to skip self-intersections
- Processes obstacles in chunks for better memory access patterns
- Uses barrier synchronization between processing stages
- No explicit spatial acceleration structures
- Processes all obstacles sequentially
- Relies on CPU cache for performance
- Doesn't group obstacles into larger blocks
- Uses work groups to process chunks of obstacles
- Caches obstacle triangles in local memory
- Implements two-stage processing:
- Groups of points bounded within spheres
- Groups of triangles bounded within larger triangles
- Uses barrier synchronization for parallel processing
- Skips self-intersections using surface indices