Skip to content

Refactor case BC hooks: rename to define_BC/apply_BC and hoist BC fields to base#307

Open
kaanolgu wants to merge 9 commits into
mainfrom
boundary_precorrect_fix
Open

Refactor case BC hooks: rename to define_BC/apply_BC and hoist BC fields to base#307
kaanolgu wants to merge 9 commits into
mainfrom
boundary_precorrect_fix

Conversation

@kaanolgu
Copy link
Copy Markdown
Collaborator

  • Rename boundary_conditions -> define_BC and pre_correction -> apply_BC across base and child cases.
  • Hoist 12 BC field pointers (bc_{start,end}_{u,v,w}_{x,y}) onto base_case_t; cases use only what they need.
  • Add field_set_face_from_field backend hook (CUDA + OMP) for spatially-varying face BCs.
  • Channel: y-wall BCs now use persistent device fields refreshed with noise per substep, applied via the new hook.
  • Cylinder: inflow profile now built field-side using new inlet_noise(3) config (replaces bc_start_u/v/w scalars).
  • Cylinder: compute_outflow_params moved from apply_BC into define_BC; results stored on self%out_vel / self%flow_rate_diff (drops _cached suffix).
  • Behavioural note: outflow params sampled one substep earlier than before.
  • Breaking change: existing cylinder_nml files must replace bc_start_u/v/w with inlet_noise.

@kaanolgu kaanolgu requested a review from ia267 May 14, 2026 09:30
@kaanolgu kaanolgu self-assigned this May 14, 2026
@kaanolgu kaanolgu added the enhancement New feature or request label May 14, 2026
Comment thread src/case/cylinder.f90
! to fix it here too accordingly
call self%compute_outflow_params(self%out_vel, self%flow_rate_diff)

! Allocate persistent device BC fields on first call.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hu%data, hv%data, hw%data are used before it's actually populated. We should do something similar to what's done in initial_conditions_cylinder:

call random_number(hu%data(1, 1:dims(2), 1:dims(3)))
call random_number(hv%data(1, 1:dims(2), 1:dims(3)))
call random_number(hw%data(1, 1:dims(2), 1:dims(3)))

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We populate the data at around line 197, It would be a duplicate work which eventually hinders the performance. The initial_conditions_cylinder is only run once but if we have 100k iterations then we can't afford to fill in values in each run twice. That's why I avoided that

Line 197- 203:

    do k = 1, dims(3)
      do j = 1, dims(2)
        hu%data(1, j, k) = 1._dp + noise(1)*um*(2._dp*hu%data(1, j, k) - 1._dp)
        hv%data(1, j, k) = noise(2)*um*(2._dp*hv%data(1, j, k) - 1._dp)
        hw%data(1, j, k) = noise(3)*um*(2._dp*hw%data(1, j, k) - 1._dp)
      end do
    end do

!$omp private(k, val)
do k_j = 1, (n_j - 1)/SZ + 1
do k_i = 1, n_i
k = k_j + (k_i - 1)*((n_j - 1)/SZ + 1)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the block-index formula is wrong here. From field_set_face_omp (see k_end = k + (dims(2)-1)/SZ*dims(3)) (line 852), the convention for DIR_X in OMP isk = z + (y_block -1)*nz`, i.e. z varies fast in the third dimension. Here we have the opposite order.

The correct formula should be k = k_i + (k_j - 1)*n_i

if (i + (b_j - 1)*blockDim%x <= n_j) then
val = f(i, i_slice, b)
ierr = atomicadd(sum_f, val)
ierr = atomicmax(max_f, val)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the atomicmax is not fully implemented for double (works for integers), we need to check this.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out! I will check it and update if required

call resolve_field_t(f_d, f)
call resolve_field_t(f_start_d, f_start)

! NOTE: NULL_LOC == VERT == 0 in this codebase, so guarding on
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is not correct, NULL_LOC = -0001 and VERT = 0 (see common.f90). I think the code still works because VERT is used unconditionally, but we need to fix the comment


end subroutine field_set_y_face
attributes(global) subroutine field_set_y_face_from_field( &
f, f_start, flow_rate_diff, nx, ny, nz)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uses flow_rate_diff as an argument but it's never used in the kernel

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out! I will flow_rate_diff into the subroutine

end subroutine field_max_sum

attributes(global) subroutine field_set_y_face( &
f, c_start, c_end, flow_rate_diff, nx, ny, nz)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here flow_rate_diff is not used in the subroutine

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out! I will flow_rate_diff into the subroutine

Comment thread src/case/cylinder.f90
use m_base_backend, only: base_backend_t
use m_base_case, only: base_case_t
use m_common, only: dp, MPI_X3D2_DP, get_argument, DIR_C, DIR_X, &
VERT, CELL, X_FACE, BC_DIRICHLET
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can remove CELL as it's not used

Comment thread src/config.f90
real(dp) :: bc_start_u = 1._dp
real(dp) :: bc_start_v = 0._dp
real(dp) :: bc_start_w = 0._dp
real(dp) :: inlet_noise(3)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to initialise inlet_noise in case an input file omits them

real(dp) :: inlet_noise(3) = [0._dp, 0._dp, 0._dp]

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then that means we also need to initialise init_noise too right ? Because in the original version that wasn't initialised too so I kept the same format

k_end = k + (dims(2) - 1)/SZ*dims(3)
do j = 1, dims(1)
f%data(1, j, k) = c_start
! TODO: fix these from OpenMP looking at CUDA implementation
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we do an error stop here like we do for Y_FACE?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants