Refactor case BC hooks: rename to define_BC/apply_BC and hoist BC fields to base#307
Refactor case BC hooks: rename to define_BC/apply_BC and hoist BC fields to base#307kaanolgu wants to merge 9 commits into
Conversation
| ! to fix it here too accordingly | ||
| call self%compute_outflow_params(self%out_vel, self%flow_rate_diff) | ||
|
|
||
| ! Allocate persistent device BC fields on first call. |
There was a problem hiding this comment.
hu%data, hv%data, hw%data are used before it's actually populated. We should do something similar to what's done in initial_conditions_cylinder:
call random_number(hu%data(1, 1:dims(2), 1:dims(3)))
call random_number(hv%data(1, 1:dims(2), 1:dims(3)))
call random_number(hw%data(1, 1:dims(2), 1:dims(3)))There was a problem hiding this comment.
We populate the data at around line 197, It would be a duplicate work which eventually hinders the performance. The initial_conditions_cylinder is only run once but if we have 100k iterations then we can't afford to fill in values in each run twice. That's why I avoided that
Line 197- 203:
do k = 1, dims(3)
do j = 1, dims(2)
hu%data(1, j, k) = 1._dp + noise(1)*um*(2._dp*hu%data(1, j, k) - 1._dp)
hv%data(1, j, k) = noise(2)*um*(2._dp*hv%data(1, j, k) - 1._dp)
hw%data(1, j, k) = noise(3)*um*(2._dp*hw%data(1, j, k) - 1._dp)
end do
end do
| !$omp private(k, val) | ||
| do k_j = 1, (n_j - 1)/SZ + 1 | ||
| do k_i = 1, n_i | ||
| k = k_j + (k_i - 1)*((n_j - 1)/SZ + 1) |
There was a problem hiding this comment.
I think the block-index formula is wrong here. From field_set_face_omp (see k_end = k + (dims(2)-1)/SZ*dims(3)) (line 852), the convention for DIR_X in OMP isk = z + (y_block -1)*nz`, i.e. z varies fast in the third dimension. Here we have the opposite order.
The correct formula should be k = k_i + (k_j - 1)*n_i
| if (i + (b_j - 1)*blockDim%x <= n_j) then | ||
| val = f(i, i_slice, b) | ||
| ierr = atomicadd(sum_f, val) | ||
| ierr = atomicmax(max_f, val) |
There was a problem hiding this comment.
I think the atomicmax is not fully implemented for double (works for integers), we need to check this.
There was a problem hiding this comment.
Thanks for pointing out! I will check it and update if required
| call resolve_field_t(f_d, f) | ||
| call resolve_field_t(f_start_d, f_start) | ||
|
|
||
| ! NOTE: NULL_LOC == VERT == 0 in this codebase, so guarding on |
There was a problem hiding this comment.
This comment is not correct, NULL_LOC = -0001 and VERT = 0 (see common.f90). I think the code still works because VERT is used unconditionally, but we need to fix the comment
|
|
||
| end subroutine field_set_y_face | ||
| attributes(global) subroutine field_set_y_face_from_field( & | ||
| f, f_start, flow_rate_diff, nx, ny, nz) |
There was a problem hiding this comment.
uses flow_rate_diff as an argument but it's never used in the kernel
There was a problem hiding this comment.
Thanks for pointing out! I will flow_rate_diff into the subroutine
| end subroutine field_max_sum | ||
|
|
||
| attributes(global) subroutine field_set_y_face( & | ||
| f, c_start, c_end, flow_rate_diff, nx, ny, nz) |
There was a problem hiding this comment.
same here flow_rate_diff is not used in the subroutine
There was a problem hiding this comment.
Thanks for pointing out! I will flow_rate_diff into the subroutine
| use m_base_backend, only: base_backend_t | ||
| use m_base_case, only: base_case_t | ||
| use m_common, only: dp, MPI_X3D2_DP, get_argument, DIR_C, DIR_X, & | ||
| VERT, CELL, X_FACE, BC_DIRICHLET |
There was a problem hiding this comment.
we can remove CELL as it's not used
| real(dp) :: bc_start_u = 1._dp | ||
| real(dp) :: bc_start_v = 0._dp | ||
| real(dp) :: bc_start_w = 0._dp | ||
| real(dp) :: inlet_noise(3) |
There was a problem hiding this comment.
We need to initialise inlet_noise in case an input file omits them
real(dp) :: inlet_noise(3) = [0._dp, 0._dp, 0._dp]There was a problem hiding this comment.
Then that means we also need to initialise init_noise too right ? Because in the original version that wasn't initialised too so I kept the same format
| k_end = k + (dims(2) - 1)/SZ*dims(3) | ||
| do j = 1, dims(1) | ||
| f%data(1, j, k) = c_start | ||
| ! TODO: fix these from OpenMP looking at CUDA implementation |
There was a problem hiding this comment.
should we do an error stop here like we do for Y_FACE?
boundary_conditions->define_BCandpre_correction->apply_BCacross base and child cases.bc_{start,end}_{u,v,w}_{x,y}) ontobase_case_t; cases use only what they need.field_set_face_from_fieldbackend hook (CUDA + OMP) for spatially-varying face BCs.inlet_noise(3)config (replacesbc_start_u/v/wscalars).compute_outflow_paramsmoved fromapply_BCintodefine_BC; results stored onself%out_vel/self%flow_rate_diff(drops_cachedsuffix).cylinder_nmlfiles must replacebc_start_u/v/wwithinlet_noise.