codee-com · alvrogd · Jan 26, 2026 · Jan 23, 2026 · Jan 23, 2026 · Jan 23, 2026
diff --git a/Checks/PWD005/README.md b/Checks/PWD005/README.md
@@ -11,7 +11,7 @@ Update the copied array range to match the actual array usage in the code.
 
 ### Relevance
 
-Minimising data transfers is one of the main optimization points when offloading
+Minimizing data transfers is one of the main optimization points when offloading
 computations to the GPU. An opportunity for such optimization occurs whenever
 only part of an array is required in a computation. In such cases, only a part
 of the array may be transferred to or from the GPU. However, the developer must

diff --git a/Checks/PWD006/README.md b/Checks/PWD006/README.md
@@ -3,7 +3,7 @@
 ### Issue
 
 The copy of a non-scalar variable to an accelerator device has been requested
-but none or  only a part of its data will be transferred because it is laid out
+but none or only a part of its data will be transferred because it is laid out
 non-contiguously in memory.
 
 ### Actions
@@ -13,9 +13,9 @@ memory segments are copied to the memory of the accelerator device.
 
 ### Relevance
 
-The data of non-scalar variables might be spread across memory, laid out in non-
-contiguous regions. One classical example is a dynamically-allocated two-
-dimensional array in C/C++, which consists of a contiguous array of pointers
+The data of non-scalar variables might be spread across memory, laid out in
+non-contiguous regions. One classical example is a dynamically-allocated
+two-dimensional array in C/C++, which consists of a contiguous array of pointers
 pointing to separate contiguous arrays that contain the actual data. Note that
 the elements of each individual array are contiguous in memory but the different
 arrays are scattered in the memory. This also holds for dynamically-allocated
@@ -25,7 +25,7 @@ In order to offload such non-scalar variables to an accelerator device using
 OpenMP or OpenACC, it is not enough to add it to a data movement clause. This is
 known as deep copy and currently is not automatically supported by either OpenMP
 or OpenACC. To overcome this limitation, all the non-contiguous memory segments
-must be explicitly transferred by the programmer.  In OpenMP 4.5, this can be
+must be explicitly transferred by the programmer. In OpenMP 4.5, this can be
 achieved through the *enter/exit data* execution statements. Alternatively, the
 code could be refactored so that it uses variables with contiguous data layouts
 (eg. flatten an array of arrays).
@@ -85,12 +85,12 @@ void foo(int **A) {
 }
 ```
 
-The *enter/exit data* statements ressemble how the dynamic bi-dimensional memory
+The *enter/exit data* statements resemble how the dynamic bi-dimensional memory
 is allocated in the CPU. An array of pointers is allocated first, followed by
 the allocation of all the separate arrays that contain the actual data. Each
 allocation constitutes a contiguous memory segment and must be transferred
 individually using *enter data*. The deallocation takes place in the inverted
-order and the same happens with the *exit *data statements.
+order and the same happens with the *exit* data statements.
 
 ### Related resources
 

diff --git a/Checks/PWD007/README.md b/Checks/PWD007/README.md
@@ -12,7 +12,7 @@ Protect the recurrence or execute the code sequentially if that is not possible.
 ### Relevance
 
 The recurrence computation pattern occurs when the same memory position is read
-and written to, at least once, in different  iterations of a loop. It englobes
+and written to, at least once, in different iterations of a loop. It englobes
 both true dependencies (read-after-write) and anti-dependencies (write-after-
 read) across loop iterations. Sometimes the term "loop-carried dependencies" is
 also used. If a loop with a recurrence computation pattern is parallelized

diff --git a/Checks/PWD009/README.md b/Checks/PWD009/README.md
@@ -12,7 +12,7 @@ Change the data scope of the variable from private to shared.
 
 Specifying an invalid scope for a variable may introduce race conditions and
 produce incorrect results. For instance, when a variable must be shared among
-threads but it is privatized instead.
+threads, but it is privatized instead.
 
 ### Code example
 

diff --git a/Checks/PWR002/README.md b/Checks/PWR002/README.md
@@ -6,7 +6,7 @@ A scalar variable should be declared in the smallest
 [scope](../../Glossary/Variable-scope.md) possible. In computer programming, the term
 scope of a variable usually refers to the part of the code where the variable
 can be used (e.g. a function, a loop). During the execution of a program, a
-variable cannot be accessed from outside of its scope. This effectively limits
+variable cannot be accessed from outside its scope. This effectively limits
 the visibility of the variable, which prevents its value from being read or
 written in other parts of the code.
 
@@ -40,7 +40,7 @@ incompatible purposes, making code testing significantly easier.
 
 In the following code, the function `example` declares a variable `t` used in
 each iteration of the loop to hold a value that is then assigned to the array
-`result`. The variable `t` is not used outside of the loop.
+`result`. The variable `t` is not used outside the loop.
 
 ```c
 void example() {
@@ -96,7 +96,7 @@ code within larger programs by grouping sections together. Conveniently,
 
 In the following code, the subroutine `example` declares a variable `t` used in
 each iteration of the loop to hold a value that is then assigned to the array
-`result`. The variable `t` is not used outside of the loop.
+`result`. The variable `t` is not used outside the loop.
 
 ```fortran
 subroutine example()

diff --git a/Checks/PWR003/README.md b/Checks/PWR003/README.md
@@ -77,19 +77,19 @@ int example_impure(int a) {
 * `const` function:
   * Depends only on `a` and `b`. If successive calls are made with the same `a`
     and `b` values, the output will not change.
-  * Returns a value without modifying any data outside of the function.
+  * Returns a value without modifying any data outside the function.
 
 * `pure` function:
   * Depends on `c`, a global variable whose value can be modified between
     successive calls to the function by other parts of the program. Even if
     successive calls are made with the same `a` value, the output can differ
     depending on the state of `c`.
-  * Returns a value without modifying any data outside of the function.
+  * Returns a value without modifying any data outside the function.
 
 * "Normal" function:
   * Depends on `c`, a global variable. This restricts the function to be
     `pure`, at most.
-  * However, the function also modifies `c`, memory outside of its scope, thus
+  * However, the function also modifies `c`, memory outside its scope, thus
     leading to a "normal" function.
 
 In the case of the `pure` and "normal" functions, it is equivalent that they
@@ -129,12 +129,12 @@ end module example_module
     successive calls to the function by other parts of the program. Even if
     successive calls are made with the same `a` value, the output can be
     different depending on the state of `c`.
-  * Returns a value without modifying any data outside of the function.
+  * Returns a value without modifying any data outside the function.
 
 * "Normal" function:
   * Depends on `c`, a public variable. This restricts the function to be
     `pure`, at most.
-  * However, the function also modifies `c`, memory outside of its scope, thus
+  * However, the function also modifies `c`, memory outside its scope, thus
     leading to a "normal" function.
 
 >[!WARNING]

diff --git a/Checks/PWR005/README.md b/Checks/PWR005/README.md
@@ -15,7 +15,7 @@ Add `default(none)` to disable default OpenMP scoping.
 When the scope for a variable is not specified in an
 [OpenMP](../../Glossary/OpenMP.md) `parallel` directive, a default scope is assigned
 to it. Even when set explicitly, using a default scope is considered a bad
-practice since it can lead to  wrong data scopes inadvertently being applied to
+practice since it can lead to wrong data scopes inadvertently being applied to
 variables. Thus, it is recommended to explicitly set the scope for each
 variable.
 

diff --git a/Checks/PWR006/README.md b/Checks/PWR006/README.md
@@ -13,7 +13,7 @@ Set the scope of the read-only variable to shared.
 
 Since a read-only variable is never written to, it can be safely shared without
 any risk of race conditions. **Sharing variables is more efficient than
-privatizing** them from a memory perspective so it should be favored whenever
+privatizing** them from a memory perspective, so it should be favored whenever
 possible.
 
 ### Code example

diff --git a/Checks/PWR009/README.md b/Checks/PWR009/README.md
@@ -20,7 +20,7 @@ specific setup in order to better exploit its capabilities.
 The OpenMP `parallel` construct specifies a parallel region of the code that
 will be executed by a team of threads. It is normally accompanied by a
 worksharing construct so that each thread of the team takes care of part of the
-work (e.g the `for` construct assigns a subset of the loop iterations to each
+work (e.g., the `for` construct assigns a subset of the loop iterations to each
 thread). This attains a single level of parallelism since all work is
 distributed across a team of threads. This works well for multi-core CPUs but
 GPUs are composed of a high number of processing units organized into groups

diff --git a/Checks/PWR012/README.md b/Checks/PWR012/README.md
@@ -24,7 +24,7 @@ variable modifications, and also contributes to improve compiler and static
 analyzer code coverage.
 
 In parallel programming, derived data types are often discouraged when
-offloading to the GPU  because they may inhibit compiler analyses and
+offloading to the GPU because they may inhibit compiler analyses and
 optimizations due to [pointer aliasing](../../Glossary/Pointer-aliasing.md). Also, it
 can cause unnecessary data movements impacting performance or incorrect data
 movements impacting correctness and even crashes impacting code quality.

diff --git a/Checks/PWR019/README.md b/Checks/PWR019/README.md
@@ -12,7 +12,7 @@ innermost loop.
 
 ### Relevance
 
-Vectorization takes advantage of having as high a trip count (ie. number of
+Vectorization takes advantage of having as high a trip count (i.e., number of
 iterations) as possible. When loops are
 [perfectly nested](../../Glossary/Perfect-loop-nesting.md) and they can be safely
 interchanged, making the loop with the highest trip count the innermost should

diff --git a/Checks/PWR020/README.md b/Checks/PWR020/README.md
@@ -14,7 +14,7 @@ statements in a first loop and the non-vectorizable statements in a second loop.
 
 [vectorization](../../Glossary/Vectorization.md) is one of the most important ways to
 speed up the computation of a loop. In practice, loops may contain a mix of
-computations where only a part of the loop body introduces loop-carrie
+computations where only a part of the loop body introduces loop-carried
 dependencies that prevent vectorization. Different types of compute patterns
 make explicit the loop-carried dependencies present in the loop. On the one
 hand, the
@@ -25,7 +25,7 @@ vectorized:
 
 * The
 [sparse reduction compute pattern](../../Glossary/Patterns-for-performance-optimization/Sparse-reduction.md) - e.g.
-the reduction variable has an read-write indirect memory access pattern which
+the reduction variable has a read-write indirect memory access pattern which
 does not allow to determine the dependencies between the loop iterations at
 compile-time.
 

diff --git a/Checks/PWR021/README.md b/Checks/PWR021/README.md
@@ -28,7 +28,7 @@ vectorized:
 
 * The
 [sparse reduction compute pattern](../../Glossary/Patterns-for-performance-optimization/Sparse-reduction.md) - e.g.
-the reduction variable has an read-write indirect memory access pattern which
+the reduction variable has a read-write indirect memory access pattern which
 does not allow to determine the dependencies between the loop iterations at
 compile-time.
 

diff --git a/Checks/PWR022/README.md b/Checks/PWR022/README.md
@@ -3,18 +3,18 @@
 ### Issue
 
 Conditional evaluates to the same value for all loop iterations and can be
-[moved outside of the loop](../../Glossary/Loop-unswitching.md) to favor
+[moved outside the loop](../../Glossary/Loop-unswitching.md) to favor
 [vectorization](../../Glossary/Vectorization.md).
 
 ### Actions
 
-Move the invariant conditional outside of the loop by duplicating the loop body.
+Move the invariant conditional outside the loop by duplicating the loop body.
 
 ### Relevance
 
 Classical vectorization requirements do not allow branching inside the loop
 body, which would mean no `if` and `switch` statements inside the loop body are
-allowed. However, loop invariant conditionals can be extracted outside of the
+allowed. However, loop invariant conditionals can be extracted outside the
 loop to facilitate vectorization. Therefore, it is often good to extract
 invariant conditional statements out of vectorizable loops to increase
 performance. A conditional whose expression evaluates to the same value for all
@@ -25,7 +25,7 @@ it will always be either true or false.
 > This optimization is called
 > [loop unswitching](../../Glossary/Loop-unswitching.md) and the compilers can do
 > it automatically in simple cases. However, in more complex cases, the compiler
-> will omit this optimization and therefore it is beneficial to do it manually.
+> will omit this optimization and, therefore, it is beneficial to do it manually.
 
 ### Code example
 

diff --git a/Checks/PWR023/README.md b/Checks/PWR023/README.md
@@ -18,7 +18,7 @@ guarantee that the pointers do not alias one another, i.e. no memory address is
 accessible through two different pointers. The developer can use the `restrict`
 C keyword to inform the compiler that the specified block of memory is not
 aliased by any other block. Providing this information can help the compiler
-generate more efficient code or vectorize the loop. Therefore it is always
+generate more efficient code or vectorize the loop. Therefore, it is always
 recommended to use `restrict` whenever possible so that the compiler has as much
 information as possible to perform optimizations such as vectorization.
 

diff --git a/Checks/PWR024/README.md b/Checks/PWR024/README.md
@@ -3,8 +3,8 @@
 ### Issue
 
 The loop is currently not in
-[OpenMP canonical](../../Glossary/OpenMP-canonical-form.md) form but it can be made
-OpenMP compliant through refactoring.
+[OpenMP canonical](../../Glossary/OpenMP-canonical-form.md) form, but it can be
+made OpenMP compliant through refactoring.
 
 ### Actions
 

diff --git a/Checks/PWR031/README.md b/Checks/PWR031/README.md
@@ -22,7 +22,7 @@ or square roots.
 > [!NOTE]
 > Some compilers under some circumstances (e.g. relaxed IEEE 754 semantics) can
 > do this optimization automatically. However, doing it manually will guarantee
-> best performance across all the compilers.
+> the best performance across all the compilers.
 
 ### Code example
 

diff --git a/Checks/PWR032/README.md b/Checks/PWR032/README.md
@@ -17,7 +17,7 @@ In C, there are several versions of the same mathematical function for different
 types. For example, the square root function is available for floats, doubles
 and long doubles through `sqrtf`, `sqrt` and `sqrtl`, respectively. Oftentimes,
 the developer who is not careful will not use the function matching the data
-type. For instance, most developers will just use "sqrt" for any data type,
+type. For instance, most developers will just use `sqrt` for any data type,
 instead of using `sqrtf` when the argument is float.
 
 The type mismatch does not cause a compiler error because of the implicit type

diff --git a/Checks/PWR034/README.md b/Checks/PWR034/README.md
@@ -1,4 +1,4 @@
-# PWR034: avoid strided array access to improve performance
+# PWR034: Avoid strided array access to improve performance
 
 ### Issue
 

diff --git a/Checks/PWR035/README.md b/Checks/PWR035/README.md
@@ -15,7 +15,7 @@ or changing the data layout to avoid non-consecutive access in hot loops.
 ### Relevance
 
 Accessing an array in a non-consecutive order is less efficient than accessing
-consecutive positions because the latter maximises
+consecutive positions because the latter maximizes
 [locality of reference](../../Glossary/Locality-of-reference.md).
 
 ### Code example

diff --git a/Checks/PWR040/README.md b/Checks/PWR040/README.md
@@ -19,8 +19,8 @@ for low performance on modern computer systems. Matrices are
 Iterating over them column-wise (in C) and row-wise (in Fortran) is inefficient,
 because it uses the memory subsystem suboptimally.
 
-Nested loops that iterate over matrices in an inefficient manner can be
-optimized by applying [loop tiling](../../Glossary/Loop-tiling.md). In contrast to
+Nested loops that iterate over matrices inefficiently can be optimized by
+applying [loop tiling](../../Glossary/Loop-tiling.md). In contrast to
 [loop interchange](../../Glossary/Loop-interchange.md), loop tiling doesn't remove
 the inefficient memory access, but instead breaks the problem into smaller
 subproblems. Smaller subproblems have a much better

diff --git a/Checks/PWR042/README.md b/Checks/PWR042/README.md
@@ -30,7 +30,7 @@ efficient one.
 In order to perform the loop interchange, the loops need to be
 [perfectly nested](../../Glossary/Perfect-loop-nesting.md), i.e. all the statements
 need to be inside the innermost loop. However, due to the initialization of a
-reduction variablе, loop interchange is not directly applicable.
+reduction variable, loop interchange is not directly applicable.
 
 > [!NOTE]
 > Often, loop interchange enables vectorization of the innermost loop which
@@ -104,7 +104,7 @@ first and the third loops are single non-nested loops, so let's focus on the
 second loop nest as it will have a higher impact on performance.
 
 Note that this loop nest is perfectly nested, making loop interchange
-applicable. This optimization will turn the `ij`  order into `ji`, improving
+applicable. This optimization will turn the `ij` order into `ji`, improving
 the locality of reference:
 
 ```c
@@ -187,7 +187,7 @@ first and the third loops are single non-nested loops, so let's focus on the
 second loop nest as it will have a higher impact on performance.
 
 Note that this loop nest is perfectly nested, making loop interchange
-applicable. This optimization will turn the `ij`  order into `ji`, improving
+applicable. This optimization will turn the `ij` order into `ji`, improving
 the locality of reference:
 
 ```fortran

diff --git a/Checks/PWR045/README.md b/Checks/PWR045/README.md
@@ -8,7 +8,7 @@ boost.
 
 ### Actions
 
-Calculate the reciprocal outside of the loop and replace the division with
+Calculate the reciprocal outside the loop and replace the division with
 multiplication with a reciprocal
 
 ### Relevance
@@ -18,7 +18,7 @@ performing the division in each iteration of the loop, one could do the
 following:
 
 * For the expression  `A / B`, calculate the reciprocal of the denominator
-(`RECIP_B = 1.0 / B`) and put it outside of the loop.
+(`RECIP_B = 1.0 / B`) and put it outside the loop.
 
 * Replace the expression `A / B`, use `A * RECIP_B`.
 

diff --git a/Checks/PWR048/README.md b/Checks/PWR048/README.md
@@ -34,8 +34,8 @@ __attribute__((const)) double example(double a, double b, double c) {
 }
 ```
 
-In the above example, the expression `a + b * c` is effectively a FMA operation
-and it can be replaced with a call to `fma`:
+In the above example, the expression `a + b * c` is effectively an FMA
+operation and it can be replaced with a call to `fma`:
 
 ```c
 #include <math.h>

diff --git a/Checks/PWR049/README.md b/Checks/PWR049/README.md
@@ -2,26 +2,26 @@
 
 ### Issue
 
-A condition that depends only on the iterator variable can be moved outside of
-the loop.
+A condition that depends only on the iterator variable can be moved outside the
+loop.
 
 ### Actions
 
-Move iterator-dependent condition outside of the loop.
+Move iterator-dependent condition outside the loop.
 
 ### Relevance
 
 A condition that depends only on the iterator is predictable: we know exactly at
 which iteration of the loop it is going to be true. Nevertheless, it is
 evaluated in each iteration of the loop.
 
-Moving the iterator-dependent condition outside of the loop will result in fewer
+Moving the iterator-dependent condition outside the loop will result in fewer
 instructions executed in the loop. This transformation can occasionally enable
 vectorization, and for the loops that are already vectorized, it can increase
 vectorization efficiency.
 
 > [!NOTE]
-> Moving an iterator-dependent condition outside of the loop is a creative
+> Moving an iterator-dependent condition outside the loop is a creative
 > process. Depending on the type of condition, it can involve loop peeling,
 > [loop fission](../../Glossary/Loop-fission.md) or loop unrolling.