Skip to content

Conversation

@hotislandn
Copy link

PR checklist

  • Updated function header with a short description and version number
  • Added test case for bug fix or new feature
  • Validated on real hardware

When compiling with the arm-none-eabi-gcc 13.2.1 with "-O2" for v8-r, this test case just outputs "0" as the result.

@fdesbiens
Copy link
Contributor

Hi @hotislandn.

I will bring this to the attention of the team. We will be in touch.

@offer796m
Copy link

Hi @hotislandn.

I will bring this to the attention of the team. We will be in touch.

Thanks.

@fdesbiens
Copy link
Contributor

Hi @hotislandn.

I will start investigating this issue soon. In the meantime, can you provide additional details, please? Did you try different GCC versions? Are you getting the same result with -O1 or -O3? What other flags are you using?

@fdesbiens fdesbiens self-assigned this Dec 9, 2025
@hotislandn
Copy link
Author

  1. Currently, we did not tried other versions of the GCC, nor the -O1/O3 options. However, we tried LLVM(with clang 21), it works as expected for this case.
  2. The flags in use: -march=armv8-r+fp.sp -mfpu=fpv5-sp-d16 -mfloat-abi=hard -mlittle-endian -mthumb -fdata-sections -ffunction-sections -O2 -g3
  3. We examed the disassemble of the ELF file generated by GCC, it shows: tm_basic_processing_counter is treated as a LOCAL REGISTER variable in the function tm_basic_processing_thread_0_entry, and the increament does not write back to the GLOBAL varible, thus the report thread gets 0 all the time.

@hotislandn
Copy link
Author

hotislandn commented Dec 10, 2025

Well, we tried to do the compile only test with the latest code base of threadx with GCC and LLVM today.
The detail info is:
GCC: gcc version 14.2.1 20241119 (Arm GNU Toolchain 14.2.Rel1 (Build arm-14.52)) <--- new version here
Clang: version 21.1.0
ThreadX source code at commit: c4ad279

Compile cmd for GCC:
threadx/utility/benchmarks/thread_metric$ arm-none-eabi-gcc -march=armv8-r -mfpu=fpv5-sp-d16 -mfloat-abi=hard -mlittle-endian -mthumb -fdata-sections -ffunction-sections -O2 -g3 -c tm_basic_processing_test.c
and objdump shows that:

00000000 <tm_basic_processing_thread_0_entry>:
   0:   2300            movs    r3, #0
   2:   f240 0100       movw    r1, #0
   6:   461a            mov     r2, r3
   8:   f2c0 0100       movt    r1, #0
   c:   f841 2023       str.w   r2, [r1, r3, lsl #2]
  10:   3301            adds    r3, #1
  12:   f5b3 6f80       cmp.w   r3, #1024       @ 0x400
  16:   d1f9            bne.n   c <tm_basic_processing_thread_0_entry+0xc>
  18:   f240 0300       movw    r3, #0
  1c:   f2c0 0300       movt    r3, #0
  20:   f8d3 c000       ldr.w   ip, [r3]
  24:   2300            movs    r3, #0
  26:   f851 2023       ldr.w   r2, [r1, r3, lsl #2]
  2a:   f851 0023       ldr.w   r0, [r1, r3, lsl #2]
  2e:   4462            add     r2, ip
  30:   4042            eors    r2, r0
  32:   f841 2023       str.w   r2, [r1, r3, lsl #2]
  36:   3301            adds    r3, #1
  38:   f5b3 6f80       cmp.w   r3, #1024       @ 0x400
  3c:   d1f3            bne.n   26 <tm_basic_processing_thread_0_entry+0x26>
  3e:   f10c 0c01       add.w   ip, ip, #1
  42:   e7ef            b.n     24 <tm_basic_processing_thread_0_entry+0x24>

The instruction for "tm_basic_processing_counter" is at address 3e.

Compile cmd for LLVM:
clang --target=arm-none-eabi -march=armv8-r -mfpu=fpv5-sp-d16 -mfloat-abi=hard -mlittle-endian -mthumb -fdata-sections -ffunction-sections -O2 -g3 -c tm_basic_processing_test.c
It generates:

00000000 <tm_basic_processing_thread_0_entry>:
   0:   f240 0000       movw    r0, #0
   4:   2100            movs    r1, #0
   6:   f2c0 0000       movt    r0, #0
   a:   2200            movs    r2, #0
   c:   f840 1022       str.w   r1, [r0, r2, lsl #2]
  10:   3201            adds    r2, #1
  12:   f5b2 6f80       cmp.w   r2, #1024       @ 0x400
  16:   d1f9            bne.n   c <tm_basic_processing_thread_0_entry+0xc>
  18:   f240 0c00       movw    ip, #0
  1c:   f2c0 0c00       movt    ip, #0
  20:   f8dc e000       ldr.w   lr, [ip]
  24:   2300            movs    r3, #0
  26:   f850 1023       ldr.w   r1, [r0, r3, lsl #2]
  2a:   f850 2023       ldr.w   r2, [r0, r3, lsl #2]
  2e:   4471            add     r1, lr
  30:   4051            eors    r1, r2
  32:   f840 1023       str.w   r1, [r0, r3, lsl #2]
  36:   3301            adds    r3, #1
  38:   f5b3 6f80       cmp.w   r3, #1024       @ 0x400
  3c:   d1f3            bne.n   26 <tm_basic_processing_thread_0_entry+0x26>
  3e:   f10e 0e01       add.w   lr, lr, #1
  42:   f8cc e000       str.w   lr, [ip]
  46:   e7ed            b.n     24 <tm_basic_processing_thread_0_entry+0x24>

The instructions at address 3e and 42 does the expected job, which is different from GCC.

BTW, O1 and O3 for GCC does not solve the problem here.

Hope those tedious logs helpful for this topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants