fs: Add Kernel-level VFS Performance Profiler#18607
fs: Add Kernel-level VFS Performance Profiler#18607Sumit6307 wants to merge 1 commit intoapache:masterfrom
Conversation
6cbaa23 to
b34527a
Compare
cederom
left a comment
There was a problem hiding this comment.
- Thank you @Sumit6307 very nice idea! :-)
- My remarks noted in the code.
- We should align the nomenclature
PROFILEvsPROFILER(second seems better suited imho), as both names are used for the same functionality. MaybePERForPERFPROFwould clearly indicate performance profiler? - Please also provide simple
nuttx/Documentationfor the new functionality.
|
@Sumit6307 why are you including mnemofs commit here? |
Yup, I would put that into a separate PR too :-P |
|
@Sumit6307 why not reuse sched_note syscall to profile fs performance? you can learn from Documentation |
b34527a to
f8a783e
Compare
f8a783e to
f68f6d2
Compare
|
@xiaoxiang781216 Thank you for the detailed review! I have addressed all the inline feedback:
Regarding By exposing this purely through The CI checks (both |
f68f6d2 to
4cf5ea7
Compare
|
@simbit18 Thank you so much for catching this! I have just updated both Specifically:
Everything should now build smoothly across both CMake and Make infrastructures. Let me know if you spot anything else! |
4cf5ea7 to
df2577d
Compare
@xiaoxiang781216 maybe @Sumit6307 idea to use it for find regressions easily during the CI test is a good idea. But I still not sure if that is a good idea to keep it enabled all the time. Maybe just select some board profiles to have it enabled by default to be validated during the CI check. |
|
@Sumit6307 please move the mnemofs fix to another PR to avoid polluting this PR |
This adds a kernel-level performance profiler for the VFS. By enabling CONFIG_FS_PROFILER, the core VFS system calls (file_read, file_write, file_open, and file_close) are instrumented to track high-resolution execution times using clock_systime_timespec() seamlessly. The collected statistics are exposed dynamically via a new procfs node at /proc/fs/profile, allowing CI regression testing without needing external debugging tools. Signed-off-by: Sumit6307 <sumitkesar6307@gmail.com>
df2577d to
f4c0133
Compare
@xiaoxiang781216 @acassis I deeply apologize for the confusion! When I initially opened this PR, my local branch had accidentally inherited the I have just performed a strict |
|
@acassis Thank you for validating this! I completely agree with your approach. To prevent any bloat on production hardware, I just pushed a fix changing Since you believe this is valuable for CI regression testing, I would be more than happy to explicitly enable it via Which simulator |
I think sim:citest is the right place to include it! Please @simbit18 @lupyuen confirm it |
|
I agree with @xiaoxiang781216 that it would be good to use the existing framework for profiling this. I'm also not sure why this type of profiling would need to exist in the kernel space? It is just a timer surrounding open/close/read/write calls, which could be done from user space applications. What regressions is this catching? |
Note: Please adhere to Contributing Guidelines.
Summary
Currently, assessing the latency or throughput of VFS operations requires external tools, ad-hoc test apps, or complex debug setups. This makes automated performance regression testing in CI difficult.
This PR introduces a Kernel-level VFS Performance Profiler to address this gap.
By enabling the new
CONFIG_FS_PROFILERconfiguration, the core VFS system calls (file_read, file_write, file_open, and file_close) are instrumented to track high-resolution execution times (in nanoseconds) and invocation counts seamlessly usingclock_systime_timespec().The collected statistics are exposed dynamically via a new procfs node at
/proc/fs/profile. This enables any testing script, CI workflow, or user-space application to effortlessly monitor filesystem performance bottlenecks and catch regressions.Impact
cat /proc/fs/profile.CONFIG_FS_PROFILER). When disabled, code size and performance impact are exactly zero.enter_critical_section) to ensure SMP (multi-core) scaling is not bottlenecked.Testing
Tested on Host: Windows 11 (via WSL2).
Tested on Board:
sim:nsh(NuttX Simulator).Test procedure:
CONFIG_FS_PROFILER=yandCONFIG_FS_PROCFS=y.Test Log: