[Deepin-Kernel-SIG] [linux 6.18.y] [deepin] deepin: arm64: decrease NR_CPUS to 2048#1753
Conversation
Reviewer's guide (collapsed on small PRs)Reviewer's GuideThis PR adjusts the Deepin ARM64 desktop kernel configuration to reduce the maximum supported CPU count from 4096 to 1024, improving memory usage and cacheline behavior for realistic Deepin desktop/server/HPC workloads. File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
有没有前后对比数据 |
There was a problem hiding this comment.
Pull request overview
This PR adjusts the Deepin arm64 desktop kernel default configuration to reduce compile-time CPU scalability limits, lowering static memory overhead associated with large NR_CPUS builds.
Changes:
- Reduce
CONFIG_NR_CPUSfrom 4096 to 1024 in the Deepin arm64 desktop defconfig.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
cd9b19f to
83fa9ea
Compare
deepin inclusion
category: performance
CONFIG_NR_CPUS=4096 clearly over our all desktop/server/hpc users requirement,
limit it to a possible value to more match cacheline and save memory.
unixbench test show that no regression happened.
Log:
without:
24 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables 29982441.6 lps (10.0 s, 1 samples)
Double-Precision Whetstone 4732.1 MWIPS (9.5 s, 1 samples)
Execl Throughput 2432.1 lps (29.8 s, 1 samples)
File Copy 1024 bufsize 2000 maxblocks 605217.0 KBps (30.0 s, 1 samples)
File Copy 256 bufsize 500 maxblocks 173412.0 KBps (30.0 s, 1 samples)
File Copy 4096 bufsize 8000 maxblocks 1757219.0 KBps (30.0 s, 1 samples)
Pipe Throughput 1418412.0 lps (10.0 s, 1 samples)
Pipe-based Context Switching 156680.7 lps (10.0 s, 1 samples)
Process Creation 5021.4 lps (30.0 s, 1 samples)
Shell Scripts (1 concurrent) 7183.0 lpm (60.0 s, 1 samples)
Shell Scripts (8 concurrent) 5890.7 lpm (60.0 s, 1 samples)
System Call Overhead 903826.0 lps (10.0 s, 1 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 29982441.6 2569.2
Double-Precision Whetstone 55.0 4732.1 860.4
Execl Throughput 43.0 2432.1 565.6
File Copy 1024 bufsize 2000 maxblocks 3960.0 605217.0 1528.3
File Copy 256 bufsize 500 maxblocks 1655.0 173412.0 1047.8
File Copy 4096 bufsize 8000 maxblocks 5800.0 1757219.0 3029.7
Pipe Throughput 12440.0 1418412.0 1140.2
Pipe-based Context Switching 4000.0 156680.7 391.7
Process Creation 126.0 5021.4 398.5
Shell Scripts (1 concurrent) 42.4 7183.0 1694.1
Shell Scripts (8 concurrent) 6.0 5890.7 9817.8
System Call Overhead 15000.0 903826.0 602.6
========
System Benchmarks Index Score 1219.5
------------------------------------------------------------------------
Benchmark Run: 二 6月 02 2026 10:43:23 - 10:50:06
24 CPUs in system; running 24 parallel copies of tests
Dhrystone 2 using register variables 721768353.8 lps (10.0 s, 1 samples)
Double-Precision Whetstone 108404.8 MWIPS (10.0 s, 1 samples)
Execl Throughput 41784.6 lps (29.0 s, 1 samples)
File Copy 1024 bufsize 2000 maxblocks 5908598.0 KBps (30.0 s, 1 samples)
File Copy 256 bufsize 500 maxblocks 4161488.0 KBps (30.0 s, 1 samples)
File Copy 4096 bufsize 8000 maxblocks 7629387.0 KBps (30.0 s, 1 samples)
Pipe Throughput 33967811.2 lps (10.0 s, 1 samples)
Pipe-based Context Switching 5096634.8 lps (10.0 s, 1 samples)
Process Creation 61454.0 lps (30.0 s, 1 samples)
Shell Scripts (1 concurrent) 89999.3 lpm (60.0 s, 1 samples)
Shell Scripts (8 concurrent) 11291.7 lpm (60.0 s, 1 samples)
System Call Overhead 21695535.8 lps (10.0 s, 1 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 721768353.8 61848.2
Double-Precision Whetstone 55.0 108404.8 19710.0
Execl Throughput 43.0 41784.6 9717.3
File Copy 1024 bufsize 2000 maxblocks 3960.0 5908598.0 14920.7
File Copy 256 bufsize 500 maxblocks 1655.0 4161488.0 25144.9
File Copy 4096 bufsize 8000 maxblocks 5800.0 7629387.0 13154.1
Pipe Throughput 12440.0 33967811.2 27305.3
Pipe-based Context Switching 4000.0 5096634.8 12741.6
Process Creation 126.0 61454.0 4877.3
Shell Scripts (1 concurrent) 42.4 89999.3 21226.2
Shell Scripts (8 concurrent) 6.0 11291.7 18819.6
System Call Overhead 15000.0 21695535.8 14463.7
========
System Benchmarks Index Score 16976.8
with:
Benchmark Run: 二 6月 02 2026 12:38:58 - 12:45:42
24 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables 30184659.9 lps (10.0 s, 1 samples)
Double-Precision Whetstone 4517.4 MWIPS (9.9 s, 1 samples)
Execl Throughput 2421.8 lps (29.2 s, 1 samples)
File Copy 1024 bufsize 2000 maxblocks 604313.0 KBps (30.0 s, 1 samples)
File Copy 256 bufsize 500 maxblocks 177897.0 KBps (30.0 s, 1 samples)
File Copy 4096 bufsize 8000 maxblocks 1750778.0 KBps (30.0 s, 1 samples)
Pipe Throughput 1414020.3 lps (10.0 s, 1 samples)
Pipe-based Context Switching 149830.5 lps (10.0 s, 1 samples)
Process Creation 5016.6 lps (30.0 s, 1 samples)
Shell Scripts (1 concurrent) 7164.2 lpm (60.0 s, 1 samples)
Shell Scripts (8 concurrent) 5882.4 lpm (60.0 s, 1 samples)
System Call Overhead 902406.4 lps (10.0 s, 1 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 30184659.9 2586.5
Double-Precision Whetstone 55.0 4517.4 821.3
Execl Throughput 43.0 2421.8 563.2
File Copy 1024 bufsize 2000 maxblocks 3960.0 604313.0 1526.0
File Copy 256 bufsize 500 maxblocks 1655.0 177897.0 1074.9
File Copy 4096 bufsize 8000 maxblocks 5800.0 1750778.0 3018.6
Pipe Throughput 12440.0 1414020.3 1136.7
Pipe-based Context Switching 4000.0 149830.5 374.6
Process Creation 126.0 5016.6 398.1
Shell Scripts (1 concurrent) 42.4 7164.2 1689.7
Shell Scripts (8 concurrent) 6.0 5882.4 9803.9
System Call Overhead 15000.0 902406.4 601.6
========
System Benchmarks Index Score 1211.6
------------------------------------------------------------------------
Benchmark Run: 二 6月 02 2026 12:45:42 - 12:52:25
24 CPUs in system; running 24 parallel copies of tests
Dhrystone 2 using register variables 720910401.8 lps (10.0 s, 1 samples)
Double-Precision Whetstone 108403.0 MWIPS (9.9 s, 1 samples)
Execl Throughput 41148.2 lps (29.1 s, 1 samples)
File Copy 1024 bufsize 2000 maxblocks 5873695.0 KBps (30.0 s, 1 samples)
File Copy 256 bufsize 500 maxblocks 4199501.0 KBps (30.0 s, 1 samples)
File Copy 4096 bufsize 8000 maxblocks 7592616.0 KBps (30.0 s, 1 samples)
Pipe Throughput 33760461.7 lps (10.0 s, 1 samples)
Pipe-based Context Switching 4999475.3 lps (10.0 s, 1 samples)
Process Creation 61476.4 lps (30.0 s, 1 samples)
Shell Scripts (1 concurrent) 89711.7 lpm (60.0 s, 1 samples)
Shell Scripts (8 concurrent) 11251.2 lpm (60.1 s, 1 samples)
System Call Overhead 21762060.0 lps (10.0 s, 1 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 720910401.8 61774.7
Double-Precision Whetstone 55.0 108403.0 19709.6
Execl Throughput 43.0 41148.2 9569.4
File Copy 1024 bufsize 2000 maxblocks 3960.0 5873695.0 14832.6
File Copy 256 bufsize 500 maxblocks 1655.0 4199501.0 25374.6
File Copy 4096 bufsize 8000 maxblocks 5800.0 7592616.0 13090.7
Pipe Throughput 12440.0 33760461.7 27138.6
Pipe-based Context Switching 4000.0 4999475.3 12498.7
Process Creation 126.0 61476.4 4879.1
Shell Scripts (1 concurrent) 42.4 89711.7 21158.4
Shell Scripts (8 concurrent) 6.0 11251.2 18752.0
System Call Overhead 15000.0 21762060.0 14508.0
========
System Benchmarks Index Score 16910.5
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
83fa9ea to
b29db32
Compare
done |
deepin inclusion
category: performance
CONFIG_NR_CPUS=2048 clearly over our all desktop/server/hpc users requirement, limit it to a possible value to more match cacheline and save memory.
Summary by Sourcery
Build: