From b29db32670cf1cae26dce504b994a275fc0ae7bb Mon Sep 17 00:00:00 2001 From: Wentao Guan Date: Wed, 20 May 2026 17:12:37 +0800 Subject: [PATCH] deepin: arm64: decrease NR_CPUS to 2048 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit deepin inclusion category: performance CONFIG_NR_CPUS=4096 clearly over our all desktop/server/hpc users requirement, limit it to a possible value to more match cacheline and save memory. unixbench test show that no regression happened. Log: without: 24 CPUs in system; running 1 parallel copy of tests Dhrystone 2 using register variables 29982441.6 lps (10.0 s, 1 samples) Double-Precision Whetstone 4732.1 MWIPS (9.5 s, 1 samples) Execl Throughput 2432.1 lps (29.8 s, 1 samples) File Copy 1024 bufsize 2000 maxblocks 605217.0 KBps (30.0 s, 1 samples) File Copy 256 bufsize 500 maxblocks 173412.0 KBps (30.0 s, 1 samples) File Copy 4096 bufsize 8000 maxblocks 1757219.0 KBps (30.0 s, 1 samples) Pipe Throughput 1418412.0 lps (10.0 s, 1 samples) Pipe-based Context Switching 156680.7 lps (10.0 s, 1 samples) Process Creation 5021.4 lps (30.0 s, 1 samples) Shell Scripts (1 concurrent) 7183.0 lpm (60.0 s, 1 samples) Shell Scripts (8 concurrent) 5890.7 lpm (60.0 s, 1 samples) System Call Overhead 903826.0 lps (10.0 s, 1 samples) System Benchmarks Index Values BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 29982441.6 2569.2 Double-Precision Whetstone 55.0 4732.1 860.4 Execl Throughput 43.0 2432.1 565.6 File Copy 1024 bufsize 2000 maxblocks 3960.0 605217.0 1528.3 File Copy 256 bufsize 500 maxblocks 1655.0 173412.0 1047.8 File Copy 4096 bufsize 8000 maxblocks 5800.0 1757219.0 3029.7 Pipe Throughput 12440.0 1418412.0 1140.2 Pipe-based Context Switching 4000.0 156680.7 391.7 Process Creation 126.0 5021.4 398.5 Shell Scripts (1 concurrent) 42.4 7183.0 1694.1 Shell Scripts (8 concurrent) 6.0 5890.7 9817.8 System Call Overhead 15000.0 903826.0 602.6 ======== System Benchmarks Index Score 1219.5 ------------------------------------------------------------------------ Benchmark Run: 二 6月 02 2026 10:43:23 - 10:50:06 24 CPUs in system; running 24 parallel copies of tests Dhrystone 2 using register variables 721768353.8 lps (10.0 s, 1 samples) Double-Precision Whetstone 108404.8 MWIPS (10.0 s, 1 samples) Execl Throughput 41784.6 lps (29.0 s, 1 samples) File Copy 1024 bufsize 2000 maxblocks 5908598.0 KBps (30.0 s, 1 samples) File Copy 256 bufsize 500 maxblocks 4161488.0 KBps (30.0 s, 1 samples) File Copy 4096 bufsize 8000 maxblocks 7629387.0 KBps (30.0 s, 1 samples) Pipe Throughput 33967811.2 lps (10.0 s, 1 samples) Pipe-based Context Switching 5096634.8 lps (10.0 s, 1 samples) Process Creation 61454.0 lps (30.0 s, 1 samples) Shell Scripts (1 concurrent) 89999.3 lpm (60.0 s, 1 samples) Shell Scripts (8 concurrent) 11291.7 lpm (60.0 s, 1 samples) System Call Overhead 21695535.8 lps (10.0 s, 1 samples) System Benchmarks Index Values BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 721768353.8 61848.2 Double-Precision Whetstone 55.0 108404.8 19710.0 Execl Throughput 43.0 41784.6 9717.3 File Copy 1024 bufsize 2000 maxblocks 3960.0 5908598.0 14920.7 File Copy 256 bufsize 500 maxblocks 1655.0 4161488.0 25144.9 File Copy 4096 bufsize 8000 maxblocks 5800.0 7629387.0 13154.1 Pipe Throughput 12440.0 33967811.2 27305.3 Pipe-based Context Switching 4000.0 5096634.8 12741.6 Process Creation 126.0 61454.0 4877.3 Shell Scripts (1 concurrent) 42.4 89999.3 21226.2 Shell Scripts (8 concurrent) 6.0 11291.7 18819.6 System Call Overhead 15000.0 21695535.8 14463.7 ======== System Benchmarks Index Score 16976.8 with: Benchmark Run: 二 6月 02 2026 12:38:58 - 12:45:42 24 CPUs in system; running 1 parallel copy of tests Dhrystone 2 using register variables 30184659.9 lps (10.0 s, 1 samples) Double-Precision Whetstone 4517.4 MWIPS (9.9 s, 1 samples) Execl Throughput 2421.8 lps (29.2 s, 1 samples) File Copy 1024 bufsize 2000 maxblocks 604313.0 KBps (30.0 s, 1 samples) File Copy 256 bufsize 500 maxblocks 177897.0 KBps (30.0 s, 1 samples) File Copy 4096 bufsize 8000 maxblocks 1750778.0 KBps (30.0 s, 1 samples) Pipe Throughput 1414020.3 lps (10.0 s, 1 samples) Pipe-based Context Switching 149830.5 lps (10.0 s, 1 samples) Process Creation 5016.6 lps (30.0 s, 1 samples) Shell Scripts (1 concurrent) 7164.2 lpm (60.0 s, 1 samples) Shell Scripts (8 concurrent) 5882.4 lpm (60.0 s, 1 samples) System Call Overhead 902406.4 lps (10.0 s, 1 samples) System Benchmarks Index Values BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 30184659.9 2586.5 Double-Precision Whetstone 55.0 4517.4 821.3 Execl Throughput 43.0 2421.8 563.2 File Copy 1024 bufsize 2000 maxblocks 3960.0 604313.0 1526.0 File Copy 256 bufsize 500 maxblocks 1655.0 177897.0 1074.9 File Copy 4096 bufsize 8000 maxblocks 5800.0 1750778.0 3018.6 Pipe Throughput 12440.0 1414020.3 1136.7 Pipe-based Context Switching 4000.0 149830.5 374.6 Process Creation 126.0 5016.6 398.1 Shell Scripts (1 concurrent) 42.4 7164.2 1689.7 Shell Scripts (8 concurrent) 6.0 5882.4 9803.9 System Call Overhead 15000.0 902406.4 601.6 ======== System Benchmarks Index Score 1211.6 ------------------------------------------------------------------------ Benchmark Run: 二 6月 02 2026 12:45:42 - 12:52:25 24 CPUs in system; running 24 parallel copies of tests Dhrystone 2 using register variables 720910401.8 lps (10.0 s, 1 samples) Double-Precision Whetstone 108403.0 MWIPS (9.9 s, 1 samples) Execl Throughput 41148.2 lps (29.1 s, 1 samples) File Copy 1024 bufsize 2000 maxblocks 5873695.0 KBps (30.0 s, 1 samples) File Copy 256 bufsize 500 maxblocks 4199501.0 KBps (30.0 s, 1 samples) File Copy 4096 bufsize 8000 maxblocks 7592616.0 KBps (30.0 s, 1 samples) Pipe Throughput 33760461.7 lps (10.0 s, 1 samples) Pipe-based Context Switching 4999475.3 lps (10.0 s, 1 samples) Process Creation 61476.4 lps (30.0 s, 1 samples) Shell Scripts (1 concurrent) 89711.7 lpm (60.0 s, 1 samples) Shell Scripts (8 concurrent) 11251.2 lpm (60.1 s, 1 samples) System Call Overhead 21762060.0 lps (10.0 s, 1 samples) System Benchmarks Index Values BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 720910401.8 61774.7 Double-Precision Whetstone 55.0 108403.0 19709.6 Execl Throughput 43.0 41148.2 9569.4 File Copy 1024 bufsize 2000 maxblocks 3960.0 5873695.0 14832.6 File Copy 256 bufsize 500 maxblocks 1655.0 4199501.0 25374.6 File Copy 4096 bufsize 8000 maxblocks 5800.0 7592616.0 13090.7 Pipe Throughput 12440.0 33760461.7 27138.6 Pipe-based Context Switching 4000.0 4999475.3 12498.7 Process Creation 126.0 61476.4 4879.1 Shell Scripts (1 concurrent) 42.4 89711.7 21158.4 Shell Scripts (8 concurrent) 6.0 11251.2 18752.0 System Call Overhead 15000.0 21762060.0 14508.0 ======== System Benchmarks Index Score 16910.5 Signed-off-by: Wentao Guan --- arch/arm64/configs/deepin_arm64_desktop_defconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/configs/deepin_arm64_desktop_defconfig b/arch/arm64/configs/deepin_arm64_desktop_defconfig index 8b336ff14b9e..0378938feba6 100644 --- a/arch/arm64/configs/deepin_arm64_desktop_defconfig +++ b/arch/arm64/configs/deepin_arm64_desktop_defconfig @@ -97,7 +97,7 @@ CONFIG_ARCH_VISCONTI=y CONFIG_ARCH_XGENE=y CONFIG_ARCH_ZYNQMP=y CONFIG_ARM64_VA_BITS_48=y -CONFIG_NR_CPUS=4096 +CONFIG_NR_CPUS=2048 CONFIG_NUMA=y CONFIG_HZ_1000=y CONFIG_XEN=y