Skip to content

Conversation

@illwieckz
Copy link
Member

@illwieckz illwieckz commented Apr 7, 2025

I added some CMakeLists.tx code to rebuild sel_ldr and nacl_helper_bootstrap.

This relies on some unified DaemonPlatform framework copied from DaemonEngine/Daemon#1641 on purpose to give this CMake code the same easiness at doing cross-compiled builds.

I need help to complete the src/trusted/service_runtime/CMakeLists.txt file.

The CMakeLists.txt files are a rewrite of the file SConstruct and *.scons files with all unit tests deleted. Remaining unported code is commented out with lines starting with #TODO:.

Build status:

system arch build sel_ldr build nacl_helper_bootstrap run helloworld nexe
windows amd64 not tested N/A not tested
windows i686 not tested N/A not tested
mingw amd64 ✅️ N/A not tested
mingw i686 ✅️ N/A not tested
linux amd64 ✅️ ✅️ ✅️
linux i686 ✅️ ✅️ not tested
linux armhf ✅️ ✅️ not tested
linux armhf 16k ✅️ ✅️ not tested
linux armel ✅️ ✅️ not tested
linux mips ✅️ ✅️ not tested
android armel ✅️ ✅️ ❌️
macos amd64 ✅️ N/A not tested

Dynamically linked loader:

system arch shared link sel_ldr run helloworld nexe
windows amd64 not tested not tested
windows i686 not tested not tested
mingw amd64 ✅️ not tested
mingw i686 ✅️ not tested
linux amd64 ✅️ ✅️
linux i686 ✅️ not tested
linux armhf ✅️ ✅️
linux armhf 16k ✅️ not tested
linux armel ✅️ not tested
linux mips ✅️ not tested
android armel ✅️ not tested
macos amd64 ✅️ not tested

Statically linked loader: NOW REMOVED.

system arch static link sel_ldr run helloworld nexe
windows amd64 not implemented not implemented
windows i686 not implemented not implemented
mingw amd64 ✅️ not tested
mingw i686 ✅️ not tested
linux amd64 ✅️ ✅️
linux i686 ✅️ not tested
linux armhf ✅️ ❌️ segfault
linux armhf 16k ✅️ not tested
linux armel ✅️ not tested
linux mips ✅️ not tested
android armel ✅️ not tested
macos amd64 ✅️ not tested

Things like linux-mips were tested only because I ported the SCons code to CMake for completeness and make sure I forgot nothing.

I tested android-armel because I remember that in the past @cu-kai tried to get daemon-tty running on Android to get a console for his server.

@illwieckz illwieckz marked this pull request as draft April 7, 2025 18:42
@illwieckz illwieckz force-pushed the illwieckz/cmake branch 2 times, most recently from 2775579 to 993144e Compare April 7, 2025 20:34
@slipher
Copy link
Member

slipher commented Apr 7, 2025

Is it really so bad to use Scons? Unlike CMake, it can handle multiple toolchains, which makes it well-suited for this repo. Also if we keep the same build system, we can easily compare things between our version and upstream.

@illwieckz
Copy link
Member Author

illwieckz commented Apr 7, 2025

Is it really so bad to use Scons?

Yes. 😅️

Unlike CMake, it can handle multiple toolchains, which makes it well-suited for this repo.

Uh, totally not. The Scons scripts are not compatible with cross-compiling to begin with.

First purpose of this effort is to make possible to use multiple toolchains.

@slipher
Copy link
Member

slipher commented Apr 7, 2025

The Scons scripts are not compatible with cross-compiling to begin with.

Wrong. With Chromium native_client I can do ./scons --mode=nacl,opt-linux platform=x86-32 sel_ldr irt_core_raw or ./scons --mode=nacl,opt-linux platform=arm sel_ldr irt_core_raw and everything completes successfully and produces executables of the expected architectures.

@illwieckz
Copy link
Member Author

How do I build with MinGW for Windows on Linux? For Linux Arm?

How do I make a static nacl_loader? How do I rebuild with a 16k PageSize for Arm?

This scons stuff is over-convoluted…

@illwieckz
Copy link
Member Author

illwieckz commented Apr 7, 2025

Wrong. With Chromium native_client I can do ./scons --mode=nacl,opt-linux platform=arm sel_ldr

With this exact command I get that:

Exception: Cannot find a toolchain for arm in toolchain/linux_x86/pnacl_newlib_raw:
  File "SConstruct", line 2799:
    if UsingNaclMode(): nacl_env = nacl_env.Clone(
  File "/usr/lib/python3/dist-packages/SCons/Environment.py", line 1610:
    apply_tools(clone, tools, toolpath)
  File "/usr/lib/python3/dist-packages/SCons/Environment.py", line 117:
    _ = env.Tool(tool)
  File "/usr/lib/python3/dist-packages/SCons/Environment.py", line 2033:
    tool(self)
  File "/usr/lib/python3/dist-packages/SCons/Tool/__init__.py", line 265:
    self.generate(env, *args, **kw)
  File "site_scons/site_tools/naclsdk.py", line 756:
    _SetEnvForNativeSdk(env, root)
  File "site_scons/site_tools/naclsdk.py", line 109:
    raise Exception("Cannot find a toolchain for %s in %s" %

I have arm-linux-gnueabihf-gcc (from the gcc-arm-linux-gnueabihf package), and also clang.

@slipher
Copy link
Member

slipher commented Apr 7, 2025

How do I build with MinGW for Windows on Linux?

Found some documentation: https://github.com/DaemonEngine/native_client/blob/master/docs/build_systems.md. Although cross-architecture builds are supported, cross-OS builds are not. And you can't build with MinGW at all as it is designed for the MSVC toolchain. It seems building with MinGW would imply a porting effort beyond just the build system, as there is, e.g., a Microsoft assembler file.

The lack of cross-OS support would seem to be a limitation of Native Client though, not a limitation of Scons.

For Linux Arm?

I posted that in the previous message.

Wrong. With Chromium native_client I can do ./scons --mode=nacl,opt-linux platform=arm sel_ldr

With this exact command I get that:

You're getting an error finding a PNaCl toolchain, which should have been downloaded by gclient or whatever. Make sure you are in a fully equipped Chromium environment.

@illwieckz
Copy link
Member Author

Wrong. With Chromium native_client I can do ./scons --mode=nacl,opt-linux platform=arm sel_ldr

With this exact command I get that:

You're getting an error finding a PNaCl toolchain, which should have been downloaded by gclient or whatever. Make sure you are in a fully equipped Chromium environment.

Why do I need a PNaCl toolchain to build an Arm sel_ldr?

The lack of cross-OS support would seem to be a limitation of Native Client though, not a limitation of Scons.

Yes, all that scons code in the repository is not meant for cross-compilation, that's what I meant.

And you can't build with MinGW at all as it is designed for the MSVC toolchain. It seems building with MinGW would imply a porting effort beyond just the build system, as there is, e.g., a Microsoft assembler file.

Very annoying… I have seen they also have some Cygwin code, so I wonder if that can be used on MinGW, I haven't looked at this deeply though.

@illwieckz
Copy link
Member Author

Sorry, the output of ./scons --mode=opt-linux platform=arm sel_ldr is (I forgot to remove the nacl mode):

AttributeError: 'SConsEnvironment' object has no attribute 'Program':
  File "SConstruct", line 3889:
    BuildEnvironments(selected_envs)
  File "site_init", line 198:
    
  File "/usr/lib/python3/dist-packages/SCons/Util/envs.py", line 242:
    return self.method(*nargs, **kwargs)
  File "site_scons/site_tools/defer.py", line 148:
    func(env)
  File "site_init", line 125:
    
  File "/usr/lib/python3/dist-packages/SCons/Script/SConscript.py", line 598:
    return _SConscript(self.fs, *files, **subst_kw)
  File "/usr/lib/python3/dist-packages/SCons/Script/SConscript.py", line 285:
    exec(compile(scriptdata, scriptname, 'exec'), call_stack[-1].globals)
  File "src/trusted/validator_arm/build.scons", line 271:
    nexe = untrusted_env.ComponentProgram(test, 'testdata/' + test + '.S',
  File "/usr/lib/python3/dist-packages/SCons/Util/envs.py", line 242:
    return self.method(*nargs, **kwargs)
  File "site_scons/site_tools/component_builders.py", line 485:
    out_nodes = env.Program(prog_name, *args, **kwargs)

@illwieckz
Copy link
Member Author

This can already build sel_ldr on linux-amd64.

@illwieckz
Copy link
Member Author

This can now rebuild sel_ldr on linux-armhf.

@illwieckz
Copy link
Member Author

This can now rebuild sel_ldr on linux-i686.

@illwieckz
Copy link
Member Author

This can now rebuild nacl_helper_bootstrap on both linux-amd64, linux-i686 and linux-armhf.

@illwieckz
Copy link
Member Author

This can now rebuild sel_ldr on macos-amd64.

@illwieckz
Copy link
Member Author

illwieckz commented Apr 8, 2025

By using JWasm as a MASM-compatible assembler I can cross-build (from Linux to Windows with MinGW) the sel_ldr binary for windows-amd64 as long as I put ksamd64.inc, kxamd64.inc and macamd64.inc in ../../src/trusted/service_runtime/arch/x86_64.

I got them from:

@illwieckz illwieckz force-pushed the illwieckz/cmake branch 2 times, most recently from 37f0e62 to 0c40165 Compare April 8, 2025 10:30
@illwieckz
Copy link
Member Author

illwieckz commented Apr 8, 2025

Using the same tricks, it is now possible to rebuild sel_ldr for windows-i686 with MinGW.

It means that it is now possible to rebuild the loader for all the DæmonEngine platforms using CMake (including the related nacl_bootstrap_helper when needed), on macOS for the macOS loader, on Linux for all other platforms.

@illwieckz illwieckz force-pushed the illwieckz/cmake branch 2 times, most recently from 8d956b8 to 11e641b Compare April 8, 2025 10:45
@illwieckz
Copy link
Member Author

In fact, only the single file macamd64.inc is enough. One can put it in include-hax/winsdk/ and it works.

@illwieckz
Copy link
Member Author

illwieckz commented Apr 8, 2025

Even better, those two obvious macros are enough:

push_reg macro Reg
	push Reg
	.pushreg Reg
	endm

alloc_stack macro Size
	sub rsp, Size
	.allocstack Size
	endm

So I provided a file named include-hax/fake_masm/macamd64.inc that just contains that, and when using MinGW, JWasm is used with include-hax/fake_masm as include directory.

On MSVC with original MASM, it is meant to use system's macamd64.inc, but this is untested.

@illwieckz
Copy link
Member Author

So this repository can now cross-compile sel_ldr for Windows using MinGW on Linux out of the box.

@illwieckz
Copy link
Member Author

illwieckz commented Apr 8, 2025

I added a commit to statically link the sel_ldr binary, unfortunately it prints this warning:

transport_common.cc:(.text+0x78c): avertissement : Using 'gethostbyname' in statically linked
 applications requires at runtime the shared libraries from the glibc version used for linking

I don't know if that's a problem for us.

@illwieckz
Copy link
Member Author

To run on 16K PageSize kernel, it looks like we would also need to rebuild the nexe as well:

$ readelf -l irt_core-armhf.nexe | grep LOAD
  LOAD           0x010000 0x0ffc0000 0x0ffc0000 0x30000 0x30000 R E 0x10000
  LOAD           0x000000 0x3efe0000 0x3efe0000 0x041b8 0x041b8 R   0x10000
  LOAD           0x0041b8 0x3eff41b8 0x3eff41b8 0x00b90 0x011b0 RW  0x10000

@slipher
Copy link
Member

slipher commented Apr 10, 2025

The nexe is loaded by sel_ldr, not the system executable loader right? So maybe its "page size" doesn't matter, or maybe the code of sel_ldr has to be changed. Hard to speculate since I don't understand the concept of page size of an executable.

@illwieckz
Copy link
Member Author

illwieckz commented Apr 10, 2025

As far as I know, this is related to the kernel, not the executable loader.

Systems like Box64 or FEX-Emu to run amd64/i686 binaries on arm64 are required to do some translation on the binaries (not just for the architecture, but for the page size itself). Wine cannot run them out of the box for the same reason, even with an amd64 translator.

To run 4k binaries on a 16k kernel, the current solution is to run a 4k secondary kernel in a microvm using some pass-through techniques for input and graphics: https://asahilinux.org/2024/10/aaa-gaming-on-asahi-linux

I assume the NaCl virtual machine is not low-level enough to emulate a kernel interface, it would be surprising.

@slipher
Copy link
Member

slipher commented Apr 10, 2025

https://tristanxr.com/post/why-16k-page-size/ has more details on Asahi Linux's experiences with page size portability issues. It says that most Linux programs work without issues. The ones with problems are ones that manage their own memory mappings somehow, e.g. using a custom allocator instead of libc's. In the thing about Windows games you linked, the problem is probably with the Windows memory allocator or some other part of the Windows runtime. The problem is surely not with executable alignment: given that you are translating to a different ISA, you can lay out the translated instructions however you want. So I don't think the section alignments in the program header really matter for 16k page compatibility. As a last bit of evidence, I tried setting a binary's executable section's alignment to 1, and it still worked.

I do very much expect that sel_ldr is a program that manages its own memory mappings. Firstly, I believe that it maps the nexe into memory itself, rather than using the kernel's executable loading. Secondly, The NaCl sandbox model specifies a specific range of memory that the untrusted code is allowed to read/write access. All memory allocations by untrusted code must go in this range and all allocations by trusted code outside it. If there is an NaCl syscall for mapping more memory pages, we should check whether the page size is a hard-coded number, baked in at compile time from a system header, or queried at runtime.

Note that if you read the nexe program header dump closely, the alignment is not 4k, but 64k. So no problem even if it somehow mattered.

@illwieckz illwieckz force-pushed the illwieckz/cmake branch 2 times, most recently from 2dc28fc to a5ae566 Compare April 11, 2025 00:31
@illwieckz
Copy link
Member Author

The nacl_helper_bootstrap for linux-armhf crashes when the sel_ldr is linked statically.

It works on linux-amd64, but not on linux-armhf… That's annoying because arm is the platform where it would be easier to have a static sel_ldr (less files to ship, and a simpler loading chain).

Even when building with debug symbols, I cannot debug the tool:

Reading symbols from ./nacl_helper_bootstrap...
(gdb) r
Starting program: ./nacl_helper_bootstrap ./sel_ldr --r_debug=0xXXXXXXXXXXXXXXXX --reserved_at_zero=0xXXXXXXXXXXXXXXXX -a -S -B irt_core.nexe -- helloworld-armhf.nexe

Program received signal SIGSEGV, Segmentation fault.
0xf7f34924 in ?? ()
(gdb) bt
#0  0xf7f34924 in ?? ()
#1  0xf7ea78c0 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

@illwieckz illwieckz force-pushed the illwieckz/cmake branch 4 times, most recently from a8436ab to 961bfa0 Compare April 11, 2025 04:31
@illwieckz
Copy link
Member Author

illwieckz commented Apr 13, 2025

The inability to build a working nacl_helper_bootstrap on amd64 is in fact a GCC bug.

When running on a debugger, It fails at the empty line preceding the assembly startup code (outside of any code block).

When building with GCC 8 instead of GCC 13, it works.

I also noticed that building it with optimization enabled whatever the GCC version tested introduces other bugs. Disabling optimization (-O0) fixes it on older GCC like GCC 8, but is not enough to fix it on GCC 13.

I'll make a bug report to GCC at some point, with some reduced source sample.

…-reorder disabled on armhf

The -ftoplevel-reorder option breaks the build for armhf.
@illwieckz
Copy link
Member Author

When building with GCC 8 instead of GCC 13, it works.

It works up to GCC 12.

@illwieckz illwieckz force-pushed the illwieckz/cmake branch 2 times, most recently from 63bbde8 to 4bad110 Compare October 15, 2025 10:58
@illwieckz illwieckz changed the title WIP: Write a CMakeLists.txt to build the NaCl loader Write a CMakeLists.txt to build the NaCl loader Dec 9, 2025
@illwieckz illwieckz marked this pull request as ready for review December 9, 2025 06:09
@illwieckz
Copy link
Member Author

Now fixed.

That PR looks ready now.

#endif /* 64-bit Windows */

#if NACL_WINDOWS && (NACL_BUILD_SUBARCH == 64)
#if !(NACL_WINDOWS && (NACL_BUILD_SUBARCH == 64)) || !defined(MSC_VER)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This here inverts the condition in a way that doesn't look intentional and doesn't match the comment below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, also it's _MSC_VER, not MSC_VER. It looks like there is no MinGW implementation so I just disabled that code on MinGW the same way it is already disabled (as said by this comment) on 64-bit MSC.

@illwieckz illwieckz force-pushed the illwieckz/cmake branch 4 times, most recently from 12459bb to 4340799 Compare December 10, 2025 10:54
@illwieckz
Copy link
Member Author

I removed everything that wasn't CMake, other things like the MinGW compatibility code will be submitted in a separate PR.

@illwieckz
Copy link
Member Author

The CMake code already includes stuff for MinGW:

So this part references files that aren't there yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants