Skip to content

Conversation

@zdevito
Copy link
Contributor

@zdevito zdevito commented Dec 8, 2025

Stack from ghstack (oldest at bottom):

Draft of how we can link everything into monarch statically.
After looking at binary size it is pretty clear we want to dynamically link libnccl (it can be big if it supports many arches). So I will change that to an approach similar to libcuda where we load the needed functions dynamically.

(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007ffdabdf5000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007fe1fae00000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007fe201a0f000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fe201921000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fe1faa00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe201a2b000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fe20191a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fe201915000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007fe201910000)

Differential Revision: D88540873

Draft of how we can link everything into monarch statically.
After looking at binary size it is pretty clear we want to dynamically link libnccl (it can be big if it supports many arches). So I will change that to an approach similar to libcuda where we load the needed functions dynamically.


```
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007ffdabdf5000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007fe1fae00000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007fe201a0f000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fe201921000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fe1faa00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe201a2b000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fe20191a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fe201915000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007fe201910000)
```

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)

[ghstack-poisoned]
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 8, 2025
Draft of how we can link everything into monarch statically.
After looking at binary size it is pretty clear we want to dynamically link libnccl (it can be big if it supports many arches). So I will change that to an approach similar to libcuda where we load the needed functions dynamically.


```
(monarch) [zdevitodevgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007ffdabdf5000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007fe1fae00000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007fe201a0f000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fe201921000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fe1faa00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe201a2b000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fe20191a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fe201915000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007fe201910000)
```

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)

[ghstack-poisoned]
This was referenced Dec 9, 2025
zdevito added a commit that referenced this pull request Dec 9, 2025
Pull Request resolved: #2085

Link everything into monarch statically, or dlopen it.

This makes the rdma related libraries and libc++ linked statically for increased portability. This means monarch built with rdma support and with a different version of libc++ than the current system can still be used.

By building rdma-core from source, we can also make sure Dennis' 64-bit patch is applied.

Total size of our so is 63MB now with no uncommon deps:

```
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007f347a684000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007f3477200000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007f347a664000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f3477125000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f3476e00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f347a686000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f347a64a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f347a645000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007f347a640000)
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ du -h ./python/monarch/_rust_bindings.so
63M     ./python/monarch/_rust_bindings.so
```

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)
ghstack-source-id: 327971955
Draft of how we can link everything into monarch statically.
After looking at binary size it is pretty clear we want to dynamically link libnccl (it can be big if it supports many arches). So I will change that to an approach similar to libcuda where we load the needed functions dynamically.


```
(monarch) [zdevitodevgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007ffdabdf5000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007fe1fae00000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007fe201a0f000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fe201921000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fe1faa00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe201a2b000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fe20191a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fe201915000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007fe201910000)
```

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)

[ghstack-poisoned]
zdevito added a commit that referenced this pull request Dec 9, 2025
Pull Request resolved: #2085

Link everything into monarch statically, or dlopen it.

This makes the rdma related libraries and libc++ linked statically for increased portability. This means monarch built with rdma support and with a different version of libc++ than the current system can still be used.

By building rdma-core from source, we can also make sure Dennis' 64-bit patch is applied.

Total size of our so is 63MB now with no uncommon deps:

```
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007f347a684000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007f3477200000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007f347a664000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f3477125000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f3476e00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f347a686000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f347a64a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f347a645000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007f347a640000)
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ du -h ./python/monarch/_rust_bindings.so
63M     ./python/monarch/_rust_bindings.so
```
ghstack-source-id: 328253183

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)
Draft of how we can link everything into monarch statically.
After looking at binary size it is pretty clear we want to dynamically link libnccl (it can be big if it supports many arches). So I will change that to an approach similar to libcuda where we load the needed functions dynamically.


```
(monarch) [zdevitodevgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007ffdabdf5000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007fe1fae00000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007fe201a0f000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fe201921000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fe1faa00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe201a2b000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fe20191a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fe201915000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007fe201910000)
```

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)

[ghstack-poisoned]
zdevito added a commit that referenced this pull request Dec 10, 2025
Pull Request resolved: #2085

Link everything into monarch statically, or dlopen it.

This makes the rdma related libraries and libc++ linked statically for increased portability. This means monarch built with rdma support and with a different version of libc++ than the current system can still be used.

By building rdma-core from source, we can also make sure Dennis' 64-bit patch is applied.

Total size of our so is 63MB now with no uncommon deps:

```
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007f347a684000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007f3477200000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007f347a664000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f3477125000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f3476e00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f347a686000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f347a64a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f347a645000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007f347a640000)
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ du -h ./python/monarch/_rust_bindings.so
63M     ./python/monarch/_rust_bindings.so
```
ghstack-source-id: 328257055

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)
Draft of how we can link everything into monarch statically.
After looking at binary size it is pretty clear we want to dynamically link libnccl (it can be big if it supports many arches). So I will change that to an approach similar to libcuda where we load the needed functions dynamically.


```
(monarch) [zdevitodevgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007ffdabdf5000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007fe1fae00000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007fe201a0f000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fe201921000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fe1faa00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe201a2b000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fe20191a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fe201915000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007fe201910000)
```

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)

[ghstack-poisoned]
zdevito added a commit that referenced this pull request Dec 10, 2025
Pull Request resolved: #2085

Link everything into monarch statically, or dlopen it.

This makes the rdma related libraries and libc++ linked statically for increased portability. This means monarch built with rdma support and with a different version of libc++ than the current system can still be used.

By building rdma-core from source, we can also make sure Dennis' 64-bit patch is applied.

Total size of our so is 63MB now with no uncommon deps:

```
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007f347a684000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007f3477200000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007f347a664000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f3477125000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f3476e00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f347a686000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f347a64a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f347a645000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007f347a640000)
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ du -h ./python/monarch/_rust_bindings.so
63M     ./python/monarch/_rust_bindings.so
```
ghstack-source-id: 328257535

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)
Draft of how we can link everything into monarch statically.
After looking at binary size it is pretty clear we want to dynamically link libnccl (it can be big if it supports many arches). So I will change that to an approach similar to libcuda where we load the needed functions dynamically.


```
(monarch) [zdevitodevgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007ffdabdf5000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007fe1fae00000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007fe201a0f000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fe201921000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fe1faa00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe201a2b000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fe20191a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fe201915000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007fe201910000)
```

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)

[ghstack-poisoned]
zdevito added a commit that referenced this pull request Dec 10, 2025
Pull Request resolved: #2085

Link everything into monarch statically, or dlopen it.

This makes the rdma related libraries and libc++ linked statically for increased portability. This means monarch built with rdma support and with a different version of libc++ than the current system can still be used.

By building rdma-core from source, we can also make sure Dennis' 64-bit patch is applied.

Total size of our so is 63MB now with no uncommon deps:

```
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007f347a684000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007f3477200000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007f347a664000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f3477125000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f3476e00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f347a686000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f347a64a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f347a645000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007f347a640000)
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ du -h ./python/monarch/_rust_bindings.so
63M     ./python/monarch/_rust_bindings.so
```
ghstack-source-id: 328260415

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)
Draft of how we can link everything into monarch statically.
After looking at binary size it is pretty clear we want to dynamically link libnccl (it can be big if it supports many arches). So I will change that to an approach similar to libcuda where we load the needed functions dynamically.


```
(monarch) [zdevitodevgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007ffdabdf5000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007fe1fae00000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007fe201a0f000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fe201921000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fe1faa00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe201a2b000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fe20191a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fe201915000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007fe201910000)
```

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)

[ghstack-poisoned]
zdevito added a commit that referenced this pull request Dec 10, 2025
Pull Request resolved: #2085

Link everything into monarch statically, or dlopen it.

This makes the rdma related libraries and libc++ linked statically for increased portability. This means monarch built with rdma support and with a different version of libc++ than the current system can still be used.

By building rdma-core from source, we can also make sure Dennis' 64-bit patch is applied.

Total size of our so is 63MB now with no uncommon deps:

```
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007f347a684000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007f3477200000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007f347a664000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f3477125000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f3476e00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f347a686000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f347a64a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f347a645000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007f347a640000)
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ du -h ./python/monarch/_rust_bindings.so
63M     ./python/monarch/_rust_bindings.so
```
ghstack-source-id: 328274952

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)
Draft of how we can link everything into monarch statically.
After looking at binary size it is pretty clear we want to dynamically link libnccl (it can be big if it supports many arches). So I will change that to an approach similar to libcuda where we load the needed functions dynamically.


```
(monarch) [zdevitodevgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007ffdabdf5000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007fe1fae00000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007fe201a0f000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fe201921000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fe1faa00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe201a2b000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fe20191a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fe201915000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007fe201910000)
```

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)

[ghstack-poisoned]
zdevito added a commit that referenced this pull request Dec 10, 2025
Pull Request resolved: #2085

Link everything into monarch statically, or dlopen it.

This makes the rdma related libraries and libc++ linked statically for increased portability. This means monarch built with rdma support and with a different version of libc++ than the current system can still be used.

By building rdma-core from source, we can also make sure Dennis' 64-bit patch is applied.

Total size of our so is 63MB now with no uncommon deps:

```
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007f347a684000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007f3477200000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007f347a664000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f3477125000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f3476e00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f347a686000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f347a64a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f347a645000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007f347a640000)
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ du -h ./python/monarch/_rust_bindings.so
63M     ./python/monarch/_rust_bindings.so
```
ghstack-source-id: 328276528

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)
Draft of how we can link everything into monarch statically.
After looking at binary size it is pretty clear we want to dynamically link libnccl (it can be big if it supports many arches). So I will change that to an approach similar to libcuda where we load the needed functions dynamically.


```
(monarch) [zdevitodevgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007ffdabdf5000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007fe1fae00000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007fe201a0f000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fe201921000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fe1faa00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe201a2b000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fe20191a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fe201915000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007fe201910000)
```

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)

[ghstack-poisoned]
zdevito added a commit that referenced this pull request Dec 10, 2025
Pull Request resolved: #2085

Link everything into monarch statically, or dlopen it.

This makes the rdma related libraries and libc++ linked statically for increased portability. This means monarch built with rdma support and with a different version of libc++ than the current system can still be used.

By building rdma-core from source, we can also make sure Dennis' 64-bit patch is applied.

Total size of our so is 63MB now with no uncommon deps:

```
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007f347a684000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007f3477200000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007f347a664000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f3477125000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f3476e00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f347a686000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f347a64a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f347a645000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007f347a640000)
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ du -h ./python/monarch/_rust_bindings.so
63M     ./python/monarch/_rust_bindings.so
```
ghstack-source-id: 328278142

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)
Draft of how we can link everything into monarch statically.
After looking at binary size it is pretty clear we want to dynamically link libnccl (it can be big if it supports many arches). So I will change that to an approach similar to libcuda where we load the needed functions dynamically.


```
(monarch) [zdevitodevgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007ffdabdf5000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007fe1fae00000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007fe201a0f000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fe201921000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fe1faa00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe201a2b000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fe20191a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fe201915000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007fe201910000)
```

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)

[ghstack-poisoned]
zdevito added a commit that referenced this pull request Dec 10, 2025
Pull Request resolved: #2085

Link everything into monarch statically, or dlopen it.

This makes the rdma related libraries and libc++ linked statically for increased portability. This means monarch built with rdma support and with a different version of libc++ than the current system can still be used.

By building rdma-core from source, we can also make sure Dennis' 64-bit patch is applied.

Total size of our so is 63MB now with no uncommon deps:

```
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007f347a684000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007f3477200000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007f347a664000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f3477125000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f3476e00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f347a686000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f347a64a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f347a645000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007f347a640000)
(monarch) [zdevito@devgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ du -h ./python/monarch/_rust_bindings.so
63M     ./python/monarch/_rust_bindings.so
```
ghstack-source-id: 328288065

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)
@zdevito zdevito mentioned this pull request Dec 10, 2025
Draft of how we can link everything into monarch statically.
After looking at binary size it is pretty clear we want to dynamically link libnccl (it can be big if it supports many arches). So I will change that to an approach similar to libcuda where we load the needed functions dynamically.


```
(monarch) [zdevitodevgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007ffdabdf5000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007fe1fae00000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007fe201a0f000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fe201921000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fe1faa00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe201a2b000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fe20191a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fe201915000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007fe201910000)
```

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)

[ghstack-poisoned]
Draft of how we can link everything into monarch statically.
After looking at binary size it is pretty clear we want to dynamically link libnccl (it can be big if it supports many arches). So I will change that to an approach similar to libcuda where we load the needed functions dynamically.


```
(monarch) [zdevitodevgpu014 /data/users/zdevito/fbsource/fbcode/monarch]$ ldd ./python/monarch/_rust_bindings.so
        linux-vdso.so.1 (0x00007ffdabdf5000)
        libpython3.11.so.1.0 => /home/zdevito/local/miniconda3/envs/monarch/lib/libpython3.11.so.1.0 (0x00007fe1fae00000)
        libgcc_s.so.1 => /home/zdevito/local/miniconda3/envs/monarch/lib/libgcc_s.so.1 (0x00007fe201a0f000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fe201921000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fe1faa00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe201a2b000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fe20191a000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fe201915000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007fe201910000)
```

Differential Revision: [D88540873](https://our.internmc.facebook.com/intern/diff/D88540873/)

[ghstack-poisoned]
@meta-codesync meta-codesync bot closed this in be3d6bc Dec 10, 2025
@meta-codesync
Copy link

meta-codesync bot commented Dec 10, 2025

This pull request has been merged in be3d6bc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported Merged meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants