Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
214 changes: 214 additions & 0 deletions docs/DeveloperDocs/AArch64/IFunc.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,214 @@
# GNU IFunc support with AArch64

GNU IFunc functionality enables a developer to provide multiple implementations
of a function and a resolver function that selects which implementation to use at runtime.
It is typically used for selecting the most optimized implementation for a given CPU / runtime.
The resolver function is called once at the start of the application runtime and then the
selected implementation is fixed.

Example:

```c
__attribute__((ifunc("foo_resolver")))
int foo(int);

int foo_impl(int u) {
return u;
}

int (*foo_resolver())(int) {
return foo_impl;
}

int bar(int u) {
int v = foo(u); // Calls the function that is returned by foo_resolver
}
```

## Static linking

eld generates a PLT slot for each ifunc symbol. Each PLT slot has a corresponding
GOTPLT slot. This model is very similar to how preemptible dynamic symbols are handled.
However, instead of doing symbol resolution at runtime to fill the GOTPLT slot, the runtime
calls the corresponding resolver function to fill the GOTPLT slot.

For the case of static executables, libc plays the role of runtime and takes on a task that is typically
unusual for a libc -- resolve symbols and patch the binary at runtime with the resolution information.

eld emits `R_AARCH64_IRELATIVE` relocations in `.rela.plt` section, and `__rela_iplt_start` and
`__rela_iplt_end` symbols that store the start and end addresses of `.rela.plt` section.

The tiny-loader in the libc process the `R_AARCH64_IRELATIVE` relocations by
iterating over the [`__rela_iplt_start`, `__rela_iplt_end`) range.

```
Relocation section '.rela.plt' at offset 0x190 contains 6 entries:
Offset Info Type Symbol's Value
0000000000490000 0000000000000408 R_AARCH64_IRELATIVE 4358c0
0000000000490008 0000000000000408 R_AARCH64_IRELATIVE 40db20
```

The symbol value of `IRELATIVE` relocations contains the IFunc resolver address, and
the relocation offset points to the GOTPLT slot for the IFunc symbol. For each relocation,
the tiny-loader in the libc calls the IFunc resolver and stores the result in the GOTPLT slot.

Note that there is no lazy binding here.

### Direct reference to an IFunc symbol

Direct references to an IFunc symbol are resolved to the PLT slot of the IFunc symbol.

```
// global variable!
// R_AARCH64_ABS64
int (*foo_gp)(int) = foo;
```

eld will resolve `R_AARCH64_ABS64` relocation to the address of PLT[foo].

### GOT references to an IFunc symbol

GOT references to an IFunc symbol are resolved to the GOTPLT slot of the IFunc symbol
when there is no direct references to the IFunc symbol. When there is a direct reference
to the IFunc symbol, then the GOT references to the IFunc symbols gets resolved to the
GOT slot of the IFunc symbol.

Let's see why:

Case 1: PIC code and no direct reference

```c
// foo is an ifunc symbol!
int main() {
// adrp x0, :got:foo ; R_AARCH64_GOT_PAGE
// ldr x0, [x0, :got_lo12:foo] ; R_AARCH64_LD64_GOT_LO12_NC
int (*foo_lp)(int) = foo;
}
```

Here eld will resolve `adrp + ldr` relocation pair to the address of GOTPLT[foo].
Hence, `ldr` will load the address that is stored in the GOTPLT[foo]. Hence,
`foo_lp` will store the address of the resolved function. With this design, there
is no indirection penalty for calls to `foo_lp`.

Case 2: PIC code and direct reference

```c
// foo is an ifunc symbol!

// R_AARCH64_ABS64
int (*foo_gp)(int) = foo;

int main() {
// adrp x0, :got:foo ; R_AARCH64_GOT_PAGE
// ldr x0, [x0, :got_lo12:foo] ; R_AARCH64_LD64_GOT_LO12_NC
int (*foo_lp)(int) = foo;
}
```

Here eld will resolve `ABS64` to the address of PLT[foo]. With this, calls to `foo_gp` will
work as expected. Note that we cannot resolve `ABS64` to the address of
the resolved function because at link time we cannot know the resolved function.

Now, if we resolve `adrp + ldr` relocation pair as before, then `foo_lp` will store the address of
the resolved function. This is a problem because `foo_gp` and `foo_lp`, both pointers to the same
function `foo`, have different values.

To resolve this, whenever their is a direct reference to an ifunc symbol `foo`,
eld creates an additional GOT slot for `foo`, and fill that with the address of
the PLT[foo], and resolve all GOT references of `foo` to the GOT[foo] instead of the GOTPLT[foo].
With this design, `foo_lp` will store the address of PLT[foo], the same as `foo_gp`. Hence,
no pointer inequality issue.

### IFunc behaviour across all relocations

To describe IFunc behavior for all relocations, we categorize the relocations
into the following categories:

- Absolute / PC-relative data relocations
- GOT-related data relocations
- Control flow relocations
- GOT-related instruction relocations
- Absolute / PC-relative address-forming relocations
- Absolute / PC-relative load/store relocations.
- General computation relocation

The relocations which are not supported by GNU for IFunc symbols are annotated with
NotSupportedInGNULDForIFunc. The GNU toolchain that is used for verifying this is:
aarch64-none-linux-gnu-gcc (Arm GNU Toolchain 15.2.Rel1 (Build arm-15.86)) 15.2.1 20251203


#### Absolute / PC-relative data relocations

Resolves to PLT[IFuncSymbol]. Sets HasDirectReference[IFuncSymbol] to true.

- R_AARCH64_ABS{16, 32, 64}

Not handled currently.

[!IMPORTANT]
They should be resolved to PLT[IFuncSymbol] as well!

- R_AARCH64_PREL{16, 32, 64} (NotSupportedInGNULDForIFunc)
- R_AARCH64_PLT32 (UNSUPPORTED)

#### GOT-related data relocations

- GOTREL{32, 64} (UNSUPPORTED)
- GOTPCREL32 (UNSUPPORTED)

#### Control flow relocations

Resolves to PLT[IFuncSymbol].

- TSTBR14
- CONDBR19
- JUMP26
- CALL26

#### GOT-related instruction relocations

Resolves-to / uses GOTPLT[IFuncSymbol] if there is no direct reference to
IFuncSymbol; otherwise uses GOT[IFuncSymbol].

- R_AARCH64_ADR_GOT_PAGE
- R_AARCH64_LD{32,64}_GOT_LO12_NC (LD32 variant UNSUPPORTED)
- R_AARCH64_LD{32,64}_GOTPAGE_LO15 (LD32 variant UNSUPPORTED)
- R_AARCH64_GOT_LD_PREL19 (UNSUPPORTED)
- AUTH-ABI GOT relocations (UNSUPPORTED)
- R_AARCH64_MOVW_GOTOFF_G{0,1}{_NC} (UNSUPPORTED)

#### Absolute / PC-relative address-forming relocations

- R_AARCH64_ADR_PREL_LO21 (NotSupportedInGNULDForIFunc)
- R_AARCH64_ADR_PREL_PG_HI21{_NC}

Resolves to PLT[IFuncSymbol]. Sets HasDirectReference[IFuncSymbol] to true.

- R_AARCH64_MOVW_UABS_G{0,1,2,3}{_NC} (NotSupportedInGNULDForIFunc)
- R_AARCH64_SABS_G{0,1,2,3} (NotSupportedInGNULDForIFunc)
- MOVW_PREL_G{0, 1, 2, 3}{_NC} (UNSUPPORTED)

Resolves to IFuncSymbol.

[!IMPORTANT]
FIXME: We should perhaps resolve these relocations to the PLT[IFuncSymbol]
instead of the IFuncSymbol for ensuring pointer equality.

#### Absolute / PC-relative load/store relocations

- LD_PREL_LO19 (NotSupportedInGNULDForIFunc)
- LDST{8, 16, 32, 64, 128}_ABS_LO12_NC (NotSupportedInGNULDForIFunc)

These relocations does not make sense with IFunc symbols. Loading value
at a function address is an invalid behavior, and so is storing a value
at a function address.

[!IMPORTANT]
FIXME: It should be an error to use these relocations with IFunc symbols.

#### General computation relocation

- R_AARCH64_ADD_ABS_LO12_NC

Resolves to PLT[IFuncSymbol]. *Do not* set HasDirectReference[IFuncSymbol].
25 changes: 22 additions & 3 deletions include/eld/SymbolResolver/ResolveInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,18 @@ class ResolveInfo {
return ((ThisBitField & PatchableMask) == PatchableMask);
}

void setIFuncDirectRef() { ThisBitField |= IFuncDirectRefFlag; }

bool hasIFuncDirectRef() const {
return ((ThisBitField & IFuncDirectRefMask) == IFuncDirectRefMask);
}

void setIFuncNeedsGOT() { ThisBitField |= IFuncNeedsGOTFlag; }

bool hasIFuncNeedsGOT() const {
return ((ThisBitField & IFuncNeedsGOTMask) == IFuncNeedsGOTMask);
}

// ----- observers ----- //
bool isNull() const;

Expand Down Expand Up @@ -314,14 +326,19 @@ class ResolveInfo {
static const uint32_t PreserveOffset = 18;
static const uint32_t PreserveMask = 1 << PreserveOffset;

// FIXME: offset 19 can be used here!
static const uint32_t IFuncAbsRefOffset = 19;
static const uint32_t IFuncDirectRefMask = 1 << IFuncAbsRefOffset;

static const uint32_t PatchableOffset = 20;
static const uint32_t PatchableMask = 1 << PatchableOffset;

static const uint32_t IFuncNeedsGOTOffset = 21;
static const uint32_t IFuncNeedsGOTMask = 1 << IFuncNeedsGOTOffset;

static const uint32_t InfoMask = 0xF;

// Bits are from 0-20.
static const uint32_t ResolveMask = 0x1FFFFF;
// Bits are from 0-21.
static const uint32_t ResolveMask = 0x3FFFFF;

public:
static const uint32_t GlobalFlag = 0 << GlobalOffset;
Expand All @@ -343,6 +360,8 @@ class ResolveInfo {
static const uint32_t InbitcodeFlag = 1 << InBitcodeOffset;
static const uint32_t PreserveFlag = 1 << PreserveOffset;
static const uint32_t PatchableFlag = 1 << PatchableOffset;
static const uint32_t IFuncDirectRefFlag = 1 << IFuncAbsRefOffset;
static const uint32_t IFuncNeedsGOTFlag = 1 << IFuncNeedsGOTOffset;
ResolveInfo();
ResolveInfo(llvm::StringRef SymbolName);
~ResolveInfo();
Expand Down
27 changes: 27 additions & 0 deletions lib/Target/AArch64/AArch64LDBackend.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -574,6 +574,33 @@ bool AArch64LDBackend::finalizeTargetSymbols() {
return true;
}

bool AArch64LDBackend::finalizeScanRelocations() {
if (!config().isCodeStatic())
return true;

ELFObjectFile *Obj = getDynamicSectionHeadersInputFile();
if (!Obj)
return true;

for (auto &[symInfo, plt] : m_PLTMap) {
if (!symInfo->isIFunc() || !symInfo->hasIFuncDirectRef() ||
!symInfo->hasIFuncNeedsGOT())
continue;

AArch64GOT *G = AArch64GOT::Create(Obj->getGOT(), symInfo);

FragmentRef *PLTFragRef = make<FragmentRef>(*plt, 0);
Relocation *r = Relocation::Create(llvm::ELF::R_AARCH64_ABS64, 64,
make<FragmentRef>(*G, 0), 0);
Obj->getGOT()->addRelocation(r);
r->modifyRelocationFragmentRef(PLTFragRef);

recordGOT(symInfo, G);
symInfo->setReserved(symInfo->reserved() | Relocator::ReserveGOT);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GNU linker creates a igot and igot.plt probably it would be useful to follow the pattern for IFUNC symbols.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can simplify the ifunc implementation too, probably we can reserve the GOT and PLT slots and later remove the GOT slots not needed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GNU linker creates a igot and igot.plt probably it would be useful to follow the pattern for IFUNC symbols.

This is interesting. I did not observe this during my experimentation of IFunc functionality with gnu ld. Can you please share which GNU version are you using?

$ aarch64-none-linux-gnu-ld.bfd --version
GNU ld (Arm GNU Toolchain 15.2.Rel1 (Build arm-15.86)) 2.45.1.20251203
Copyright (C) 2025 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later version.
This program has absolutely no warranty.
$ llvm-readelf -S 1.bfd.out | grep -i .got
  [20] .got              PROGBITS        000000000001ffb8 00ffb8 000030 08  WA  0   0  8
  [21] .got.plt          PROGBITS        000000000001ffe8 00ffe8 000048 08  WA  0   0  8

This can simplify the ifunc implementation too, probably we can reserve the GOT and PLT slots and later remove the GOT slots not needed.

This would be the same even if we use the normal .got slot instead of the .igot slot, right? I do prefer to have mechanisms in place so that we only create the slots if they are required. Preemptively creating the slots may lead to unnecessary operations, and reverting/removing the slot may silently bring in bugs in the future if we add any new side-effect to the .got slot creation.

return true;
}

void AArch64LDBackend::setupStaticTCBForTLSSupport() {
if (!config().isCodeStatic())
return;
Expand Down
3 changes: 3 additions & 0 deletions lib/Target/AArch64/AArch64LDBackend.h
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,9 @@ class AArch64LDBackend : public GNULDBackend {
/// finalizeTargetSymbols - finalize the symbol value
bool finalizeTargetSymbols() override;

/// Currently is only used to create GOT entries for ifunc with direct refs
bool finalizeScanRelocations() override;

void setOptions() override;

void initSegmentFromLinkerScript(ELFSegment *pSegment) override;
Expand Down
Loading
Loading