ssa: emit go semantic metadata by luoliwoshang · Pull Request #1728 · goplus/llgo

luoliwoshang · 2026-03-18T06:54:31Z

Background

Ordinary call/ref reachability only sees direct symbol references, but in Go the question of which methods must be kept also depends on additional semantics such as:

interface conversions
interface method calls
MethodByName
conservative reflection handling

An ordinary reference graph alone cannot answer:

which concrete types enter the interface dispatch domain
which interface method slots are actually demanded
which method names are requested by name
which sites must fall back to conservative reflection handling

This PR is not the end of the full pipeline. It is one implementation stage of the proposal: it emits these Go semantic facts as llgo.xxx metadata during ssa/cl, and adds stable readback for them.

This PR corresponds to the producer / readback portion of:

llgo#1727: Proposal: Semantic Pruning of Unreachable Methods from a Global Graph

This PR only covers the producer / readback layer of the proposal. It does not include:

whole-program analysis after metadata aggregation
global method liveness driven by OrdinaryEdges / TypeChildren

What This PR Does

1. Emit Go semantic metadata during `ssa/cl`

This PR adds and wires up the following named metadata:

!llgo.useiface
!llgo.useifacemethod
!llgo.interfaceinfo
!llgo.methodinfo
!llgo.usenamedmethod
!llgo.reflectmethod

They represent:

useiface
- which concrete types enter the interface semantic domain once an owner becomes reachable
useifacemethod
- which interface method demands are produced once an owner becomes reachable
interfaceinfo
- the complete method set of an interface
methodinfo
- the method-table slots of a concrete type, plus MType / IFn / TFn
usenamedmethod
- which method names are requested exactly through MethodByName
reflectmethod
- which owners must fall back to conservative reflection handling

2. Define the metadata encoding baseline

The current encoding is row-oriented to keep emission and later aggregation simple:

llgo.interfaceinfo
- one row per interface method
llgo.methodinfo
- one row per method slot

In particular:

methodinfo is emitted only when a type actually has method slots
unexported method names are normalized with package qualification
MethodInfo keeps Index / MethodSig / IFn / TFn, matching the needs of later analysis

3. Distinguish the emission ownership of `MethodInfo` and `InterfaceInfo`

Both are inputs to later analysis, but they intentionally use different emission strategies in the current implementation.

`MethodInfo`

MethodInfo is tied to the method-table layout of a concrete type. Its ownership is stronger, and duplicate emission is much noisier.

So this PR tightens its emission:

for ordinary imported, non-generic named types
- llgo.methodinfo is no longer emitted redundantly at use sites
their method-slot metadata is emitted only by the defining package

Generic instances remain conservatively allowed at use sites for now:

current LLGo still materializes generic instance methods primarily at use sites
a concrete imported generic named instance may only be instantiated and compiled in the current package
if imported generic instances were forced into definition-site-only emission now, metadata could be missing even though the instance methods were materialized locally

So the current policy is:

ordinary imported, non-generic named types
- deduplicated; emitted only by the defining package
imported generic instances
- still conservatively allowed at use sites, prioritizing completeness

`InterfaceInfo`

InterfaceInfo records the complete method set of an interface. In principle it also looks like definition-site information, but under current LLGo interface lowering, named interfaces are often erased early into their underlying *types.Interface. That makes it unreliable to recover a stable named owner from the definition-side type materialization path.

So the current implementation chooses:

emit InterfaceInfo at the use site
specifically, along the path that already produces the interface method demand

This is intentional because:

that path always has the relevant interface shape in hand
duplicate InterfaceInfo contributions from multiple modules remain semantically safe after whole-program merge
this is more reliable than forcing a definition-site-only rule under the current lowering model

In other words, this PR intentionally treats the two differently:

MethodInfo
- heavier and more strongly bound to concrete type ownership, so duplicate emission is reduced aggressively
InterfaceInfo
- definition-site ownership is unstable under current lowering, so use-site emission is preferred to keep the input complete

4. Add `internal/semmeta`

This PR adds internal/semmeta, which is responsible for:

reading llgo.xxx metadata back from a single llvm.Module
folding it into semantic ModuleInfo
owning the metadata protocol on both the write side and the read side

This layer only handles semantic metadata. It does not handle:

ordinary LLVM reference graphs
TypeChildren
whole-program aggregation
DCE analyze / rewrite

5. Use the new metadata read API from `goplus/llvm`

This PR also switches to the metadata read API that has already landed in goplus/llvm, removing the temporary cgo-based reader previously used by tests.

The current code directly uses:

Module.NamedMetadataOperands
Value.MDNodeOperands
Value.MDString

to read named metadata back from llvm.Module.

6. Add `llgen.GenModuleFrom`

This PR adds GenModuleFrom to internal/llgen, so callers can directly obtain an llvm.Module.

That makes it possible to use the same llvm.Module for both:

obtaining IR text
reading semantic metadata back

without a stringify-then-reparse detour.

7. Build a stable semantic view for metadata

The readback result is not a raw row dump. It is a stable semantic view grouped by meaning, for example:

UseIface
UseIfaceMethod
InterfaceInfo
MethodInfo
UseNamedMethod
ReflectMethod

This captures semantic content rather than LLVM metadata node numbering or row layout details.

Current Scope

This PR only covers the producer / readback layer of the proposal. It does not include:

whole-program DCE analysis after metadata aggregation
global method liveness driven by OrdinaryEdges / TypeChildren

In other words, this PR establishes the input contract and observability foundation required by the later algorithm.

Representative Coverage

This PR covers several important semantic categories:

ifaceconv
- interface conversions driving useiface
reader / invoke / interface
- interface method demands
embedunexport / geometry1370
- interface methods with unexported names
reflectmk
- Method, MethodByName, and conservative reflection
abimethod
- named / anonymous / promoted methods and embedded method sets
cursor / reflectconv / abinamed
- large method-slot sets and more complex type coverage

Follow-up Work

Later PRs will continue from here with:

whole-program aggregation of OrdinaryEdges, TypeChildren, and semantic metadata
the analysis that computes type symbol -> live method indexes

gemini-code-assist · 2026-03-18T06:55:16Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the LLGo compiler's SSA generation by adding a mechanism to emit semantic metadata. Specifically, it now tracks and annotates instances where concrete types are converted to interface types using !llgo.useiface metadata. This change provides more detailed information in the generated LLVM IR, which can be beneficial for downstream analysis or optimization passes. The update also includes the necessary regeneration of test output files to align with this new metadata emission.

Highlights

Semantic Metadata Emission: Introduced a new package-level semantic metadata emitter infrastructure within the SSA (Static Single Assignment) form generation.
Interface Conversion Metadata: Implemented the emission of !llgo.useiface metadata for non-interface to interface conversions originating from the MakeInterface operation.
Test File Regeneration: Regenerated numerous existing IR golden files to reflect the newly introduced metadata, ensuring consistency and correctness.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

xgopilot · 2026-03-18T06:58:15Z

Overall: Clean, well-scoped PR. The semantic metadata emitter is a solid foundation for recording interface conversions. The golden file updates are consistent. A few items below worth addressing — mainly a redundant abiTypeGlobal call and a missing license header.

gemini-code-assist

Code Review

This pull request introduces an infrastructure for emitting package-level semantic metadata in the SSA backend and uses it to emit !llgo.useiface metadata for non-interface to interface conversions. The implementation includes a new semanticMetadataEmitter to manage and prevent duplicate metadata entries in the LLVM module. The changes are well-structured, with a logical refactoring in abitype.go to support the new functionality in interface.go. The code is clean, correct, and the regenerated golden files confirm the intended behavior. Overall, this is a solid improvement to the compiler's metadata emission capabilities.

codecov · 2026-03-18T09:27:29Z

Codecov Report

❌ Patch coverage is 96.65272% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.56%. Comparing base (2c3c5c1) to head (957d908).
⚠️ Report is 4 commits behind head on main.

Files with missing lines	Patch %	Lines
internal/semmeta/semmeta.go	95.74%	3 Missing and 3 partials ⚠️
ssa/metadata.go	66.66%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1728      +/-   ##
==========================================
+ Coverage   88.44%   88.56%   +0.12%     
==========================================
  Files          50       52       +2     
  Lines       13656    13891     +235     
==========================================
+ Hits        12078    12303     +225     
- Misses       1369     1374       +5     
- Partials      209      214       +5

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

luoliwoshang · 2026-03-19T09:41:47Z

will use goplus/llvm instead

already use goplus/llvm ssa: use llvm metadata read APIs

…data-producer-rework

zhouguangyuan0718 · 2026-04-16T11:27:13Z

这部分我不太建议新增meta-expect.txt这种类型的用例，和之前的out.ll问题一样，针对性不足，并且会受其他部分变化的影响。我建议可以新增一些有针对性的测试用例，复用同样的LITTEST机制来做检查，而不是普遍性的测试大量代码的输出。

luoliwoshang · 2026-04-16T11:56:01Z

这部分我不太建议新增meta-expect.txt这种类型的用例，和之前的out.ll问题一样，针对性不足，并且会受其他部分变化的影响。

明白，对于每个cl/testxxx 都输出meta-expect.txt 会导致噪音过多，这里预期会修改为针对性的用例验证metadata的产出。但如果emitter输出逻辑、abi.Type名称稳定其实这个回归也不应该变化，如果abi.Type名称变化这里更新其实是预期的～

我建议可以新增一些有针对性的测试用例，复用同样的LITTEST机制来做检查，而不是普遍性的测试大量代码的输出。

这里复用LITTEST机制指的是预期在cl/testxxx下面新增对应的in.go，然后标记llvm module 的 metadata的节点产出进行匹配么；
因为这部分想回归的其实是metadata的产出稳定，而不是llvm.module 那里的metadata的节点布局，（后续如果涉及到缓存，可能会落到其他文件）所以这里还是会更倾向于对比的是序列化后的semantic module info ～，会更可读。

所以可能会更倾向于在cl/_testmeta下面增加 ifaceuse,interface,reflect等等的针对性用例，每个包内都是一个in.go，以及现在形状的meta-expect.txt.

zhouguangyuan0718 · 2026-04-16T12:17:54Z

明白，对于每个cl/testxxx 都输出meta-expect.txt 会导致噪音过多，这里预期会修改为针对性的用例验证metadata的产出。但如果emitter输出逻辑、abi.Type名称稳定其实这个回归也不应该变化，如果abi.Type名称变化这里更新其实是预期的～

主要是需要额外维护这些东西，比如因为其他原因，改了无关用例里的类型名称，或者顺序，字段之类的，还需要再去维护metadata的输出，这就和现在的偶尔改动需要刷新大量的out.ll的情况类似

所以可能会更倾向于在cl/_testmeta下面增加 ifaceuse,interface,reflect等等的针对性用例，每个包内都是一个in.go，以及现在形状的meta-expect.txt.

嗯嗯，我赞同这样的改法，主要是觉得把测试用例的关注点独立出来会比普遍检查输出会更好。

zhouguangyuan0718 · 2026-04-16T15:01:45Z

+		if obj == nil || obj.Pkg() == nil {
+			return true
+		}
+		return abi.PathOf(obj.Pkg()) == p.Path()


这里有个疑问，看起来类型只有在定义的包中有使用的时候才会判定为true，那对于import的情况，如果在定义的包中没有使用，在其他包中才有使用，是不是会缺少这个type的MethodInfo？

#1728 (comment)
确实是一个误优化！MethodInfo 当前是随 abiUncommonMethods/abiType 按需发射的，而不是定义包统一产出，所以不应该限制为只在 owner package emit。这里将会移除这个条件，和当前 InterfaceInfo 保持一致。

fixed at d553c34 d553c34

zhouguangyuan0718 · 2026-04-16T15:04:58Z

+	// only on definition/type-processing paths would miss rows such as
+	// "_llgo_foo/bar.IFmt" even though later whole-program analysis still
+	// needs that complete interface method set.
+	if _, ok := types.Unalias(intf.raw.Type).(*types.Named); ok {


匿名的interface是否也需要InterfaceInfo？

已修改，匿名的interface也会需要Interface，否则后续全局视角计算下，类型是否实现接口在匿名接口的情况下会误判，已修改，并回归在 cl/_testmeta/interface_anonymous/in.go

xgopilot bot reviewed Mar 18, 2026

View reviewed changes

Comment thread ssa/interface.go Outdated

xgopilot bot reviewed Mar 18, 2026

View reviewed changes

Comment thread ssa/metadata.go

xgopilot bot reviewed Mar 18, 2026

View reviewed changes

Comment thread ssa/metadata.go Outdated

xgopilot bot reviewed Mar 18, 2026

View reviewed changes

Comment thread ssa/metadata.go Outdated

luoliwoshang force-pushed the codex/useiface-metadata-producer branch from 1958ac9 to 8e77b05 Compare March 18, 2026 06:58

gemini-code-assist bot reviewed Mar 18, 2026

View reviewed changes

luoliwoshang changed the title ~~Add llgo.useiface metadata emission~~ ssa: emit llgo.useiface metadata Mar 18, 2026

luoliwoshang force-pushed the codex/useiface-metadata-producer branch from 57e867d to b0f7c04 Compare March 18, 2026 08:53

luoliwoshang force-pushed the codex/useiface-metadata-producer branch 4 times, most recently from 646581d to 1b2e89b Compare March 18, 2026 10:13

luoliwoshang changed the title ~~ssa: emit llgo.useiface metadata~~ ssa: emit deadcode metadata Mar 18, 2026

luoliwoshang force-pushed the codex/useiface-metadata-producer branch 4 times, most recently from 728feba to a4b76a5 Compare March 19, 2026 02:47

luoliwoshang commented Mar 19, 2026

View reviewed changes

luoliwoshang force-pushed the codex/useiface-metadata-producer branch 3 times, most recently from 3e35f66 to 7222776 Compare March 19, 2026 11:09

luoliwoshang changed the title ~~ssa: emit deadcode metadata~~ ssa: emit go semantic metadata Mar 19, 2026

luoliwoshang force-pushed the codex/useiface-metadata-producer branch 4 times, most recently from b2b62bb to 6cad6b2 Compare March 20, 2026 07:14

luoliwoshang mentioned this pull request Mar 20, 2026

build: Go Linktime-like Unreachable Method Pruning #1736

Open

luoliwoshang added 4 commits April 15, 2026 14:53

ssa: document metadata method-name normalization

78b7de9

ssa: keep method metadata emission close to main

c4e7376

ssa: compute metadata method names once

f6396dd

ssa: simplify metadata method-name helper

7680b8a

luoliwoshang force-pushed the codex/useiface-metadata-producer branch 3 times, most recently from f41a4c8 to e8fbec6 Compare April 15, 2026 09:42

ssa: emit interfaceinfo at use sites

ee34c8d

luoliwoshang force-pushed the codex/useiface-metadata-producer branch from e8fbec6 to ee34c8d Compare April 15, 2026 10:17

luoliwoshang added 3 commits April 15, 2026 20:42

semmeta: rename iface method use type

eb3fca1

semmeta: document metadata semantics

4284460

ssa: drop unused interface metadata owner

8af1691

luoliwoshang changed the title ~~[wip] ssa: emit go semantic metadata~~ ssa: emit go semantic metadata Apr 15, 2026

Merge remote-tracking branch 'upstream/main' into codex/useiface-meta…

da16e3b

…data-producer-rework

luoliwoshang force-pushed the codex/useiface-metadata-producer branch 5 times, most recently from 4c2a85e to 038df93 Compare April 16, 2026 14:26

cltest: add focused semantic metadata cases

3b43c7f

luoliwoshang force-pushed the codex/useiface-metadata-producer branch from 038df93 to 3b43c7f Compare April 16, 2026 14:35

zhouguangyuan0718 reviewed Apr 16, 2026

View reviewed changes

ssa: emit methodinfo at use sites

d553c34

luoliwoshang force-pushed the codex/useiface-metadata-producer branch from e41d813 to d553c34 Compare April 16, 2026 17:07

luoliwoshang added 2 commits April 17, 2026 10:49

ssa: emit interfaceinfo for anonymous interfaces

27b7b57

ssa: extract interface metadata emit helpers

957d908

Conversation

luoliwoshang commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

What This PR Does

1. Emit Go semantic metadata during ssa/cl

2. Define the metadata encoding baseline

3. Distinguish the emission ownership of MethodInfo and InterfaceInfo

MethodInfo

InterfaceInfo

4. Add internal/semmeta

5. Use the new metadata read API from goplus/llvm

6. Add llgen.GenModuleFrom

7. Build a stable semantic view for metadata

Current Scope

Representative Coverage

Follow-up Work

Uh oh!

gemini-code-assist bot commented Mar 18, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

xgopilot bot commented Mar 18, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

codecov bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

luoliwoshang Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luoliwoshang Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

zhouguangyuan0718 commented Apr 16, 2026

Uh oh!

luoliwoshang commented Apr 16, 2026

Uh oh!

zhouguangyuan0718 commented Apr 16, 2026

Uh oh!

zhouguangyuan0718 Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luoliwoshang Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

luoliwoshang Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

zhouguangyuan0718 Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

luoliwoshang Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

luoliwoshang commented Mar 18, 2026 •

edited

Loading

1. Emit Go semantic metadata during `ssa/cl`

3. Distinguish the emission ownership of `MethodInfo` and `InterfaceInfo`

`MethodInfo`

`InterfaceInfo`

4. Add `internal/semmeta`

5. Use the new metadata read API from `goplus/llvm`

6. Add `llgen.GenModuleFrom`

codecov bot commented Mar 18, 2026 •

edited

Loading

luoliwoshang Mar 19, 2026 •

edited

Loading

zhouguangyuan0718 Apr 16, 2026 •

edited

Loading