Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
9412961
reproduce some mismatch points
youge325 Mar 27, 2026
8b6aa36
reproduce all
youge325 Mar 27, 2026
36caa05
resolve some mismatches
youge325 Mar 28, 2026
05f096a
update doc
youge325 Mar 28, 2026
3c4e33f
update .gitignore
youge325 Mar 28, 2026
c8d938f
refactor directory structure and update imports
youge325 Mar 28, 2026
c4dc263
refactor directory structure for doc
youge325 Mar 28, 2026
8d74789
split mismatch_api_record.md by directory
youge325 Mar 28, 2026
2fda0f0
align some APIs
youge325 Mar 28, 2026
a92b5b2
already align, update doc
youge325 Mar 28, 2026
9869dbf
align cuda related APIs
youge325 Mar 28, 2026
43ba5fc
fix compiling error
youge325 Mar 28, 2026
60e1972
align allocator
youge325 Mar 29, 2026
f2e77bf
Align Device and update doc
youge325 Mar 29, 2026
3a8b760
add doc
youge325 Mar 29, 2026
77c3f42
Align TORCH_CHECK_OP and update Device macro doc
youge325 Mar 29, 2026
26b7567
update tests and docs for already aligned APIs
youge325 Mar 30, 2026
cd27e03
move mismatch tests to new test file
youge325 Mar 30, 2026
955104c
move mismatch tests to new test file and update doc
youge325 Mar 30, 2026
c4c891c
align arange without dtype provided
youge325 Mar 30, 2026
ec44f48
fix resize_
youge325 Mar 30, 2026
d51cac1
update CUDADataType test and doc
youge325 Mar 30, 2026
5a27989
fix resize_ again
youge325 Mar 30, 2026
4dd48f6
add mismatch Allocator tests and update doc
youge325 Mar 31, 2026
5d8b1f1
add mismatch Event tests and update doc
youge325 Apr 1, 2026
0b3f576
update Event doc
youge325 Apr 2, 2026
270d9fd
update arange doc
youge325 Apr 2, 2026
533af42
update cuda_stream doc
youge325 Apr 2, 2026
de73d56
update ScalarType mismatch API record
youge325 Apr 2, 2026
871436b
delete stack info in exception message, since the stack info is not c…
youge325 Apr 3, 2026
193de17
add ccache to speed up compilation
youge325 Apr 3, 2026
d76600b
update doc
youge325 Apr 3, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 57 additions & 15 deletions .github/skills/compat-doc-authoring/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,17 +33,39 @@ argument-hint: '要编写的头文件或模块,例如 typeid.h、Stream.h、Te

## 标准模板(推荐)

1. 标题:`<HeaderName> 头文件 API 兼容性`
2. 对比文件列表(Paddle / Torch)
3. 状态说明(`✅/🔧/❌`)
4. 分组表格(按构造、访问器、静态函数、宏等)
5. `### 兼容性统计`
6. `### 备注`(可选)
1. **文件头**:日期与复核说明(`> YYYY-MM-DD 编制/复核:...`)
2. **对比文件列表**(点列表格式)
3. **状态说明段落**(定义 `✅/🔧/❌/🟦` 含义)
4. **API 对比表(分组)**(按构造、访问器、操作等分类)
5. `## 兼容性统计`(简化2列表)
6. `## 关键差异说明`(按序列号)
7. `## 备注`(实现细节、编译依赖等)

表格列默认:
### 表格标准列

```markdown
| torch API | paddle API 兼容性 | 测试用例状态 | 优先级 | 备注 |
|---|---|---|---|---|
|-----------|------------------|------------|-------|------|
```

**约定**:
- `paddle API 兼容性`:使用符号 ✅/🔧/❌/🟦
- `测试用例状态`:使用 checkbox `- [ ]` 或 `- [x]`
- `优先级`:使用 P0/P1/P2/P3 标记
- P0: 核心功能,必须支持
- P1: 常用功能,高优先级
- P2: 进阶功能,中优先级
- P3: 边缘功能,低优先级

### 兼容性统计表

```markdown
| 状态 | 数量 |
|---|---|
| ✅ 已实现 | N1 |
| 🔧 部分兼容 | N2 |
| ❌ 未实现 | N3 |
```

## 工作流程

Expand Down Expand Up @@ -94,11 +116,13 @@ argument-hint: '要编写的头文件或模块,例如 typeid.h、Stream.h、Te

发布前必须检查:

1. 文档中存在 `### 兼容性统计`
1. 文档中存在 `## 兼容性统计`(2级标题)
2. 统计值与表格行数一致
3. 对比文件路径正确且可访问
4. `🔧` 条目都解释了“差异在哪里”
5. 无明显过时描述(如“未接入”但代码已接入)
4. 每个 🔧 条目都在"关键差异说明"中有详细说明
5. 测试用例状态全部使用 checkbox `- [ ]` / `- [x]`
6. 优先级全部使用 P0/P1/P2/P3 标记
7. 无明显过时描述(如"未接入"但代码已接入)

## 决策与分支

Expand All @@ -123,10 +147,28 @@ argument-hint: '要编写的头文件或模块,例如 typeid.h、Stream.h、Te

## 质量标准

1. 可审阅:每个差异都有落点说明
2. 可验证:统计表可由行数复算
3. 可维护:结构稳定、命名统一、索引清晰
4. 可复用:同一模板可直接用于下一个头文件
1. **可审阅**:每个🔧条目都在"关键差异说明"中有详细落点
2. **可验证**:统计值与表格行数一致(可逐行检查)
3. **可维护**:格式统一、轻级标题稳定、优先级一致
4. **可复用**:同一模板可直接用于下一个头文件(如Stream.h、TensorBase.h)

## 常见隐患与修复

### 隐患 1:优先级混用
**❌ 错误**:有的行用 `P0`,有的用 `高/中/低`,有的用 `H/M/L`
**✅ 要求**:全文统一用 P0/P1/P2/P3

### 隐患 2:测试用例状态不规范
**❌ 错误**:用 `✅/⚠️/❌` 或 `过` / `漏`
**✅ 要求**:全文统一用 checkbox `- [ ]` 和 `- [x]`

### 隐患 3:统计表行数与表格不符
**❌ 错误**:表格有 15 行,但统计 ✅:9 + 🔧:4 = 13
**✅ 要求**:逐行数清,再填入统计表

### 隐患 4:🔧 条目无备注
**❌ 错误**:只写 `🔧` 不说明差异在哪
**✅ 要求**:每个 🔧 都在"关键差异说明"中有小节说明

## 推荐触发词示例

Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,5 @@ Makefile
Thumbs.db

.humanize
.codex
.codex_tmp
15 changes: 8 additions & 7 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ endif()
set(CMAKE_MODULE_PATH ${PROJECT_SOURCE_DIR}/cmake ${CMAKE_MODULE_PATH})
get_filename_component(THIRD_ROOT "${PROJECT_BINARY_DIR}/3rd_party" ABSOLUTE)

include(ccache)
include(external)

find_package(Threads REQUIRED)
Expand Down Expand Up @@ -65,13 +66,10 @@ enable_testing()
include_directories(${COMMON_INCLUDES})
include(cmake/build.cmake)

file(GLOB TEST_SRC_FILES ${PROJECT_SOURCE_DIR}/test/*.cpp
${PROJECT_SOURCE_DIR}/test/ops/*.cpp)
# 排除已unmatch开头的文件
file(GLOB UNMATCH_FILES ${PROJECT_SOURCE_DIR}/test/unmatch*.cpp
${PROJECT_SOURCE_DIR}/test/ops/unmatch*.cpp)

list(REMOVE_ITEM TEST_SRC_FILES ${UNMATCH_FILES})
file(GLOB_RECURSE TEST_SRC_FILES CONFIGURE_DEPENDS
${PROJECT_SOURCE_DIR}/test/*.cpp)
# 排除各级子目录下以 unmatch_ 开头的测试文件
list(FILTER TEST_SRC_FILES EXCLUDE REGEX "/unmatch_[^/]+\\.cpp$")

file(GLOB_RECURSE TEST_BASE_FILES ${PROJECT_SOURCE_DIR}/src/*.cpp)
set(PADDLE_TARGET_FOLDER ${CMAKE_BINARY_DIR}/paddle)
Expand Down Expand Up @@ -101,6 +99,9 @@ set(TORCH_DIR
set(TORCH_LIBRARIES "")
file(GLOB_RECURSE TORCH_LIBRARIES "${TORCH_DIR}/lib/*.so"
"${TORCH_DIR}/lib/*.a")
if(CUDAToolkit_FOUND)
list(APPEND TORCH_LIBRARIES CUDA::cudart)
endif()

find_package(CUDAToolkit QUIET)
set(TORCH_INCLUDE_DIR "${TORCH_DIR}/include"
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ cd .. && ./test/result_cmp.sh build

### 适用场景

- 新增 ATen 算子的兼容性测试(如 `test/ops/AbsTest.cpp`)
- 新增 ATen 算子的兼容性测试(如 `test/ATen/ops/AbsTest.cpp`)
- 排查 Paddle 与 PyTorch 在特定算子上的行为差异
- 扩展现有测试的 shape / dtype 覆盖范围

Expand Down
30 changes: 30 additions & 0 deletions cmake/ccache.cmake
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Use ccache if found ccache program

if(NOT WIN32)
find_program(CCACHE_PATH ccache)
if(CCACHE_PATH)
execute_process(COMMAND ccache -V OUTPUT_VARIABLE ccache_output)
execute_process(COMMAND ccache -v -s cache directory
OUTPUT_VARIABLE cache_directory)
string(REGEX MATCH "[0-9]+.[0-9]+" ccache_version ${ccache_output})
message(STATUS "ccache is founded, use ccache to speed up compile on Unix.")
# show statistics summary of ccache
message("ccache version\t\t\t " ${ccache_version} "\n"
${cache_directory})
set(CMAKE_C_COMPILER_LAUNCHER ${CCACHE_PATH})
set(CMAKE_CXX_COMPILER_LAUNCHER ${CCACHE_PATH})
set(CMAKE_CUDA_COMPILER_LAUNCHER ${CCACHE_PATH})
endif()
elseif("${CMAKE_GENERATOR}" STREQUAL "Ninja")
# Only Ninja Generator can support sccache now
find_program(SCCACHE_PATH sccache)
if(SCCACHE_PATH)
execute_process(COMMAND sccache -V OUTPUT_VARIABLE sccache_version)
message(
STATUS
"sccache is founded, use [${SCCACHE_PATH}] to speed up compile on Windows."
)
set(CMAKE_C_COMPILER_LAUNCHER ${SCCACHE_PATH})
set(CMAKE_CXX_COMPILER_LAUNCHER ${SCCACHE_PATH})
endif()
endif()
2 changes: 1 addition & 1 deletion doc/generator.md → doc/ATen/core/generator.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@
- Paddle 的 `check_generator<T>` 未做 `T::device_type()` 与 `gen->device().type()` 的一致性校验。

3. **测试依据**:
- 参考 `test/GeneratorTest.cpp` 中已覆盖用例:`defined`、seed/offset、`device`、`clone`、`get_generator_or_default` 等。
- 参考 `test/ATen/cuda/GeneratorTest.cpp` 中已覆盖用例:`defined`、seed/offset、`device`、`clone`、`get_generator_or_default` 等。
- 标记为 🔧 或 `- [ ]` 的项多数为头文件接口存在但缺少直接单测。

4. **更新记录**:
Expand Down
123 changes: 123 additions & 0 deletions doc/ATen/core/mismatch_api_record.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# IValue(已对齐)

> Paddle 头文件:`ATen/core/ivalue.h`
> 状态:已对齐(2026-03-28)

当前 compat `IValue` 已补齐以下 PyTorch 风格接口:

1. `c10::IValue` 入口,可直接按 PyTorch 方式引用。
2. camelCase 方法:`isNone()`、`isBool()`、`isInt()`、`isDouble()`、`isString()`、`isList()`、`isTensor()`、`isCustomClass()`、`isTuple()`。
3. 提取方法:`toBool()`、`toInt()`、`toDouble()`、`toStringRef()`、`toStringView()`、`toTensor()`、`toScalarType()`。
4. 调试接口:`tagKind()`。

验证位置:

- `test/ATen/core/IValueTest.cpp`
- `test/torch/LibraryTest.cpp`

说明:

- `torch::IValue` 兼容入口仍保留,便于已有调用方平滑迁移。
- 该节原先记录的命名空间、camelCase 命名、`tagKind()`、`toStringRef()` 差异已不再是当前阻塞项。


---

# Tensor::resize_(已对齐)

> Paddle 相关头文件:`ATen/core/TensorBody.h`、`ATen/ops/resize.h`
> 状态:基础 `resize_` 语义已对齐(2026-03-30)

## Diff 测试用例位置

测试文件:`test/ATen/core/TensorTest.cpp`

### 测试用例原文

```cpp
// 测试 resize_ - 缩小元素数时应成功并保留前缀数据
TEST_F(TensorTest, Resize) {
auto file_name = g_custom_param.get();
FileManerger file(file_name);
file.openAppend();
file << "Resize ";
tensor.resize_({4, 5});
file << std::to_string(tensor.sizes()[0]) << " ";
file << std::to_string(tensor.sizes()[1]) << " ";
file << std::to_string(tensor.numel()) << " ";
file << std::to_string(tensor.data_ptr<float>()[0]) << " ";
file << std::to_string(tensor.data_ptr<float>()[19]) << " ";
file << "\n";
file.saveFile();
}
```

---

## 输出对比

| 测试用例 | Paddle 输出 | Torch 输出 |
|---------|------------|------------|
| Resize | `4 5 20 1.000000 1.000000` | `4 5 20 1.000000 1.000000` |

---

## 当前行为

当前 compat `resize_` 已改为混合实现:元素总数不变时走 `reshape`,元素总数变化时走 Paddle 原生 `set_` 路径,因此覆盖了当前 diff 用例中 `2x3x4 -> 4x5` 的缩容场景,也不会破坏连续多次 `resize_()` 的稳定性。

当前范围:

1. 支持元素总数变化的 `resize_()` 调用,不再退化为只能 `reshape`。
2. 现有对比用例验证了缩容后 shape、`numel()` 和前缀数据保留行为。
3. `memory_format` 目前仅覆盖 `nullopt` / `Contiguous` 路径,这与当前仓里的使用方式一致。

备注:

- 本节原先“不支持、会抛异常”的结论已失效,排查时请以现有实现和测试结果为准。

---

# Tensor::pin_memory / is_pinned(已对齐)

> Paddle 相关头文件:`ATen/core/TensorBody.h`
> 状态:已按 PyTorch 语义对齐(2026-03-28)

当前行为:

1. `pin_memory()` 仅接受 CPU Tensor,非 CPU Tensor 直接报错。
2. `is_pinned()` 仅对 pinned host tensor 返回 true。
3. `device` 形参保留兼容入口,但按 PyTorch 语义视为 deprecated。

验证位置:

- `test/ATen/core/TensorTest.cpp`

备注:

- 该节原先记录的是历史差异,当前实现与文档旧结论不一致,排查时请以现有实现和测试结果为准。

---


---

# Tensor 指针 API(`const_data_ptr<T>` / `mutable_data_ptr<T>`,已对齐)

> Paddle 头文件:`ATen/core/TensorBody.h`
> 状态:已对齐(2026-03-28)

当前状态:

1. 模板版本 `const_data_ptr<T>()` / `mutable_data_ptr<T>()` 已可正常链接和调用。
2. `test/ATen/ops/TensorPtrTest.cpp` 已恢复直接验证 `float*` / `const float*` 路径。

验证位置:

- `test/ATen/ops/TensorPtrTest.cpp`

备注:

- 本节保留作为历史记录,旧的 `undefined reference` 结论不再适用。

---
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion doc/tensor_body.md → doc/ATen/core/tensor_body.md
Original file line number Diff line number Diff line change
Expand Up @@ -300,7 +300,7 @@
| `reshape` | ✅ | ✅ | P2 | |
| `reshape_as` | ⏳ | ⏳ | P2 | |
| `reshape_symint` | ⏳ | ⏳ | P2 | |
| `resize_` | 🚧 | 🚧 | P2 | |
| `resize_` | | | P2 | `memory_format` 当前覆盖 `nullopt/Contiguous` |
| `resize__symint` | ⏳ | ⏳ | P2 | |
| `resize_as_` | ⏳ | ⏳ | P2 | |
| `resize_as_sparse_` | ⏳ | ⏳ | P2 | |
Expand Down
File renamed without changes.
File renamed without changes.
Loading