Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 15 additions & 8 deletions docs/PTO_IR_manual.md
Original file line number Diff line number Diff line change
Expand Up @@ -8346,6 +8346,10 @@ frontend/framework generated IR. The detailed design document is:
function.
- `slot_size` is expressed in bytes and uses the pre-split logical pipe-entry
size.
- `slot_num` is an optional compile-time integer attribute on
`pto.aic_initialize_pipe` / `pto.aiv_initialize_pipe`. It controls the GM
ring FIFO depth and defaults to `8` for `dir_mask = 1/2` or `4` for
`dir_mask = 3`.
- `local_slot_num` is an optional compile-time integer attribute on
`pto.aic_initialize_pipe` / `pto.aiv_initialize_pipe`.
On A2/A3 it overrides the default consumer-side local FIFO slot count only
Expand All @@ -8369,9 +8373,10 @@ frontend/framework generated IR. The detailed design document is:
(`pto.initialize_l2g2l_pipe`). It does not implicitly execute `pto.tstore` or
`pto.tload`; callers move data explicitly before `tpush` or after `tpop`.
- When every transfer op bound to one pipe id uses a global entry, the pipe is
a global-only GM FIFO. Its frontend initialize op carries only
`gm_slot_tensor`; `gm_slot_buffer`, `c2v_consumer_buf`, `v2c_consumer_buf`, `local_slot_num`,
`pto.reserve_buffer`, and `pto.import_reserved_buffer` are not used.
a global-only GM FIFO. Its frontend initialize op carries `gm_slot_tensor`
and may carry `slot_num`; `gm_slot_buffer`, `c2v_consumer_buf`,
`v2c_consumer_buf`, `local_slot_num`, `pto.reserve_buffer`, and
`pto.import_reserved_buffer` are not used.
- For global entries, the matched initialize op's `gm_slot_tensor` describes
one FIFO slot entry, not the full multi-slot FIFO buffer. Its dtype, shape,
stride, and layout must match the `tensor_view` returned by `talloc` /
Expand Down Expand Up @@ -8505,7 +8510,7 @@ this op.

```mlir
// A2/A3 (with GM slot buffer):
pto.aic_initialize_pipe {id = 0, dir_mask = 1, slot_size = 1024, local_slot_num = 1}
pto.aic_initialize_pipe {id = 0, dir_mask = 1, slot_size = 1024, slot_num = 2, local_slot_num = 1}
(gm_slot_buffer = %gm_buf : !pto.ptr<f32>,
c2v_consumer_buf = %c2v_import : i32,
v2c_consumer_buf = %c0_i32 : i32)
Expand All @@ -8529,6 +8534,8 @@ pto.aic_initialize_pipe {id = 0, dir_mask = 1, slot_size = 1024, nosplit = true}
the same function
- `dir_mask`: communication direction encoding
- `slot_size`: logical slot size in bytes
- `slot_num`: optional GM ring FIFO slot count; omitted defaults to `8` for
`dir_mask = 1/2` or `4` for `dir_mask = 3`
- `local_slot_num`: optional A2/A3-only local FIFO slot count override for the
lowered `pto.initialize_l2g2l_pipe`; omitted for global-only GM FIFO
- `nosplit`: optional compile-time boolean controlling no-split pipe mode
Expand All @@ -8551,12 +8558,12 @@ pto.aic_initialize_pipe {id = 0, dir_mask = 1, slot_size = 1024, nosplit = true}
- Must appear in Cube kernels
- Multiple `pto.aic_initialize_pipe` ops are allowed in one Cube function, but
`id` must be unique among frontend initialize ops in that function
- If `slot_num` is present, it must be greater than `0`
- If `local_slot_num` is present, it must be greater than `0` and no greater
than the legacy slot count implied by `dir_mask`
(`8` for `dir_mask = 1/2`, `4` for `dir_mask = 3`)
than the effective `slot_num`
- A global-only GM FIFO initialize carries only `gm_slot_tensor`; it must not
carry `gm_slot_buffer`, `local_slot_num`, `c2v_consumer_buf`, or
`v2c_consumer_buf`
`v2c_consumer_buf`; it may carry `slot_num`
- For global-only GM FIFO, `slot_size` must match the byte size of
`gm_slot_tensor`
- Global-entry `talloc` / `tpush` / `tpop` / `tfree` entry types must match the
Expand All @@ -8576,7 +8583,7 @@ pto.aic_initialize_pipe {id = 0, dir_mask = 1, slot_size = 1024, nosplit = true}

```mlir
// A2/A3 (with GM slot buffer):
pto.aiv_initialize_pipe {id = 0, dir_mask = 1, slot_size = 1024, local_slot_num = 1}
pto.aiv_initialize_pipe {id = 0, dir_mask = 1, slot_size = 1024, slot_num = 2, local_slot_num = 1}
(gm_slot_buffer = %gm_buf : !pto.ptr<f32>,
c2v_consumer_buf = %c2v_local : i32,
v2c_consumer_buf = %c0_i32 : i32)
Expand Down
25 changes: 13 additions & 12 deletions docs/designs/ptoas-tpush-tpop-design.md
Original file line number Diff line number Diff line change
Expand Up @@ -390,7 +390,7 @@ func.func @vector_kernel(%gm_slot_buffer : !pto.ptr<f32>,
- 启用 local address planning 的编译流程:`reserve_buffer` 只允许 `auto = true`
- 跳过 local address planning 的编译流程:`reserve_buffer` 只允许 `auto = false` 且显式提供 `base`
- `import_reserved_buffer` 必须能在 `peer_func` 中找到同名 `reserve_buffer`
- global-only GM FIFO 的 initialize 只提供 `gm_slot_tensor`,不提供 `gm_slot_buffer`、`local_slot_num`、`c2v_consumer_buf`、`v2c_consumer_buf`,且不要求成对的 `reserve_buffer` / `import_reserved_buffer`
- global-only GM FIFO 的 initialize 只提供 `gm_slot_tensor`(可附带 `slot_num`),不提供 `gm_slot_buffer`、`local_slot_num`、`c2v_consumer_buf`、`v2c_consumer_buf`,且不要求成对的 `reserve_buffer` / `import_reserved_buffer`

## 4. 核心约定

Expand Down Expand Up @@ -516,7 +516,7 @@ DIR_BOTH 示例:
- 表示 GM 路径下 consumer 侧 local slot buffer 的槽数,仅在存在 local FIFO buffer 的 tile-entry 路径有意义
- 仅在通过 GM 传递时对底层 `TPipe` 模板参数有意义,不改变 GM FIFO 的 `slot_num`
- 存在 local FIFO buffer 且缺省时,默认值等于该内部 pipe 的 `slot_num`
- 因此当前固定规则下
- 因此前端未显式指定 `slot_num` 时
- `DIR_MASK=1/2` 直接 lowering 时,`local_slot_num = 8`
- `DIR_MASK=3` 单条 DIR_BOTH pipe,`local_slot_num = 4`
- global-only GM FIFO 不携带 `local_slot_num`
Expand Down Expand Up @@ -658,10 +658,10 @@ pto.tfree(%entry, %pipe : !pto.tensor_view<128x512xf32>, !pto.pipe) {split = 0}
#### A2/A3

- `pto.aic_initialize_pipe` 和 `pto.aiv_initialize_pipe` lower 为 `pto.initialize_l2g2l_pipe`
- 若前端 init 只提供 `gm_slot_tensor`,则 lower 为只携带 `gm_slot_tensor` 的 global-only GM FIFO;不补 `local_slot_num`,不生成 local consumer address operand,也不依赖 `reserve_buffer` / `import_reserved_buffer`
- 若前端 init 只提供 `gm_slot_tensor`(可附带 `slot_num`),则 lower 为只携带 `gm_slot_tensor` 的 global-only GM FIFO;不补 `local_slot_num`,不生成 local consumer address operand,也不依赖 `reserve_buffer` / `import_reserved_buffer`
- 若前端提供了 consumer 侧 local FIFO buffer,且提供了 `local_slot_num`,则直接转发到 lowered
`pto.initialize_l2g2l_pipe`
- 若前端提供了 consumer 侧 local FIFO buffer 但未提供更具体信息,lowering 默认补上 `local_slot_num = slot_num`
- 若前端提供了 consumer 侧 local FIFO buffer 但未提供 `local_slot_num`,lowering 默认补上 `local_slot_num = slot_num`

#### A5

Expand All @@ -670,17 +670,17 @@ pto.tfree(%entry, %pipe : !pto.tensor_view<128x512xf32>, !pto.pipe) {split = 0}
### 6.2 `DIR_MASK=1/2`

- 只生成一条内部 pipe
- `slot_num = 8`
- 对带 consumer 侧 local FIFO buffer 的 `initialize_l2g2l_pipe`,默认 `local_slot_num = 8`
- `slot_num` 缺省为 `8`,也可由前端显式指定
- 对带 consumer 侧 local FIFO buffer 的 `initialize_l2g2l_pipe`,默认 `local_slot_num = slot_num`
- 若前端显式提供 `local_slot_num`,则使用显式值
- global-only GM FIFO 不携带 `local_slot_num`,地址/descriptor 操作数只有 `gm_slot_tensor`

### 6.3 `DIR_MASK=3`

前端一个 init op 生成**单条** DIR_BOTH 内部 pipe:

- `%pipe`:`dir_mask = 3`,`slot_num = 4`
- 若 lowering 为带 consumer 侧 local FIFO buffer 的 `initialize_l2g2l_pipe`,默认 `local_slot_num = 4`
- `%pipe`:`dir_mask = 3`,`slot_num` 缺省为 `4`,也可由前端显式指定
- 若 lowering 为带 consumer 侧 local FIFO buffer 的 `initialize_l2g2l_pipe`,默认 `local_slot_num = slot_num`
- 若前端显式提供 `local_slot_num`,则使用显式值

地址选择规则:
Expand Down Expand Up @@ -995,7 +995,7 @@ pass 在模块级按两步执行:
- 方向相关 op 只能出现在合法 kernel 中
- 前端数据传输 op 的 `split` 必须是合法的编译期常量属性
- `global` entry 形式的 `talloc_to_*` / `tpush_to_*` / `tpop_from_*` / `tfree_from_*` 只能绑定到 GM FIFO pipe(A2/A3 `initialize_l2g2l_pipe` 路径)
- 绑定到 global-only GM FIFO 的 initialize 只允许携带 `gm_slot_tensor`,不得携带 `gm_slot_buffer`、`local_slot_num`、`c2v_consumer_buf`、`v2c_consumer_buf`;该路径不要求 `reserve_buffer` / `import_reserved_buffer`
- 绑定到 global-only GM FIFO 的 initialize 只允许携带 `gm_slot_tensor`(可附带 `slot_num`),不得携带 `gm_slot_buffer`、`local_slot_num`、`c2v_consumer_buf`、`v2c_consumer_buf`;该路径不要求 `reserve_buffer` / `import_reserved_buffer`
- `gm_slot_tensor` 本身描述单个 slot entry;其字节数必须匹配 `slot_size`
- `talloc_to_*` / `tpop_from_*` 返回的 `tensor_view` 类型必须匹配 `gm_slot_tensor`
- `global` entry 的 dtype、shape 与 stride/layout 必须足以生成底层 `GlobalTensor<RawDType, Shape, Stride, Layout>` 类型
Expand All @@ -1008,11 +1008,12 @@ pass 在模块级按两步执行:
内部 verifier 负责检查:

- `slot_size > 0`
- `slot_num` 只允许 `8` 或 `4`
- `DIR_MASK=1/2` 时,`slot_num` 必须与单向/双向 lowering 规则一致
- `slot_num >= 1`
- legacy 前端 `pto.aic_initialize_pipe` / `pto.aiv_initialize_pipe` 可显式提供
`slot_num`;缺省时 `DIR_MASK=1/2` 使用 `8`,`DIR_MASK=3` 使用 `4`
- `local_slot_num` 若出现,可出现在 `pto.initialize_l2g2l_pipe` 或 legacy 前端
`pto.aic_initialize_pipe` / `pto.aiv_initialize_pipe` 上,且必须大于 `0`
且不大于其对应 lowering 规则下的 `slot_num`;global-only GM FIFO 不携带 `local_slot_num`
且不大于其有效 `slot_num`;global-only GM FIFO 不携带 `local_slot_num`
- `flag_base` 若出现,必须满足基本合法性;是否已填写以及具体分配值由 flag 分配保证
- `pto.initialize_l2g2l_pipe` 必须提供 `gm_addr` 或 `gm_slot_tensor`;只有存在 consumer 侧 local FIFO buffer 时才提供 `local_addr` / `peer_local_addr`
- `pto.initialize_l2l_pipe` 必须提供 `local_addr`
Expand Down
2 changes: 2 additions & 0 deletions include/PTO/IR/PTOOps.td
Original file line number Diff line number Diff line change
Expand Up @@ -1505,6 +1505,7 @@ def AicInitializePipeOp : PTO_Op<"aic_initialize_pipe",
DefaultValuedOptionalAttr<I32Attr, "0">:$id,
I8Attr:$dir_mask,
I32Attr:$slot_size,
OptionalAttr<I32Attr>:$slot_num,
OptionalAttr<I32Attr>:$local_slot_num,
OptionalAttr<BoolAttr>:$nosplit,
Optional<PtrType>:$gm_slot_buffer,
Expand All @@ -1526,6 +1527,7 @@ def AivInitializePipeOp : PTO_Op<"aiv_initialize_pipe",
DefaultValuedOptionalAttr<I32Attr, "0">:$id,
I8Attr:$dir_mask,
I32Attr:$slot_size,
OptionalAttr<I32Attr>:$slot_num,
OptionalAttr<I32Attr>:$local_slot_num,
OptionalAttr<BoolAttr>:$nosplit,
Optional<PtrType>:$gm_slot_buffer,
Expand Down
32 changes: 25 additions & 7 deletions lib/PTO/IR/PTO.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -11457,6 +11457,7 @@ static ParseResult parseFrontendInitializePipeOp(OpAsmParser &parser,
bool sawId = false;
bool sawDirMask = false;
bool sawSlotSize = false;
bool sawSlotNum = false;
bool sawLocalSlotNum = false;
bool sawNoSplit = false;

Expand Down Expand Up @@ -11495,6 +11496,15 @@ static ParseResult parseFrontendInitializePipeOp(OpAsmParser &parser,
"slot_size", attrs))
return failure();
sawSlotSize = true;
} else if (keyword == "slot_num") {
if (sawSlotNum)
return parser.emitError(parser.getCurrentLocation(),
"duplicate 'slot_num' clause");
IntegerAttr slotNumAttr;
if (parser.parseAttribute(slotNumAttr, parser.getBuilder().getI32Type(),
"slot_num", attrs))
return failure();
sawSlotNum = true;
} else if (keyword == "local_slot_num") {
if (sawLocalSlotNum)
return parser.emitError(parser.getCurrentLocation(),
Expand Down Expand Up @@ -11632,6 +11642,8 @@ static void printFrontendInitializePipeOp(InitOpT op, OpAsmPrinter &p) {
printClause("id", op.getId());
printClause("dir_mask", static_cast<int32_t>(op.getDirMask()));
printClause("slot_size", op.getSlotSize());
if (auto slotNumAttr = op.getSlotNumAttr())
printClause("slot_num", slotNumAttr.getInt());
if (auto localSlotNumAttr = op.getLocalSlotNumAttr())
printClause("local_slot_num", localSlotNumAttr.getInt());
if (auto noSplitAttr = op.getNosplitAttr())
Expand All @@ -11658,7 +11670,8 @@ static void printFrontendInitializePipeOp(InitOpT op, OpAsmPrinter &p) {
p << ")";
p.printOptionalAttrDict(
op->getAttrs(),
/*elidedAttrs=*/{"id", "dir_mask", "slot_size", "local_slot_num",
/*elidedAttrs=*/{"id", "dir_mask", "slot_size", "slot_num",
"local_slot_num",
"nosplit", "operandSegmentSizes"});
}

Expand Down Expand Up @@ -11744,6 +11757,12 @@ static LogicalResult verifyFrontendInitCommon(InitOpT op,
return op.emitOpError("expects 'dir_mask' to be 1, 2, or 3");
if (op.getSlotSize() <= 0)
return op.emitOpError("expects 'slot_size' to be greater than 0");
int32_t slotNum = dirMask == 3 ? 4 : 8;
if (auto slotNumAttr = op.getSlotNumAttr()) {
slotNum = slotNumAttr.getInt();
if (slotNum <= 0)
return op.emitOpError("expects 'slot_num' to be greater than 0");
}

bool hasGlobalSlotTensor = static_cast<bool>(op.getGmSlotTensor());
bool hasC2vConsumerBuf = static_cast<bool>(op.getC2vConsumerBuf());
Expand Down Expand Up @@ -11779,11 +11798,10 @@ static LogicalResult verifyFrontendInitCommon(InitOpT op,
int32_t localSlotNum = localSlotNumAttr.getInt();
if (localSlotNum <= 0)
return op.emitOpError("expects 'local_slot_num' to be greater than 0");
int32_t loweredSlotNum = dirMask == 3 ? 4 : 8;
if (localSlotNum > loweredSlotNum) {
if (localSlotNum > slotNum) {
return op.emitOpError()
<< "expects 'local_slot_num' to be less than or equal to "
<< loweredSlotNum << " for dir_mask = " << static_cast<int>(dirMask);
<< "expects 'local_slot_num' to be less than or equal to slot_num ("
<< slotNum << ") for dir_mask = " << static_cast<int>(dirMask);
}
}

Expand Down Expand Up @@ -12060,8 +12078,8 @@ static LogicalResult verifyPipeShape(Operation *op, int8_t dirMask, int32_t slot
return op->emitOpError("expects 'dir_mask' to be 1, 2, or 3");
if (slotSize <= 0)
return op->emitOpError("expects 'slot_size' to be greater than 0");
if (slotNum != 4 && slotNum != 8)
return op->emitOpError("expects 'slot_num' to be 4 or 8");
if (slotNum <= 0)
return op->emitOpError("expects 'slot_num' to be greater than 0");
if (flagBase && *flagBase < 0)
return op->emitOpError("expects 'flag_base' to be non-negative when present");
if (flagBase) {
Expand Down
18 changes: 14 additions & 4 deletions lib/PTO/Transforms/PTOLowerFrontendPipeOpsPass.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,15 @@ static void propagateFrontendIdAttr(InitOpT initOp, Operation *pipeOp,
rewriter.getI32IntegerAttr(initOp.getId()));
}

template <typename InitOpT>
static int32_t getFrontendSlotNum(InitOpT initOp) {
if (auto slotNumAttr = initOp.getSlotNumAttr())
return slotNumAttr.getInt();
return initOp.getDirMask() == kBidirectionalDirMask
? kBidirectionalSlotNum
: kSingleDirectionSlotNum;
}

static std::optional<int64_t> getStaticIndexLikeValue(Value value) {
if (auto cst = value.getDefiningOp<arith::ConstantIndexOp>())
return cst.value();
Expand Down Expand Up @@ -166,9 +175,10 @@ static FailureOr<FrontendPipeHandles>
lowerSingleDirectionFrontendInit(InitOpT initOp, IRRewriter &rewriter,
PTOArch arch, Type pipeTy, int8_t dirMask,
Value localAddr) {
int32_t slotNum = getFrontendSlotNum(initOp);
auto pipeOr =
createFrontendPipe(initOp, rewriter, arch, pipeTy, dirMask,
kSingleDirectionSlotNum, localAddr);
createFrontendPipe(initOp, rewriter, arch, pipeTy, dirMask, slotNum,
localAddr);
if (failed(pipeOr))
return failure();

Expand All @@ -190,9 +200,9 @@ template <typename InitOpT>
static FailureOr<FrontendPipeHandles>
lowerBidirectionalFrontendInit(InitOpT initOp, IRRewriter &rewriter,
PTOArch arch, Type pipeTy) {
int32_t slotNum = getFrontendSlotNum(initOp);
auto pipeOr = createFrontendPipe(initOp, rewriter, arch, pipeTy,
kBidirectionalDirMask,
kBidirectionalSlotNum,
kBidirectionalDirMask, slotNum,
initOp.getC2vConsumerBuf(),
initOp.getV2cConsumerBuf());
if (failed(pipeOr))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,4 @@ module {
}
}

// CHECK: error: 'pto.aic_initialize_pipe' op expects 'local_slot_num' to be less than or equal to 4 for dir_mask = 3
// CHECK: error: 'pto.aic_initialize_pipe' op expects 'local_slot_num' to be less than or equal to slot_num (4) for dir_mask = 3
49 changes: 49 additions & 0 deletions test/lit/pto/tpush_tpop_frontend_slot_num_a3.pto
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
// RUN: ptoas --pto-arch=a3 %s 2>&1 | FileCheck %s --check-prefix=A3

module {
func.func @cube_kernel(%gm_slot_buffer: !pto.ptr<f32>)
attributes {pto.kernel_kind = #pto.kernel_kind<cube>} {
%c0_i32 = arith.constant 0 : i32
%v2c_local = pto.reserve_buffer {
name = "v2c_fifo",
size = 2048,
location = #pto.address_space<mat>,
auto = true
} -> i32
pto.aic_initialize_pipe {id = 0, dir_mask = 2, slot_size = 1024, slot_num = 2}
(gm_slot_buffer = %gm_slot_buffer : !pto.ptr<f32>,
c2v_consumer_buf = %c0_i32 : i32,
v2c_consumer_buf = %v2c_local : i32)

%recv_tile = pto.tpop_from_aiv {id = 0, split = 0}
-> !pto.tile_buf<loc=mat, dtype=f32, rows=16, cols=16, v_row=16, v_col=16, blayout=col_major, slayout=row_major, fractal=512, pad=0>
pto.tfree_from_aiv {id = 0, split = 0}
return
}

func.func @vector_kernel(%gm_slot_buffer: !pto.ptr<f32>)
attributes {pto.kernel_kind = #pto.kernel_kind<vector>} {
%c0_i32 = arith.constant 0 : i32
%v2c_import = pto.import_reserved_buffer {
name = "v2c_fifo",
peer_func = @cube_kernel
} -> i32
pto.aiv_initialize_pipe {id = 0, dir_mask = 2, slot_size = 1024, slot_num = 2}
(gm_slot_buffer = %gm_slot_buffer : !pto.ptr<f32>,
c2v_consumer_buf = %c0_i32 : i32,
v2c_consumer_buf = %v2c_import : i32)

%vec_tile = pto.alloc_tile : !pto.tile_buf<loc=vec, dtype=f32, rows=16, cols=16, v_row=16, v_col=16, blayout=row_major, slayout=none_box, fractal=512, pad=0>
pto.tpush_to_aic(%vec_tile : !pto.tile_buf<loc=vec, dtype=f32, rows=16, cols=16, v_row=16, v_col=16, blayout=row_major, slayout=none_box, fractal=512, pad=0>) {id = 0, split = 0}
return
}
}

// A3-LABEL: AICORE void cube_kernel(__gm__ float*
// A3: auto {{v[0-9]+}} = TPipe<0, Direction::DIR_V2C, 1024, 2, 2, true>(
// A3: TPOP<TPipe<0, Direction::DIR_V2C, 1024, 2, 2, true>
// A3: TFREE<TPipe<0, Direction::DIR_V2C, 1024, 2, 2, true>, TileSplitAxis::TILE_NO_SPLIT>(

// A3-LABEL: AICORE void vector_kernel(__gm__ float*
// A3: auto {{v[0-9]+}} = TPipe<0, Direction::DIR_V2C, 1024, 2, 2, true>(
// A3: TPUSH<TPipe<0, Direction::DIR_V2C, 1024, 2, 2, true>
15 changes: 15 additions & 0 deletions test/lit/pto/tpush_tpop_frontend_slot_num_invalid.pto
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
// RUN: not ptoas --pto-arch=a3 %s 2>&1 | FileCheck %s

module {
func.func @cube_kernel(%gm_slot_buffer: !pto.ptr<f32>)
attributes {pto.kernel_kind = #pto.kernel_kind<cube>} {
%c0_i32 = arith.constant 0 : i32
pto.aic_initialize_pipe {id = 0, dir_mask = 1, slot_size = 1024, slot_num = 0}
(gm_slot_buffer = %gm_slot_buffer : !pto.ptr<f32>,
c2v_consumer_buf = %c0_i32 : i32,
v2c_consumer_buf = %c0_i32 : i32)
return
}
}

// CHECK: error: 'pto.aic_initialize_pipe' op expects 'slot_num' to be greater than 0
15 changes: 15 additions & 0 deletions test/lit/pto/tpush_tpop_frontend_slot_num_local_invalid.pto
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
// RUN: not ptoas --pto-arch=a3 %s 2>&1 | FileCheck %s

module {
func.func @cube_kernel(%gm_slot_buffer: !pto.ptr<f32>)
attributes {pto.kernel_kind = #pto.kernel_kind<cube>} {
%c0_i32 = arith.constant 0 : i32
pto.aic_initialize_pipe {id = 0, dir_mask = 2, slot_size = 1024, slot_num = 2, local_slot_num = 3}
(gm_slot_buffer = %gm_slot_buffer : !pto.ptr<f32>,
c2v_consumer_buf = %c0_i32 : i32,
v2c_consumer_buf = %c0_i32 : i32)
return
}
}

// CHECK: error: 'pto.aic_initialize_pipe' op expects 'local_slot_num' to be less than or equal to slot_num (2) for dir_mask = 2
Loading
Loading