CANN/pto-isa 交织指令 # TINTERLEAVE【免费下载链接】pto-isaParallel Tile Operation (PTO) is a virtual instruction set architecture designed by Ascend CANN, focusing on tile-level operations. This repository offers high-performance, cross-platform tile operations across Ascend platforms.项目地址: https://gitcode.com/cann/pto-isa指令示意图简介将两个源 Tilesrc0和src1交织到两个目标 Tiledst0和dst1中。该操作以交替模式组合src0和src1的元素交织流的偶数位置元素放入dst0奇数位置元素放入dst1。每个目标 Tile 各持有交织流的一半在中点处拆分。TInterleave是TDeInterleave的逆操作。数学语义双源形式给定两个具有相同有效形状(validRows, validCols)的源 Tilesrc0和src1为每行构造长度为2 × validCols的交织流$$ \mathrm{interleaved}{2k} \mathrm{src0}{i, k}, \quad \mathrm{interleaved}{2k1} \mathrm{src1}{i, k}, \quad 0 \le k \mathrm{validCols} $$然后将交织流拆分为两半$$ \mathrm{dst0}{i, j} \mathrm{interleaved}{j}, \quad 0 \le j \mathrm{validCols} $$ $$ \mathrm{dst1}{i, j} \mathrm{interleaved}{\mathrm{validCols} j}, \quad 0 \le j \mathrm{validCols} $$其中validRows dst0.GetValidRow()且validCols dst0.GetValidCol()。汇编语法PTO-AS 形式参见 PTO-AS 规范。同步形式%dst0, %dst1 tinterleave %src0, %src1 : !pto.tile...AS Level 1SSA%dst0, %dst1 pto.tinterleave %src0, %src1 : (!pto.tile..., !pto.tile...) - (!pto.tile..., !pto.tile...)AS Level 2DPSpto.tinterleave ins(%src0, %src1 : !pto.tile_buf..., !pto.tile_buf...) outs(%dst0, %dst1 : !pto.tile_buf..., !pto.tile_buf...)C 内建接口声明于include/pto/common/pto_instr.hpptemplate typename TileDataDst, typename TileDataSrc, typename... WaitEvents PTO_INST RecordEvent TInterleave(TileDataDst dst1, TileDataDst dst0, TileDataSrc src1, TileDataSrc src0, WaitEvents ...events);注意参数顺序为(dst1, dst0, src1, src0)。dst0接收交织流的前半部分位置0 … validCols-1dst1接收后半部分位置validCols … 2×validCols-1。约束实现检查 (A5):TileData::DType必须是以下之一int32_t、uint32_t、float、int16_t、uint16_t、half、bfloat16_t、uint8_t、int8_t。Tile 布局必须是行主序TileData::isRowMajor。所有 Tiledst0、dst1、src0、src1必须具有相同的DType, 相同的有效形状。所有 Tile 的validCol必须为偶数dst0.GetValidCol() % 2 0。由于所有 Tile 共享相同的有效形状这等价于要求dst0.GetValidCol() % 2 0。有效区域:该操作使用dst0.GetValidRow()/dst0.GetValidCol()作为迭代域假定src0/src1/dst1是兼容的。示例自动Auto#include pto/pto-inst.hpp using namespace pto; void example_auto() { using TileT TileTileType::Vec, float, 16, 64; TileT src0(16, 64), src1(16, 64); TileT dst0(16, 64), dst1(16, 64); TInterleave(dst1, dst0, src1, src0); }手动Manual#include pto/pto-inst.hpp using namespace pto; void example_manual() { using TileT TileTileType::Vec, half, 16, 256, BLayout::RowMajor, 16, 256; TileT src0, src1, dst0, dst1; TASSIGN(src0, 0x1000); TASSIGN(src1, 0x2000); TASSIGN(dst0, 0x3000); TASSIGN(dst1, 0x4000); TInterleave(dst1, dst0, src1, src0); }汇编示例ASM自动模式# 自动模式由编译器/运行时负责资源放置与调度。 %dst0, %dst1 pto.tinterleave %src0, %src1 : (!pto.tile..., !pto.tile...) - (!pto.tile..., !pto.tile...)手动模式# 手动模式先显式绑定资源再发射指令。 # 可选当该指令包含 tile 操作数时 # pto.tassign %src0, tile(0x1000) # pto.tassign %src1, tile(0x2000) # pto.tassign %dst0, tile(0x3000) # pto.tassign %dst1, tile(0x4000) %dst0, %dst1 pto.tinterleave %src0, %src1 : (!pto.tile..., !pto.tile...) - (!pto.tile..., !pto.tile...)PTO 汇编形式%dst0, %dst1 tinterleave %src0, %src1 : !pto.tile... # AS Level 2 (DPS) pto.tinterleave ins(%src0, %src1 : !pto.tile_buf..., !pto.tile_buf...) outs(%dst0, %dst1 : !pto.tile_buf..., !pto.tile_buf...)相关指令TDeInterleave - 将两个 Tile 反交织回原始的偶/奇流TInterleave 的逆操作。【免费下载链接】pto-isaParallel Tile Operation (PTO) is a virtual instruction set architecture designed by Ascend CANN, focusing on tile-level operations. This repository offers high-performance, cross-platform tile operations across Ascend platforms.项目地址: https://gitcode.com/cann/pto-isa创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考