← 算子目录

FlashAttention-3

attention

Hopper-optimized FlashAttention v3 using TMA and FP8 paths

公式

FLOPs

same as attention but with hardware-aware tiling

Bytes

reduced HBM reads via SRAM tiling and warp-specialized scheduling

使用此算子的模型 (0)

尚无模型在算子拆解中引用此算子。