【Stick-breaking Attention:基于 Triton 的变长序列注意力机制实现,旨在提高 GPU 上的性能】'Stick-breaking attention: Triton-based implementation of Stick-breaking Attention on GPUs. This implementation is for variable length.' GitHub: github.com/shawntan/stickbreaking-attention
人工智能 GPU计算 注意力机制