Kimi Linear is a groundbreaking hybrid attention architecture that combines the best of both worlds: the efficiency of linear attention and the performance of full attention mechanisms. This ...
5D并行 = DP × PP × TP × SP × EP 示例配置 (1024 GPUs): - DP: 8路 (8个数据副本) - PP: 8路 (8个流水线阶段) - TP: 8路 (8路张量并行) - SP: 2路 (2路序列并行) - EP: 1路 (所有专家在同一组) 总模型大小 ≈ 单GPU ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results