Mixture of Experts (MoE): a brief introduction
第四阶段第二次会议 主讲人:韩益增
Mixture of Experts (MoE)
大纲
Concepts and early works
- HydraNet
- Multi-gate Mixture-of-Experts
- Sparsely gated MoE layer
Classic Transformer-based methods
- Switch Transformer
- GShard
- Vision MoE
Recent works
- LI-MoE
- DSelect-k
- BASE
- Hash Layers
- Sparse MLP
- Swin-MoE
- Uni-Perceiver-MoE
Talk 视频回放
https://meeting.tencent.com/v2/cloud-record/share?id=36844c5c-30f5-474a-a62c-fa6f817e9a55&from=3
Slide
链接: https://pan.baidu.com/s/1WzvNxFre7mjpd0d1NhiThg?pwd=c9kh 提取码: c9kh