李理的博客

翻译:DeepSeek Explained 6: All you need to know about Reinforcement Learning in LLM training

本文翻译DeepSeek Explained 6: All you need to know about Reinforcement Learning in LLM training


翻译:DeepSeek Explained 5: DeepSeek-V3-Base

本文翻译DeepSeek Explained 5: DeepSeek-V3-Base


翻译:DeepSeek Explained 4: Multi-Token Prediction

本文翻译DeepSeek Explained 4: Multi-Token Prediction


Multi-head Latent Attention代码分析

本文解释MLA的代码。


翻译:DeepSeek-V3 Explained 3: Auxiliary-Loss-Free Load Balancing

本文翻译DeepSeek-V3 Explained 3: Auxiliary-Loss-Free Load Balancing


翻译:DeepSeek-V3 Explained 2: DeepSeekMoE

本文翻译DeepSeek-V3 Explained 2: DeepSeekMoE


RoPE代码分析

本文介绍RoPE的不同代码实现。


翻译:DeepSeek-V3 Explained 1: Multi-head Latent Attention

本文翻译DeepSeek-V3 Explained 1: Multi-head Latent Attention


翻译:The Llama 3 Herd of Models

本文分析阅读The Llama 3 Herd of Models。


Huggingface Whisper代码阅读(一)

本文分析阅读Huggingface Whisper的代码。