李理的博客

Building and Optimizing a BPE Tokenizer from Scratch—Part 0: Introduction

This series of articles implements a subtask of Stanford’s CS336 Assignment 1: building an efficient training algorithm for a BPE Tokenizer. Through a series of optimizations, our algorithm’s training time on OpenWebText was reduced from over 10 hours to less than 10 minutes. This series explains these optimizations, including algorithmic improvements, data structure enhancements, parallelization with OpenMP, Cython optimization, and implementing key code in C++ along with its integration via Cython. This first article covers the task’s introduction, how to get the source code, and how to set up the development environment.


动手实现和优化BPE Tokenizer的训练——第0部分:简介

本系列文章完成Stanford CS336作业1的一个子任务——实现BPE Tokenizer的高效训练算法。通过一系列优化,我们的算法在OpenWebText上的训练时间从最初的10多个小时优化到小于10分钟。本系列文章解释这一系列优化过程,包括:算法的优化,数据结构的优化,并行(openmp)优化,cython优化,用c++实现关键代码和c++库的cython集成等内容。本文是第一篇,内容包括这个任务的介绍,获取源代码和设置开发环境。


模型优化


负对数似然和交叉熵


翻译:The Log-Sum-Exp Trick

本文翻译The Log-Sum-Exp Trick

对数概率向量的归一化是统计建模中的常见任务,但当对大数值进行指数运算时,这可能导致下溢或上溢。本文将讨论用于解决此问题的对数-和-指数技巧(log-sum-exp trick)。


翻译:DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

本文翻译DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning


翻译:YaRN: Efficient Context Window Extension of Large Language Models

本文翻译YaRN: Efficient Context Window Extension of Large Language Models


翻译:DeepSeek-R1: Advancing LLM Reasoning with Reinforcement Learning

本文翻译DeepSeek-R1: Advancing LLM Reasoning with Reinforcement Learning


翻译:DeepSeek Explained 6: All you need to know about Reinforcement Learning in LLM training

本文翻译DeepSeek Explained 6: All you need to know about Reinforcement Learning in LLM training


翻译:DeepSeek Explained 5: DeepSeek-V3-Base

本文翻译DeepSeek Explained 5: DeepSeek-V3-Base