李理的博客

动手实现和优化BPE Tokenizer的训练——第2部分：优化算法

本系列文章完成Stanford CS336作业1的一个子任务——实现BPE Tokenizer的高效训练算法。通过一系列优化，我们的算法在OpenWebText上的训练时间从最初的10多个小时优化到小于10分钟。本系列文章解释这一系列优化过程，包括：算法的优化，数据结构的优化，并行(openmp)优化，cython优化，用c++实现关键代码和c++库的cython集成等内容。本文是第三篇，优化之前的算法。

Posted by lili on September 8, 2025

Implementing and Optimizing a BPE Tokenizer from Scratch—Part 1: The Simplest Implementation

This series of articles implements a subtask of Stanford’s CS336 Assignment 1: building an efficient training algorithm for a BPE Tokenizer. Through a series of optimizations, our algorithm’s training time on OpenWebText was reduced from over 10 hours to less than 10 minutes. This series explains these optimizations, including algorithmic improvements, data structure enhancements, parallelization with OpenMP, Cython optimization, and implementing key code in C++ along with its integration via Cython. This is the second article, covering the implementation of the simplest algorithm.

Posted by lili on September 7, 2025

动手实现和优化BPE Tokenizer的训练——第1部分：最简单实现

本系列文章完成Stanford CS336作业1的一个子任务——实现BPE Tokenizer的高效训练算法。通过一系列优化，我们的算法在OpenWebText上的训练时间从最初的10多个小时优化到小于10分钟。本系列文章解释这一系列优化过程，包括：算法的优化，数据结构的优化，并行(openmp)优化，cython优化，用c++实现关键代码和c++库的cython集成等内容。本文是第二篇，实现一个最简单的算法。

Posted by lili on September 7, 2025

Building and Optimizing a BPE Tokenizer from Scratch—Part 0: Introduction

This series of articles implements a subtask of Stanford’s CS336 Assignment 1: building an efficient training algorithm for a BPE Tokenizer. Through a series of optimizations, our algorithm’s training time on OpenWebText was reduced from over 10 hours to less than 10 minutes. This series explains these optimizations, including algorithmic improvements, data structure enhancements, parallelization with OpenMP, Cython optimization, and implementing key code in C++ along with its integration via Cython. This first article covers the task’s introduction, how to get the source code, and how to set up the development environment.

Posted by lili on September 5, 2025

Posted by lili on July 6, 2025

翻译：DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

本文翻译 DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning。

Posted by lili on June 30, 2025

翻译：YaRN: Efficient Context Window Extension of Large Language Models

本文翻译 YaRN: Efficient Context Window Extension of Large Language Models。

Posted by lili on June 29, 2025

李理的博客

动手实现和优化BPE Tokenizer的训练——第2部分：优化算法

Implementing and Optimizing a BPE Tokenizer from Scratch—Part 1: The Simplest Implementation

动手实现和优化BPE Tokenizer的训练——第1部分：最简单实现

Building and Optimizing a BPE Tokenizer from Scratch—Part 0: Introduction

动手实现和优化BPE Tokenizer的训练——第0部分：简介

模型优化

负对数似然和交叉熵

翻译：The Log-Sum-Exp Trick

翻译：DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

翻译：YaRN: Efficient Context Window Extension of Large Language Models

FEATURED TAGS

ABOUT ME

FRIENDS