Best Articles by Topic

A curated, non-chronological guide to the articles I would recommend first. If the blog archive is the full library, this page is the shelf of strongest starting points by topic.

LLM architectures From scratch Reasoning models PyTorch Beginner-friendly

Best LLM Architecture Articles

Start here if you want to compare modern open-weight models, attention variants, mixture-of-experts designs, and practical architecture tradeoffs.

The Big LLM Architecture Comparison The main long-form comparison of recent LLM architecture choices.

A Visual Guide to Attention Variants in Modern LLMs A focused guide to MHA, GQA, MLA, sparse attention, and hybrid attention.

LLM Architecture Gallery A visual reference collection of architecture figures and compact fact sheets.

My Workflow for Understanding LLM Architectures How to inspect open-weight model configs and implementation details.

From GPT-2 to gpt-oss A bridge from classic GPT-style designs to newer open-weight releases.

From DeepSeek V3 to V3.2 A technical look at DeepSeek architecture, sparse attention, and RL updates.

Best From-Scratch Implementation Articles

These are the most useful articles if your preferred way to understand a method is to implement it and inspect the moving parts.

Understanding and Coding the KV Cache in LLMs from Scratch A practical explanation of one of the key mechanisms behind efficient inference.

Self-Attention from Scratch A step-by-step implementation of the core mechanism behind transformer LLMs.

Byte Pair Encoding Tokenizer from Scratch A standalone implementation of the tokenization algorithm used by many LLMs.

Understanding and Implementing Qwen3 from Scratch A detailed implementation-focused look at a modern open-weight LLM family.

LoRA and DoRA from Scratch A code-oriented explanation of parameter-efficient adaptation methods.

The 4 Main Approaches to LLM Evaluation from Scratch Benchmarks, verifiers, leaderboards, and LLM judges with concrete examples.

Best Reasoning Model Articles

Use this path for reinforcement learning for LLM training, inference scaling, distillation, and other advanced reasoning-model methods.

Understanding Reasoning LLMs The high-level map of reasoning model strategies and where they fit.

Categories of Inference-Time Scaling A guide to using more test-time compute to improve LLM reasoning.

The State of Reinforcement Learning for LLM Reasoning GRPO, reinforcement learning, and post-training ideas for reasoning models.

Inference-Time Compute Scaling Methods A deeper look at inference-time compute methods for reasoning-optimized LLMs.

First Look at Reasoning From Scratch A bridge from the conceptual articles into the code-first reasoning book.

Best Practical PyTorch Articles

These articles and tutorials are useful when you want concrete PyTorch practice, training efficiency, and implementation details.

PyTorch in One Hour A compact tutorial for getting productive with tensors, training loops, and GPUs.

Optimizing Memory Usage in PyTorch Practical memory-saving methods for LLMs and vision transformers.

Make Your PyTorch Models Train Faster A practical tour of performance improvements without changing model accuracy.

Mixed-Precision Techniques for LLMs A code-backed explanation of float16, bfloat16, and training speedups.

Finetuning Falcon LLMs with LoRA and Adapters A practical finetuning comparison for parameter-efficient methods.

Single-GPU LLM Finetuning with Gradient Accumulation A practical workaround for larger effective batch sizes when GPU memory is tight.

Best Beginner-Friendly Articles

Start here if you want broader context before jumping into longer technical walkthroughs or code-heavy implementations.

Understanding Large Language Models: A Reading List A guided set of papers and resources for building a first mental model of LLMs.

Developing an LLM: Building, Training, Finetuning A broad overview of the LLM development lifecycle.

New LLM Pre-training and Post-training Paradigms A readable tour of how modern LLM training has changed.

Keeping Up With AI Research and News A practical workflow for navigating the flood of AI papers and updates.

Getting the Most Out of a Technical Book Study advice for turning technical reading into working knowledge.

Components of a Coding Agent A practical entry point into tools, memory, repo context, and coding agents.