Build gpt oss from scratch
2025-12-19
Build gemma from scratch
Build DeepSeek from scratch - Part 21: Mixed Precision and Fine-Grained Quantization
2025-12-18
Build DeepSeek from scratch - Part 11: Sinusoidal Positional Encoding
2025-12-17
Build DeepSeek from scratch - Part 12: Rotary Positional Encoding (RoPE)
Build DeepSeek from scratch - Part 13: Decoupling Attention and Position
Build DeepSeek from scratch - Part 14: Mixture of Experts (MoE)
Build DeepSeek from scratch - Part 15: DeepSeek's MoE Innovations
Build DeepSeek from scratch - Part 16: Understanding Mixture of Experts
Build DeepSeek from scratch - Part 17: DeepSeek's MoE Innovations
Build DeepSeek from scratch - Part 18: Multi-Token Prediction Foundation
Build DeepSeek from scratch - Part 19: DeepSeek's Multi-Token Prediction (MTP)
Build DeepSeek from scratch - Part 20: The Foundation of Quantization
Build DeepSeek from scratch - Part 22: Advanced Quantization Techniques
Build DeepSeek from scratch - Part 10: Positional Encodings (Integer and Binary)
2025-12-16
Build DeepSeek from scratch - Part 7: The KV Cache Memory Problem
Build DeepSeek from scratch - Part 8: Grouped Query Attention (GQA)
Build DeepSeek from scratch - Part 9: Multi-Head Latent Attention (MHLA)
Build DeepSeek from scratch - Part 4: Causal Attention
2025-12-15
Build DeepSeek from scratch - Part 5: Multi-Head Attention
Build DeepSeek from scratch - Part 6: The KV Cache Memory Problem
Build DeepSeek from scratch - Part 1: The Foundation
2025-12-13
Build DeepSeek from scratch - Part 2: Opening the hood of the DeepSeek engine
Build DeepSeek from scratch - Part 3: The Attention Mechanism
RAG vs. CAG: Comparing Approaches to Enhance LLM Knowledge
2025-04-19
Optimizing Large Language Models: RAG vs. Fine-Tuning vs. Prompt Engineering
2025-04-16
YOLO Object Detection: From Training to Prediction
Application Permission Systems: A Deep Dive into RBAC, ABAC, and Best Practices
2025-04-10
Deep Dive into 5 Essential Sorting Algorithms
2025-04-09
Deep Dive into LLMs like ChatGPT
2025-03-20