Featured Posts
-
DeepSeek OCR, and why I think vision eats language
Notes on DeepSeek OCR
-
My Take on GPT-5
First impressions
-
The Murmuring Woman
A Parable for LLM Thinking
-
Dropout - Review
Revisiting the foundational 2014 paper on Dropout
-
Rethinking Sequence-to-Sequence - Review
Looking back at the 2015 paper that introduced an attention-like mechanism to NMT.
-
Knowledge Distillation - Review
Reviewing the elegant 2015 paper by Hinton, Vinyals, and Dean on knowledge distillation.