skip to content
Posts
-
What are video codes, their evolution, and how to work with them
-
Just a brief explanation of how attention mechanism works. As well as the quadratic scaling of attention.
-
A reflective poem about identity, the tension between dreams and reality, and the struggle between belief and self
-
FPGAs: the ultimate flex by Jon Y from Asianometry
-
C code style by Malcolm Inglis
-
layer normalization of GPT by Andrej Karpathy
-
layer normalization of GPT by Andrej Karpathy
-
Understanding the Text Corpus and Training Datasets of GPT-3
-
How I understand the Decoder Transformer in Generative Text Models
-
A brief history of large language models, from bigrams to transformers