skip to content
Tags → #ml-ai 16 Sept 2025 Visual Geometry Grounded Transformer - CVPR2025 3 Mar 2025 This is a summary from my understanding of reinforcement learning, based on the book Reinforcement Learning: An Introduction by Sutton and Barto, and supplemented with the YouTube series. 25 Feb 2025 A post on how FPGA lost to NVIDIA. Not written by me. 23 Feb 2025 Trying to understand how Flash Attention works on Tenstorrent and how it compares to CUDA 7 Sept 2024 Understanding Adaptive Layer Normalization. First introduced in the DiT paper 31 Jul 2024 Just a brief explanation of how attention mechanism works. As well as the quadratic scaling of attention. 1 May 2024 layer normalization of GPT by Andrej Karpathy 1 May 2024 layer normalization of GPT by Andrej Karpathy 4 Apr 2024 Understanding the Text Corpus and Training Datasets of GPT-3 30 Mar 2024 How I understand the Decoder Transformer in Generative Text Models Next Tags →