GPT-2 WebGL Inference

GPT-2 WebGL Inference Demo

Transformer Block Outputs

First Layer Attention Matrices

Last Layer Attention Matrices