Skip to content

Files

Latest commit

 

History

History
24 lines (21 loc) · 1.1 KB

README.md

File metadata and controls

24 lines (21 loc) · 1.1 KB

TheTensorCoreProject

Microarchitecture implementation of my interpretation of Nvidia's SIMT CUDA and Hybrid-Precision Tensor Cores, and Google's Systolic Array TPU MXU

Tensor Core Versions

TensorCore v0: Volta Architecture [FP16MUL FP32ADD]

Volta Tensor Core Architecture Diagram
Volta Tensor Core Architecture Diagram

TensorCore v1: Ampere Architecture [TF32MUL FP32ADD / BF16MUL FP32ADD] + Fine-Grained Structured Sparsity

Ampere Tensor Core Architecture Diagram
Ampere Tensor Core Architecture Diagram

TensorCore v2: Hopper Architecture [FP8(E5M2/E4M3)MUL FP16ADD]

Hopper Tensor Core Architecture Diagram