Ervin Tasnadi's blog
  • GitHub
  • LinkedIn
  • Mail

Ervin Tasnadi’s blog

GPU programming & deep learning

  • Nanobenchmarking: cycle accurate benchmarking of CUDA kernels

    Dec 3
  • Proton profile location

    Jan 28
  • Memory efficient Scaled Dot Product Attention (SDPA) with Tensor Cores acceleration implemented in Vulkan

    Jan 19
  • Gradient of the attention op

    Oct 9

Designed with WordPress