Vedant Puri profile photo

Vedant Puri

PhD Candidate, Carnegie Mellon University

Efficient Transformer Architectures | Scientific Machine Learning

I design transformer architectures with explicit attention to scaling and memory efficiency. My recent work, FLARE, enables million-token regimes on a single GPU. I implement new architectures directly in PyTorch and Triton. My background spans high-performance computing, numerical analysis, and computational fluid dynamics.

LinkedIn | GitHub | Google Scholar | vedantpuri@cmu.edu

Research Interests

  • Efficient attention architectures
  • Numerical methods for ML and for PDEs
  • Scientific machine learning

Previous Work: Computational fluid dynamics on HPC systems

I previously worked on turbulence simulation and analysis workflows in high-performance computing settings, with emphasis on spectral element methods and large-scale post-processing. This background in numerical methods and PDE solvers informs how I design stable and efficient transformer architectures for scientific ML.

Velocity magnitude for flow past wall-mounted cube Velocity magnitude for flow past wall-mounted cube case at Reynolds Number 3900 with respect to cube height. Computation performed using spectral element code NEK5000 at Argonne Leadership Computing Facility.

Not Work

Not So Up-to-Date Photography Portfolio

For the past decade, I have used a Canon DSLR as an excuse to walk around and photograph people, geometry, and city texture.

Open portfolio page | Flickr

Hobbies

  • Sports: squash, golf, crossfit

Open Source

FLARE GitHub GitHub stars

Fast Low-rank Attention Routing Engine for scalable transformer attention.

mlutils.py GitHub GitHub stars

Lightweight PyTorch project template and utility toolkit for ML experiments.

Julia Open Source Tools

SciMLOperators.jl SciMLOperators.jl SciMLOperators stars

Operator abstractions for SciML and PDE workflows

LinearSolve.jl LinearSolve.jl LinearSolve stars

Linear solver interface for scientific machine learning

Below is a nonexhaustive list of Julia projects that I have contributed to.

KolmogorovArnold.jl GitHub GitHub stars

Julia implementation of Kolmogorov-Arnold Networks with custom gradients for faster training.

FastDiffusion.py GitHub GitHub stars

Experiment with trigonometric noise schedule in context of few step diffusion.

NekTools GitHub GitHub stars

FORTRAN 77 utilities for turbulence statistics and post-processing in NEK5000.