Working on reinforcement learning and generative models.
Zero-shot off-policy learning.
Your latent reasoning is secretly policy improvement operator.
Rethinking optimal transport in offline reinforcement learning.
Neural optimal transport with general cost functionals.
A minimalist approach for domain adaptation with optimal transport.
Exploring and exploiting conditioning of reinforcement learning agents.
See all my papers here.
Applications: Designed neural computer architectures that generate molecules with favorable pharmacokinetics. These models were later validated in real-world applications.
Developed a CowSwap solver and built various AI layers to optimize token swaps, asset bridging, and other decentralized finance operations.
Teaching: Created and taught several courses on reinforcement learning and deep generative models, covering a range of topics from GANs to flow-matching.
Open to discussing new ideas and potential collaborations. Feel free to contact.
Social: @machinestein
Last update: Feb 20, 2026