Avner May

Staff Research Scientist at together.ai
CV

About me

I am a staff research scientist at together.ai. Prior to joining Together, I was a research scientist in Google's speech recognition group (2020-2023), and a postdoctoral scholar working in Prof. Chris Ré's group at Stanford University (2018-2020). I completed my PhD in Computer Science at Columbia University in December 2017, advised by Prof. Michael Collins. Prior to my PhD, I worked for two years as a software development engineer at Microsoft, living in Seattle, WA. I graduated in 2009 from Harvard College, where I majored in Mathematics, with a minor in Computer Science. I am originally from Potomac, MD.

Research Interests

My research interests center around designing simpler, better understood, and more efficient, machine learning models. For example, during my PhD, I showed that kernel approximation methods can perform comparably to fully-connected deep neural networks on the challenging non-linear classification problems in speech recognition systems. More recently, I have worked on better understanding what makes an approximate feature representation perform well on downstream tasks, both in the context of kernel approximation methods and word embedding compression. This understanding is important for efficiently selecting among existing feature approximations or designing new ones, and for navigating the trade-offs between computation, memory, and downstream performance.

Prior to working on machine learning, I did two years of research in social network analysis, advised by Prof. Augustin Chaintreau; I studied whether social networks like Facebook or Twitter are efficient systems for delivering content of interest to their users (2011-2013).

Other Interests

I love most things that involve being active and outdoors — running, biking, snowboarding, hiking, camping, and basically anything in the mountains. During the summer of 2017 I spent 2.5 months on the Pacific Crest Trail. I am very interested in food systems and nutrition, and how they affect our health, the environment, and the well-being of animals.

Publications

Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models
A. Narayan*, D. Biderman*, S. Eyuboglu*, A. May, S. Linderman, J. Zou, C. Ré
ICML 2025
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
R. Sadhukhan*, J. Chen*, Z. Chen, V. Tiwari, R. Lai, J. Shi, I. Yen, A. May, T. Chen, B. Chen
ICLR 2025
The Mamba in the Llama: Distilling and Accelerating Hybrid Models
J. Wang*, D. Paliotta*, A. May, A. Rush, T. Dao
NeurIPS 2024
SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices
R. Svirschevski*, A. May*, Z. Chen*, B. Chen, Z. Jia, M. Ryabinin
NeurIPS 2024
Sequoia: Scalable, Robust, and Hardware-Aware Speculative Decoding
Z. Chen*, A. May*, R. Svirschevski*, Y. Huang, M. Ryabinin, Z. Jia, B. Chen
NeurIPS 2024
Audio-Visual Fine-tuning of Audio-Only ASR Models
A. May, D. Serdyuk, A. Shah, O. Braga, O. Siohan
ArXiv 2023
Contextual Embeddings: When are they worth it?
S. Arora*, A. May*, J. Zhang, C. Ré
ACL 2020
Understanding the Downstream Instability of Word Embeddings
M. Leszczynski, A. May, J. Zhang, S. Wu, C. Aberger, C. Ré
MLSys 2020
On the Downstream Performance of Compressed Word Embeddings
A. May, J. Zhang, T. Dao, C. Ré
NeurIPS 2019 (Spotlight, 3% acceptance) [slides] [video]
Low-Precision Random Fourier Features for Memory-Constrained Kernel Approximation
J. Zhang*, A. May*, T. Dao, C. Ré
AISTATS 2019
Kernel Approximation Methods for Speech Recognition
A. May, A.B. Garakani, Z. Lu, D. Guo, K. Liu, A. Bellet, L. Fan, M. Collins, D. Hsu, B. Kingsbury, M. Picheny, F. Sha
JMLR 2019 (arXiv 2017)
Kernel Approximation Methods for Speech Recognition
Avner May
PhD Thesis, 2017 [slides]
Compact Kernel Models for Acoustic Modeling via Random Feature Selection
A. May, M. Collins, D. Hsu, B. Kingsbury
ICASSP 2016
A Comparison Between Deep Neural Nets and Kernel Acoustic Models for Speech Recognition
Z. Lu, D. Guo, A.B. Garakani, K. Liu, A. May, A. Bellet, L. Fan, M. Collins, B. Kingsbury, M. Picheny, F. Sha
ICASSP 2016
How to Scale Up Kernel Methods to Be As Good As Deep Neural Nets
Z. Lu*, A. May*, K. Lu, A. Garakani, D. Guo, A. Bellet, L. Fan, F. Sha, M. Collins, B. Kingsbury
arXiv 2014
Filter & Follow: How Social Media Foster Content Curation
A. May, A. Chaintreau, N. Korula, S. Lattanzi
SIGMETRICS 2014

* Equal contribution.

Internships

Summer 2015: Google Research, New York, NY.
Summer 2014: Microsoft Research, Redmond, WA.

Community Service

I have been a reviewer for ICLR 2018, 2022-2024, ICML 2017-2020, 2022 (2019 Top Reviewer), NeurIPS 2017-2019, 2022-2023, IJCAI 2019-2020 (2019 Distinguished PC member), AAAI 2020, 2022, ICASSP 2023 (Outstanding Reviewer), EMNLP 2022, JMLR, IEEE Transactions on Multimedia.