publications
2025
- Universal Approximation with Softmax Attention2025*Equal Contribution, Under review at ICLR 2026
- Transformers versus the EM Algorithm in Multi-class ClusteringarXiv preprint arXiv:2502.06007, 2025
- Learning spectral methods by transformersarXiv preprint arXiv:2501.01312, 2025Under review at Journal of the American Statistical Association
- Transformers Simulate MLE for Sequence Generation in Bayesian NetworksarXiv preprint arXiv:2501.02547, 2025Under review at Journal of the American Statistical Association