2025 Universal Approximation with Softmax Attention Jerry Yao-Chieh Hu*, Hude Liu*, Hong-Yu Chen*, Weimin Wu, and Han Liu 2025 *Equal Contribution arXiv Transformers versus the EM Algorithm in Multi-class Clustering Yihan He, Hong-Yu Chen, Yuan Cao, Jianqing Fan, and Han Liu arXiv preprint arXiv:2502.06007, 2025 arXiv Learning spectral methods by transformers Yihan He, Yuan Cao, Hong-Yu Chen, Dennis Wu, Jianqing Fan, and Han Liu arXiv preprint arXiv:2501.01312, 2025 arXiv Transformers Simulate MLE for Sequence Generation in Bayesian Networks Yuan Cao, Yihan He, Dennis Wu, Hong-Yu Chen, Jianqing Fan, and Han Liu arXiv preprint arXiv:2501.02547, 2025 arXiv 2024 Outlier-Efficient Hopfield Layers for Large Transformer-Based Models Jerry Yao-Chieh Hu, Pei-Hsuan Chang, Haozheng Luo, Hong-Yu Chen, Weijian Li, Wei-Po Wang, and Han Liu In Proceedings of the 41st International Conference on Machine Learning, 21–27 jul 2024 arXiv PDF Code Poster Slides