InterSketch: Interleaved Visual–Textual Chain-of-Thought with Tool-Generated Sketches and Reinforcement Learning
Under review at International Conference on Machine Learning (ICML), 2025
An interleaved multimodal reasoning framework that leverages tool-generated sketches to support long-horizon visual-text Chain-of-Thought reasoning.
Recommended citation: Z. Ning, W. Tong, X. Kong, S. Ma, Z. Shang, J. Ni, T. Hu, Y. X. Chng, Jixuan Ying, Z. Wu, J. Yang, W. Liu, H. Deng, L. Lu. "InterSketch: Interleaved Visual–Textual Chain-of-Thought with Tool-Generated Sketches and Reinforcement Learning." ICML 2026, under review.
