Selected Publications
Can MLLMs Absorb Math Reasoning Abilities from LLMs as Free Lunch?
The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeruIPS), 2025.
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait
International Journal of Computer Vision (IJCV), 2025.
[280+ Star]
GitHub Repo PDF
GitHub Repo PDF
Can GRPO Boost Complex Multimodal Table Reasoning?
The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025.
PDF
MedMAP: Promoting Incomplete Multi-modal Brain Tumor Segmentation with Alignment
IEEE Journal of Biomedical and Health Informatics (JBHI), 2025.
PDF
DvD: Unleashing a Generative Paradigm for Document Dewarping via Coordinates-based Diffusion Model
The 17th ACM SIGGRAPH Conference and Exhibition on Computer Graphics and Interactive Techniques in Asia (SIGGRAPH Asia), 2025.
PDF
Towards Training-Free Open-World Classification with 3D Generative Models
The 33rd ACM International Conference on Multimedia (ACM MM), 2025.
PDF
KDTalker++: Controllable Talking Portrait Generation with Audio, Text, and Expression Editing
The 33rd ACM International Conference on Multimedia - Demo Track (ACM MM), 2025.
GitHub Repo
GitHub Repo
Towards a Universal 3D Medical Multi-modality Generalization via Learning Personalized Invariant Representation
The International Conference on Computer Vision (ICCV), 2025.
PDF
SCMix: Stochastic Compound Mixing for Open Compound Domain Adaptation in Semantic Segmentation
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2025.
PDF
KMD: Koopman Multi-modality Decomposition for Generalized Brain Tumor Segmentation under Incomplete Modalities
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2025.
PO3AD: Predicting Point Offsets toward Better 3D Point Cloud Anomaly Detection
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2025.
PDF
BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2025.
PDF
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
The Thirteenth International Conference on Learning Representations (ICLR), 2025.
PDF
ZeroDiff: Solidified Visual-semantic Correlation in Zero-Shot Learning
The Thirteenth International Conference on Learning Representations (ICLR), 2025.
PDF
Disentangling Tabular Data towards Better One-Class Anomaly Detection
The 39th Annual AAAI Conference on Artificial Intelligence (AAAI), 2025.
PDF
GNS: Solving Plane Geometry Problems by Neural-Symbolic Reasoning with Multi-Modal LLMs
The 39th Annual AAAI Conference on Artificial Intelligence (AAAI), 2025.
PDF
Towards Better Robustness Against Natural Corruptions in Document Tampering Localization
The 39th Annual AAAI Conference on Artificial Intelligence (AAAI), 2025.
PDF
Template-Driven LLM-Paraphrased Framework for Tabular Math Word Problem Generation
The 39th Annual AAAI Conference on Artificial Intelligence (AAAI), 2025.
PDF
Covariance-based Space Regularization for Few-shot Class Incremental Learning
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025.
PDF
Interpret Your Decision: Logical Reasoning Regularization for Generalization in Visual
Classification
Neural Information Processing Systems (NeurIPS), 2024.
[Spotlight]
PDF
ES-GNN: Generalizing Graph Neural Networks Beyond Homophily with Edge Splitting
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024.
DOI PDF
DOI PDF
Latest Breaking Work
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait
International Journal of Computer Vision (IJCV), 2025.
[280+ Star]
GitHub Repo PDF
GitHub Repo PDF