This review systematically analyzes recent research progress in multimodal fusion techniques for medical imaging classification, focusing on various fusion strategies and their effectiveness in classification tasks. Studies indicate that multimodal fusion methods significantly enhance classification performance and demonstrate potential in clinical decision support. However, challenges remain, including insufficient dataset sharing, limited utilization of text modalities, and inadequate integration of fusion strategies with medical knowledge. Future efforts should focus on developing large-scale public datasets and optimizing deep fusion strategies for image and text modalities to promote broader application in medical scenarios.