Abstract: Video question answering (VideoQA), a critical task in vision-language understanding and reasoning, encounters significant challenges in integrating visual concepts for compositional ...
Abstract: Medical visual question answering (Med-VQA) is a crucial multimodal task in clinical decision support and telemedicine. Recent methods fail to fully leverage domain-specific medical ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results