Machine Learning with Audio Python

Object-Aware Image Augmentation for Audio-Visual Zero-Shot Learning

Abstract: Audio-visual zero-shot learning (ZSL) leverages both video and audio information for model training, aiming to classify new video categories that were not seen during the training. However, ...

GitHub

AVF-MAE++ : Scaling Affective Video Facial Masked Autoencoders via Efficient Audio-Visual Self-Supervised Learning

Abstract: Affective Video Facial Analysis (AVFA) is important for advancing emotion-aware AI, yet the persistent data scarcity in AVFA presents challenges. Recently, the self-supervised learning (SSL) ...

IEEE

Text-Based Audio Retrieval by Learning From Similarities Between Audio Captions

Abstract: This letter proposes to use similarities of audio captions for estimating audio-caption relevances to be used for training text-based audio retrieval systems. Current audio-caption datasets ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Object-Aware Image Augmentation for Audio-Visual Zero-Shot Learning

AVF-MAE++ : Scaling Affective Video Facial Masked Autoencoders via Efficient Audio-Visual Self-Supervised Learning

Text-Based Audio Retrieval by Learning From Similarities Between Audio Captions

Trending now