Research papers and academic contributions in computer vision, machine learning, and natural language processing.
The SAFE Challenge evaluates synthetic speech detection across three tasks: unmodified audio, processed audio with compression artifacts, and laundered audio designed to evade detection. We systematically explore self-supervised learning (SSL) front-ends, training data compositions, and audio length configurations for robust deepfake detection. Our AASIST-based approach incorporates WavLM Large with RawBoost augmentation, trained on a multilingual dataset of 256,600 samples spanning 9 languages and over 70 TTS systems from CodecFake, MLAAD v5, SpoofCeleb, Famous Figures, and MAILABS. Through extensive experimentation with different SSL front-ends, three training data versions, and two audio lengths, we achieved second place in both Task 1 (unmodified audio detection) and Task 3 (laundered audio detection), demonstrating strong generalization and robustness.
This paper introduces a speaker-specific framework for detecting audio deepfakes. By combining self-supervised learning embeddings with a one-class SVM trained only on genuine speech, the method reliably identifies synthetic voices. Evaluations on benchmark and real-world datasets show strong performance across diverse spoofing techniques, making it a practical solution for safeguarding individuals, such as political figures, against audio impersonation.
This paper investigates how surface texture influences the performance of the μTesla rotor version 3. By varying the amplitude and frequency of sinusoidal textures on the rotor surfaces, the authors demonstrate that the boundary layer flow and pump output can be effectively controlled, as confirmed through simulations and experimental measurements