Abstract: In this study, we explore the use of Vector Quantized Variational Autoencoders (VQ-VAE) for real-time audio spectrogram inpainting, with a focus on minimizing environmental impact. We ...
Abstract: Deep learning-based object detectors have become increasingly critical in spectrogram-based wideband multi-signal detection, recognition, and time-frequency localization. Current methods ...
This study proposes a novel heterogeneous stacking ensemble learning model for the fusion of phonocardiogram (PCG) spectrogram texture and deep features to detect heart failure with preserved ejection ...
Diffusion Speech is a diffusion-based text-to-speech model. Our speech synthesis pipeline is quite simple. We use a diffusion transformer model (DiT) to predict the duration of each phoneme. Then we ...
The development of machine learning for cardiac care is severely hampered by privacy restrictions on sharing real patient electrocardiogram (ECG) data. Although generative AI offers a promising ...