asrnlpresearch
From 34% to 28% WER: lessons from code-switching ASR
What I learned building a Whisper + LLaMA speech recognizer for Malay–English code-switching — where the WER actually came from, and what didn't help.
2 posts
What I learned building a Whisper + LLaMA speech recognizer for Malay–English code-switching — where the WER actually came from, and what didn't help.
Why combining CNNs, InceptionNeXt, and a Vision Transformer beat either alone for video deepfake detection — and why cross-dataset generalization is the metric that matters.