AudioRAG Demo - Semantic Audio Search

This demo builds on the work from the ColQwen team, expanding retrieval capabilities beyond images to include audio and video.

Unlike traditional methods, this model searches directly through raw audio without converting it to text. It understands semantic meaning in sound, speech, and audio patterns, making "AudioRAG" a real possibility.

๐Ÿ“– Blog post | ๐Ÿค— Model on Hugging Face | ๐Ÿ““ Colab Notebook | ๐ŸŽ™๏ธ Sample from Newsroom Robots

10 60

Generate a textual answer based on the retrieved audio chunks with an OpenAI api key

Examples
Upload Audio File Search Query Chunk Length (seconds)