AudioRAG Demo - Semantic Audio Search
This demo builds on the work from the ColQwen team, expanding retrieval capabilities beyond images to include audio and video.
Unlike traditional methods, this model searches directly through raw audio without converting it to text. It understands semantic meaning in sound, speech, and audio patterns, making "AudioRAG" a real possibility.
๐ Blog post | ๐ค Model on Hugging Face | ๐ Colab Notebook | ๐๏ธ Sample from Newsroom Robots
10 60
Generate a textual answer based on the retrieved audio chunks with an OpenAI api key
Examples
Upload Audio File | Search Query | Chunk Length (seconds) |
---|