Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
Whisper/VAD Multi-Model Segmentation
The session explains how merging Whisper, WebRTC‑VAD, and other public models creates a multi‑model pipeline that improves audio segmentation without extra preprocessing.
The work-in-progress includes an audio processing pipeline. The audio is segmented and transcribed. Linguistic text features are extracted from the transcribed text. The use of publicly pre-trained AI models has provided high quality results. The next development stage will be focused on finer segmentation of the audio using the outputs of these public models.
Helps businesses transform using data science, digital strategy, and analytics expertise.