Speaker diarization inaccuracies | Voters

Speaker diarization inaccuracies

Jeremy Hadfield

Sometimes our speaker diarization model with the expression measurement API is not accurate. Specifically:

sometimes the speaker will always be unknown despite being clearly audible
Male and female voices together work great, but having two female or two male voices results in a single speaker being detected

While we are not prioritizing this right now because of other priorities, please upvote or comment if you're experiencing similar issues!

Nathan Guenther

This is a fairly significant hurdle for me at the moment.

The alternative workaround then, is using a separate diarization service, and then stitch it together with the emotions using timestamps.