GSLM Metrics

ASR Metrics

The suite of metrics here uses an ASR model to transcribe the synthesized speech into text, and then uses text-based metrics. We also use word error rate from ASR transcription itself as one of the metrics. More details

ABX Metrics

We use ABX to evaluate how well-separated phonetic categories are with quantized representations. More details

sWUGGY and sBLIMP

We refer to ZeroSpeech challenge for details on the sWUGGY and sBLIMP metrics.