Spaces:

argmaxinc
/

whisperkit-benchmarks

Running

App Files Files Community

ardaatahan commited on Oct 25, 2024

Commit

1543414

0 Parent(s):

initial commit

Browse files

Files changed (23) hide show

.gitignore +58 -0
.pre-commit-config.yaml +18 -0
Makefile +12 -0
README.md +85 -0
constants.py +254 -0
dashboard_data/config.json +136 -0
dashboard_data/device_map.json +14 -0
dashboard_data/diff_checker_data.json +0 -0
dashboard_data/multilingual_confusion_matrices.json +0 -0
dashboard_data/multilingual_results.csv +17 -0
dashboard_data/performance_data.json +0 -0
dashboard_data/quality_data.json +23 -0
dashboard_data/support_data.csv +23 -0
main.py +1302 -0
multilingual_generate.py +132 -0
performance_generate.py +465 -0
quality_generate.py +186 -0
requirements.txt +122 -0
static/Zwizz-Medium.woff +0 -0
static/Zwizz-Regular.woff +0 -0
static/Zwizz-SemiBold.woff +0 -0
text_normalizer.py +2374 -0
utils.py +991 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,58 @@

+# OS generated files
+.DS_Store
+Thumbs.db
+# Environment files
+*.env
+.env
+# Python virtual environment
+venv/
+env/
+*.pyc
+__pycache__/
+# Hugging Face related
+.huggingface
+# Project specific
+argmaxinc/
+table_data.json
+# Jupyter Notebook
+.ipynb_checkpoints
+# PyCharm
+.idea/
+# VS Code
+.vscode/
+# Gradio temporary files
+gradio_cached_examples/
+# Logs
+*.log
+# Dependency directories
+node_modules/
+# Distribution / packaging
+dist/
+build/
+*.egg-info/
+# Temporary files
+*.tmp
+*.bak
+*.swp
+# Dataset files (if you don't want to track them)
+*.jsonl
+# Model files (if you don't want to track them)
+*.pth
+*.h5
+*.ckpt
+.gradio/

.pre-commit-config.yaml ADDED Viewed

	@@ -0,0 +1,18 @@

+repos:
+  - repo: https://github.com/pycqa/isort
+    rev: 5.13.2
+    hooks:
+      - id: isort
+        args: ["--profile", "black"]
+  - repo: https://github.com/psf/black
+    rev: 23.3.0
+    hooks:
+      - id: black
+        name: black
+        language: python
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v4.5.0
+    hooks:
+      - id: end-of-file-fixer

Makefile ADDED Viewed

	@@ -0,0 +1,12 @@

+.PHONY: format use-huggingface-data use-local-data
+format:
+	@pre-commit run --all-files
+use-huggingface-data:
+	@python multilingual_generate.py download
+	@python performance_generate.py download
+	@python quality_generate.py
+use-local-data:
+	@python performance_generate.py

README.md ADDED Viewed

	@@ -0,0 +1,85 @@

+---
+title: WhisperKit Benchmarks
+emoji: 🏆
+colorFrom: green
+colorTo: indigo
+sdk: gradio
+app_file: main.py
+license: mit
+---
+## Prerequisites
+Ensure you have the following software installed:
+- Python 3.10 or higher
+- pip (Python package installer)
+## Installation
+1. **Clone the repository**:
+   ```sh
+   git clone https://github.com/argmaxinc/model-performance-dashboard.git
+   cd model-performance-dashboard
+   ```
+2. **Create a virtual environment**:
+   ```sh
+   python -m venv venv
+   source venv/bin/activate
+   ```
+3. **Install required packages**:
+   ```sh
+   pip install -r requirements.txt
+   ```
+## Usage
+1. **Run the application**:
+   ```sh
+   gradio main.py
+   ```
+2. **Access the application**:
+   After running main.py, a local server will start, and you will see an interface URL in the terminal. Open the URL in your web browser to interact with Argmax Benchmark dashboard.
+## Data Generation
+The data generation process involves three main scripts: performance_generate.py, multilingual_generate.py, and quality_generate.py. Each script is responsible for updating a specific aspect of the benchmark data.
+1. **Performance Data Update (performance_generate.py)**:
+   - Downloads benchmark data from [WhisperKit Evals Dataset](https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset).
+   - Processes the data to extract performance metrics for various models, devices, and operating systems.
+   - Calculates metrics such as speed, tokens per second for long and short-form data.
+   - Saves the results in `performance_data.json` and `support_data.csv`.
+2. **Multilingual Data Update (multilingual_generate.py)**:
+   - Downloads multilingual evaluation data from [WhisperKit Multilingual Evals Dataset](https://huggingface.co/datasets/argmaxinc/whisperkit-evals-multilingual).
+   - Processes the data to generate confusion matrices for language detection.
+   - Calculates metrics for both forced and unforced language detection scenarios.
+   - Saves the results in `multilingual_confusion_matrices.json` and `multilingual_results.csv`.
+3. **Quality Data Update (quality_generate.py)**:
+   - Downloads quality evaluation data from [WhisperKit Evals](https://huggingface.co/datasets/argmaxinc/whisperkit-evals).
+   - Processes the data to calculate Word Error Rate (WER) and Quality of Inference (QoI) metrics for each dataset.
+   - Saves the results in `quality_data.json`.
+## Data Update
+To update the dashboard with latest data from our HuggingFace datasets, run:
+```sh
+	make use-huggingface-data
+```
+Alternatively, you can use our on-device testing code [TODO:INSERT_LINK_TO_OS_TEST_CODE] on your device to update the dashboard with your own data. After generating the Xcode data, place the resulting `.json` files in the `whisperkit-evals/xcresults/benchmark_data` directory, then run:
+```sh
+    make use-local-data
+```

constants.py ADDED Viewed

	@@ -0,0 +1,254 @@

+from textwrap import dedent
+from iso639 import Lang
+BANNER_TEXT = """
+<div style="text-align: center;">
+    <h1><a href='https://github.com/argmaxinc/WhisperKit'>WhisperKit Benchmarks</a></h1>
+</div>
+"""
+INTRO_LABEL = """We present comprehensive benchmarks for WhisperKit, our on-device ASR solution, compared against a reference implementation. These benchmarks aim to help developers and enterprises make informed decisions when choosing optimized or compressed variants of machine learning models for production use. Show more."""
+INTRO_TEXT = """
+<h3 style="display: flex;
+  justify-content: center;
+  align-items: center;
+"></h2>
+\n📈 Key Metrics:
+Word Error Rate (WER) (⬇️): The percentage of words incorrectly transcribed. Lower is better.
+Quality of Inference (QoI) (⬆️): Percentage of examples where WhisperKit performs no worse than the reference model. Higher is better.
+Tokens per Second (⬆️): The number of output tokens generated per second. Higher is better.
+Speed (⬆️): Input audio seconds transcribed per second. Higher is better.
+🎯 WhisperKit is evaluated across different datasets, with a focus on per-example no-regressions (QoI) and overall accuracy (WER).
+\n💻 Our benchmarks include:
+Reference: <a href='https://platform.openai.com/docs/guides/speech-to-text'>WhisperOpenAIAPI</a> (OpenAI's Whisper API)
+On-device: <a href='https://github.com/argmaxinc/WhisperKit'>WhisperKit</a> (various versions and optimizations)
+ℹ️ Reference Implementation:
+<a href='https://platform.openai.com/docs/guides/speech-to-text'>WhisperOpenAIAPI</a> sets the reference standard. We assume it uses the equivalent of openai/whisper-large-v2 in float16 precision, along with additional undisclosed optimizations from OpenAI. As of 02/29/24, it costs $0.36 per hour of audio and has a 25MB file size limit per request.
+\n🔍 We use two primary datasets:
+<a href='https://huggingface.co/datasets/argmaxinc/librispeech'>LibriSpeech</a>: ~5 hours of short English audio clips
+<a href='https://huggingface.co/datasets/argmaxinc/earnings22'>Earnings22</a>: ~120 hours of English audio from earnings calls
+🌐 Multilingual Benchmarks:
+These benchmarks aim to demonstrate WhisperKit's capabilities across diverse languages, helping developers assess its suitability for multilingual applications.
+\nDataset:
+<a href='https://huggingface.co/datasets/argmaxinc/whisperkit-evals-multilingual'>Common Voice 17.0</a>: Short-form audio files (<30s/clip) for a maximum of 400 samples per language from Common Voice 17.0. Test set covers a wide range of languages to test model's versatility.
+\nMetrics:
+Average WER: Provides an overall measure of model performance across all languages.
+Language-specific WER: Allows for detailed analysis of model performance for each supported language.
+Language Detection Accuracy: Measured using a confusion matrix, showing the model's ability to identify the correct language.
+Results are shown for both forced (correct language given as input) and unforced (model detects language) scenarios.
+🔄 Results are periodically updated using our automated evaluation pipeline on Apple Silicon Macs.
+\n🛠️ Developers can use <a href='https://github.com/argmaxinc/WhisperKit'>WhisperKit</a> to reproduce these results or run evaluations on their own custom datasets.
+🔗 Links:
+- <a href='https://github.com/argmaxinc/WhisperKit'>WhisperKit</a>
+- <a href='https://github.com/argmaxinc/whisperkittools'>whisperkittools</a>
+- <a href='https://huggingface.co/datasets/argmaxinc/librispeech'>LibriSpeech</a>
+- <a href='https://huggingface.co/datasets/argmaxinc/earnings22'>Earnings22</a>
+- <a href='https://huggingface.co/datasets/argmaxinc/whisperkit-evals-multilingual'>Common Voice 17.0</a>
+- <a href='https://platform.openai.com/docs/guides/speech-to-text'>WhisperOpenAIAPI</a>
+"""
+METHODOLOGY_TEXT = dedent(
+    """
+    # Methodology
+    ## Overview
+    WhisperKit Benchmarks is the one-stop shop for on-device performance and quality testing of WhisperKit models across supported devices, OS versions and audio datasets.
+    ## Metrics
+    - **Speed factor** (⬆️): Computed as the ratio of input audio length to end-to-end WhisperKit latency for transcribing that audio. A speed factor of N means N seconds of input audio was transcribed in 1 second.
+    - **Tok/s (Tokens per second)** (⬆️): Total number of text decoder forward passes divided by the end-to-end processing time.
+        - This metric varies with input data given that the pace of speech changes the text decoder % of overall latency. This metric should not be confused with the reciprocal of the text decoder latency which is constant across input files.
+    - **WER (Word Error Rate)** (⬇️): The ratio of words incorrectly transcribed when comparing the model's output to reference transcriptions, with lower values indicating better accuracy.
+    - **QoI (Quality of Inference)** (⬆️): The ratio of examples where WhisperKit performs no worse than the reference model.
+        - This metric does not capture improvements to the reference. It only measures potential regressions.
+    - **Parity %**: The percentage difference between a model's Average WER on a given device and its Average WER on the Apple M2 Ultra, where a negative value indicates worse performance compared to the M2 Ultra.
+    - **Multilingual results**: Separated into "language hinted" and "language predicted" categories to evaluate performance with and without prior knowledge of the input language.
+    ## Data
+    - **Short-form**: 5 hours of English audiobook clips with 30s/clip comprising the [librispeech test set](https://huggingface.co/datasets/argmaxinc/librispeech). Proxy for average streaming performance.
+    - **Long-form**: 12 hours of earnings call recordings with ~1hr/clip in English with various accents. Built by randomly selecting 10% of the [earnings22 test set](https://huggingface.co/datasets/argmaxinc/earnings22-12hours). Proxy for average from-file performance.
+    - Full datasets are used for English Quality tests and random 10-minute subsets are used for Performance tests.
+    - **Multilingual**: Max 400 samples per language with <30s/clip from [Common Voice 17.0 Test Set](https://huggingface.co/datasets/argmaxinc/common_voice_17_0-argmax_subset-400). Common Voice covers 77 of the 99 languages supported by Whisper.
+    ## Performance Measurement
+    1. On-device testing is conducted with [WhisperKit Regression Test Automations](https://github.com/argmaxinc/WhisperKit/blob/main/BENCHMARKS.md) on iPhones, iPads, and Macs, across different iOS and macOS versions.
+    2. Performance is recorded on 10-minute datasets described above for short- and long-form
+    3. Quality metrics are recorded on full datasets on Apple M2 Ultra Mac Studios to allow for fast processing of many configurations and providing a consistent, high-performance baseline for all evaluations displayed in the English Quality tab.
+    4. Quality is also sanity-checked on 10-minute datasets in order to catch potential correctness regressions across different device and OS combinations despite running the same version of WhisperKit.
+    5. Results are aggregated and presented in the dashboard, allowing for easy comparison and analysis.
+    ## Dashboard Features
+    - Performance: Interactive filtering by model, device, OS, and performance metrics
+    - Timeline: Visualizations of performance trends
+    - English Quality: English transcription quality on short- and long-form audio
+    - Multilingual Quality: Multilingual (77) transcription quality on short-form audio with and without language prediction
+	- Device Support: Matrix of supported device, OS and model version combinations. Unsupported combinations are marked with :warning:.
+    - This methodology ensures a comprehensive and fair evaluation of speech recognition models supported by WhisperKit across a wide range of scenarios and use cases.
+"""
+)
+PERFORMANCE_TEXT = dedent(
+    """
+    ## Metrics
+    - **Speed factor** (⬆️): Computed as the ratio of input audio length to end-to-end WhisperKit latency for transcribing that audio. A speed factor of N means N seconds of input audio was transcribed in 1 second.
+    - **Tok/s (Tokens per second)** (⬆️): Total number of text decoder forward passes divided by the end-to-end processing time.
+    - **Parity %**: The percentage difference between a model's Average WER on a given device and its Average WER on the Apple M2 Ultra, where a negative value indicates worse performance compared to the M2 Ultra.
+    ## Data
+   - **Short-form**: 5 hours of English audiobook clips with 30s/clip comprising the [librispeech test set](https://huggingface.co/datasets/argmaxinc/librispeech).
+    - **Long-form**: 12 hours of earnings call recordings with ~1hr/clip in English with various accents. Built by randomly selecting 10% of the [earnings22 test set](https://huggingface.co/datasets/argmaxinc/earnings22-12hours).
+"""
+)
+QUALITY_TEXT = dedent(
+    """
+    ## Metrics
+    - **WER (Word Error Rate)** (⬇️): The ratio of words incorrectly transcribed when comparing the model's output to reference transcriptions, with lower values indicating better accuracy.
+    - **QoI (Quality of Inference)** (⬆️): The ratio of examples where WhisperKit performs no worse than the reference model.
+        - This metric does not capture improvements to the reference. It only measures potential regressions.
+"""
+)
+COL_NAMES = {
+    "model.model_version": "Model",
+    "device.product_name": "Device",
+    "device.os": "OS",
+    "average_wer": "Average WER",
+    "qoi": "QoI",
+    "speed": "Speed",
+    "tokens_per_second": "Tok / s",
+    "model": "Model",
+    "device": "Device",
+    "os": "OS",
+    "parity": "Parity %",
+}
+CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
+CITATION_BUTTON_TEXT = r"""@misc{whisperkit-argmax,
+   title = {WhisperKit},
+   author = {Argmax, Inc.},
+   year = {2024},
+   URL = {https://github.com/argmaxinc/WhisperKit}
+}"""
+HEADER = """<div align="center">
+        <div position: relative>
+        <img
+            src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAbgAAAG5CAYAAAD8liEWAAAdN0lEQVR4Ae3db3IbRbfH8XNGMhUeyzzKCtBdQXxXECW2c133Dc4KYlaQZAUhKwheQewVoLyhXNixhxVgVoBYASKWIWVp+txuxeEBbhxsSzM9an0/VRQQ/liZGc2vT/8VAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABadSsk27g9NANSYDVSz3MxeHRy1dgXRba6fdp1lj8Vs1b9AOzKXbCCqJ2Ky1/ikke/vf9qXihFwAP6s719G92K8jOCDbfP3TjEqXvpQ6Epa+qru6XevP+tJhTIBgP/oFOfFD+FFK6jUJNzOi+MEwy3omGXfrPvKVCpEwAH4u/bkRYtKXVzzjiRMnX6ztfVLWypCwAH4kM6DtbMtQSXW1t6Ea92R9LV/f7O0LRUh4AB8mHPbgkpkkn0hC8KpVPZ7JeAAfJCp3hFUZVUWhVX3eyXgAFymI6iGLVDA+W5KqQgBBwBIEgEHAEgSAQcASBIBBwBIEgEHAEgSAQcASBIBBwBIEgEHAEgSAQcASBIBBwBIEgEHAEgSAQcASBIBBwBIEgEHAEgSAQcASBIBB+BSW1u/VHZ2FzBrBByASw2HzUU6iDOKzfXTrqAUBByAS2VOuoJSFaaPBKUg4ABcykQfb27+3hGUYnJtjUZEWQg4AB/TLkbFS0EpLq5tR1CKpgDAx/gKY2Nt+ENjqfFwf//TvmBqoXKbhBvVW6lKDzgV6QtQMybW9k8nMwSvymS1OC9+8kG3q+JeNbOs70aNgdzQrfatQa+nN/7vY9jsTt9V6xqjVdHsbnE+3ub5K58KsOA2usNVa/rAc7qqKl/Qqq6KDUT1xF/vvYOj1q7U0Mb94bZ/Sz4Ss1UCaXb8/a4kewg44G9CS32cjbdV9JmgKv3GJ417dekCvehC/CZUroKZI+CAyELQuaw4NiYBVGXgQ+6/Y4fcJNzOi2PhvpemqoBjFiVwif38037mGvcYR65MLWZsTio3wi0JBBzwESHkXGZfCqrhxz/XI+7sMdlVhG7JZBBwwD84PFzJxdyeoBJqsiWRsKtIWgg44AqsobuCSqjoXYmnI0gGAQdcQajiGIurhplEm47vq8eOIBkEHHBV5r4XAHODgAOuyGl2IgDmBgEHXJGp6wuAuUHAAVe0VCxRwQFzhIADACSJgAOuKCz6FgBzg4ADACSJgAMAJImAAwAkiYADACSJgAMAJImAAwAkiYADACSJgAMAJImAAwAkiYADUDcdicQk3ll0mD0CDkDtrK+fdqViGxvDVSHgkkLAAaidzElXKmYmjwRJIeAA1I6JPt7a+qWyampz8/eOOtkSJIWAA1BH7bPTpW+kIsWoeCkRx/5QjqYAQB2ZdDfWhj80lhoP9/fLOaooVG6TcLPqu0RRPgIOf7G1Ze23g7d/dA3dat8a9Ho6ENSSivTDn00l3KPr3qe22rtJFVbX6sVktTgvfvJBt6viXmXF0sm05/Jtdn/vSHPccaJfFOfjbX8Vazux5P39Dfw97sv11P/+lkylZBv3hyaYczYQ1V5DbW//cCWXBVaH5zm89JzITsst7/by2TQ+wku/yIqu/38/W9SXYV2UcX+3utY+y8626nJ/D45apWdPQMDhelR2l1dGT3u92wtZ1UV/ntV2lovWV7N68X3I+v3Tr1T0maB6C3J/qwo4Jpngeky2/eD/cZUz3HDBv/wOXq88KfPlFxwerXzlxHYEVdvl/s4WAYfr8+Miv51+8lJQmdBtFVr2UpHCjb+S64/p4YbC/W24xnOpyKLcXwION2JmWzF2m1hUvl/0edkt+z/L89sDpYqrjL+/+bSTZ65jUe4vAYcbU5ex80NFlpycSMVcJrmgEpbZnlQsc81dSRwBhylYV1CJb/NW5QE3Ho8r/5mLqtWq/lpXWTHGQsAB9deXCEI3lqASsWYla6RnqyoEHAAgSQQcACBJBBwAIEkEHAAgSQQcgI9hognmFgEH4FJKwFWBa1wSAg4A4iLgSkLAYRpsuAygtgg4TIOAA1BbBByAS93gFGmgNgg4TCWcBC1Iljr3s6BcNCJKQ8BhKpkUdFMmzFSZAIG5RcBhKqNMVgXJMtG+oGT2o6AUBBymo0bAJUwbnAlXNpUsF5SCgMN0TL8QJOvgYHIOHd2UJcqKjHP3SkLAYVqd9fXTrqBMHYmq+tOmF4WavYp58KhFf7bKRcBhapmTrqBU/9sdxusKdrorKEVmzScSycbGMPnhBQIOUzPRx5ubLBco0zizLYnkIG+dOLEdwWyp7USt3kweSeIIOMxCuzgvjre2fmHJQElCIyLm9X19tPJElAknM6Nu7+D1SrTqLTRI1Um0RlNVCDjMSufszdIPVHKlaZ+dLn0jER28bt2jkpueiT0/eP3ZtkQ0Ho1fSOLjbwEBh1nq+Erup4214UuCrgQmXX9tj2Ne21DJNVzjv8QcE0+uxQb+j51w7Q6PVr6SSLa2rB2eITVNvnoLVEr24P7wJ0HtlTSbqh+2IVKVE3H2aybWl0z7merAjRqDW+1bg15vvnbK2Lg/NIkvXNe8oba3f7iSS0STGbROV1WsY5J9rmp/dKOqffyZmrcZfCof31LLdLKcYnDxN/7fdb9OFspndnIY+T5t+vvkXHbXxPluUY0+lHBw1Co9e4JKfgjmx1bX2mdy1rGmtcPsSFO9GyoHiSEEo8hJHV7k79Uk4P7GBipXayi8fwmHRoc5+bHxSSPf34830WGjO1y1TLq+K+lx/QLPBv4z7YxdczePOBkkVF1vh8PV8aQhoXd8y6Hjf7ntGxBXCqo6NiQIONRG2FC50NFXolm8WVcqu8sro6e93u2oFV89A25KvhpcyuTpt+8WdUcRnrFx5l74SrAeXWfq9paLlSe9PF4PQ6i6CtNnYmG3oPhV1ywRcKid8BJyWXEcrUXoqw4fcvdihlySAXdBM/n6u8PWU4loY+3NrlgWdfq6ifX8ONlDiSRUbL+9OXvmH7RosyzLVlXAMckEVxbW7GSucU9ibd1ksvrbcOmZoBS+y/LJxtrwh5jLEUZFEV7q0RowYZyt6ZrRQj6E29np2XHK4VYlAg7XEkLOMovWug0vYWZolihyIyLPb/vxxHhLEXyw5DEXX4fKLdwDwUwQcLi2yYywiIt+i9GY1m2JQiMi5v6i1tCeROIbb9GWP4QxNyq32SLgcCOm9lwiUdG7glJpmNwQyUHEyS6t1jjazy4iXvNUEXC4kYt1PVHGSszS34EhOpNuzLG4f1pzVpZYE5gm3e7GpuWzRsDhxtTcK4mDPS8rcPZmaSF2u6iD0WjEuFsJCDjcmFMOakwap7VXpmHKtS4BAYcbM3V9QbLU0RVcFadyRzBzBBxuLMuyviBZlum/BZXQxHYqqQsCDjfWGDXmaqNkXNPiTebheU4MAYcbi7kgFigBAZcYAg4A4qOLsgQEHABEdtWjb3A9BByA2rk4tw6YCgEH4DIxq4rqA07j7J6C8hBwAC4TLeBMjAoOUyPgANRO5uxXqZhG+JkoFwEHoHaibAOnwtZziSHgANRPZhECLt7eqiZsi1YGAg5A7YzHk3PZKh2H+9fKeS5ICgEH4FKxzoTL89uDKrsM1exVrLPgUB4CDsClhsNmtGNcqjw13jXka4lkY2PIUTklIeAAXEpNoh16Gk6Nd2I7Uja1nYsT6qMwk0eCUhBwAC5n+ihWN2Xw+mjlia/kelIWlfzg9coTiWRz8/eOuniNiNQRcAA+pn12uvSNRHT4euWhSQndlb5yO3jduicRjUfjF8IMytIQcJiKCtsbJc+ku7E2PA7VhkRyeLTyVcM1/kvM7U33zIUdUmwn/L9iVm5bW9YO11RNqd5K1JQFFB6ut4O3f3S7cK4Z8A98yBXnxU/+pbzbUNuTcbNf9ffm4udth7/e6A5XffN81X+wVZPsc1V7/31+/+d3MyJN/X/jfvV/cSIuOznIV6KtdZu8d4bDVeeyu2dvhj5cOcW7bCoLYnP9tFu47JGKddNbVKm7B0fLX0oED+4Pf4pxPQ+OWlGe3Y37QxN8hA1U9I/p9vZ+A2MfNJrZwDn5uZnZya1W66TX07malr/Z9RVsNuq6LLvjfz8d323a+fsxN/7X2gTXP6vq+5t8BRe6VYpR8bJw0p08fgKgPNq2P2/SbO8bP/675961qAuncvbmTHw1mPtf3vMvu12psUnj2PRZYUV3Mqrjwq++e5P8//fJwtQMcyHpgJuE23lxLAziAvVjodE5Gd973FhqPNzfr9dQQehS/P307IVvHG8L5lLSk0xC5SaEG1BvJquhIRpzEsuHnJ2eHTsj3OZZsgHnx0q2L1qIAOqvc9EgrQU/tvwiBK9grqVbwSm7AwBzxTdI1/14l0QWKkk/thZtCQFmJ92Ao/UFzB01fSaR+Uoy+mfAbCQZcN3uZGshpuoC88Y3TGNuDXbxGVh8nYgkA+6W3CLcgPnUPh8sdSSSzXddpLw/EpFkwLEzSXU4iRizNsriDS+MnTK0kRD2osQ86gvSpRYtZFSsI0gGAQegXiz7t0SiZnRPJoSAA1Arau62RGJZ9rkgGQQcgFqxTKNVcEhLkgE32fUbwFxSZjFiRqjgANSKGQGH2Ugz4JrjjiBlHUG6NN4sWZuc+I1UUMEBwIXM2a+CZCQZcIVjLQuA6zOdr1PG8XFJBpyKdgTAfDLtSyQqQsAlJM0uSqOCA+aVivtZIvFjcH1BMpIMOBZrAnNMsxOJpRHxZ2PmEq3gOAsOmFf/WjnPJZLRaNQXuimTkVzAcdwFMMdU8l7vdrSAyXP/s1Wo4hKRXMAVpo8EwFwytecSWR0+A2YjqYCbVG8m2wJgHu0eHq7kEtnkM6jbE8y9ZAJuc/P3TuH0pQCYP75bcNktP5WaGBXFE7oq518SARcqt+K8OBa2cALmj6+Wlovle728Pousw1jcqBjdE6OSm2dNmWOTYPNjboWjWxKYLzYQ1ZMw3nV4+FkuNTSZcCKyvb5+uisue6xiW4K5UvuA29qy9tvB23a2VLTHznWcZR0VveMftq4Pto7MmEq8jV5vyr8k2D8PFbOB/x7+Y8Vl7zdOvtidxMR+lMxOWuPWSZ0qto+5GBcMf4gPu66ZtXXyHgobSmT/Fv3LxhL+n/11Fre/Bu9+n5MNKJQZ3hUqPeAe3B/+ZFN0HZ69OZt0pBZF+LssBJBMviYyhdC3br716L9spq5fFEsnbbk1mJcvHBCHDfz3bmfsmrt5/mlfIvrfjeGqK9yqy7I75nxovAuZP8LlfbBOQlit75z83PTBuj/lJJZpJ8FsdIervkv2iWp21xhSKd1cd1Fez7svZ8u1vibIgGuajJOtPIn53QkTydy4eGzOtkdFCDLf8nXhn5i8b/H+0fC1d+FhF/8sNIwLp7Jx/zR0jfYaanv7EWZsHuStMHFlOxzKXDSKl/6zdQWlWYyA8xVbo2g+3I/c6gTmkQ+J3uHrz7YlIt8T9KI4L568+zuVm/PVnsm2H7vf3lgb7jaWGs/396t/L1y8i+75z3BMyJUn/fPgwvTjYvke4QZcXxiTbrpm1On7IQR8EfZEZi0E3XlxHCpDiWRUjB4KW4OVJumAC1/ORtF4SJckcDM+WPKYjcNQuZVc4XSKURFt/WyYqaliO4JSJB1w/sv5nMoNuDnLLNo6sLAMqJTK7e98gK6/28M2CmtoT1CKlANucHDU2hUAN9ZqjaPt5uHHyZ5JRbTCn/V3BwctdkwpSbIBp2bfC4CpxNrZfzIuVuXkC5PVra1foq1Rm8f1t/Mg3QqOfeSAuTUajao+07E9HDY5RzIx6VZwwsQSYF41TCsPm4bTaAHHou9yJBtwpu+2BgIwf5zKHamY++uWW0hAsgHn1FHBlSzsxiBACTTCno3qqKJSk2zAfTKmixKY0kJ9hyzTfwuSkm4FJw0CDpjOYn2HjAouNelv1QVgHi3UsTIsEygHAQegdv5+phpwEwQcALzTkUj+OBQVM0XAAfgg5aVbGRPrC2Yu2YDLpKCLA5iCcYxLZTJnvwpmLtmAGze0IyhXc9wRAFMzZVlTGdLdycSMCq5kI65x4uxHQUWMvXNLkPBmy8bGqSXLTLuCZKlkuaAajYyAK0HCk0y08r3sFo5Gu8Z051QgK3jpVuXiTDie6xlLN+Ain++UusrP6/oTVRbFlk3NXu3nn/Ylkli768d9Z8Q7PT1VKS8TaP/2pln+cfcLqhgV0U5ANuOsv7JZU7+SSDY2htGGF2KeCddwza8FM5X0OjgTfUwVN3sX1du2RKKqrwTlUdu56DKL8+ML25JI1CTazw4VsxPbEcxM6gu922enS8eE3OyEcCvOi2OJJOzZ993r5Z6gHCr5weuVaD0f4fnyDdNHEovpo0kDLpLXR/7a+3sgmIn0dzLxY3Eh5GI+tKnYXD/tXoRbRyII4Za5xj1BOULl9roV9fqOR+MXEvd067bvfn8pEYV7QCU3G4uxVZcPufBifnD/7BlBd30h2DbWhseF02jhFlq1IdxiTnxIkw18l2/PMrsXs3Lb2rL2/6wNX6pptC7CP5h0H6ydfRO7kgv3xFez9FZMQaVkD+4PfzKp2TlL/mWpZt/7FutJM8v6nywv93s9dhIIwovm/Oys4wq36rLsjjjZinP/bOAH2/ri75Nl0js8XMmlBjbuD01q6p+OXLH3s09N+yruZ5dJ3hq3Tnp5vGc/nArvMvfIxPlw1boNJfTVP3tNlb1vI45JbnWtfdo47TZMV51kd1TfbbCgHzm/rnbv3L85OGqVnj1BJT9kGt3uf1pRzea4E3YoCTfaVO/Ofpq6b83KYgadSfjSzPYFE164vgWa+//3j/6FOhCXnYwuDqLN57QSq0fA2cB/iB3f/5LHDv5Q3Y+drmaZfG7OPz9qHf/L7ascd1PGM1em65zZ9r4xoSonYu77bGnpZH8/3jO/7u+TFratmt2tQ/gRcFcQWn+Fjr4SzeINSuNDdn33yl5dqq5Zih5wvvdhuVh+GLXq8l13blw8Nmfb8xRQ0ansNpYaz2MGXXhnjjP3QiXeTNWAgLuGje5w1XclfFP3sjx1oYXrMvsyxWB7L2bA+Yqnd3i08lAi8kMOL/wFYH3pFPy76uvvDltPJaKNtTe7YvEKg6oCLolJJgd56+Ridh3jaJGEcPuXW/7vlMMtpnB9m64Z+aU4PCbcpmdOnoRrKRGNiiLcx+Tfl8nMogyz63y3WNTW7aJ6P30/ZrdZ6nywPI85gzRUbrG2ZktSmKm5PnwhkeT57YEuwFKEpJYJTKoHFklWzonsMH2/XEsu3vZkYSIJldvshUouTP6QSKyR/hKE5NbBmdpzQZUGrc9Gu4JSfZvHm6ZemEbbdzR1GvHaxtyOrSrJBdzFGBBdZRUJ6wl7vdtc73JFnTFJ12SJ/LWNuZXgdZY+zKMkdzLxVVwuqISq5oKyRQu44rzoCkp19mYp/u4tiUoy4DKTHwWVKDLj6Jqk2aqgXMo1LkuaFZxYX1CJ5rjZF6Qr3qntC0Md63fLkmTAOWUMriq32m+51sA0Mv1cUIokA25JWY9VFSaYpO1jG/piNuwK+3biZtI8LoduMwBYeItxHhwA1FdHUAoCDtOgexJAbSUZcGwbVRkCDpiBmIu9U0YFBwCRvR3cIuBKQMABuBRnLGKeEXAAgCQRcAAQGRsmlIOAA4DI2DChHAQcUH9MQABuINmAS/2co1pQrnFFCLjExVomkPokomQDzinHuJTOCLiqbHZ/70jFNtdPu4JKDIfNyo/M2dgYJn9MT7pdlKbfC0plme0JKmFZUfmhmIXpI0El1KT6Q08LIeDm1diNdummLJHKyeHhSi6ohBN5XGU31uamrxhNuoJq+MbE5JpX65kkLtmAy/PbA3PyUNhOauZCw6FRNB4KqtT5bbhU2QtpPBq/EBZ5V6ldjIpvqmrEPLg/XIj7m/QsyoO8deKbvveo5GYnXMvMNe6x32f1fIPtyYP14YsyX4JbW9b+n7XhSzWtvsts0Zmsnp0uHZdZyYX7G8LNRJ7IAlBZEBv3h9v+d/tIzHy/szIr7VpsIKon/gu4d3DU2pUF5p8jk/j6/kPsfNKQ/NuD1kwmU4VJLC5zj0zcE74f0fX9uypfymRnFvc3hNr52VlnXOgXdbm//j1SSfYsTMABs1CTgFswFoYZ9iyT3njc7OdT9h50fZg3m+OOFratmt2t+1T59z1QKU3pJ+CAGiLgqmVivZZrfdnLtZSx9FC5jjP3QsXokq1QVQHXFACooRBuh0crpU5muhhLfrix9mZXLGNZRGLYqgtA7YRuuaZrPpWKjIoiTLpgxnViCDgAteMDbqfKmbphWZHvptwRJIWAA1A7DSe5VCxzzV1BUgg4ALXzbT6b5Q/XcVEx0k2ZEAIOQN30JRIl4JJCwAEAkkTAAQCSRMABAJJEwAEAkkTAAQCSRMABAJJEwAEAkkTAAQCSRMABAJJEwAFXFM4OEwBzg4ADrqhYKtoCYG4QcMAVNcwIOGCOEHDAFY2drgqAuUHAAVekKncEwNwg4ICrMukKgLlBwAFXsLk5mUHZEZTOV8rRzmQz5Ty4lBBwwBUUo+KZoBJmUvlp3n/64d8LkkHAAf9gUr2ZbAsqoaqvJBLLpCdIBgEHfEQIt+K8OBZUQkX6371ejhYyh4cruf8QuSAJBBxwiT+FW0dQuhBumWvck8gaRePL8FkEc4+AA/5mc/20++D+2bPifPyDEG7V8FVTCLf9/NO+RBY+Q/gshNz8UwEW1NaWtd8O3rZdY7TqLOtkqnfNXNd/Ldix5EpsoKJTzTo00dwytzfpGqyhtbU3W2qNR6rWUZMbPxcmYRccnqv3Do5alWRP6T/kwf3hTwLUjFGZ3YAN/HXb8f0+eWvcOunlypT6a1r3vQNa2LZqdneRn8FkAm7j/tAEwHxTt7dcrDwh1GYjnEwxztwLFduSBVRVwDUFAD7Cd6/1Dl9/ti2YmYuxxocba292xbJHglIwyQTApcJEi6ZrPhWUYlQUT/yfqIpLQsABuJQfX8jrMLMxVXl+e+C7KXcEpSDgAFzKMtsTlMplLCwvCwEH4FKt1jjevpALYjzmGpeFgANwqV7vNuNDJQvdlIJSEHAAgCQRcACAJBFwAIAkEXAAgCQRcACAJBFwAIAkEXAAgCQRcACAJBFwAIAkEXAAgCQRcACAJBFwAIAkEXAAgCQRcACAJBFwABAfR+aUgIAD8EGqwkGcFdHFCrjKfq8EHIAPc/azoBrmvpdFUWHDiYAD8EGm2hNUwjV0VxaFyZ5UhIAD8P/4LrP+wVFrV1CJw8OV3F/0XBIXnquGa+RSEQIOwF+El1DmGvcElWoUjS/DtZeUqT7dzz/tS0UIOAD/4cdHQrhV+RLCO+Gah2ufYsiF35Nldu+718uVdnurlGzj/tAEQG2Fl48TPZHM7Uy6yhCdf29um+gXqtbxY1arMpds4Cu2EzN51XLLu71cWQoBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAoN7+D8MSQdHK7cEnAAAAAElFTkSuQmCC"
+            style="display:block;width:7%;height:auto;"
+        />
+        </div>
+</div>"""
+EARNINGS22_URL = (
+    "https://huggingface.co/datasets/argmaxinc/earnings22-debug/resolve/main/{0}"
+)
+LIBRISPEECH_URL = (
+    "https://huggingface.co/datasets/argmaxinc/librispeech-debug/resolve/main/{0}"
+)
+AUDIO_URL = (
+    "https://huggingface.co/datasets/argmaxinc/whisperkit-test-data/resolve/main/"
+)
+WHISPER_OPEN_AI_LINK = "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/{}/{}"
+BASE_WHISPERKIT_BENCHMARK_URL = "https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data"
+AVAILABLE_LANGUAGES = [
+    "af",
+    "am",
+    "ar",
+    "as",
+    "az",
+    "ba",
+    "be",
+    "bg",
+    "bn",
+    "br",
+    "ca",
+    "cs",
+    "cy",
+    "da",
+    "de",
+    "el",
+    "en",
+    "es",
+    "et",
+    "eu",
+    "fa",
+    "fi",
+    "fr",
+    "gl",
+    "ha",
+    "he",
+    "hi",
+    "hu",
+    "hy",
+    "id",
+    "it",
+    "ja",
+    "ka",
+    "kk",
+    "ko",
+    "lo",
+    "lt",
+    "lv",
+    "mk",
+    "ml",
+    "mn",
+    "mr",
+    "mt",
+    "ne",
+    "nl",
+    "nn",
+    "oc",
+    "pa",
+    "pl",
+    "ps",
+    "pt",
+    "ro",
+    "ru",
+    "sk",
+    "sl",
+    "sq",
+    "sr",
+    "sv",
+    "sw",
+    "ta",
+    "te",
+    "th",
+    "tk",
+    "tr",
+    "tt",
+    "uk",
+    "ur",
+    "uz",
+    "vi",
+    "yi",
+    "yo",
+    "yue",
+    "zh",
+]
+LANGUAGE_MAP = {lang: Lang(lang).name for lang in AVAILABLE_LANGUAGES}

dashboard_data/config.json ADDED Viewed

	@@ -0,0 +1,136 @@

+{
+    "name": "whisperkit-coreml",
+    "version": "0.2",
+    "device_support": [
+        {
+            "identifiers": ["iPhone11", "iPhone12", "Watch7", "Watch8"],
+            "models": {
+                "default": "openai_whisper-tiny",
+                "supported": [
+                    "openai_whisper-tiny",
+                    "openai_whisper-tiny.en",
+                    "openai_whisper-base",
+                    "openai_whisper-base.en"
+                ]
+            }
+        },
+        {
+            "identifiers": ["iPhone13", "iPad13,18", "iPad13,1"],
+            "models": {
+                "default": "openai_whisper-base",
+                "supported": [
+                    "openai_whisper-tiny",
+                    "openai_whisper-tiny.en",
+                    "openai_whisper-base",
+                    "openai_whisper-base.en",
+                    "openai_whisper-small",
+                    "openai_whisper-small.en"
+                ]
+            }
+        },
+        {
+            "identifiers": [
+                "iPhone14",
+                "iPhone15",
+                "iPhone16",
+                "iPhone17",
+                "iPad14,1",
+                "iPad14,2"
+            ],
+            "models": {
+                "default": "openai_whisper-base",
+                "supported": [
+                    "openai_whisper-tiny",
+                    "openai_whisper-tiny.en",
+                    "openai_whisper-base",
+                    "openai_whisper-base.en",
+                    "openai_whisper-small",
+                    "openai_whisper-small.en",
+                    "openai_whisper-large-v2_949MB",
+                    "openai_whisper-large-v2_turbo_955MB",
+                    "openai_whisper-large-v3_947MB",
+                    "openai_whisper-large-v3_turbo_954MB",
+                    "distil-whisper_distil-large-v3_594MB",
+                    "distil-whisper_distil-large-v3_turbo_600MB",
+                    "openai_whisper-large-v3-v20240930_626MB",
+                    "openai_whisper-large-v3-v20240930_turbo_632MB"
+                ]
+            }
+        },
+        {
+            "identifiers": [
+                "Mac13",
+                "iMac21",
+                "MacBookAir10,1",
+                "MacBookPro17",
+                "MacBookPro18",
+                "Macmini9",
+                "iPad13,16",
+                "iPad13,4",
+                "iPad13,8"
+            ],
+            "models": {
+                "default": "openai_whisper-large-v3-v20240930",
+                "supported": [
+                    "openai_whisper-tiny",
+                    "openai_whisper-tiny.en",
+                    "openai_whisper-base",
+                    "openai_whisper-base.en",
+                    "openai_whisper-small",
+                    "openai_whisper-small.en",
+                    "openai_whisper-large-v2",
+                    "openai_whisper-large-v2_949MB",
+                    "openai_whisper-large-v3",
+                    "openai_whisper-large-v3_947MB",
+                    "distil-whisper_distil-large-v3",
+                    "distil-whisper_distil-large-v3_594MB",
+                    "openai_whisper-large-v3-v20240930",
+                    "openai_whisper-large-v3-v20240930_626MB"
+                ]
+            }
+        },
+        {
+            "identifiers": [
+                "Mac14",
+                "Mac15",
+                "Mac16",
+                "iPad14,3",
+                "iPad14,4",
+                "iPad14,5",
+                "iPad14,6",
+                "iPad14,8",
+                "iPad14,9",
+                "iPad14,10",
+                "iPad14,11",
+                "iPad16"
+            ],
+            "models": {
+                "default": "openai_whisper-large-v3-v20240930",
+                "supported": [
+                    "openai_whisper-tiny",
+                    "openai_whisper-tiny.en",
+                    "openai_whisper-base",
+                    "openai_whisper-base.en",
+                    "openai_whisper-small",
+                    "openai_whisper-small.en",
+                    "openai_whisper-large-v2",
+                    "openai_whisper-large-v2_949MB",
+                    "openai_whisper-large-v2_turbo",
+                    "openai_whisper-large-v2_turbo_955MB",
+                    "openai_whisper-large-v3",
+                    "openai_whisper-large-v3_947MB",
+                    "openai_whisper-large-v3_turbo",
+                    "openai_whisper-large-v3_turbo_954MB",
+                    "distil-whisper_distil-large-v3",
+                    "distil-whisper_distil-large-v3_594MB",
+                    "distil-whisper_distil-large-v3_turbo",
+                    "distil-whisper_distil-large-v3_turbo_600MB",
+                    "openai_whisper-large-v3-v20240930",
+                    "openai_whisper-large-v3-v20240930_turbo",
+                    "openai_whisper-large-v3-v20240930_626MB",
+                    "openai_whisper-large-v3-v20240930_turbo_632MB"
+                ]
+            }
+        }
+    ]
+}

dashboard_data/device_map.json ADDED Viewed

	@@ -0,0 +1,14 @@

+{
+    "Mac14,12": "Apple M2 Pro",
+    "Mac14,14": "Apple M2 Ultra",
+    "Mac15,3": "Apple M3",
+    "Mac15,9": "Apple M3 Max",
+    "iPad14,8": "iPad Air 11-inch (M2)",
+    "iPad16,1": "iPad mini (A17 Pro)",
+    "iPad16,3": "iPad Pro 11-inch (M4)",
+    "iPhone12,1": "iPhone 11",
+    "iPhone14,2": "iPhone 13 Pro",
+    "iPhone14,5": "iPhone 13",
+    "iPhone14,7": "iPhone 14",
+    "iPhone17,1": "iPhone 16 Pro"
+}

dashboard_data/diff_checker_data.json ADDED Viewed

File without changes

dashboard_data/multilingual_confusion_matrices.json ADDED Viewed

The diff for this file is too large to render. See raw diff

dashboard_data/multilingual_results.csv ADDED Viewed

	@@ -0,0 +1,17 @@

+Model,Forced Tokens,Average WER,WER_sl,WER_sk,WER_ur,WER_sw,WER_uz,WER_pl,WER_vi,WER_sq,WER_sv,WER_he,WER_mt,WER_hy,WER_am,WER_nn,WER_be,WER_da,WER_mr,WER_kk,WER_mn,WER_ja,WER_el,WER_lv,WER_oc,WER_it,WER_ca,WER_cs,WER_te,WER_ru,WER_tk,WER_ro,WER_yo,WER_yue,WER_yi,WER_pt,WER_ps,WER_zh,WER_uk,WER_sr,WER_pa,WER_ml,WER_mk,WER_ba,WER_ha,WER_ar,WER_gl,WER_hu,WER_nl,WER_bg,WER_bn,WER_ne,WER_af,WER_hi,WER_ka,WER_de,WER_as,WER_az,WER_br,WER_ko,WER_fi,WER_id,WER_fr,WER_es,WER_et,WER_en,WER_fa,WER_lt,WER_cy,WER_eu,WER_lo,WER_tt,WER_ta,WER_th,WER_tr
+openai_whisper-large-v3-v20240930,False,51.57,36.9,29.71,46.48,64.04,110.02,14.74,14.89,69.25,18.09,29.11,86.41,74.32,145.83,50.03,79.08,19.43,67.19,43.57,116.51,26.33,21.75,32.92,73.51,14.39,20.36,14.41,140.14,15.81,112.64,15.2,95.06,51.16,103.7,16.37,111.73,27.24,24.08,62.2,104.28,121.81,48.07,102.63,104.87,40.62,18.12,16.39,11.46,21.95,98.71,86.28,37.8,43.31,137.87,14.01,103.2,38.1,100.68,20.79,16.62,12.28,16.31,5.94,32.21,12.54,60.73,35.6,57.45,42.35,103.14,98.21,44.83,31.0,31.24
+openai_whisper-large-v3-v20240930,True,46.09,27.13,24.61,25.59,61.29,98.84,12.12,16.92,65.69,12.97,26.85,84.04,73.95,128.9,39.97,61.51,17.63,48.26,41.87,97.08,21.97,17.73,30.78,71.01,12.83,18.25,12.85,75.43,13.28,104.35,11.41,89.71,64.28,100.0,14.93,95.78,25.34,19.14,54.07,120.4,112.94,34.52,100.0,96.64,31.45,15.0,15.3,8.91,20.42,79.7,63.89,36.54,26.14,132.26,12.26,105.14,33.33,95.96,20.75,15.42,11.11,15.51,6.1,31.51,12.13,55.96,32.84,54.92,40.65,114.11,98.39,41.54,23.3,24.29
+openai_whisper-tiny,False,105.22,121.79,133.13,113.57,119.78,118.34,103.44,99.27,119.73,82.19,122.66,112.51,132.53,120.31,103.18,115.1,99.88,101.13,125.86,114.12,82.61,130.16,112.75,100.2,89.71,82.85,125.93,113.15,109.31,117.09,118.71,109.16,88.8,120.37,81.94,115.21,79.77,115.73,114.84,103.05,105.04,117.77,116.19,109.58,159.43,78.61,129.6,76.43,122.17,100.32,104.44,118.43,102.01,140.29,79.95,100.86,146.83,110.82,110.13,93.9,124.4,67.17,68.41,113.76,33.14,122.06,133.54,112.59,132.9,106.52,123.61,100.96,110.41,125.91
+openai_whisper-tiny,True,86.1,81.42,92.88,70.33,112.75,122.16,56.82,50.52,99.17,64.45,72.15,103.81,133.47,140.93,102.1,98.9,79.55,102.76,179.77,128.57,53.32,66.33,89.27,93.19,60.12,59.02,81.79,133.22,58.43,124.62,66.43,111.99,90.36,102.78,65.71,105.43,65.2,69.07,80.42,104.57,133.29,83.84,110.69,97.86,97.63,54.0,85.5,54.06,83.5,106.27,103.34,93.39,102.17,140.86,49.53,112.36,90.67,102.29,62.34,72.56,54.08,59.57,34.99,101.75,33.4,130.66,101.05,93.62,97.05,113.15,107.44,80.94,42.42,66.07
+openai_whisper-small,False,96.89,116.31,109.59,106.84,110.05,117.73,69.24,74.63,110.83,52.47,97.58,109.76,138.13,118.51,93.33,113.81,66.04,101.28,127.1,115.44,89.93,120.81,101.47,88.06,67.04,51.78,117.66,120.07,92.0,116.14,86.09,106.41,48.07,105.56,37.94,105.77,117.15,101.7,108.27,101.54,102.57,115.41,118.11,104.62,151.92,65.17,66.19,60.49,118.61,100.46,102.89,99.37,100.76,141.61,49.17,100.37,121.83,107.77,137.18,68.85,104.92,54.88,47.03,96.83,18.2,119.04,121.42,111.48,116.31,118.1,120.3,100.63,115.33,108.8
+openai_whisper-small,True,69.14,49.09,51.74,40.93,96.1,115.21,23.74,25.43,89.94,23.97,43.29,96.2,120.55,130.06,164.5,78.06,37.18,89.86,82.95,262.79,30.25,31.49,62.47,114.14,25.02,30.35,37.7,311.76,26.09,174.55,26.99,161.95,48.61,100.0,35.7,94.66,42.22,40.02,60.5,160.3,115.92,50.81,118.57,97.46,47.01,30.45,44.66,19.94,49.16,129.67,107.33,71.02,45.58,,23.87,131.95,62.3,,34.7,30.07,23.81,27.11,11.94,72.39,17.35,97.5,75.61,67.42,77.08,102.07,103.03,42.18,21.52,33.32
+openai_whisper-large-v3,False,54.77,41.01,32.74,44.39,66.07,110.74,17.82,14.19,64.45,13.59,36.31,96.11,70.79,134.23,52.71,85.41,16.63,60.48,58.79,122.59,33.66,28.76,27.08,78.1,13.55,17.32,20.67,123.88,16.13,107.71,10.99,110.75,53.95,105.56,14.51,103.6,42.68,32.28,64.23,101.2,102.62,68.19,100.46,99.97,36.03,23.57,13.44,12.17,30.2,98.44,101.1,40.79,75.87,149.84,14.75,100.54,35.32,106.08,20.94,15.9,11.86,15.51,6.36,30.06,12.7,62.68,32.37,51.06,45.38,104.29,100.73,81.72,38.07,26.73
+openai_whisper-large-v3,True,34.23,18.87,18.44,21.24,58.02,90.52,10.13,12.32,53.97,9.81,23.79,78.78,54.56,,29.37,45.53,13.89,42.37,48.61,87.75,20.38,12.35,21.06,65.39,11.11,14.69,12.04,61.25,13.0,99.39,5.39,97.25,14.27,101.85,13.75,88.95,25.41,15.59,41.4,57.1,107.34,20.59,99.25,91.39,23.08,13.06,12.44,7.03,17.37,,52.77,36.38,20.33,,9.89,,21.43,86.38,20.37,10.32,9.47,13.67,4.93,28.43,12.21,45.43,27.63,35.05,40.65,102.76,90.45,28.97,6.11,17.88
+openai_whisper-large-v3-v20240930_626MB,False,52.29,39.68,29.99,49.08,66.59,107.43,15.31,15.95,71.18,17.19,32.01,88.37,79.06,135.02,51.08,80.09,20.74,71.26,47.37,105.47,25.78,22.21,34.77,74.12,15.26,20.99,15.98,139.45,16.29,106.18,16.59,95.23,51.42,101.85,16.46,107.76,29.67,27.49,64.5,103.61,115.9,47.55,100.79,103.61,38.22,19.62,17.52,11.63,24.46,98.93,85.04,39.69,47.4,133.75,14.69,104.02,38.49,101.45,22.74,16.62,12.28,17.04,6.02,34.3,13.39,62.2,38.5,60.13,45.51,103.6,98.12,48.42,35.09,31.2
+openai_whisper-large-v3-v20240930_626MB,True,47.64,30.62,25.67,26.93,62.36,97.82,13.11,17.36,67.47,12.72,29.16,84.89,77.31,111.23,39.77,63.57,18.94,50.51,45.76,97.71,22.33,18.71,31.72,72.64,13.16,19.39,14.49,84.78,14.37,102.51,12.24,93.42,66.1,100.0,14.85,95.84,27.18,21.48,56.17,134.5,123.72,36.74,98.12,95.73,32.09,15.48,17.05,9.05,22.25,81.89,63.49,40.0,27.3,128.28,13.21,102.9,34.52,96.28,22.01,15.17,12.66,15.68,5.97,33.98,13.03,56.96,35.97,57.0,43.62,121.47,99.17,42.74,23.09,24.11
+openai_whisper-large-v3-v20240930_547MB,False,61.3,56.47,43.16,61.91,88.9,109.85,24.17,22.93,88.74,26.09,45.97,96.7,107.38,134.76,57.25,85.1,27.18,70.85,71.7,109.68,30.21,32.19,50.95,80.91,20.97,30.91,27.66,137.02,20.13,112.37,27.76,99.08,65.2,116.67,21.24,103.29,38.97,40.09,73.1,103.36,116.51,67.78,109.58,103.61,53.89,27.16,30.42,17.39,39.13,102.58,86.68,51.97,58.32,132.81,19.31,103.2,59.92,103.96,26.14,23.24,18.05,23.09,8.1,47.03,16.56,84.54,55.51,75.58,63.11,111.89,103.54,61.24,58.17,42.66
+openai_whisper-large-v3-v20240930_547MB,True,54.61,40.12,35.54,35.4,78.38,102.05,19.6,25.97,81.39,19.25,38.72,89.86,109.32,146.41,46.71,69.67,25.3,60.25,66.07,101.55,25.79,26.23,45.55,75.4,18.77,27.16,23.73,106.23,18.63,108.87,18.26,97.33,74.97,101.85,18.91,95.65,34.74,30.15,63.01,,120.76,47.96,104.64,100.75,41.63,20.54,27.11,14.18,33.31,96.72,74.91,48.5,38.49,129.01,17.62,101.41,47.22,99.32,26.25,20.31,17.25,22.0,7.84,45.12,16.12,77.07,49.49,71.63,57.09,114.95,101.79,53.56,39.58,33.54
+openai_whisper-large-v2,False,94.09,119.27,112.44,106.95,110.77,122.16,75.3,61.28,112.19,43.62,91.08,112.01,137.54,118.3,90.82,118.11,25.33,100.79,152.88,115.67,79.34,113.99,69.07,91.12,50.45,40.5,112.0,113.84,99.36,123.07,96.59,110.1,52.67,100.0,47.38,106.7,125.66,95.49,,101.13,102.34,118.31,124.76,105.15,143.76,63.7,44.82,48.04,119.44,100.18,102.64,101.1,100.38,154.52,65.08,100.59,85.71,105.79,97.02,48.57,92.39,31.95,46.74,99.3,13.74,116.06,137.91,74.42,107.11,111.27,131.83,100.18,119.35,113.86
+openai_whisper-large-v2,True,47.14,25.76,25.84,25.24,67.14,100.99,12.51,17.69,65.57,12.16,24.01,83.34,62.4,176.79,47.12,49.05,16.72,48.12,58.01,136.6,22.6,15.04,28.69,72.69,14.34,16.2,17.14,165.74,15.11,115.56,7.86,95.93,53.06,105.56,15.23,99.75,36.59,20.95,43.09,105.46,114.5,25.32,107.07,115.23,26.39,16.27,16.72,8.93,21.52,103.9,62.0,47.24,25.92,150.19,11.7,107.19,29.37,106.33,24.84,13.13,12.2,16.21,6.93,35.96,12.7,53.38,38.94,32.85,49.09,103.76,105.51,28.08,8.76,19.55
+openai_whisper-base,False,104.18,125.5,143.16,112.16,113.79,122.43,99.57,99.07,123.02,75.03,98.01,114.12,138.03,114.66,99.56,122.48,81.21,101.78,137.07,115.04,91.47,131.68,117.3,96.63,78.16,69.85,128.78,132.18,103.59,125.21,114.33,106.87,72.69,125.93,59.81,114.59,74.81,113.57,119.31,103.18,105.38,123.9,123.21,109.11,160.77,70.55,110.22,80.49,122.28,100.7,104.89,108.98,101.32,143.37,61.29,100.7,134.52,111.44,136.38,102.17,126.88,58.74,58.79,115.46,25.71,122.69,149.93,110.1,126.44,116.79,132.15,101.37,112.61,122.57
+openai_whisper-base,True,79.92,72.07,76.57,59.1,106.54,171.77,43.44,40.15,100.62,45.53,61.22,102.6,208.4,165.98,83.98,92.45,61.96,103.31,99.05,201.0,42.73,55.22,81.5,82.95,46.45,48.59,67.24,117.99,44.21,151.1,54.19,105.42,70.49,111.11,48.98,98.51,53.88,58.31,76.9,100.38,119.75,74.21,134.5,116.11,72.17,47.63,71.07,37.01,73.54,100.79,101.9,87.4,102.24,117.93,38.09,109.8,84.13,106.76,48.87,56.32,43.04,45.09,24.55,91.31,25.11,104.21,91.07,87.41,98.64,106.13,108.77,60.25,32.91,51.87

dashboard_data/performance_data.json ADDED Viewed

The diff for this file is too large to render. See raw diff

dashboard_data/quality_data.json ADDED Viewed

	@@ -0,0 +1,23 @@

+{"model": "openai/whisper-large-v3/947MB", "timestamp": "2024-10-18_16:59:10_GMT-0700", "average_wer": 9.74, "dataset_wer": {"librispeech": 2.41, "earnings22-12hours": 17.08}, "qoi": 0.94}
+{"model": "openai/whisper-large-v2/turbo/955MB", "timestamp": "2024-10-18_16:52:35_GMT-0700", "average_wer": 7.27, "dataset_wer": {"librispeech": 2.4, "earnings22-12hours": 12.14}, "qoi": 0.94}
+{"model": "openai/whisper-tiny.en", "timestamp": "2024-10-19_15:40:06_GMT-0700", "average_wer": 12.23, "dataset_wer": {"librispeech": 5.61, "earnings22-12hours": 18.86}, "qoi": 0.63}
+{"model": "distil-whisper/distil-large-v3/594MB", "timestamp": "2024-10-20_13:02:33_GMT-0700", "average_wer": 8.96, "dataset_wer": {"librispeech": 2.87, "earnings22-12hours": 15.06}, "qoi": 0.86}
+{"model": "openai/whisper-large-v2/949MB", "timestamp": "2024-10-18_19:51:30_GMT-0400", "average_wer": 7.88, "dataset_wer": {"librispeech": 2.38, "earnings22-12hours": 13.39}, "qoi": 0.94}
+{"model": "openai/whisper-large-v3/turbo/954MB", "timestamp": "2024-10-20_13:49:26_GMT-0700", "average_wer": 22.75, "dataset_wer": {"librispeech": 2.51, "earnings22-12hours": 43.0}, "qoi": 0.93}
+{"model": "distil-whisper/distil-large-v3", "timestamp": "2024-10-20_20:32:22_GMT-0700", "average_wer": 7.2, "dataset_wer": {"librispeech": 2.38, "earnings22-12hours": 12.02}, "qoi": 0.9}
+{"model": "openai/whisper-large-v3-v20240930", "timestamp": "2024-10-18_18:35:46_GMT-0700", "average_wer": 6.74, "dataset_wer": {"librispeech": 1.93, "earnings22-12hours": 11.55}, "qoi": 0.94}
+{"model": "openai/whisper-tiny", "timestamp": "2024-10-20_20:19:04_GMT-0700", "average_wer": 14.21, "dataset_wer": {"librispeech": 7.46, "earnings22-12hours": 20.97}, "qoi": 0.52}
+{"model": "openai/whisper-large-v3-v20240930/turbo/632MB", "timestamp": "2024-10-18_20:10:30_GMT-0700", "average_wer": 6.86, "dataset_wer": {"librispeech": 1.95, "earnings22-12hours": 11.77}, "qoi": 0.93}
+{"model": "openai/whisper-large-v2/turbo", "timestamp": "2024-10-18_14:58:38_GMT-0700", "average_wer": 7.25, "dataset_wer": {"librispeech": 2.4, "earnings22-12hours": 12.1}, "qoi": 0.96}
+{"model": "openai/whisper-small", "timestamp": "2024-10-18_12:40:03_GMT-0700", "average_wer": 8.11, "dataset_wer": {"librispeech": 3.21, "earnings22-12hours": 13.0}, "qoi": 0.83}
+{"model": "openai/whisper-large-v3-v20240930/turbo", "timestamp": "2024-10-18_19:37:26_GMT-0700", "average_wer": 6.72, "dataset_wer": {"librispeech": 1.92, "earnings22-12hours": 11.52}, "qoi": 0.94}
+{"model": "openai/whisper-large-v3", "timestamp": "2024-10-18_18:01:14_GMT-0400", "average_wer": 6.85, "dataset_wer": {"librispeech": 2.02, "earnings22-12hours": 11.69}, "qoi": 0.95}
+{"model": "openai/whisper-large-v3-v20240930/626MB", "timestamp": "2024-10-18_19:21:06_GMT-0700", "average_wer": 7.15, "dataset_wer": {"librispeech": 1.96, "earnings22-12hours": 12.35}, "qoi": 0.93}
+{"model": "openai/whisper-base.en", "timestamp": "2024-10-20_12:31:44_GMT-0700", "average_wer": 9.59, "dataset_wer": {"librispeech": 3.98, "earnings22-12hours": 15.2}, "qoi": 0.75}
+{"model": "openai/whisper-large-v3-v20240930/547MB", "timestamp": "2024-10-18_21:59:11_GMT-0400", "average_wer": 16.82, "dataset_wer": {"librispeech": 2.16, "earnings22-12hours": 31.49}, "qoi": 0.92}
+{"model": "distil-whisper/distil-large-v3/turbo/600MB", "timestamp": "2024-10-18_17:50:17_GMT-0700", "average_wer": 8.33, "dataset_wer": {"librispeech": 2.8, "earnings22-12hours": 13.87}, "qoi": 0.86}
+{"model": "openai/whisper-large-v2", "timestamp": "2024-10-18_17:07:15_GMT-0400", "average_wer": 7.32, "dataset_wer": {"librispeech": 2.36, "earnings22-12hours": 12.28}, "qoi": 0.97}
+{"model": "openai/whisper-small.en", "timestamp": "2024-10-18_15:39:48_GMT-0400", "average_wer": 7.85, "dataset_wer": {"librispeech": 2.88, "earnings22-12hours": 12.82}, "qoi": 0.86}
+{"model": "distil-whisper/distil-large-v3/turbo", "timestamp": "2024-10-20_12:45:20_GMT-0700", "average_wer": 7.2, "dataset_wer": {"librispeech": 2.35, "earnings22-12hours": 12.05}, "qoi": 0.9}
+{"model": "openai/whisper-base", "timestamp": "2024-10-18_20:25:50_GMT-0700", "average_wer": 10.67, "dataset_wer": {"librispeech": 4.94, "earnings22-12hours": 16.4}, "qoi": 0.67}
+{"model": "openai/whisper-large-v3/turbo", "timestamp": "2024-10-20_16:58:25_GMT-0400", "average_wer": 6.86, "dataset_wer": {"librispeech": 1.97, "earnings22-12hours": 11.74}, "qoi": 0.95}

dashboard_data/support_data.csv ADDED Viewed

	@@ -0,0 +1,23 @@

+,Model,Apple M2 Pro,Apple M2 Ultra,Apple M3,Apple M3 Max,iPad Air 11-inch (M2),iPad mini (A17 Pro),iPad Pro 11-inch (M4),iPhone 11,iPhone 13 Pro,iPhone 13,iPhone 14,iPhone 16 Pro
+distil-whisper_distil-large-v3,distil-whisper_distil-large-v3,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported
+distil-whisper_distil-large-v3_594MB,distil-whisper_distil-large-v3_594MB,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
+distil-whisper_distil-large-v3_turbo,distil-whisper_distil-large-v3_turbo,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad14%2C8_summary_2024-10-25T032747.json>iPadOS 17.6.1</a>,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad16%2C1_summary_2024-10-25T054749.json>iPadOS 18.0.1</a>,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad16%2C3_summary_2024-10-25T032747.json>iPadOS 18.1</a>,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported
+distil-whisper_distil-large-v3_turbo_600MB,distil-whisper_distil-large-v3_turbo_600MB,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/Mac14%2C12_summary_2024-10-25T031359.json>macOS 15.0.1</a>,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
+openai_whisper-base,openai_whisper-base,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/Mac14%2C12_summary_2024-10-25T031359.json>macOS 15.0.1</a>,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,✅ iOS 17.6.1,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
+openai_whisper-base.en,openai_whisper-base.en,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,✅ iOS 17.6.1,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
+openai_whisper-large-v2,openai_whisper-large-v2,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/Mac14%2C12_summary_2024-10-25T031359.json>macOS 15.0.1</a>,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported
+openai_whisper-large-v2_949MB,openai_whisper-large-v2_949MB,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/Mac14%2C12_summary_2024-10-25T031359.json>macOS 15.0.1</a>,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPhone14%2C5_summary_2024-10-25T032747.json>iOS 17.3</a>,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
+openai_whisper-large-v2_turbo,openai_whisper-large-v2_turbo,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad14%2C8_summary_2024-10-25T032747.json>iPadOS 17.6.1</a>,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad16%2C1_summary_2024-10-25T054749.json>iPadOS 18.0.1</a>,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported
+openai_whisper-large-v2_turbo_955MB,openai_whisper-large-v2_turbo_955MB,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
+openai_whisper-large-v3,openai_whisper-large-v3,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported
+openai_whisper-large-v3-v20240930,openai_whisper-large-v3-v20240930,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad14%2C8_summary_2024-10-25T032747.json>iPadOS 17.6.1</a>,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad16%2C1_summary_2024-10-25T054749.json>iPadOS 18.0.1</a>,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad16%2C3_summary_2024-10-25T032747.json>iPadOS 18.1</a>,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported
+openai_whisper-large-v3-v20240930_626MB,openai_whisper-large-v3-v20240930_626MB,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
+openai_whisper-large-v3-v20240930_turbo,openai_whisper-large-v3-v20240930_turbo,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,Not Supported,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad16%2C1_summary_2024-10-25T054749.json>iPadOS 18.0.1</a>,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPad16%2C3_summary_2024-10-25T032747.json>iPadOS 18.1</a>,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported
+openai_whisper-large-v3-v20240930_turbo_632MB,openai_whisper-large-v3-v20240930_turbo_632MB,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPhone14%2C5_summary_2024-10-25T032747.json>iOS 17.3</a>,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPhone14%2C7_summary_2024-10-25T032747.json>iOS 17.3</a>,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
+openai_whisper-large-v3_947MB,openai_whisper-large-v3_947MB,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/iPhone14%2C5_summary_2024-10-25T032747.json>iOS 17.3</a>,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
+openai_whisper-large-v3_turbo,openai_whisper-large-v3_turbo,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported,Not Supported
+openai_whisper-large-v3_turbo_954MB,openai_whisper-large-v3_turbo_954MB,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
+openai_whisper-small,openai_whisper-small,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
+openai_whisper-small.en,openai_whisper-small.en,⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href=https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data/2024-10-25T012729_6962d0d/Mac14%2C12_summary_2024-10-25T031359.json>macOS 15.0.1</a>,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
+openai_whisper-tiny,openai_whisper-tiny,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,Not Supported,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>
+openai_whisper-tiny.en,openai_whisper-tiny.en,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.0.1,✅ macOS 15.1,✅ iPadOS 17.6.1,✅ iPadOS 18.0.1,✅ iPadOS 18.1,✅ iOS 17.6.1,✅ iOS 18.0,✅ iOS 17.3,✅ iOS 17.3,✅ iOS 18.0.1<p>✅ iOS 18.0</p>

main.py ADDED Viewed

	@@ -0,0 +1,1302 @@

+"""
+Main module for the WhisperKit Evaluation Dashboard.
+This module sets up and runs the Gradio interface for the WhisperKit Evaluation Dashboard,
+allowing users to explore and compare speech recognition model performance across different
+devices, operating systems, and datasets.
+"""
+import json
+import os
+import re
+from math import ceil, floor
+import gradio as gr
+import pandas as pd
+from argmax_gradio_components import RangeSlider
+from dotenv import load_dotenv
+from huggingface_hub import login
+# Import custom constants and utility functions
+from constants import (
+    BANNER_TEXT,
+    CITATION_BUTTON_LABEL,
+    CITATION_BUTTON_TEXT,
+    COL_NAMES,
+    HEADER,
+    LANGUAGE_MAP,
+    METHODOLOGY_TEXT,
+    PERFORMANCE_TEXT,
+    QUALITY_TEXT,
+)
+from utils import (
+    add_datasets_to_performance_columns,
+    add_datasets_to_quality_columns,
+    calculate_parity,
+    create_confusion_matrix_plot,
+    create_initial_performance_column_dict,
+    create_initial_quality_column_dict,
+    css,
+    fields,
+    get_os_name_and_version,
+    make_dataset_wer_clickable_link,
+    make_model_name_clickable_link,
+    make_multilingual_model_clickable_link,
+    plot_metric,
+    read_json_line_by_line,
+)
+# Load environment variables
+load_dotenv()
+# Get the Hugging Face token from the environment variable
+HF_TOKEN = os.getenv("HF_TOKEN")
+# Use the token for login
+login(token=HF_TOKEN, add_to_git_credential=True)
+# Define repository and directory information
+repo_id = "argmaxinc/whisperkit-evals-dataset"
+directory = "xcresults/benchmark_results"
+local_dir = ""
+# Load benchmark data from JSON files
+PERFORMANCE_DATA = read_json_line_by_line("dashboard_data/performance_data.json")
+QUALITY_DATA = read_json_line_by_line("dashboard_data/quality_data.json")
+# Convert JSON data to pandas DataFrames
+quality_df = pd.json_normalize(QUALITY_DATA)
+benchmark_df = pd.json_normalize(PERFORMANCE_DATA)
+# Process timestamp data
+benchmark_df["timestamp"] = pd.to_datetime(benchmark_df["timestamp"]).dt.tz_localize(
+    None
+)
+benchmark_df["timestamp"] = pd.to_datetime(benchmark_df["timestamp"]).dt.tz_localize(
+    None
+)
+# First create a temporary column for model length
+sorted_quality_df = (
+    quality_df.assign(model_len=quality_df["model"].str.len())
+    .sort_values(
+        by=["model_len", "model", "timestamp"],
+        ascending=[True, True, False],
+    )
+    .drop(columns=["model_len"])
+    .drop_duplicates(subset=["model"], keep="first")
+    .reset_index(drop=True)
+)
+sorted_performance_df = (
+    benchmark_df.assign(model_len=benchmark_df["model"].str.len())
+    .sort_values(
+        by=["model_len", "model", "device", "os", "timestamp"],
+        ascending=[True, True, True, True, False],
+    )
+    .drop(columns=["model_len"])
+    .drop_duplicates(subset=["model", "device", "os"], keep="first")
+    .reset_index(drop=True)
+)
+# Identify dataset-specific columns
+dataset_wer_columns = [
+    col for col in sorted_quality_df.columns if col.startswith("dataset_wer.")
+]
+dataset_speed_columns = [
+    col for col in sorted_performance_df.columns if col.startswith("dataset_speed.")
+]
+dataset_toks_columns = [
+    col
+    for col in sorted_performance_df.columns
+    if col.startswith("dataset_tokens_per_second.")
+]
+# Extract dataset names
+QUALITY_DATASETS = [col.split(".")[-1] for col in dataset_wer_columns]
+PERFORMANCE_DATASETS = [col.split(".")[-1] for col in dataset_speed_columns]
+# Prepare DataFrames for display
+model_df = sorted_quality_df[
+    ["model", "average_wer", "qoi", "timestamp"] + dataset_wer_columns
+]
+performance_df = sorted_performance_df[
+    [
+        "model",
+        "device",
+        "os",
+        "average_wer",
+        "qoi",
+        "speed",
+        "tokens_per_second",
+        "timestamp",
+    ]
+    + dataset_speed_columns
+    + dataset_toks_columns
+].copy()
+# Rename columns for clarity
+performance_df = performance_df.rename(
+    lambda x: COL_NAMES[x] if x in COL_NAMES else x, axis="columns"
+)
+model_df = model_df.rename(
+    lambda x: COL_NAMES[x] if x in COL_NAMES else x, axis="columns"
+)
+# Process dataset-specific columns
+for col in dataset_wer_columns:
+    dataset_name = col.split(".")[-1]
+    model_df = model_df.rename(columns={col: dataset_name})
+    model_df[dataset_name] = model_df.apply(
+        lambda x: make_dataset_wer_clickable_link(x, dataset_name), axis=1
+    )
+for col in dataset_speed_columns:
+    dataset_name = col.split(".")[-1]
+    performance_df = performance_df.rename(
+        columns={
+            col: f"{'Short-Form' if dataset_name == 'librispeech-10mins' else 'Long-Form'} Speed"
+        }
+    )
+for col in dataset_toks_columns:
+    dataset_name = col.split(".")[-1]
+    performance_df = performance_df.rename(
+        columns={
+            col: f"{'Short-Form' if dataset_name == 'librispeech-10mins' else 'Long-Form'} Tok/s"
+        }
+    )
+# Calculate parity with M2 Ultra
+m2_ultra_wer = (
+    performance_df[performance_df["Device"] == "Apple M2 Ultra"]
+    .groupby("Model")["Average WER"]
+    .first()
+)
+performance_df["Parity %"] = performance_df.apply(
+    lambda row: calculate_parity(m2_ultra_wer, row), axis=1
+)
+# Process model names for display
+model_df["model_raw"] = model_df["Model"].copy()
+performance_df["model_raw"] = performance_df["Model"].copy()
+model_df["Model"] = model_df["Model"].apply(lambda x: make_model_name_clickable_link(x))
+performance_df["Model"] = performance_df["Model"].apply(
+    lambda x: make_model_name_clickable_link(x)
+)
+# Extract unique devices and OS versions
+PERFORMANCE_DEVICES = performance_df["Device"].unique().tolist()
+PERFORMANCE_OS = performance_df["OS"].apply(get_os_name_and_version).unique().tolist()
+PERFORMANCE_OS.sort()
+# Create initial column dictionaries and update with dataset information
+initial_performance_column_dict = create_initial_performance_column_dict()
+initial_quality_column_dict = create_initial_quality_column_dict()
+performance_column_info = add_datasets_to_performance_columns(
+    initial_performance_column_dict, PERFORMANCE_DATASETS
+)
+quality_column_info = add_datasets_to_quality_columns(
+    initial_quality_column_dict, QUALITY_DATASETS
+)
+# Unpack the returned dictionaries
+updated_performance_column_dict = performance_column_info["column_dict"]
+updated_quality_column_dict = quality_column_info["column_dict"]
+PerformanceAutoEvalColumn = performance_column_info["AutoEvalColumn"]
+QualityAutoEvalColumn = quality_column_info["AutoEvalColumn"]
+# Define column sets for different views
+PERFORMANCE_COLS = performance_column_info["COLS"]
+QUALITY_COLS = quality_column_info["COLS"]
+PERFORMANCE_TYPES = performance_column_info["TYPES"]
+QUALITY_TYPES = quality_column_info["TYPES"]
+PERFORMANCE_ALWAYS_HERE_COLS = performance_column_info["ALWAYS_HERE_COLS"]
+QUALITY_ALWAYS_HERE_COLS = quality_column_info["ALWAYS_HERE_COLS"]
+PERFORMANCE_TOGGLE_COLS = performance_column_info["TOGGLE_COLS"]
+QUALITY_TOGGLE_COLS = quality_column_info["TOGGLE_COLS"]
+PERFORMANCE_SELECTED_COLS = performance_column_info["SELECTED_COLS"]
+QUALITY_SELECTED_COLS = quality_column_info["SELECTED_COLS"]
+def performance_filter(
+    df,
+    columns,
+    model_query,
+    exclude_models,
+    devices,
+    os,
+    short_speed_slider,
+    long_speed_slider,
+    short_toks_slider,
+    long_toks_slider,
+):
+    """
+    Filters the performance DataFrame based on specified criteria.
+    :param df: The DataFrame to be filtered.
+    :param columns: The columns to be included in the filtered DataFrame.
+    :param model_query: The query string to filter the 'Model' column.
+    :param exclude_models: Models to exclude from the results.
+    :param devices: The devices to filter the 'Device' column.
+    :param os: The list of operating systems to filter the 'OS' column.
+    :param short_speed_slider: The range of values to filter the 'Short-Form Speed' column.
+    :param long_speed_slider: The range of values to filter the 'Long-Form Speed' column.
+    :param short_toks_slider: The range of values to filter the 'Short-Form Tok/s' column.
+    :param long_toks_slider: The range of values to filter the 'Long-Form Tok/s' column.
+    :return: The filtered DataFrame.
+    """
+    # Select columns based on input and always-present columns
+    filtered_df = df[
+        PERFORMANCE_ALWAYS_HERE_COLS
+        + [c for c in PERFORMANCE_COLS if c in df.columns and c in columns]
+    ]
+    # Filter models based on query
+    if model_query:
+        filtered_df = filtered_df[
+            filtered_df["Model"].str.contains(
+                "|".join(q.strip() for q in model_query.split(";")), case=False
+            )
+        ]
+    # Exclude specified models
+    if exclude_models:
+        exclude_list = [m.strip() for m in exclude_models.split(";")]
+        filtered_df = filtered_df[
+            ~filtered_df["Model"].str.contains("|".join(exclude_list), case=False)
+        ]
+    # Filter by devices
+    filtered_df = (
+        filtered_df[
+            (
+                filtered_df["Device"].str.contains(
+                    "|".join(re.escape(q.strip()) for q in devices), case=False
+                )
+            )
+        ]
+        if devices
+        else pd.DataFrame(columns=filtered_df.columns)
+    )
+    # Filter by operating systems
+    filtered_df = (
+        filtered_df[
+            (
+                filtered_df["OS"].str.contains(
+                    "|".join(q.strip() for q in os), case=False
+                )
+            )
+        ]
+        if os
+        else pd.DataFrame(columns=filtered_df.columns)
+    )
+    # Apply short-form and long-form speed and tokens per second filters
+    min_short_speed, max_short_speed = short_speed_slider
+    min_long_speed, max_long_speed = long_speed_slider
+    min_short_toks, max_short_toks = short_toks_slider
+    min_long_toks, max_long_toks = long_toks_slider
+    filtered_df = filtered_df[
+        (filtered_df["Short-Form Speed"] >= min_short_speed)
+        & (filtered_df["Short-Form Speed"] <= max_short_speed)
+        & (filtered_df["Long-Form Speed"] >= min_long_speed)
+        & (filtered_df["Long-Form Speed"] <= max_long_speed)
+        & (filtered_df["Short-Form Tok/s"] >= min_short_toks)
+        & (filtered_df["Short-Form Tok/s"] <= max_short_toks)
+        & (filtered_df["Long-Form Tok/s"] >= min_long_toks)
+        & (filtered_df["Long-Form Tok/s"] <= max_long_toks)
+    ]
+    return filtered_df
+def quality_filter(df, columns, model_query, wer_slider, qoi_slider, exclude_models):
+    """
+    Filters the quality DataFrame based on specified criteria.
+    :param df: The DataFrame to be filtered.
+    :param columns: The columns to be included in the filtered DataFrame.
+    :param model_query: The query string to filter the 'Model' column.
+    :param wer_slider: The range of values to filter the 'Average WER' column.
+    :param qoi_slider: The range of values to filter the 'QoI' column.
+    :param exclude_models: Models to exclude from the results.
+    :return: The filtered DataFrame.
+    """
+    # Select columns based on input and always-present columns
+    filtered_df = df[
+        QUALITY_ALWAYS_HERE_COLS
+        + [c for c in QUALITY_COLS if c in df.columns and c in columns]
+    ]
+    # Filter models based on query
+    if model_query:
+        filtered_df = filtered_df[
+            filtered_df["Model"].str.contains(
+                "|".join(q.strip() for q in model_query.split(";")), case=False
+            )
+        ]
+    # Exclude specified models
+    if exclude_models:
+        exclude_list = [m.strip() for m in exclude_models.split(";")]
+        filtered_df = filtered_df[
+            ~filtered_df["Model"].str.contains("|".join(exclude_list), case=False)
+        ]
+    # Apply WER and QoI filters
+    min_wer_slider, max_wer_slider = wer_slider
+    min_qoi_slider, max_qoi_slider = qoi_slider
+    if "Average WER" in filtered_df.columns:
+        filtered_df = filtered_df[
+            (filtered_df["Average WER"] >= min_wer_slider)
+            & (filtered_df["Average WER"] <= max_wer_slider)
+        ]
+    if "QoI" in filtered_df.columns:
+        filtered_df = filtered_df[
+            (filtered_df["QoI"] >= min_qoi_slider)
+            & (filtered_df["QoI"] <= max_qoi_slider)
+        ]
+    return filtered_df
+diff_tab = gr.TabItem("Difference Checker", elem_id="diff_checker", id=2)
+text_diff_elems = []
+tabs = gr.Tabs(elem_id="tab-elems")
+multilingual_df = pd.read_csv("dashboard_data/multilingual_results.csv")
+multilingual_models_df = multilingual_df[["Model"]].drop_duplicates()
+multilingual_models_buttons = []
+for model in multilingual_models_df["Model"]:
+    elem_id = (
+        f"{model}".replace(" ", "_").replace('"', "").replace("'", "").replace(",", "")
+    )
+    multilingual_models_buttons.append(
+        gr.Button(value=model, elem_id=elem_id, visible=False)
+    )
+multilingual_models_df["Model"] = multilingual_models_df["Model"].apply(
+    lambda x: make_multilingual_model_clickable_link(x)
+)
+with open("dashboard_data/multilingual_confusion_matrices.json", "r") as file:
+    confusion_matrix_map = dict(json.load(file))
+def update_multilingual_results(selected_model):
+    """
+    Updates the multilingual results display based on the selected model.
+    This function processes the multilingual data for the chosen model,
+    calculates average WER for different scenarios (language hinted vs. predicted),
+    and prepares language-specific WER data for display.
+    :param selected_model: The name of the selected model
+    :return: A list containing updated components for the Gradio interface
+    """
+    if selected_model is None:
+        return "# Select a model from the dropdown to view results."
+    # Filter data for the selected model
+    model_data = multilingual_df[multilingual_df["Model"] == selected_model]
+    if model_data.empty:
+        return f"# No data available for model: {selected_model}"
+    # Separate data for forced and not forced scenarios
+    forced_data = model_data[model_data["Forced Tokens"] == True]
+    not_forced_data = model_data[model_data["Forced Tokens"] == False]
+    result_text = f"# Model: {selected_model}\n\n"
+    # Prepare average WER data
+    average_wer_data = []
+    if not forced_data.empty:
+        average_wer_data.append(
+            {
+                "Scenario": "Language Hinted",
+                "Average WER": forced_data.iloc[0]["Average WER"],
+            }
+        )
+    if not not_forced_data.empty:
+        average_wer_data.append(
+            {
+                "Scenario": "Language Predicted",
+                "Average WER": not_forced_data.iloc[0]["Average WER"],
+            }
+        )
+    average_wer_df = pd.DataFrame(average_wer_data)
+    average_wer_df["Average WER"] = average_wer_df["Average WER"].apply(
+        lambda x: round(x, 2)
+    )
+    # Prepare language-specific WER data
+    lang_columns = [col for col in model_data.columns if col.startswith("WER_")]
+    lang_wer_data = []
+    for column in lang_columns:
+        lang = column.split("_")[1]
+        forced_wer = forced_data[column].iloc[0] if not forced_data.empty else None
+        not_forced_wer = (
+            not_forced_data[column].iloc[0] if not not_forced_data.empty else None
+        )
+        if forced_wer is not None or not_forced_wer is not None:
+            lang_wer_data.append(
+                {
+                    "Language": LANGUAGE_MAP[lang],
+                    "Language Hinted WER": round(forced_wer, 2)
+                    if forced_wer is not None
+                    else "N/A",
+                    "Language Predicted WER": round(not_forced_wer, 2)
+                    if not_forced_wer is not None
+                    else "N/A",
+                }
+            )
+    lang_wer_df = pd.DataFrame(lang_wer_data)
+    lang_wer_df = lang_wer_df.fillna("No Data")
+    # Create confusion matrix plot for unforced scenario
+    unforced_plot = None
+    if selected_model in confusion_matrix_map:
+        if "not_forced" in confusion_matrix_map[selected_model]:
+            unforced_plot = create_confusion_matrix_plot(
+                confusion_matrix_map[selected_model]["not_forced"]["matrix"],
+                confusion_matrix_map[selected_model]["not_forced"]["labels"],
+                False,
+            )
+    # Return updated components for Gradio interface
+    return [
+        gr.update(value=result_text),
+        gr.update(visible=True, value=average_wer_df),
+        gr.update(visible=True, value=lang_wer_df),
+        gr.update(visible=unforced_plot is not None, value=unforced_plot),
+    ]
+font = [
+    "Zwizz Regular",  # Local font
+    "IBM Plex Mono",  # Monospace font
+    "ui-sans-serif",
+    "system-ui",
+    "sans-serif",
+]
+# Define the Gradio interface
+with gr.Blocks(css=css, theme=gr.themes.Base(font=font)) as demo:
+    # Add header and banner to the interface
+    gr.HTML(HEADER)
+    gr.HTML(BANNER_TEXT, elem_classes="markdown-text")
+    # Create tabs for different sections of the dashboard
+    with tabs.render():
+        # Performance Tab
+        with gr.TabItem("Performance", elem_id="benchmark", id=0):
+            with gr.Row():
+                with gr.Column(scale=1):
+                    with gr.Row():
+                        with gr.Column(scale=6, elem_classes="filter_models_column"):
+                            filter_performance_models = gr.Textbox(
+                                placeholder="🔍 Filter Model (separate multiple queries with ';')",
+                                label="Filter Models",
+                            )
+                        with gr.Column(scale=4, elem_classes="exclude_models_column"):
+                            exclude_performance_models = gr.Textbox(
+                                placeholder="🔍 Exclude (separate multiple queries with ';')",
+                                label="Exclude Models",
+                            )
+                    with gr.Row():
+                        with gr.Accordion("See All Columns", open=False):
+                            with gr.Row():
+                                with gr.Column(scale=9, elem_id="performance_columns"):
+                                    performance_shown_columns = gr.CheckboxGroup(
+                                        choices=PERFORMANCE_TOGGLE_COLS,
+                                        value=PERFORMANCE_SELECTED_COLS,
+                                        label="Toggle Columns",
+                                        elem_id="column-select",
+                                        interactive=True,
+                                    )
+                                with gr.Column(
+                                    scale=1,
+                                    min_width=200,
+                                    elem_id="performance_select_columns",
+                                ):
+                                    with gr.Row():
+                                        select_all_button = gr.Button(
+                                            "Select All",
+                                            elem_id="select-all-button",
+                                            interactive=True,
+                                        )
+                                        deselect_all_button = gr.Button(
+                                            "Deselect All",
+                                            elem_id="deselect-all-button",
+                                            interactive=True,
+                                        )
+                            def select_all_columns():
+                                return PERFORMANCE_TOGGLE_COLS
+                            def deselect_all_columns():
+                                return []
+                            select_all_button.click(
+                                select_all_columns,
+                                inputs=[],
+                                outputs=performance_shown_columns,
+                            )
+                            deselect_all_button.click(
+                                deselect_all_columns,
+                                inputs=[],
+                                outputs=performance_shown_columns,
+                            )
+                    with gr.Row():
+                        with gr.Accordion("Filter Devices", open=False):
+                            with gr.Row():
+                                with gr.Column(
+                                    scale=9, elem_id="filter_devices_column"
+                                ):
+                                    performance_shown_devices = gr.CheckboxGroup(
+                                        choices=PERFORMANCE_DEVICES,
+                                        value=PERFORMANCE_DEVICES,
+                                        label="Filter Devices",
+                                        interactive=True,
+                                    )
+                                with gr.Column(
+                                    scale=1,
+                                    min_width=200,
+                                    elem_id="filter_select_devices",
+                                ):
+                                    with gr.Row():
+                                        select_all_devices_button = gr.Button(
+                                            "Select All",
+                                            elem_id="select-all-devices-button",
+                                            interactive=True,
+                                        )
+                                        deselect_all_devices_button = gr.Button(
+                                            "Deselect All",
+                                            elem_id="deselect-all-devices-button",
+                                            interactive=True,
+                                        )
+                            def select_all_devices():
+                                return PERFORMANCE_DEVICES
+                            def deselect_all_devices():
+                                return []
+                            select_all_devices_button.click(
+                                select_all_devices,
+                                inputs=[],
+                                outputs=performance_shown_devices,
+                            )
+                            deselect_all_devices_button.click(
+                                deselect_all_devices,
+                                inputs=[],
+                                outputs=performance_shown_devices,
+                            )
+                    with gr.Row():
+                        performance_shown_os = gr.CheckboxGroup(
+                            choices=PERFORMANCE_OS,
+                            value=PERFORMANCE_OS,
+                            label="Filter OS",
+                            interactive=True,
+                        )
+                with gr.Column(scale=1):
+                    with gr.Accordion("See Performance Filters"):
+                        with gr.Row():
+                            with gr.Row():
+                                min_short_speed, max_short_speed = floor(
+                                    min(performance_df["Short-Form Speed"])
+                                ), ceil(max(performance_df["Short-Form Speed"]))
+                                short_speed_slider = RangeSlider(
+                                    value=[min_short_speed, max_short_speed],
+                                    minimum=min_short_speed,
+                                    maximum=max_short_speed,
+                                    step=0.001,
+                                    label="Short-Form Speed",
+                                )
+                            with gr.Row():
+                                min_long_speed, max_long_speed = floor(
+                                    min(performance_df["Long-Form Speed"])
+                                ), ceil(max(performance_df["Long-Form Speed"]))
+                                long_speed_slider = RangeSlider(
+                                    value=[min_long_speed, max_long_speed],
+                                    minimum=min_long_speed,
+                                    maximum=max_long_speed,
+                                    step=0.001,
+                                    label="Long-Form Speed",
+                                )
+                        with gr.Row():
+                            with gr.Row():
+                                min_short_toks, max_short_toks = floor(
+                                    min(performance_df["Short-Form Tok/s"])
+                                ), ceil(max(performance_df["Short-Form Tok/s"]))
+                                short_toks_slider = RangeSlider(
+                                    value=[min_short_toks, max_short_toks],
+                                    minimum=min_short_toks,
+                                    maximum=max_short_toks,
+                                    step=0.001,
+                                    label="Short-Form Tok/s",
+                                )
+                            with gr.Row():
+                                min_long_toks, max_long_toks = floor(
+                                    min(performance_df["Long-Form Tok/s"])
+                                ), ceil(max(performance_df["Long-Form Tok/s"]))
+                                long_toks_slider = RangeSlider(
+                                    value=[min_long_toks, max_long_toks],
+                                    minimum=min_long_toks,
+                                    maximum=max_long_toks,
+                                    step=0.001,
+                                    label="Long-Form Tok/s",
+                                )
+                    with gr.Row():
+                        gr.Markdown(PERFORMANCE_TEXT, elem_classes="markdown-text")
+            with gr.Row():
+                leaderboard_df = gr.components.Dataframe(
+                    value=performance_df[
+                        PERFORMANCE_ALWAYS_HERE_COLS + performance_shown_columns.value
+                    ],
+                    headers=[
+                        PERFORMANCE_ALWAYS_HERE_COLS + performance_shown_columns.value
+                    ],
+                    datatype=[
+                        c.type
+                        for c in fields(PerformanceAutoEvalColumn)
+                        if c.name in PERFORMANCE_COLS
+                    ],
+                    elem_id="leaderboard-table",
+                    elem_classes="large-table",
+                    interactive=False,
+                )
+                # Copy of the leaderboard dataframe to apply filters to
+                hidden_leaderboard_df = gr.components.Dataframe(
+                    value=performance_df,
+                    headers=PERFORMANCE_COLS,
+                    datatype=[
+                        c.type
+                        for c in fields(PerformanceAutoEvalColumn)
+                        if c.name in PERFORMANCE_COLS
+                    ],
+                    visible=False,
+                )
+                # Inputs for the dataframe filter function
+                performance_filter_inputs = [
+                    hidden_leaderboard_df,
+                    performance_shown_columns,
+                    filter_performance_models,
+                    exclude_performance_models,
+                    performance_shown_devices,
+                    performance_shown_os,
+                    short_speed_slider,
+                    long_speed_slider,
+                    short_toks_slider,
+                    long_toks_slider,
+                ]
+                filter_output = leaderboard_df
+                filter_performance_models.change(
+                    performance_filter, performance_filter_inputs, filter_output
+                )
+                exclude_performance_models.change(
+                    performance_filter, performance_filter_inputs, filter_output
+                )
+                performance_shown_columns.change(
+                    performance_filter, performance_filter_inputs, filter_output
+                )
+                performance_shown_devices.change(
+                    performance_filter, performance_filter_inputs, filter_output
+                )
+                performance_shown_os.change(
+                    performance_filter, performance_filter_inputs, filter_output
+                )
+                short_speed_slider.change(
+                    performance_filter, performance_filter_inputs, filter_output
+                )
+                long_speed_slider.change(
+                    performance_filter, performance_filter_inputs, filter_output
+                )
+                short_toks_slider.change(
+                    performance_filter, performance_filter_inputs, filter_output
+                )
+                long_toks_slider.change(
+                    performance_filter, performance_filter_inputs, filter_output
+                )
+        # English Quality Tab
+        with gr.TabItem("English Quality", elem_id="timeline", id=1):
+            with gr.Row():
+                with gr.Column(scale=1):
+                    with gr.Row():
+                        with gr.Column(scale=6, elem_classes="filter_models_column"):
+                            filter_quality_models = gr.Textbox(
+                                placeholder="🔍 Filter Model (separate multiple queries with ';')",
+                                label="Filter Models",
+                            )
+                        with gr.Column(scale=4, elem_classes="exclude_models_column"):
+                            exclude_quality_models = gr.Textbox(
+                                placeholder="🔍 Exclude Model (separate multiple models with ';')",
+                                label="Exclude Models",
+                            )
+                    with gr.Row():
+                        with gr.Accordion("See All Columns", open=False):
+                            quality_shown_columns = gr.CheckboxGroup(
+                                choices=QUALITY_TOGGLE_COLS,
+                                value=QUALITY_SELECTED_COLS,
+                                label="Toggle Columns",
+                                elem_id="column-select",
+                                interactive=True,
+                            )
+                with gr.Column(scale=1):
+                    with gr.Accordion("See Quality Filters"):
+                        with gr.Row():
+                            with gr.Row():
+                                quality_min_avg_wer, quality_max_avg_wer = (
+                                    floor(min(model_df["Average WER"])),
+                                    ceil(max(model_df["Average WER"])) + 1,
+                                )
+                                wer_slider = RangeSlider(
+                                    value=[quality_min_avg_wer, quality_max_avg_wer],
+                                    minimum=quality_min_avg_wer,
+                                    maximum=quality_max_avg_wer,
+                                    label="Average WER",
+                                )
+                            with gr.Row():
+                                quality_min_qoi, quality_max_qoi = floor(
+                                    min(model_df["QoI"])
+                                ), ceil(max(model_df["QoI"] + 1))
+                                qoi_slider = RangeSlider(
+                                    value=[quality_min_qoi, quality_max_qoi],
+                                    minimum=quality_min_qoi,
+                                    maximum=quality_max_qoi,
+                                    label="QoI",
+                                )
+                    with gr.Row():
+                        gr.Markdown(QUALITY_TEXT)
+            with gr.Row():
+                quality_leaderboard_df = gr.components.Dataframe(
+                    value=model_df[
+                        QUALITY_ALWAYS_HERE_COLS + quality_shown_columns.value
+                    ],
+                    headers=[QUALITY_ALWAYS_HERE_COLS + quality_shown_columns.value],
+                    datatype=[
+                        c.type
+                        for c in fields(QualityAutoEvalColumn)
+                        if c.name in QUALITY_COLS
+                    ],
+                    elem_id="leaderboard-table",
+                    elem_classes="large-table",
+                    interactive=False,
+                )
+                # Copy of the leaderboard dataframe to apply filters to
+                hidden_quality_leaderboard_df = gr.components.Dataframe(
+                    value=model_df,
+                    headers=QUALITY_COLS,
+                    datatype=[
+                        c.type
+                        for c in fields(QualityAutoEvalColumn)
+                        if c.name in QUALITY_COLS
+                    ],
+                    visible=False,
+                )
+                # Inputs for the dataframe filter function
+                filter_inputs = [
+                    hidden_quality_leaderboard_df,
+                    quality_shown_columns,
+                    filter_quality_models,
+                    wer_slider,
+                    qoi_slider,
+                    exclude_quality_models,
+                ]
+                filter_output = quality_leaderboard_df
+                filter_quality_models.change(
+                    quality_filter, filter_inputs, filter_output
+                )
+                exclude_quality_models.change(
+                    quality_filter, filter_inputs, filter_output
+                )
+                quality_shown_columns.change(
+                    quality_filter, filter_inputs, filter_output
+                )
+                wer_slider.change(quality_filter, filter_inputs, filter_output)
+                qoi_slider.change(quality_filter, filter_inputs, filter_output)
+        # Timeline Tab
+        with gr.TabItem("Timeline", elem_id="timeline", id=4):
+            # Create subtabs for different metrics
+            with gr.Tabs():
+                with gr.TabItem("QoI", id=0):
+                    with gr.Row():
+                        with gr.Column(scale=6):
+                            filter_qoi = gr.Textbox(
+                                placeholder="🔍 Filter Model-Device-OS (separate multiple queries with ';')",
+                                label="Filter",
+                            )
+                        with gr.Column(scale=4):
+                            exclude_qoi = gr.Textbox(
+                                placeholder="🔍 Exclude Model-Device-OS (separate multiple with ';')",
+                                label="Exclude",
+                            )
+                    with gr.Row():
+                        with gr.Column():
+                            qoi_plot = gr.Plot(container=True)
+                            demo.load(
+                                lambda x, y, z: plot_metric(
+                                    x,
+                                    "qoi",
+                                    "QoI",
+                                    "QoI Over Time for Model-Device-OS Combinations",
+                                    y,
+                                    z,
+                                ),
+                                [
+                                    gr.Dataframe(benchmark_df, visible=False),
+                                    filter_qoi,
+                                    exclude_qoi,
+                                ],
+                                qoi_plot,
+                            )
+                            filter_qoi.change(
+                                lambda x, y, z: plot_metric(
+                                    x,
+                                    "qoi",
+                                    "QoI",
+                                    "QoI Over Time for Model-Device-OS Combinations",
+                                    y,
+                                    z,
+                                ),
+                                [
+                                    gr.Dataframe(benchmark_df, visible=False),
+                                    filter_qoi,
+                                    exclude_qoi,
+                                ],
+                                qoi_plot,
+                            )
+                            exclude_qoi.change(
+                                lambda x, y, z: plot_metric(
+                                    x,
+                                    "qoi",
+                                    "QoI",
+                                    "QoI Over Time for Model-Device-OS Combinations",
+                                    y,
+                                    z,
+                                ),
+                                [
+                                    gr.Dataframe(benchmark_df, visible=False),
+                                    filter_qoi,
+                                    exclude_qoi,
+                                ],
+                                qoi_plot,
+                            )
+                with gr.TabItem("Average WER", id=1):
+                    with gr.Row():
+                        with gr.Column(scale=6):
+                            filter_average_wer = gr.Textbox(
+                                placeholder="🔍 Filter Model-Device-OS (separate multiple queries with ';')",
+                                label="Filter",
+                            )
+                        with gr.Column(scale=4):
+                            exclude_average_wer = gr.Textbox(
+                                placeholder="🔍 Exclude Model-Device-OS (separate multiple with ';')",
+                                label="Exclude",
+                            )
+                    with gr.Row():
+                        with gr.Column():
+                            average_wer_plot = gr.Plot(container=True)
+                            demo.load(
+                                lambda x, y, z: plot_metric(
+                                    x,
+                                    "average_wer",
+                                    "Average WER",
+                                    "Average WER Over Time for Model-Device-OS Combinations",
+                                    y,
+                                    z,
+                                ),
+                                [
+                                    gr.Dataframe(benchmark_df, visible=False),
+                                    filter_average_wer,
+                                    exclude_average_wer,
+                                ],
+                                average_wer_plot,
+                            )
+                            filter_average_wer.change(
+                                lambda x, y, z: plot_metric(
+                                    x,
+                                    "average_wer",
+                                    "Average WER",
+                                    "Average WER Over Time for Model-Device-OS Combinations",
+                                    y,
+                                    z,
+                                ),
+                                [
+                                    gr.Dataframe(benchmark_df, visible=False),
+                                    filter_average_wer,
+                                    exclude_average_wer,
+                                ],
+                                average_wer_plot,
+                            )
+                            exclude_average_wer.change(
+                                lambda x, y, z: plot_metric(
+                                    x,
+                                    "average_wer",
+                                    "Average WER",
+                                    "Average WER Over Time for Model-Device-OS Combinations",
+                                    y,
+                                    z,
+                                ),
+                                [
+                                    gr.Dataframe(benchmark_df, visible=False),
+                                    filter_average_wer,
+                                    exclude_average_wer,
+                                ],
+                                average_wer_plot,
+                            )
+                with gr.TabItem("Speed", id=2):
+                    with gr.Row():
+                        with gr.Column(scale=6):
+                            filter_speed = gr.Textbox(
+                                placeholder="🔍 Filter Model-Device-OS (separate multiple queries with ';')",
+                                label="Filter",
+                            )
+                        with gr.Column(scale=4):
+                            exclude_speed = gr.Textbox(
+                                placeholder="🔍 Exclude Model-Device-OS (separate multiple with ';')",
+                                label="Exclude",
+                            )
+                    with gr.Row():
+                        with gr.Column():
+                            speed_plot = gr.Plot(container=True)
+                            demo.load(
+                                lambda x, y, z: plot_metric(
+                                    x,
+                                    "speed",
+                                    "Speed",
+                                    "Speed Over Time for Model-Device-OS Combinations",
+                                    y,
+                                    z,
+                                ),
+                                [
+                                    gr.Dataframe(benchmark_df, visible=False),
+                                    filter_speed,
+                                    exclude_speed,
+                                ],
+                                speed_plot,
+                            )
+                            filter_speed.change(
+                                lambda x, y, z: plot_metric(
+                                    x,
+                                    "speed",
+                                    "Speed",
+                                    "Speed Over Time for Model-Device-OS Combinations",
+                                    y,
+                                    z,
+                                ),
+                                [
+                                    gr.Dataframe(benchmark_df, visible=False),
+                                    filter_speed,
+                                    exclude_speed,
+                                ],
+                                speed_plot,
+                            )
+                            exclude_speed.change(
+                                lambda x, y, z: plot_metric(
+                                    x,
+                                    "speed",
+                                    "Speed",
+                                    "Speed Over Time for Model-Device-OS Combinations",
+                                    y,
+                                    z,
+                                ),
+                                [
+                                    gr.Dataframe(benchmark_df, visible=False),
+                                    filter_speed,
+                                    exclude_speed,
+                                ],
+                                speed_plot,
+                            )
+                with gr.TabItem("Tok/s", id=3):
+                    with gr.Row():
+                        with gr.Column(scale=6):
+                            filter_toks = gr.Textbox(
+                                placeholder="🔍 Filter Model-Device-OS (separate multiple queries with ';')",
+                                label="Filter",
+                            )
+                        with gr.Column(scale=4):
+                            exclude_toks = gr.Textbox(
+                                placeholder="🔍 Exclude Model-Device-OS (separate multiple with ';')",
+                                label="Exclude",
+                            )
+                    with gr.Row():
+                        with gr.Column():
+                            toks_plot = gr.Plot(container=True)
+                            demo.load(
+                                lambda x, y, z: plot_metric(
+                                    x,
+                                    "tokens_per_second",
+                                    "Tok/s",
+                                    "Tok/s Over Time for Model-Device-OS Combinations",
+                                    y,
+                                    z,
+                                ),
+                                [
+                                    gr.Dataframe(benchmark_df, visible=False),
+                                    filter_toks,
+                                    exclude_toks,
+                                ],
+                                toks_plot,
+                            )
+                            filter_toks.change(
+                                lambda x, y, z: plot_metric(
+                                    x,
+                                    "tokens_per_second",
+                                    "Tok/s",
+                                    "Tok/s Over Time for Model-Device-OS Combinations",
+                                    y,
+                                    z,
+                                ),
+                                [
+                                    gr.Dataframe(benchmark_df, visible=False),
+                                    filter_toks,
+                                    exclude_toks,
+                                ],
+                                toks_plot,
+                            )
+                            exclude_toks.change(
+                                lambda x, y, z: plot_metric(
+                                    x,
+                                    "tokens_per_second",
+                                    "Tok/s",
+                                    "Tok/s Over Time for Model-Device-OS Combinations",
+                                    y,
+                                    z,
+                                ),
+                                [
+                                    gr.Dataframe(benchmark_df, visible=False),
+                                    filter_toks,
+                                    exclude_toks,
+                                ],
+                                toks_plot,
+                            )
+        # Multilingual Quality Tab
+        with gr.TabItem("Multilingual Quality", elem_id="multilingual", id=5):
+            if multilingual_df is not None:
+                with gr.Row():
+                    with gr.Column(scale=1):
+                        # Display table of multilingual models
+                        model_table = gr.Dataframe(
+                            value=multilingual_models_df,
+                            headers=["Model"],
+                            datatype=["html"],
+                            elem_classes="left-side-table",
+                        )
+                        # Placeholders for confusion matrix plots
+                        with gr.Row():
+                            unforced_confusion_matrix = gr.Plot(visible=False)
+                        with gr.Row():
+                            forced_confusion_matrix = gr.Plot(visible=False)
+                    with gr.Column(scale=1):
+                        # Display area for selected model results
+                        results_markdown = gr.Markdown(
+                            "# Select a model from the table on the left to view results.",
+                            elem_id="multilingual-results",
+                        )
+                        # Tables for displaying average WER and language-specific WER
+                        average_wer_table = gr.Dataframe(
+                            value=None, elem_id="average-wer-table", visible=False
+                        )
+                        language_wer_table = gr.Dataframe(
+                            value=None, elem_id="general-wer-table", visible=False
+                        )
+                    # Set up click event to update results when a model is selected
+                    for button in multilingual_models_buttons:
+                        button.render()
+                        button.click(
+                            fn=lambda x: update_multilingual_results(x),
+                            inputs=[button],
+                            outputs=[
+                                results_markdown,
+                                average_wer_table,
+                                language_wer_table,
+                                unforced_confusion_matrix,
+                            ],
+                        )
+            else:
+                # Display message if no multilingual data is available
+                gr.Markdown("No multilingual benchmark results available.")
+        # Device Support Tab
+        with gr.TabItem("Device Support", elem_id="device_support", id=6):
+            # Load device support data from CSV
+            support_data = pd.read_csv("dashboard_data/support_data.csv")
+            support_data.set_index(support_data.columns[0], inplace=True)
+            support_data["Model"] = support_data["Model"].apply(
+                lambda x: x.replace("_", "/")
+            )
+            support_data["Model"] = support_data["Model"].apply(
+                lambda x: make_model_name_clickable_link(x)
+            )
+            support_data = (
+                support_data.assign(model_len=support_data["Model"].str.len())
+                .sort_values(
+                    by=["model_len"],
+                    ascending=[True],
+                )
+                .drop(columns=["model_len"])
+            )
+            with gr.Row():
+                with gr.Column(scale=1):
+                    with gr.Row():
+                        with gr.Column(scale=6, elem_id="filter_models_column"):
+                            filter_support_models = gr.Textbox(
+                                placeholder="🔍 Filter Model (separate multiple queries with ';')",
+                                label="Filter Models",
+                            )
+                        with gr.Column(scale=4, elem_classes="exclude_models_column"):
+                            exclude_support_models = gr.Textbox(
+                                placeholder="🔍 Exclude Model (separate multiple models with ';')",
+                                label="Exclude Models",
+                            )
+                    with gr.Row():
+                        with gr.Accordion("See All Columns", open=False):
+                            with gr.Row():
+                                with gr.Column(scale=9):
+                                    support_shown_columns = gr.CheckboxGroup(
+                                        choices=support_data.columns.tolist()[
+                                            1:
+                                        ],  # Exclude 'Model' column
+                                        value=support_data.columns.tolist()[1:],
+                                        label="Toggle Columns",
+                                        elem_id="support-column-select",
+                                        interactive=True,
+                                    )
+                                with gr.Column(scale=1, min_width=200):
+                                    with gr.Row():
+                                        select_all_support_button = gr.Button(
+                                            "Select All",
+                                            elem_id="select-all-support-button",
+                                            interactive=True,
+                                        )
+                                        deselect_all_support_button = gr.Button(
+                                            "Deselect All",
+                                            elem_id="deselect-all-support-button",
+                                            interactive=True,
+                                        )
+            with gr.Column():
+                gr.Markdown(
+                    """
+                ### Legend
+                - ✅ Supported: The model is supported and tested on this device.
+                - ⚠️ Failed: Either The model tests failed on this device or the Speed Factor for the test is less than 1.
+                - ? Not Tested: The model is supported on this device but no test information available.
+                - Not Supported: The model is not supported on this device as per the [WhisperKit configuration](https://huggingface.co/argmaxinc/whisperkit-coreml/blob/main/config.json).
+                """
+                )
+            # Display device support data in a table
+            device_support_table = gr.Dataframe(
+                value=support_data,
+                headers=support_data.columns.tolist(),
+                datatype=["html" for _ in support_data.columns],
+                elem_id="device-support-table",
+                elem_classes="large-table",
+                interactive=False,
+            )
+            # Hidden dataframe to store the original data
+            hidden_support_df = gr.Dataframe(value=support_data, visible=False)
+            def filter_support_data(df, columns, model_query, exclude_models):
+                filtered_df = df.copy()
+                # Filter models based on query
+                if model_query:
+                    filtered_df = filtered_df[
+                        filtered_df["Model"].str.contains(
+                            "|".join(q.strip() for q in model_query.split(";")),
+                            case=False,
+                            regex=True,
+                        )
+                    ]
+                # Exclude specified models
+                if exclude_models:
+                    exclude_list = [
+                        re.escape(m.strip()) for m in exclude_models.split(";")
+                    ]
+                    filtered_df = filtered_df[
+                        ~filtered_df["Model"].str.contains(
+                            "|".join(exclude_list), case=False, regex=True
+                        )
+                    ]
+                # Select columns
+                selected_columns = ["Model"] + [
+                    col for col in columns if col in df.columns
+                ]
+                filtered_df = filtered_df[selected_columns]
+                return filtered_df
+            def select_all_support_columns():
+                return support_data.columns.tolist()[1:]  # Exclude 'Model' column
+            def deselect_all_support_columns():
+                return []
+            # Connect the filter function to the input components
+            filter_inputs = [
+                hidden_support_df,
+                support_shown_columns,
+                filter_support_models,
+                exclude_support_models,
+            ]
+            filter_support_models.change(
+                filter_support_data, filter_inputs, device_support_table
+            )
+            exclude_support_models.change(
+                filter_support_data, filter_inputs, device_support_table
+            )
+            support_shown_columns.change(
+                filter_support_data, filter_inputs, device_support_table
+            )
+            # Connect select all and deselect all buttons
+            select_all_support_button.click(
+                select_all_support_columns,
+                inputs=[],
+                outputs=support_shown_columns,
+            )
+            deselect_all_support_button.click(
+                deselect_all_support_columns,
+                inputs=[],
+                outputs=support_shown_columns,
+            )
+        # Methodology Tab
+        with gr.TabItem("Methodology", elem_id="methodology", id=7):
+            gr.Markdown(METHODOLOGY_TEXT, elem_id="methodology-text")
+    # Citation section
+    with gr.Accordion("📙 Citation", open=False):
+        citation_button = gr.Textbox(
+            value=CITATION_BUTTON_TEXT,
+            label=CITATION_BUTTON_LABEL,
+            lines=7,
+            elem_id="citation-button",
+            show_copy_button=True,
+        )
+# Launch the Gradio interface
+demo.launch(debug=True, share=True, ssr_mode=False)

multilingual_generate.py ADDED Viewed

	@@ -0,0 +1,132 @@

+import json
+import os
+import shutil
+import sys
+from collections import defaultdict
+import numpy as np
+import pandas as pd
+from sklearn.metrics import confusion_matrix
+from utils import compute_average_wer, download_dataset
+def main():
+    """
+    Main function to orchestrate the multilingual data generation process.
+    This function performs the following steps:
+    1. Downloads multilingual evaluation data if requested.
+    2. Processes multilingual evaluation files.
+    3. Calculates and saves results, including Word Error Rate (WER) and
+       language detection confusion matrices.
+    """
+    source_repo = "argmaxinc/whisperkit-evals-multilingual"
+    source_subfolder = "WhisperKit"
+    source_directory = f"{source_repo}/{source_subfolder}"
+    if len(sys.argv) > 1 and sys.argv[1] == "download":
+        try:
+            shutil.rmtree(source_repo)
+        except:
+            print("Nothing to remove.")
+        download_dataset(source_repo, source_repo, source_subfolder)
+    results = defaultdict(
+        lambda: {
+            "average_wer": [],
+            "language_wer": defaultdict(list),
+            "language_detection": [],
+        }
+    )
+    confusion_matrices = {}
+    for subdir, _, files in os.walk(source_directory):
+        for filename in files:
+            if not filename.endswith(".json") or "summary" in filename:
+                continue
+            file_path = os.path.join(subdir, filename)
+            with open(file_path, "r") as f:
+                data = json.load(f)
+            subdir_components = subdir.split(os.path.sep)
+            is_forced = "forced" in subdir_components
+            model = subdir_components[-3] if not is_forced else subdir_components[-4]
+            key = f"{model}/{'forced' if is_forced else 'not_forced'}"
+            for item in data["results"]:
+                if "reference_language" not in item:
+                    continue
+                reference_language = item["reference_language"]
+                wer = item["wer"]
+                detected_language = item["predicted_language"]
+                result = {
+                    "reference": item["reference"],
+                    "prediction": item["prediction"],
+                }
+                results[key]["average_wer"].append(result)
+                results[key]["language_wer"][reference_language].append(result)
+                results[key]["language_detection"].append(
+                    (reference_language, detected_language)
+                )
+    calculate_and_save_results(results, confusion_matrices)
+def calculate_and_save_results(results, confusion_matrices):
+    """
+    Calculates final multilingual metrics and saves them to CSV and JSON files.
+    :param results: Dictionary containing raw multilingual evaluation data.
+    :param confusion_matrices: Dictionary to store confusion matrices for language detection.
+    This function processes the raw multilingual data, calculates average metrics,
+    creates confusion matrices for language detection, and saves the results to:
+    1. A CSV file with WER data for each model and language.
+    2. A JSON file with confusion matrices for language detection.
+    """
+    wer_data = []
+    for key, data in results.items():
+        model, forced = key.rsplit("/", 1)
+        row = {
+            "Model": model,
+            "Forced Tokens": forced == "forced",
+            "Average WER": compute_average_wer(data["average_wer"]),
+        }
+        for lang, wers in data["language_wer"].items():
+            row[f"WER_{lang}"] = compute_average_wer(wers)
+        wer_data.append(row)
+        true_languages, detected_languages = zip(*data["language_detection"])
+        unique_languages = sorted(set(true_languages))
+        cm = confusion_matrix(
+            true_languages, detected_languages, labels=unique_languages
+        )
+        row_sums = cm.sum(axis=1)
+        cm_normalized = np.zeros_like(cm, dtype=float)
+        non_zero_rows = row_sums != 0
+        cm_normalized[non_zero_rows] = (
+            cm[non_zero_rows] / row_sums[non_zero_rows, np.newaxis]
+        )
+        if model not in confusion_matrices:
+            confusion_matrices[model] = {}
+        confusion_matrices[model][forced] = {
+            "matrix": cm_normalized.tolist(),
+            "labels": unique_languages,
+        }
+    df = pd.DataFrame(wer_data)
+    df.to_csv("dashboard_data/multilingual_results.csv", index=False)
+    with open("dashboard_data/multilingual_confusion_matrices.json", "w") as f:
+        json.dump(confusion_matrices, f, indent=2)
+if __name__ == "__main__":
+    main()

performance_generate.py ADDED Viewed

	@@ -0,0 +1,465 @@

+import glob
+import json
+import os
+import shutil
+import sys
+import urllib
+from collections import defaultdict
+from datetime import datetime
+from statistics import mean
+import pandas as pd
+import requests
+from constants import BASE_WHISPERKIT_BENCHMARK_URL
+from text_normalizer import text_normalizer
+from utils import compute_average_wer, dir_to_json, download_dataset
+def fetch_evaluation_data(url):
+    """
+    Fetches evaluation data from the given URL.
+    :param url: The URL to fetch the evaluation data from.
+    :returns: The evaluation data as a dictionary.
+    :rauses: sys.exit if the request fails
+    """
+    response = requests.get(url)
+    if response.status_code == 200:
+        return json.loads(response.text)
+    else:
+        sys.exit(f"Failed to fetch WhisperKit evals: {response.text}")
+def generate_device_map(base_dir):
+    """
+    Generates a mapping of device identifiers to their corresponding device models.
+    This function iterates through all summary files in the specified base directory and its subdirectories,
+    extracting device identifier and device model information. It stores this information in a dictionary,
+    where the keys are device identifiers and the values are device models.
+    :param base_dir: The base directory to search for summary files.
+    :returns: A dictionary mapping device identifiers to device models.
+    """
+    device_map = {}
+    # Find all summary files recursively
+    summary_files = glob.glob(f"{base_dir}/**/*summary*.json", recursive=True)
+    for file_path in summary_files:
+        try:
+            with open(file_path, "r") as f:
+                data = json.load(f)
+            # Extract device information and create simple mapping
+            if "deviceModel" in data and "deviceIdentifier" in data:
+                device_map[data["deviceIdentifier"]] = data["deviceModel"]
+        except json.JSONDecodeError:
+            print(f"Error reading {file_path}")
+        except Exception as e:
+            print(f"Error processing {file_path}: {e}")
+    # Save the device map to project root
+    output_path = "dashboard_data/device_map.json"
+    with open(output_path, "w") as f:
+        json.dump(device_map, f, indent=4, sort_keys=True)
+    return device_map
+def get_device_name(device):
+    """
+    Gets the device name from the device map if it exists.
+    :param device: String representing the device name.
+    :returns: The device name from the device map if it exists, otherwise the input device name.
+    """
+    with open("dashboard_data/device_map.json", "r") as f:
+        device_map = json.load(f)
+    return device_map.get(device, device).replace(" ", "_")
+def process_benchmark_file(file_path, dataset_dfs, results):
+    """
+    Processes a single benchmark file and updates the results dictionary.
+    :param file_path: Path to the benchmark JSON file.
+    :param dataset_dfs: Dictionary of DataFrames containing dataset information.
+    :param results: Dictionary to store the processed results.
+    This function reads a benchmark JSON file, extracts relevant information,
+    and updates the results dictionary with various metrics including WER,
+    speed, tokens per second, and quality of inference (QoI).
+    """
+    with open(file_path, "r") as file:
+        test_results = json.load(file)
+    if len(test_results) == 0:
+        return
+    first_test_result = test_results[0]
+    model = first_test_result["testInfo"]["model"]
+    device = first_test_result["testInfo"]["device"]
+    dataset_dir = first_test_result["testInfo"]["datasetDir"]
+    if "iPhone" in device or "iPad" in device:
+        version_numbers = first_test_result["staticAttributes"]["osVersion"].split(".")
+        if len(version_numbers) == 3 and version_numbers[-1] == "0":
+            version_numbers.pop()
+        os_info = f"""{'iOS' if 'iPhone' in device else 'iPadOS'}_{".".join(version_numbers)}"""
+    else:
+        os_info = f"macOS_{first_test_result['staticAttributes']['osVersion']}"
+    timestamp = first_test_result["testInfo"]["date"]
+    commit_hash_timestamp = file_path.split("/")[-2]
+    commit_timestamp, commit_hash = commit_hash_timestamp.split("_")
+    key = (model, device, os_info, commit_timestamp)
+    dataset_name = dataset_dir
+    for test_result in test_results:
+        test_info = test_result["testInfo"]
+        audio_file_name = test_info["audioFile"]
+        dataset_df = dataset_dfs[dataset_name]
+        wer_entry = {
+            "prediction": text_normalizer(test_info["prediction"]),
+            "reference": text_normalizer(test_info["reference"]),
+        }
+        results[key]["timestamp"] = timestamp
+        results[key]["average_wer"].append(wer_entry)
+        results[key]["dataset_wer"][dataset_name].append(wer_entry)
+        input_audio_seconds = test_info["timings"]["inputAudioSeconds"]
+        full_pipeline = test_info["timings"]["fullPipeline"]
+        total_decoding_loops = test_info["timings"]["totalDecodingLoops"]
+        results[key]["dataset_speed"][dataset_name][
+            "inputAudioSeconds"
+        ] += input_audio_seconds
+        results[key]["dataset_speed"][dataset_name]["fullPipeline"] += full_pipeline
+        results[key]["speed"]["inputAudioSeconds"] += input_audio_seconds
+        results[key]["speed"]["fullPipeline"] += full_pipeline
+        results[key]["commit_hash"] = commit_hash
+        results[key]["commit_timestamp"] = commit_timestamp
+        results[key]["dataset_tokens_per_second"][dataset_name][
+            "totalDecodingLoops"
+        ] += total_decoding_loops
+        results[key]["dataset_tokens_per_second"][dataset_name][
+            "fullPipeline"
+        ] += full_pipeline
+        results[key]["tokens_per_second"]["totalDecodingLoops"] += total_decoding_loops
+        results[key]["tokens_per_second"]["fullPipeline"] += full_pipeline
+        audio = audio_file_name.split(".")[0]
+        if dataset_name == "earnings22-10mins":
+            audio = audio.split("-")[0]
+        dataset_row = dataset_df.loc[dataset_df["file"].str.contains(audio)].iloc[0]
+        reference_wer = dataset_row["wer"]
+        prediction_wer = test_info["wer"]
+        results[key]["qoi"].append(1 if prediction_wer <= reference_wer else 0)
+    return key, dataset_name
+def process_summary_file(file_path, results):
+    """
+    Processes a summary file and updates the results dictionary with device support information.
+    :param file_path: Path to the summary JSON file.
+    :param results: Dictionary to store the processed results.
+    This function reads a summary JSON file, extracts information about supported
+    and failed models for a specific device and OS combination, and updates the
+    results dictionary accordingly.
+    """
+    with open(file_path, "r") as file:
+        summary_data = json.load(file)
+    device = summary_data["deviceIdentifier"]
+    os = f"{'iPadOS' if 'iPad' in device else summary_data['osType']} {summary_data['osVersion']}"
+    commit_timestamp = summary_data["commitTimestamp"]
+    key = (device, os)
+    if key in results:
+        existing_timestamp = results[key]["commitTimestamp"]
+        existing_dt = datetime.strptime(existing_timestamp, "%Y-%m-%dT%H%M%S")
+        new_dt = datetime.strptime(commit_timestamp, "%Y-%m-%dT%H%M%S")
+        if new_dt <= existing_dt:
+            return
+    else:
+        results[key] = {}
+    supported_models = set(summary_data["modelsTested"])
+    failed_models = set()
+    dataset_count = 2
+    for model, value in summary_data["testResults"].items():
+        if model not in summary_data["failureInfo"]:
+            dataset_count = len(value)
+            break
+    for failed_model in summary_data["failureInfo"]:
+        if (
+            failed_model in summary_data["testResults"]
+            and len(summary_data["testResults"][failed_model]) == dataset_count
+        ):
+            continue
+        supported_models.discard(failed_model)
+        failed_models.add(failed_model)
+    results[key]["supportedModels"] = supported_models
+    results[key]["commitTimestamp"] = commit_timestamp
+    results[key]["failedModels"] = (failed_models, file_path)
+    results["modelsTested"] |= supported_models
+    results["devices"].add(device)
+def calculate_and_save_performance_results(
+    performance_results, performance_output_path
+):
+    """
+    Calculates final performance metrics and saves them to a JSON file.
+    :param performance_results: Dictionary containing raw performance data.
+    :param performance_output_path: Path to save the processed performance results.
+    This function processes the raw performance data, calculates average metrics,
+    and writes the final results to a JSON file, with each entry representing
+    a unique combination of model, device, and OS.
+    """
+    not_supported = []
+    with open(performance_output_path, "w") as performance_file:
+        for key, data in performance_results.items():
+            model, device, os_info, timestamp = key
+            speed = round(
+                data["speed"]["inputAudioSeconds"] / data["speed"]["fullPipeline"], 2
+            )
+            if speed < 1.0:
+                not_supported.append((model, device, os_info))
+                continue
+            performance_entry = {
+                "model": model.replace("_", "/"),
+                "device": get_device_name(device).replace("_", " "),
+                "os": os_info.replace("_", " "),
+                "timestamp": data["timestamp"],
+                "speed": speed,
+                "tokens_per_second": round(
+                    data["tokens_per_second"]["totalDecodingLoops"]
+                    / data["tokens_per_second"]["fullPipeline"],
+                    2,
+                ),
+                "dataset_speed": {
+                    dataset: round(
+                        speed_info["inputAudioSeconds"] / speed_info["fullPipeline"], 2
+                    )
+                    for dataset, speed_info in data["dataset_speed"].items()
+                },
+                "dataset_tokens_per_second": {
+                    dataset: round(
+                        tps_info["totalDecodingLoops"] / tps_info["fullPipeline"], 2
+                    )
+                    for dataset, tps_info in data["dataset_tokens_per_second"].items()
+                },
+                "average_wer": compute_average_wer(data["average_wer"]),
+                "dataset_wer": {
+                    dataset: compute_average_wer(wer)
+                    for dataset, wer in data["dataset_wer"].items()
+                },
+                "qoi": round(mean(data["qoi"]), 2),
+                "commit_hash": data["commit_hash"],
+                "commit_timestamp": data["commit_timestamp"],
+            }
+            json.dump(performance_entry, performance_file)
+            performance_file.write("\n")
+    return not_supported
+def calculate_and_save_support_results(
+    support_results, not_supported, support_output_path
+):
+    """
+    Calculates device support results and saves them to a CSV file.
+    :param support_results: Dictionary containing device support information.
+    :param support_output_path: Path to save the processed support results.
+    This function processes the device support data and creates a CSV file
+    showing which models are supported on different devices and OS versions,
+    using checkmarks, warning signs, quesiton marks or Not supported to
+    indicate support status.
+    """
+    all_models = sorted(support_results["modelsTested"])
+    all_devices = sorted(set(support_results["devices"]))
+    df = pd.DataFrame(index=all_models, columns=["Model"] + all_devices)
+    for model in all_models:
+        row = {"Model": model}
+        for device in all_devices:
+            row[device] = ""
+        for key, data in support_results.items():
+            if key in ["modelsTested", "devices"]:
+                continue
+            (device, os) = key
+            supported_models = data["supportedModels"]
+            failed_models, file_path = data["failedModels"]
+            directories = file_path.split("/")
+            commit_file, summary_file = directories[-2], directories[-1]
+            url = f"{BASE_WHISPERKIT_BENCHMARK_URL}/{commit_file}/{urllib.parse.quote(summary_file)}"
+            if model in supported_models:
+                current_value = row[device]
+                new_value = (
+                    f"✅ {os}"
+                    if current_value == ""
+                    else f"{current_value}<p>✅ {os}</p>"
+                )
+            elif model in failed_models:
+                current_value = row[device]
+                new_value = (
+                    f"""⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href={url}>{os}</a>"""
+                    if current_value == ""
+                    else f"""{current_value}<p>⚠️ <a style='color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;' href={url}>{os}</a></p>"""
+                )
+            else:
+                current_value = row[device]
+                new_value = (
+                    f"? {os}"
+                    if current_value == ""
+                    else f"{current_value}<p>? {os}</p>"
+                )
+            row[device] = new_value
+        df.loc[model] = row
+    remove_unsupported_cells(df, not_supported)
+    cols = df.columns.tolist()
+    cols = ["Model"] + [
+        get_device_name(col).replace("_", " ") for col in cols if col != "Model"
+    ]
+    df.columns = cols
+    df.to_csv(support_output_path, index=True)
+def remove_unsupported_cells(df, not_supported):
+    """
+    Updates the DataFrame to mark unsupported model-device combinations.
+    This function reads a configuration file to determine which models are supported
+    on which devices. It then iterates over the DataFrame and sets the value to "Not supported"
+    for any model-device combination that is not supported according to the configuration.
+    :param df: A Pandas DataFrame where the index represents models and columns represent devices.
+    """
+    with open("dashboard_data/config.json", "r") as file:
+        config_data = json.load(file)
+    device_support = config_data["device_support"]
+    for info in device_support:
+        identifiers = set(info["identifiers"])
+        supported = set(info["models"]["supported"])
+        for model in df.index:
+            for device in df.columns:
+                if (
+                    any(identifier in device for identifier in identifiers)
+                    and model not in supported
+                ):
+                    df.at[model, device] = "Not Supported"
+    for model, device, os in not_supported:
+        df.at[model, device] = "Not Supported"
+def main():
+    """
+    Main function to orchestrate the performance data generation process.
+    This function performs the following steps:
+    1. Downloads benchmark data if requested.
+    2. Fetches evaluation data for various datasets.
+    3. Processes benchmark files and summary files.
+    4. Calculates and saves performance and support results.
+    """
+    source_xcresult_repo = "argmaxinc/whisperkit-evals-dataset"
+    source_xcresult_subfolder = "benchmark_data/"
+    source_xcresult_directory = f"{source_xcresult_repo}/{source_xcresult_subfolder}"
+    if len(sys.argv) > 1 and sys.argv[1] == "download":
+        try:
+            shutil.rmtree(source_xcresult_repo)
+        except:
+            print("Nothing to remove.")
+        download_dataset(
+            source_xcresult_repo, source_xcresult_repo, source_xcresult_subfolder
+        )
+    datasets = {
+        "Earnings-22": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/earnings22/2024-03-04_13%3A39%3A42_GMT-0800.json",
+        "LibriSpeech": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/librispeech/2024-02-28_18%3A45%3A02_GMT-0800.json?download=true",
+        "earnings22-10mins": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/earnings22/2024-03-04_13%3A39%3A42_GMT-0800.json",
+        "librispeech-10mins": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/librispeech/2024-02-28_18%3A45%3A02_GMT-0800.json?download=true",
+        "earnings22-12hours": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/earnings22/2024-03-04_13%3A39%3A42_GMT-0800.json",
+        "librispeech": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/librispeech/2024-02-28_18%3A45%3A02_GMT-0800.json?download=true",
+    }
+    dataset_dfs = {}
+    for dataset_name, url in datasets.items():
+        evals = fetch_evaluation_data(url)
+        dataset_dfs[dataset_name] = pd.json_normalize(evals["results"])
+    performance_results = defaultdict(
+        lambda: {
+            "average_wer": [],
+            "dataset_wer": defaultdict(list),
+            "qoi": [],
+            "speed": {"inputAudioSeconds": 0, "fullPipeline": 0},
+            "tokens_per_second": {"totalDecodingLoops": 0, "fullPipeline": 0},
+            "dataset_speed": defaultdict(
+                lambda: {"inputAudioSeconds": 0, "fullPipeline": 0}
+            ),
+            "dataset_tokens_per_second": defaultdict(
+                lambda: {"totalDecodingLoops": 0, "fullPipeline": 0}
+            ),
+            "timestamp": None,
+            "commit_hash": None,
+            "commit_timestamp": None,
+        }
+    )
+    support_results = {"modelsTested": set(), "devices": set()}
+    generate_device_map(source_xcresult_directory)
+    for subdir, _, files in os.walk(source_xcresult_directory):
+        for filename in files:
+            file_path = os.path.join(subdir, filename)
+            if not filename.endswith(".json"):
+                continue
+            elif "summary" in filename:
+                process_summary_file(file_path, support_results)
+            else:
+                process_benchmark_file(file_path, dataset_dfs, performance_results)
+    not_supported = calculate_and_save_performance_results(
+        performance_results, "dashboard_data/performance_data.json"
+    )
+    calculate_and_save_support_results(
+        support_results, not_supported, "dashboard_data/support_data.csv"
+    )
+if __name__ == "__main__":
+    main()

quality_generate.py ADDED Viewed

	@@ -0,0 +1,186 @@

+import json
+import os
+import shutil
+import sys
+from collections import defaultdict
+from statistics import mean
+import pandas as pd
+import requests
+from text_normalizer import text_normalizer
+from utils import compute_average_wer, download_dataset
+def fetch_evaluation_data(url):
+    """
+    Fetches evaluation data from the given URL.
+    :param url: The URL to fetch the evaluation data from.
+    :returns: The evaluation data as a dictionary.
+    :rauses: sys.exit if the request fails
+    """
+    response = requests.get(url)
+    if response.status_code == 200:
+        return json.loads(response.text)
+    else:
+        sys.exit(f"Failed to fetch WhisperKit evals: {response.text}")
+def get_device_name(device):
+    """
+    Gets the device name from the device map if it exists.
+    :param device: String representing the device name.
+    :returns: The device name from the device map if it exists, otherwise the input device name.
+    """
+    with open("dashboard_data/device_map.json", "r") as f:
+        device_map = json.load(f)
+    return device_map.get(device, device).replace(" ", "_")
+def process_quality_file(file_path, dataset_dfs, quality_results):
+    """
+    Processes a single quality file and updates the quality_results dictionary.
+    :param file_path: Path to the quality JSON file.
+    :param dataset_dfs: Dictionary of DataFrames containing dataset information.
+    :param quality_results: Dictionary to store the processed quality results.
+    This function reads a quality JSON file, extracts relevant information,
+    and updates the quality_results dictionary with various metrics including WER
+    and Quality of Inference (QoI) for different datasets.
+    """
+    with open(file_path, "r") as file:
+        test_results = json.load(file)
+    if len(test_results) == 0:
+        return
+    metadata = test_results["metadata"]
+    test_results = test_results["results"]
+    model = file_path.split("/")[-3].replace("_", "/")
+    device = metadata["inference_context"]["device_spec"]["product_name"]
+    device = get_device_name(device)
+    timestamp = file_path.split("/")[-1].split(".")[0]
+    key = model
+    dataset_name = metadata["dataset_name"]
+    for test_result in test_results:
+        audio_file_name = test_result["file"]
+        dataset_key = "Earnings-22" if "earnings22" in dataset_name else "LibriSpeech"
+        dataset_df = dataset_dfs[dataset_key]
+        wer_entry = {
+            "prediction": text_normalizer(test_result["prediction"]),
+            "reference": text_normalizer(test_result["reference"]),
+        }
+        quality_results[key]["timestamp"] = timestamp
+        quality_results[key]["dataset_wer"][dataset_name].append(wer_entry)
+        audio = audio_file_name.split(".")[0]
+        dataset_row = dataset_df.loc[dataset_df["file"].str.contains(audio)].iloc[0]
+        reference_wer = dataset_row["wer"]
+        prediction_wer = test_result["wer"]
+        quality_results[key]["qoi"].append(1 if prediction_wer <= reference_wer else 0)
+def calculate_and_save_quality_results(quality_results, quality_output_path):
+    """
+    Calculates final quality metrics and saves them to a JSON file.
+    :param quality_results: Dictionary containing raw quality data.
+    :param quality_output_path: Path to save the processed quality results.
+    This function processes the raw quality data, calculates average metrics,
+    and writes the final results to a JSON file, with each entry representing
+    a unique model's quality metrics across different datasets, including
+    Word Error Rate (WER) and Quality of Inference (QoI).
+    """
+    with open(quality_output_path, "w") as quality_file:
+        for key, data in quality_results.items():
+            model = key
+            dataset_wers = {
+                dataset: compute_average_wer(wer)
+                for dataset, wer in data["dataset_wer"].items()
+            }
+            average_wer = (
+                sum(dataset_wers.values()) / len(dataset_wers)
+                if len(dataset_wers) != 0
+                else 0
+            )
+            quality_entry = {
+                "model": model.replace("_", "/"),
+                "timestamp": data["timestamp"],
+                "average_wer": round(average_wer, 2),
+                "dataset_wer": dataset_wers,
+                "qoi": round(mean(data["qoi"]), 2),
+            }
+            json.dump(quality_entry, quality_file)
+            quality_file.write("\n")
+def main():
+    """
+    Main function to orchestrate the quality data generation process.
+    This function performs the following steps:
+    1. Downloads quality data if requested.
+    2. Fetches evaluation data for various datasets.
+    3. Processes quality files for specific datasets.
+    4. Calculates and saves quality results, including WER and QoI metrics.
+    """
+    if len(sys.argv) > 1 and sys.argv[1] == "download":
+        try:
+            shutil.rmtree("english")
+        except:
+            print("Nothing to remove.")
+        download_dataset("argmaxinc/whisperkit-evals", "english", "WhisperKit")
+    datasets = {
+        "Earnings-22": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/earnings22/2024-03-04_13%3A39%3A42_GMT-0800.json",
+        "LibriSpeech": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/librispeech/2024-02-28_18%3A45%3A02_GMT-0800.json?download=true",
+        "earnings22-10mins": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/earnings22/2024-03-04_13%3A39%3A42_GMT-0800.json",
+        "librispeech-10mins": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/librispeech/2024-02-28_18%3A45%3A02_GMT-0800.json?download=true",
+        "earnings22-12hours": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/earnings22/2024-03-04_13%3A39%3A42_GMT-0800.json",
+        "librispeech": "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/resolve/main/WhisperOpenAIAPI/openai_whisper-large-v2/librispeech/2024-02-28_18%3A45%3A02_GMT-0800.json?download=true",
+    }
+    dataset_dfs = {}
+    for dataset_name, url in datasets.items():
+        evals = fetch_evaluation_data(url)
+        dataset_dfs[dataset_name] = pd.json_normalize(evals["results"])
+    source_quality_directory = "argmaxinc/english/WhisperKit/"
+    quality_results = defaultdict(
+        lambda: {
+            "average_wer": [],
+            "dataset_wer": defaultdict(list),
+            "qoi": [],
+            "timestamp": None,
+        }
+    )
+    for subdir, _, files in os.walk(source_quality_directory):
+        dataset = subdir.split("/")[-1]
+        if dataset not in ["earnings22-12hours", "librispeech"]:
+            continue
+        for filename in files:
+            if not filename.endswith(".json"):
+                continue
+            file_path = os.path.join(subdir, filename)
+            process_quality_file(file_path, dataset_dfs, quality_results)
+    calculate_and_save_quality_results(
+        quality_results, "dashboard_data/quality_data.json"
+    )
+if __name__ == "__main__":
+    main()

requirements.txt ADDED Viewed

	@@ -0,0 +1,122 @@

+aiofiles
+aiohttp
+aiosignal
+altair
+annotated-types
+anyio
+argmax_gradio_components
+async-timeout
+attrs
+backports.tarfile
+build
+certifi
+cffi
+cfgv
+charset-normalizer
+click
+contourpy
+cycler
+datasets
+dill
+distlib
+dnspython
+docutils
+email_validator
+exceptiongroup
+fastapi
+fastapi-cli
+ffmpy
+filelock
+fonttools
+frozenlist
+fsspec
+gradio==5.0.1
+h11
+httpcore
+httptools
+httpx
+huggingface-hub
+identify
+idna
+importlib_metadata
+importlib_resources
+jaraco.classes
+jaraco.context
+jaraco.functools
+Jinja2
+jsonschema
+jsonschema-specifications
+keyring
+kiwisolver
+markdown-it-py
+MarkupSafe
+matplotlib
+mdurl
+more-itertools
+multidict
+multiprocess
+nh3
+nodeenv
+numpy
+orjson
+packaging
+pandas
+pillow
+pkginfo
+platformdirs
+plotly
+pre-commit
+pyarrow
+pyarrow-hotfix
+pycparser
+pydantic
+pydantic_core
+pydub
+Pygments
+pyparsing
+pyproject_hooks
+python-dateutil
+python-dotenv
+python-multipart
+pytz
+PyYAML
+readme_renderer
+referencing
+requests
+requests-toolbelt
+rfc3986
+rich
+rpds-py
+ruff
+scipy
+scikit-learn
+semantic-version
+shellingham
+six
+sniffio
+soundfile
+starlette
+tenacity
+text_normalizer
+tomli
+tomlkit
+toolz
+tqdm
+twine
+typer
+typing_extensions
+tzdata
+ujson
+urllib3
+uvicorn
+uvloop
+virtualenv
+watchfiles
+websockets
+xxhash
+yarl
+zipp
+iso639-lang
+evaluate
+jiwer
+regex

static/Zwizz-Medium.woff ADDED Viewed

Binary file (28.7 kB). View file

static/Zwizz-Regular.woff ADDED Viewed

Binary file (28.4 kB). View file

static/Zwizz-SemiBold.woff ADDED Viewed

Binary file (28.7 kB). View file

text_normalizer.py ADDED Viewed

	@@ -0,0 +1,2374 @@

+# Copyright 2022 The OpenAI team and The HuggingFace Team. All rights reserved.
+# Most of the code is copy pasted from the original whisper repository
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import re
+import unicodedata
+from fractions import Fraction
+from typing import Iterator, List, Match, Optional, Union
+import regex
+abbr = {
+    "accessorise": "accessorize",
+    "accessorised": "accessorized",
+    "accessorises": "accessorizes",
+    "accessorising": "accessorizing",
+    "acclimatisation": "acclimatization",
+    "acclimatise": "acclimatize",
+    "acclimatised": "acclimatized",
+    "acclimatises": "acclimatizes",
+    "acclimatising": "acclimatizing",
+    "accoutrements": "accouterments",
+    "aeon": "eon",
+    "aeons": "eons",
+    "aerogramme": "aerogram",
+    "aerogrammes": "aerograms",
+    "aeroplane": "airplane",
+    "aeroplanes": "airplanes",
+    "aesthete": "esthete",
+    "aesthetes": "esthetes",
+    "aesthetic": "esthetic",
+    "aesthetically": "esthetically",
+    "aesthetics": "esthetics",
+    "aetiology": "etiology",
+    "ageing": "aging",
+    "aggrandisement": "aggrandizement",
+    "agonise": "agonize",
+    "agonised": "agonized",
+    "agonises": "agonizes",
+    "agonising": "agonizing",
+    "agonisingly": "agonizingly",
+    "almanack": "almanac",
+    "almanacks": "almanacs",
+    "aluminium": "aluminum",
+    "amortisable": "amortizable",
+    "amortisation": "amortization",
+    "amortisations": "amortizations",
+    "amortise": "amortize",
+    "amortised": "amortized",
+    "amortises": "amortizes",
+    "amortising": "amortizing",
+    "amphitheatre": "amphitheater",
+    "amphitheatres": "amphitheaters",
+    "anaemia": "anemia",
+    "anaemic": "anemic",
+    "anaesthesia": "anesthesia",
+    "anaesthetic": "anesthetic",
+    "anaesthetics": "anesthetics",
+    "anaesthetise": "anesthetize",
+    "anaesthetised": "anesthetized",
+    "anaesthetises": "anesthetizes",
+    "anaesthetising": "anesthetizing",
+    "anaesthetist": "anesthetist",
+    "anaesthetists": "anesthetists",
+    "anaesthetize": "anesthetize",
+    "anaesthetized": "anesthetized",
+    "anaesthetizes": "anesthetizes",
+    "anaesthetizing": "anesthetizing",
+    "analogue": "analog",
+    "analogues": "analogs",
+    "analyse": "analyze",
+    "analysed": "analyzed",
+    "analyses": "analyzes",
+    "analysing": "analyzing",
+    "anglicise": "anglicize",
+    "anglicised": "anglicized",
+    "anglicises": "anglicizes",
+    "anglicising": "anglicizing",
+    "annualised": "annualized",
+    "antagonise": "antagonize",
+    "antagonised": "antagonized",
+    "antagonises": "antagonizes",
+    "antagonising": "antagonizing",
+    "apologise": "apologize",
+    "apologised": "apologized",
+    "apologises": "apologizes",
+    "apologising": "apologizing",
+    "appal": "appall",
+    "appals": "appalls",
+    "appetiser": "appetizer",
+    "appetisers": "appetizers",
+    "appetising": "appetizing",
+    "appetisingly": "appetizingly",
+    "arbour": "arbor",
+    "arbours": "arbors",
+    "archaeologically": "archeologically",
+    "archaeologist": "archeologist",
+    "archaeologists": "archeologists",
+    "archaeology": "archeology</span>",
+    "archeological": "archaeological",
+    "ardour": "ardor",
+    "armour": "armor",
+    "armoured": "armored",
+    "armourer": "armorer",
+    "armourers": "armorers",
+    "armouries": "armories",
+    "armoury": "armory",
+    "artefact": "artifact",
+    "artefacts": "artifacts",
+    "authorise": "authorize",
+    "authorised": "authorized",
+    "authorises": "authorizes",
+    "authorising": "authorizing",
+    "axe": "ax",
+    "backpedalled": "backpedaled",
+    "backpedalling": "backpedaling",
+    "bannister": "banister",
+    "bannisters": "banisters",
+    "baptise": "baptize",
+    "baptised": "baptized",
+    "baptises": "baptizes",
+    "baptising": "baptizing",
+    "bastardise": "bastardize",
+    "bastardised": "bastardized",
+    "bastardises": "bastardizes",
+    "bastardising": "bastardizing",
+    "battleax": "battleaxe",
+    "baulk": "balk",
+    "baulked": "balked",
+    "baulking": "balking",
+    "baulks": "balks",
+    "bedevilled": "bedeviled",
+    "bedevilling": "bedeviling",
+    "behaviour": "behavior",
+    "behavioural": "behavioral",
+    "behaviourism": "behaviorism",
+    "behaviourist": "behaviorist",
+    "behaviourists": "behaviorists",
+    "behaviours": "behaviors",
+    "behove": "behoove",
+    "behoved": "behooved",
+    "behoves": "behooves",
+    "bejewelled": "bejeweled",
+    "belabour": "belabor",
+    "belaboured": "belabored",
+    "belabouring": "belaboring",
+    "belabours": "belabors",
+    "bevelled": "beveled",
+    "bevvies": "bevies",
+    "bevvy": "bevy",
+    "biassed": "biased",
+    "biassing": "biasing",
+    "bingeing": "binging",
+    "bougainvillaea": "bougainvillea",
+    "bougainvillaeas": "bougainvilleas",
+    "bowdlerise": "bowdlerize",
+    "bowdlerised": "bowdlerized",
+    "bowdlerises": "bowdlerizes",
+    "bowdlerising": "bowdlerizing",
+    "breathalyse": "breathalyze",
+    "breathalysed": "breathalyzed",
+    "breathalyser": "breathalyzer",
+    "breathalysers": "breathalyzers",
+    "breathalyses": "breathalyzes",
+    "breathalysing": "breathalyzing",
+    "brutalise": "brutalize",
+    "brutalised": "brutalized",
+    "brutalises": "brutalizes",
+    "brutalising": "brutalizing",
+    "busses": "buses",
+    "bussing": "busing",
+    "caesarean": "cesarean",
+    "caesareans": "cesareans",
+    "calibre": "caliber",
+    "calibres": "calibers",
+    "calliper": "caliper",
+    "callipers": "calipers",
+    "callisthenics": "calisthenics",
+    "canalise": "canalize",
+    "canalised": "canalized",
+    "canalises": "canalizes",
+    "canalising": "canalizing",
+    "cancelation": "cancellation",
+    "cancelations": "cancellations",
+    "cancelled": "canceled",
+    "cancelling": "canceling",
+    "candour": "candor",
+    "cannibalise": "cannibalize",
+    "cannibalised": "cannibalized",
+    "cannibalises": "cannibalizes",
+    "cannibalising": "cannibalizing",
+    "canonise": "canonize",
+    "canonised": "canonized",
+    "canonises": "canonizes",
+    "canonising": "canonizing",
+    "capitalise": "capitalize",
+    "capitalised": "capitalized",
+    "capitalises": "capitalizes",
+    "capitalising": "capitalizing",
+    "caramelise": "caramelize",
+    "caramelised": "caramelized",
+    "caramelises": "caramelizes",
+    "caramelising": "caramelizing",
+    "carbonise": "carbonize",
+    "carbonised": "carbonized",
+    "carbonises": "carbonizes",
+    "carbonising": "carbonizing",
+    "carolled": "caroled",
+    "carolling": "caroling",
+    "catalogue": "catalog",
+    "catalogued": "cataloged",
+    "catalogues": "catalogs",
+    "cataloguing": "cataloging",
+    "catalyse": "catalyze",
+    "catalysed": "catalyzed",
+    "catalyses": "catalyzes",
+    "catalysing": "catalyzing",
+    "categorise": "categorize",
+    "categorised": "categorized",
+    "categorises": "categorizes",
+    "categorising": "categorizing",
+    "cauterise": "cauterize",
+    "cauterised": "cauterized",
+    "cauterises": "cauterizes",
+    "cauterising": "cauterizing",
+    "cavilled": "caviled",
+    "cavilling": "caviling",
+    "centigramme": "centigram",
+    "centigrammes": "centigrams",
+    "centilitre": "centiliter",
+    "centilitres": "centiliters",
+    "centimetre": "centimeter",
+    "centimetres": "centimeters",
+    "centralise": "centralize",
+    "centralised": "centralized",
+    "centralises": "centralizes",
+    "centralising": "centralizing",
+    "centre": "center",
+    "centred": "centered",
+    "centrefold": "centerfold",
+    "centrefolds": "centerfolds",
+    "centrepiece": "centerpiece",
+    "centrepieces": "centerpieces",
+    "centres": "centers",
+    "channelled": "channeled",
+    "channelling": "channeling",
+    "characterise": "characterize",
+    "characterised": "characterized",
+    "characterises": "characterizes",
+    "characterising": "characterizing",
+    "cheque": "check",
+    "chequebook": "checkbook",
+    "chequebooks": "checkbooks",
+    "chequered": "checkered",
+    "cheques": "checks",
+    "chilli": "chili",
+    "chimaera": "chimera",
+    "chimaeras": "chimeras",
+    "chiselled": "chiseled",
+    "chiselling": "chiseling",
+    "circularise": "circularize",
+    "circularised": "circularized",
+    "circularises": "circularizes",
+    "circularising": "circularizing",
+    "civilise": "civilize",
+    "civilised": "civilized",
+    "civilises": "civilizes",
+    "civilising": "civilizing",
+    "clamour": "clamor",
+    "clamoured": "clamored",
+    "clamouring": "clamoring",
+    "clamours": "clamors",
+    "clangour": "clangor",
+    "clarinettist": "clarinetist",
+    "clarinettists": "clarinetists",
+    "collectivise": "collectivize",
+    "collectivised": "collectivized",
+    "collectivises": "collectivizes",
+    "collectivising": "collectivizing",
+    "colonisation": "colonization",
+    "colonise": "colonize",
+    "colonised": "colonized",
+    "coloniser": "colonizer",
+    "colonisers": "colonizers",
+    "colonises": "colonizes",
+    "colonising": "colonizing",
+    "colour": "color",
+    "colourant": "colorant",
+    "colourants": "colorants",
+    "coloured": "colored",
+    "coloureds": "coloreds",
+    "colourful": "colorful",
+    "colourfully": "colorfully",
+    "colouring": "coloring",
+    "colourize": "colorize",
+    "colourized": "colorized",
+    "colourizes": "colorizes",
+    "colourizing": "colorizing",
+    "colourless": "colorless",
+    "colours": "colors",
+    "commercialise": "commercialize",
+    "commercialised": "commercialized",
+    "commercialises": "commercializes",
+    "commercialising": "commercializing",
+    "compartmentalise": "compartmentalize",
+    "compartmentalised": "compartmentalized",
+    "compartmentalises": "compartmentalizes",
+    "compartmentalising": "compartmentalizing",
+    "computerise": "computerize",
+    "computerised": "computerized",
+    "computerises": "computerizes",
+    "computerising": "computerizing",
+    "conceptualise": "conceptualize",
+    "conceptualised": "conceptualized",
+    "conceptualises": "conceptualizes",
+    "conceptualising": "conceptualizing",
+    "connexion": "connection",
+    "connexions": "connections",
+    "contextualise": "contextualize",
+    "contextualised": "contextualized",
+    "contextualises": "contextualizes",
+    "contextualising": "contextualizing",
+    "cosier": "cozier",
+    "cosies": "cozies",
+    "cosiest": "coziest",
+    "cosily": "cozily",
+    "cosiness": "coziness",
+    "cosy": "cozy",
+    "councillor": "councilor",
+    "councillors": "councilors",
+    "counselled": "counseled",
+    "counselling": "counseling",
+    "counsellor": "counselor",
+    "counsellors": "counselors",
+    "crenelated": "crenellated",
+    "criminalise": "criminalize",
+    "criminalised": "criminalized",
+    "criminalises": "criminalizes",
+    "criminalising": "criminalizing",
+    "criticise": "criticize",
+    "criticised": "criticized",
+    "criticises": "criticizes",
+    "criticising": "criticizing",
+    "crueller": "crueler",
+    "cruellest": "cruelest",
+    "crystallisation": "crystallization",
+    "crystallise": "crystallize",
+    "crystallised": "crystallized",
+    "crystallises": "crystallizes",
+    "crystallising": "crystallizing",
+    "cudgelled": "cudgeled",
+    "cudgelling": "cudgeling",
+    "customise": "customize",
+    "customised": "customized",
+    "customises": "customizes",
+    "customising": "customizing",
+    "cypher": "cipher",
+    "cyphers": "ciphers",
+    "decentralisation": "decentralization",
+    "decentralise": "decentralize",
+    "decentralised": "decentralized",
+    "decentralises": "decentralizes",
+    "decentralising": "decentralizing",
+    "decriminalisation": "decriminalization",
+    "decriminalise": "decriminalize",
+    "decriminalised": "decriminalized",
+    "decriminalises": "decriminalizes",
+    "decriminalising": "decriminalizing",
+    "defence": "defense",
+    "defenceless": "defenseless",
+    "defences": "defenses",
+    "dehumanisation": "dehumanization",
+    "dehumanise": "dehumanize",
+    "dehumanised": "dehumanized",
+    "dehumanises": "dehumanizes",
+    "dehumanising": "dehumanizing",
+    "demeanour": "demeanor",
+    "demilitarisation": "demilitarization",
+    "demilitarise": "demilitarize",
+    "demilitarised": "demilitarized",
+    "demilitarises": "demilitarizes",
+    "demilitarising": "demilitarizing",
+    "demobilisation": "demobilization",
+    "demobilise": "demobilize",
+    "demobilised": "demobilized",
+    "demobilises": "demobilizes",
+    "demobilising": "demobilizing",
+    "democratisation": "democratization",
+    "democratise": "democratize",
+    "democratised": "democratized",
+    "democratises": "democratizes",
+    "democratising": "democratizing",
+    "demonise": "demonize",
+    "demonised": "demonized",
+    "demonises": "demonizes",
+    "demonising": "demonizing",
+    "demoralisation": "demoralization",
+    "demoralise": "demoralize",
+    "demoralised": "demoralized",
+    "demoralises": "demoralizes",
+    "demoralising": "demoralizing",
+    "denationalisation": "denationalization",
+    "denationalise": "denationalize",
+    "denationalised": "denationalized",
+    "denationalises": "denationalizes",
+    "denationalising": "denationalizing",
+    "deodorise": "deodorize",
+    "deodorised": "deodorized",
+    "deodorises": "deodorizes",
+    "deodorising": "deodorizing",
+    "depersonalise": "depersonalize",
+    "depersonalised": "depersonalized",
+    "depersonalises": "depersonalizes",
+    "depersonalising": "depersonalizing",
+    "deputise": "deputize",
+    "deputised": "deputized",
+    "deputises": "deputizes",
+    "deputising": "deputizing",
+    "desensitisation": "desensitization",
+    "desensitise": "desensitize",
+    "desensitised": "desensitized",
+    "desensitises": "desensitizes",
+    "desensitising": "desensitizing",
+    "destabilisation": "destabilization",
+    "destabilise": "destabilize",
+    "destabilised": "destabilized",
+    "destabilises": "destabilizes",
+    "destabilising": "destabilizing",
+    "dialled": "dialed",
+    "dialling": "dialing",
+    "dialogue": "dialog",
+    "dialogues": "dialogs",
+    "diarrhoea": "diarrhea",
+    "digitise": "digitize",
+    "digitised": "digitized",
+    "digitises": "digitizes",
+    "digitising": "digitizing",
+    "disc": "disk",
+    "discolour": "discolor",
+    "discoloured": "discolored",
+    "discolouring": "discoloring",
+    "discolours": "discolors",
+    "discs": "disks",
+    "disembowelled": "disemboweled",
+    "disembowelling": "disemboweling",
+    "disfavour": "disfavor",
+    "dishevelled": "disheveled",
+    "dishonour": "dishonor",
+    "dishonourable": "dishonorable",
+    "dishonourably": "dishonorably",
+    "dishonoured": "dishonored",
+    "dishonouring": "dishonoring",
+    "dishonours": "dishonors",
+    "disorganisation": "disorganization",
+    "disorganised": "disorganized",
+    "distil": "distill",
+    "distils": "distills",
+    "dramatisation": "dramatization",
+    "dramatisations": "dramatizations",
+    "dramatise": "dramatize",
+    "dramatised": "dramatized",
+    "dramatises": "dramatizes",
+    "dramatising": "dramatizing",
+    "draught": "draft",
+    "draughtboard": "draftboard",
+    "draughtboards": "draftboards",
+    "draughtier": "draftier",
+    "draughtiest": "draftiest",
+    "draughts": "drafts",
+    "draughtsman": "draftsman",
+    "draughtsmanship": "draftsmanship",
+    "draughtsmen": "draftsmen",
+    "draughtswoman": "draftswoman",
+    "draughtswomen": "draftswomen",
+    "draughty": "drafty",
+    "drivelled": "driveled",
+    "drivelling": "driveling",
+    "duelled": "dueled",
+    "duelling": "dueling",
+    "economise": "economize",
+    "economised": "economized",
+    "economises": "economizes",
+    "economising": "economizing",
+    "editorialise": "editorialize",
+    "editorialised": "editorialized",
+    "editorialises": "editorializes",
+    "editorialising": "editorializing",
+    "edoema": "edema",
+    "empathise": "empathize",
+    "empathised": "empathized",
+    "empathises": "empathizes",
+    "empathising": "empathizing",
+    "emphasise": "emphasize",
+    "emphasised": "emphasized",
+    "emphasises": "emphasizes",
+    "emphasising": "emphasizing",
+    "enamelled": "enameled",
+    "enamelling": "enameling",
+    "enamoured": "enamored",
+    "encyclopaedia": "encyclopedia",
+    "encyclopaedias": "encyclopedias",
+    "encyclopaedic": "encyclopedic",
+    "endeavour": "endeavor",
+    "endeavoured": "endeavored",
+    "endeavouring": "endeavoring",
+    "endeavours": "endeavors",
+    "energise": "energize",
+    "energised": "energized",
+    "energises": "energizes",
+    "energising": "energizing",
+    "enrol": "enroll",
+    "enrols": "enrolls",
+    "enthral": "enthrall",
+    "enthrals": "enthralls",
+    "epaulette": "epaulet",
+    "epaulettes": "epaulets",
+    "epicentre": "epicenter",
+    "epicentres": "epicenters",
+    "epilogue": "epilog",
+    "epilogues": "epilogs",
+    "epitomise": "epitomize",
+    "epitomised": "epitomized",
+    "epitomises": "epitomizes",
+    "epitomising": "epitomizing",
+    "equalisation": "equalization",
+    "equalise": "equalize",
+    "equalised": "equalized",
+    "equaliser": "equalizer",
+    "equalisers": "equalizers",
+    "equalises": "equalizes",
+    "equalising": "equalizing",
+    "eulogise": "eulogize",
+    "eulogised": "eulogized",
+    "eulogises": "eulogizes",
+    "eulogising": "eulogizing",
+    "evangelise": "evangelize",
+    "evangelised": "evangelized",
+    "evangelises": "evangelizes",
+    "evangelising": "evangelizing",
+    "exorcise": "exorcize",
+    "exorcised": "exorcized",
+    "exorcises": "exorcizes",
+    "exorcising": "exorcizing",
+    "extemporisation": "extemporization",
+    "extemporise": "extemporize",
+    "extemporised": "extemporized",
+    "extemporises": "extemporizes",
+    "extemporising": "extemporizing",
+    "externalisation": "externalization",
+    "externalisations": "externalizations",
+    "externalise": "externalize",
+    "externalised": "externalized",
+    "externalises": "externalizes",
+    "externalising": "externalizing",
+    "factorise": "factorize",
+    "factorised": "factorized",
+    "factorises": "factorizes",
+    "factorising": "factorizing",
+    "faecal": "fecal",
+    "faeces": "feces",
+    "familiarisation": "familiarization",
+    "familiarise": "familiarize",
+    "familiarised": "familiarized",
+    "familiarises": "familiarizes",
+    "familiarising": "familiarizing",
+    "fantasise": "fantasize",
+    "fantasised": "fantasized",
+    "fantasises": "fantasizes",
+    "fantasising": "fantasizing",
+    "favour": "favor",
+    "favourable": "favorable",
+    "favourably": "favorably",
+    "favoured": "favored",
+    "favouring": "favoring",
+    "favourite": "favorite",
+    "favourites": "favorites",
+    "favouritism": "favoritism",
+    "favours": "favors",
+    "feminise": "feminize",
+    "feminised": "feminized",
+    "feminises": "feminizes",
+    "feminising": "feminizing",
+    "fertilisation": "fertilization",
+    "fertilise": "fertilize",
+    "fertilised": "fertilized",
+    "fertiliser": "fertilizer",
+    "fertilisers": "fertilizers",
+    "fertilises": "fertilizes",
+    "fertilising": "fertilizing",
+    "fervour": "fervor",
+    "fibre": "fiber",
+    "fibreglass": "fiberglass",
+    "fibres": "fibers",
+    "fictionalisation": "fictionalization",
+    "fictionalisations": "fictionalizations",
+    "fictionalise": "fictionalize",
+    "fictionalised": "fictionalized",
+    "fictionalises": "fictionalizes",
+    "fictionalising": "fictionalizing",
+    "fillet": "filet",
+    "filleted": "fileted",
+    "filleting": "fileting",
+    "fillets": "filets",
+    "finalisation": "finalization",
+    "finalise": "finalize",
+    "finalised": "finalized",
+    "finalises": "finalizes",
+    "finalising": "finalizing",
+    "flautist": "flutist",
+    "flautists": "flutists",
+    "flavour": "flavor",
+    "flavoured": "flavored",
+    "flavouring": "flavoring",
+    "flavourings": "flavorings",
+    "flavourless": "flavorless",
+    "flavours": "flavors",
+    "flavoursome": "flavorsome",
+    "flyer / flier": "flier / flyer",
+    "foetal": "fetal",
+    "foetid": "fetid",
+    "foetus": "fetus",
+    "foetuses": "fetuses",
+    "formalisation": "formalization",
+    "formalise": "formalize",
+    "formalised": "formalized",
+    "formalises": "formalizes",
+    "formalising": "formalizing",
+    "fossilisation": "fossilization",
+    "fossilise": "fossilize",
+    "fossilised": "fossilized",
+    "fossilises": "fossilizes",
+    "fossilising": "fossilizing",
+    "fraternisation": "fraternization",
+    "fraternise": "fraternize",
+    "fraternised": "fraternized",
+    "fraternises": "fraternizes",
+    "fraternising": "fraternizing",
+    "fulfil": "fulfill",
+    "fulfilment": "fulfillment",
+    "fulfils": "fulfills",
+    "funnelled": "funneled",
+    "funnelling": "funneling",
+    "gage": "gauge",
+    "gaged": "gauged",
+    "gages": "gauges",
+    "gaging": "gauging",
+    "galvanise": "galvanize",
+    "galvanised": "galvanized",
+    "galvanises": "galvanizes",
+    "galvanising": "galvanizing",
+    "gambolled": "gamboled",
+    "gambolling": "gamboling",
+    "gaol": "jail",
+    "gaolbird": "jailbird",
+    "gaolbirds": "jailbirds",
+    "gaolbreak": "jailbreak",
+    "gaolbreaks": "jailbreaks",
+    "gaoled": "jailed",
+    "gaoler": "jailer",
+    "gaolers": "jailers",
+    "gaoling": "jailing",
+    "gaols": "jails",
+    "gasses": "gases",
+    "generalisation": "generalization",
+    "generalisations": "generalizations",
+    "generalise": "generalize",
+    "generalised": "generalized",
+    "generalises": "generalizes",
+    "generalising": "generalizing",
+    "ghettoise": "ghettoize",
+    "ghettoised": "ghettoized",
+    "ghettoises": "ghettoizes",
+    "ghettoising": "ghettoizing",
+    "gipsies": "gypsies",
+    "glamor": "glamour",
+    "glamorise": "glamorize",
+    "glamorised": "glamorized",
+    "glamorises": "glamorizes",
+    "glamorising": "glamorizing",
+    "globalisation": "globalization",
+    "globalise": "globalize",
+    "globalised": "globalized",
+    "globalises": "globalizes",
+    "globalising": "globalizing",
+    "glueing": "gluing",
+    "goitre": "goiter",
+    "goitres": "goiters",
+    "gonorrhoea": "gonorrhea",
+    "gramme": "gram",
+    "grammes": "grams",
+    "gravelled": "graveled",
+    "grey": "gray",
+    "greyed": "grayed",
+    "greying": "graying",
+    "greyish": "grayish",
+    "greyness": "grayness",
+    "greys": "grays",
+    "grovelled": "groveled",
+    "grovelling": "groveling",
+    "groyne": "groin",
+    "groynes": "groins",
+    "gruelling": "grueling",
+    "gruellingly": "gruelingly",
+    "gryphon": "griffin",
+    "gryphons": "griffins",
+    "gynaecological": "gynecological",
+    "gynaecologist": "gynecologist",
+    "gynaecologists": "gynecologists",
+    "gynaecology": "gynecology",
+    "haematological": "hematological",
+    "haematologist": "hematologist",
+    "haematologists": "hematologists",
+    "haematology": "hematology",
+    "haemoglobin": "hemoglobin",
+    "haemophilia": "hemophilia",
+    "haemophiliac": "hemophiliac",
+    "haemophiliacs": "hemophiliacs",
+    "haemorrhage": "hemorrhage",
+    "haemorrhaged": "hemorrhaged",
+    "haemorrhages": "hemorrhages",
+    "haemorrhaging": "hemorrhaging",
+    "haemorrhoids": "hemorrhoids",
+    "harbour": "harbor",
+    "harboured": "harbored",
+    "harbouring": "harboring",
+    "harbours": "harbors",
+    "harmonisation": "harmonization",
+    "harmonise": "harmonize",
+    "harmonised": "harmonized",
+    "harmonises": "harmonizes",
+    "harmonising": "harmonizing",
+    "homoeopath": "homeopath",
+    "homoeopathic": "homeopathic",
+    "homoeopaths": "homeopaths",
+    "homoeopathy": "homeopathy",
+    "homogenise": "homogenize",
+    "homogenised": "homogenized",
+    "homogenises": "homogenizes",
+    "homogenising": "homogenizing",
+    "honour": "honor",
+    "honourable": "honorable",
+    "honourably": "honorably",
+    "honoured": "honored",
+    "honouring": "honoring",
+    "honours": "honors",
+    "hospitalisation": "hospitalization",
+    "hospitalise": "hospitalize",
+    "hospitalised": "hospitalized",
+    "hospitalises": "hospitalizes",
+    "hospitalising": "hospitalizing",
+    "humanise": "humanize",
+    "humanised": "humanized",
+    "humanises": "humanizes",
+    "humanising": "humanizing",
+    "humour": "humor",
+    "humoured": "humored",
+    "humouring": "humoring",
+    "humourless": "humorless",
+    "humours": "humors",
+    "hybridise": "hybridize",
+    "hybridised": "hybridized",
+    "hybridises": "hybridizes",
+    "hybridising": "hybridizing",
+    "hypnotise": "hypnotize",
+    "hypnotised": "hypnotized",
+    "hypnotises": "hypnotizes",
+    "hypnotising": "hypnotizing",
+    "hypothesise": "hypothesize",
+    "hypothesised": "hypothesized",
+    "hypothesises": "hypothesizes",
+    "hypothesising": "hypothesizing",
+    "idealisation": "idealization",
+    "idealise": "idealize",
+    "idealised": "idealized",
+    "idealises": "idealizes",
+    "idealising": "idealizing",
+    "idolise": "idolize",
+    "idolised": "idolized",
+    "idolises": "idolizes",
+    "idolising": "idolizing",
+    "immobilisation": "immobilization",
+    "immobilise": "immobilize",
+    "immobilised": "immobilized",
+    "immobiliser": "immobilizer",
+    "immobilisers": "immobilizers",
+    "immobilises": "immobilizes",
+    "immobilising": "immobilizing",
+    "immortalise": "immortalize",
+    "immortalised": "immortalized",
+    "immortalises": "immortalizes",
+    "immortalising": "immortalizing",
+    "immunisation": "immunization",
+    "immunise": "immunize",
+    "immunised": "immunized",
+    "immunises": "immunizes",
+    "immunising": "immunizing",
+    "impanelled": "impaneled",
+    "impanelling": "impaneling",
+    "imperilled": "imperiled",
+    "imperilling": "imperiling",
+    "individualise": "individualize",
+    "individualised": "individualized",
+    "individualises": "individualizes",
+    "individualising": "individualizing",
+    "industrialise": "industrialize",
+    "industrialised": "industrialized",
+    "industrialises": "industrializes",
+    "industrialising": "industrializing",
+    "inflexion": "inflection",
+    "inflexions": "inflections",
+    "initialise": "initialize",
+    "initialised": "initialized",
+    "initialises": "initializes",
+    "initialising": "initializing",
+    "initialled": "initialed",
+    "initialling": "initialing",
+    "instal": "install",
+    "instalment": "installment",
+    "instalments": "installments",
+    "instals": "installs",
+    "instil": "instill",
+    "instils": "instills",
+    "institutionalisation": "institutionalization",
+    "institutionalise": "institutionalize",
+    "institutionalised": "institutionalized",
+    "institutionalises": "institutionalizes",
+    "institutionalising": "institutionalizing",
+    "intellectualise": "intellectualize",
+    "intellectualised": "intellectualized",
+    "intellectualises": "intellectualizes",
+    "intellectualising": "intellectualizing",
+    "internalisation": "internalization",
+    "internalise": "internalize",
+    "internalised": "internalized",
+    "internalises": "internalizes",
+    "internalising": "internalizing",
+    "internationalisation": "internationalization",
+    "internationalise": "internationalize",
+    "internationalised": "internationalized",
+    "internationalises": "internationalizes",
+    "internationalising": "internationalizing",
+    "ionisation": "ionization",
+    "ionise": "ionize",
+    "ionised": "ionized",
+    "ioniser": "ionizer",
+    "ionisers": "ionizers",
+    "ionises": "ionizes",
+    "ionising": "ionizing",
+    "italicise": "italicize",
+    "italicised": "italicized",
+    "italicises": "italicizes",
+    "italicising": "italicizing",
+    "itemise": "itemize",
+    "itemised": "itemized",
+    "itemises": "itemizes",
+    "itemising": "itemizing",
+    "jeopardise": "jeopardize",
+    "jeopardised": "jeopardized",
+    "jeopardises": "jeopardizes",
+    "jeopardising": "jeopardizing",
+    "jewelled": "jeweled",
+    "jeweller": "jeweler",
+    "jewellers": "jewelers",
+    "jewellery": "jewelry",
+    "judgement": "judgment",
+    "kilogramme": "kilogram",
+    "kilogrammes": "kilograms",
+    "kilometre": "kilometer",
+    "kilometres": "kilometers",
+    "labelled": "labeled",
+    "labelling": "labeling",
+    "labour": "labor",
+    "laboured": "labored",
+    "labourer": "laborer",
+    "labourers": "laborers",
+    "labouring": "laboring",
+    "labours": "labors",
+    "lacklustre": "lackluster",
+    "legalisation": "legalization",
+    "legalise": "legalize",
+    "legalised": "legalized",
+    "legalises": "legalizes",
+    "legalising": "legalizing",
+    "legitimise": "legitimize",
+    "legitimised": "legitimized",
+    "legitimises": "legitimizes",
+    "legitimising": "legitimizing",
+    "leukaemia": "leukemia",
+    "levelled": "leveled",
+    "leveller": "leveler",
+    "levellers": "levelers",
+    "levelling": "leveling",
+    "libelled": "libeled",
+    "libelling": "libeling",
+    "libellous": "libelous",
+    "liberalisation": "liberalization",
+    "liberalise": "liberalize",
+    "liberalised": "liberalized",
+    "liberalises": "liberalizes",
+    "liberalising": "liberalizing",
+    "licence": "license",
+    "licenced": "licensed",
+    "licences": "licenses",
+    "licencing": "licensing",
+    "likeable": "likable",
+    "lionisation": "lionization",
+    "lionise": "lionize",
+    "lionised": "lionized",
+    "lionises": "lionizes",
+    "lionising": "lionizing",
+    "liquidise": "liquidize",
+    "liquidised": "liquidized",
+    "liquidiser": "liquidizer",
+    "liquidisers": "liquidizers",
+    "liquidises": "liquidizes",
+    "liquidising": "liquidizing",
+    "litre": "liter",
+    "litres": "liters",
+    "localise": "localize",
+    "localised": "localized",
+    "localises": "localizes",
+    "localising": "localizing",
+    "louvre": "louver",
+    "louvred": "louvered",
+    "louvres": "louvers",
+    "lustre": "luster",
+    "magnetise": "magnetize",
+    "magnetised": "magnetized",
+    "magnetises": "magnetizes",
+    "magnetising": "magnetizing",
+    "manoeuvrability": "maneuverability",
+    "manoeuvrable": "maneuverable",
+    "manoeuvre": "maneuver",
+    "manoeuvred": "maneuvered",
+    "manoeuvres": "maneuvers",
+    "manoeuvring": "maneuvering",
+    "manoeuvrings": "maneuverings",
+    "marginalisation": "marginalization",
+    "marginalise": "marginalize",
+    "marginalised": "marginalized",
+    "marginalises": "marginalizes",
+    "marginalising": "marginalizing",
+    "marshalled": "marshaled",
+    "marshalling": "marshaling",
+    "marvelled": "marveled",
+    "marvelling": "marveling",
+    "marvellous": "marvelous",
+    "marvellously": "marvelously",
+    "materialisation": "materialization",
+    "materialise": "materialize",
+    "materialised": "materialized",
+    "materialises": "materializes",
+    "materialising": "materializing",
+    "maximisation": "maximization",
+    "maximise": "maximize",
+    "maximised": "maximized",
+    "maximises": "maximizes",
+    "maximising": "maximizing",
+    "meagre": "meager",
+    "mechanisation": "mechanization",
+    "mechanise": "mechanize",
+    "mechanised": "mechanized",
+    "mechanises": "mechanizes",
+    "mechanising": "mechanizing",
+    "mediaeval": "medieval",
+    "memorialise": "memorialize",
+    "memorialised": "memorialized",
+    "memorialises": "memorializes",
+    "memorialising": "memorializing",
+    "memorise": "memorize",
+    "memorised": "memorized",
+    "memorises": "memorizes",
+    "memorising": "memorizing",
+    "mesmerise": "mesmerize",
+    "mesmerised": "mesmerized",
+    "mesmerises": "mesmerizes",
+    "mesmerising": "mesmerizing",
+    "metabolise": "metabolize",
+    "metabolised": "metabolized",
+    "metabolises": "metabolizes",
+    "metabolising": "metabolizing",
+    "metre": "meter",
+    "metres": "meters",
+    "mhm": "hmm",
+    "micrometre": "micrometer",
+    "micrometres": "micrometers",
+    "militarise": "militarize",
+    "militarised": "militarized",
+    "militarises": "militarizes",
+    "militarising": "militarizing",
+    "milligramme": "milligram",
+    "milligrammes": "milligrams",
+    "millilitre": "milliliter",
+    "millilitres": "milliliters",
+    "millimetre": "millimeter",
+    "millimetres": "millimeters",
+    "miniaturisation": "miniaturization",
+    "miniaturise": "miniaturize",
+    "miniaturised": "miniaturized",
+    "miniaturises": "miniaturizes",
+    "miniaturising": "miniaturizing",
+    "minibusses": "minibuses",
+    "minimise": "minimize",
+    "minimised": "minimized",
+    "minimises": "minimizes",
+    "minimising": "minimizing",
+    "misbehaviour": "misbehavior",
+    "misdemeanour": "misdemeanor",
+    "misdemeanours": "misdemeanors",
+    "misspelt": "misspelled",
+    "mitre": "miter",
+    "mitres": "miters",
+    "mm": "hmm",
+    "mmm": "hmm",
+    "mobilisation": "mobilization",
+    "mobilise": "mobilize",
+    "mobilised": "mobilized",
+    "mobilises": "mobilizes",
+    "mobilising": "mobilizing",
+    "modelled": "modeled",
+    "modeller": "modeler",
+    "modellers": "modelers",
+    "modelling": "modeling",
+    "modernise": "modernize",
+    "modernised": "modernized",
+    "modernises": "modernizes",
+    "modernising": "modernizing",
+    "moisturise": "moisturize",
+    "moisturised": "moisturized",
+    "moisturiser": "moisturizer",
+    "moisturisers": "moisturizers",
+    "moisturises": "moisturizes",
+    "moisturising": "moisturizing",
+    "monologue": "monolog",
+    "monologues": "monologs",
+    "monopolisation": "monopolization",
+    "monopolise": "monopolize",
+    "monopolised": "monopolized",
+    "monopolises": "monopolizes",
+    "monopolising": "monopolizing",
+    "moralise": "moralize",
+    "moralised": "moralized",
+    "moralises": "moralizes",
+    "moralising": "moralizing",
+    "motorised": "motorized",
+    "mould": "mold",
+    "moulded": "molded",
+    "moulder": "molder",
+    "mouldered": "moldered",
+    "mouldering": "moldering",
+    "moulders": "molders",
+    "mouldier": "moldier",
+    "mouldiest": "moldiest",
+    "moulding": "molding",
+    "mouldings": "moldings",
+    "moulds": "molds",
+    "mouldy": "moldy",
+    "moult": "molt",
+    "moulted": "molted",
+    "moulting": "molting",
+    "moults": "molts",
+    "moustache": "mustache",
+    "moustached": "mustached",
+    "moustaches": "mustaches",
+    "moustachioed": "mustachioed",
+    "multicoloured": "multicolored",
+    "nationalisation": "nationalization",
+    "nationalisations": "nationalizations",
+    "nationalise": "nationalize",
+    "nationalised": "nationalized",
+    "nationalises": "nationalizes",
+    "nationalising": "nationalizing",
+    "naturalisation": "naturalization",
+    "naturalise": "naturalize",
+    "naturalised": "naturalized",
+    "naturalises": "naturalizes",
+    "naturalising": "naturalizing",
+    "neighbour": "neighbor",
+    "neighbourhood": "neighborhood",
+    "neighbourhoods": "neighborhoods",
+    "neighbouring": "neighboring",
+    "neighbourliness": "neighborliness",
+    "neighbourly": "neighborly",
+    "neighbours": "neighbors",
+    "neutralisation": "neutralization",
+    "neutralise": "neutralize",
+    "neutralised": "neutralized",
+    "neutralises": "neutralizes",
+    "neutralising": "neutralizing",
+    "normalisation": "normalization",
+    "normalise": "normalize",
+    "normalised": "normalized",
+    "normalises": "normalizes",
+    "normalising": "normalizing",
+    "odour": "odor",
+    "odourless": "odorless",
+    "odours": "odors",
+    "oesophagus": "esophagus",
+    "oesophaguses": "esophaguses",
+    "oestrogen": "estrogen",
+    "offence": "offense",
+    "offences": "offenses",
+    "omelette": "omelet",
+    "omelettes": "omelets",
+    "optimise": "optimize",
+    "optimised": "optimized",
+    "optimises": "optimizes",
+    "optimising": "optimizing",
+    "organisation": "organization",
+    "organisational": "organizational",
+    "organisations": "organizations",
+    "organise": "organize",
+    "organised": "organized",
+    "organiser": "organizer",
+    "organisers": "organizers",
+    "organises": "organizes",
+    "organising": "organizing",
+    "orthopaedic": "orthopedic",
+    "orthopaedics": "orthopedics",
+    "ostracise": "ostracize",
+    "ostracised": "ostracized",
+    "ostracises": "ostracizes",
+    "ostracising": "ostracizing",
+    "outmanoeuvre": "outmaneuver",
+    "outmanoeuvred": "outmaneuvered",
+    "outmanoeuvres": "outmaneuvers",
+    "outmanoeuvring": "outmaneuvering",
+    "overemphasise": "overemphasize",
+    "overemphasised": "overemphasized",
+    "overemphasises": "overemphasizes",
+    "overemphasising": "overemphasizing",
+    "oxidisation": "oxidization",
+    "oxidise": "oxidize",
+    "oxidised": "oxidized",
+    "oxidises": "oxidizes",
+    "oxidising": "oxidizing",
+    "paederast": "pederast",
+    "paederasts": "pederasts",
+    "paediatric": "pediatric",
+    "paediatrician": "pediatrician",
+    "paediatricians": "pediatricians",
+    "paediatrics": "pediatrics",
+    "paedophile": "pedophile",
+    "paedophiles": "pedophiles",
+    "paedophilia": "pedophilia",
+    "palaeolithic": "paleolithic",
+    "palaeontologist": "paleontologist",
+    "palaeontologists": "paleontologists",
+    "palaeontology": "paleontology",
+    "panelled": "paneled",
+    "panelling": "paneling",
+    "panellist": "panelist",
+    "panellists": "panelists",
+    "paralyse": "paralyze",
+    "paralysed": "paralyzed",
+    "paralyses": "paralyzes",
+    "paralysing": "paralyzing",
+    "parcelled": "parceled",
+    "parcelling": "parceling",
+    "parlour": "parlor",
+    "parlours": "parlors",
+    "particularise": "particularize",
+    "particularised": "particularized",
+    "particularises": "particularizes",
+    "particularising": "particularizing",
+    "passivisation": "passivization",
+    "passivise": "passivize",
+    "passivised": "passivized",
+    "passivises": "passivizes",
+    "passivising": "passivizing",
+    "pasteurisation": "pasteurization",
+    "pasteurise": "pasteurize",
+    "pasteurised": "pasteurized",
+    "pasteurises": "pasteurizes",
+    "pasteurising": "pasteurizing",
+    "patronise": "patronize",
+    "patronised": "patronized",
+    "patronises": "patronizes",
+    "patronising": "patronizing",
+    "patronisingly": "patronizingly",
+    "pedalled": "pedaled",
+    "pedalling": "pedaling",
+    "pedestrianisation": "pedestrianization",
+    "pedestrianise": "pedestrianize",
+    "pedestrianised": "pedestrianized",
+    "pedestrianises": "pedestrianizes",
+    "pedestrianising": "pedestrianizing",
+    "penalise": "penalize",
+    "penalised": "penalized",
+    "penalises": "penalizes",
+    "penalising": "penalizing",
+    "pencilled": "penciled",
+    "pencilling": "penciling",
+    "personalise": "personalize",
+    "personalised": "personalized",
+    "personalises": "personalizes",
+    "personalising": "personalizing",
+    "pharmacopoeia": "pharmacopeia",
+    "pharmacopoeias": "pharmacopeias",
+    "philosophise": "philosophize",
+    "philosophised": "philosophized",
+    "philosophises": "philosophizes",
+    "philosophising": "philosophizing",
+    "philtre": "filter",
+    "philtres": "filters",
+    "phoney": "phony",
+    "plagiarise": "plagiarize",
+    "plagiarised": "plagiarized",
+    "plagiarises": "plagiarizes",
+    "plagiarising": "plagiarizing",
+    "plough": "plow",
+    "ploughed": "plowed",
+    "ploughing": "plowing",
+    "ploughman": "plowman",
+    "ploughmen": "plowmen",
+    "ploughs": "plows",
+    "ploughshare": "plowshare",
+    "ploughshares": "plowshares",
+    "polarisation": "polarization",
+    "polarise": "polarize",
+    "polarised": "polarized",
+    "polarises": "polarizes",
+    "polarising": "polarizing",
+    "politicisation": "politicization",
+    "politicise": "politicize",
+    "politicised": "politicized",
+    "politicises": "politicizes",
+    "politicising": "politicizing",
+    "popularisation": "popularization",
+    "popularise": "popularize",
+    "popularised": "popularized",
+    "popularises": "popularizes",
+    "popularising": "popularizing",
+    "pouffe": "pouf",
+    "pouffes": "poufs",
+    "practise": "practice",
+    "practised": "practiced",
+    "practises": "practices",
+    "practising": "practicing",
+    "praesidium": "presidium",
+    "praesidiums": "presidiums",
+    "pressurisation": "pressurization",
+    "pressurise": "pressurize",
+    "pressurised": "pressurized",
+    "pressurises": "pressurizes",
+    "pressurising": "pressurizing",
+    "pretence": "pretense",
+    "pretences": "pretenses",
+    "primaeval": "primeval",
+    "prioritisation": "prioritization",
+    "prioritise": "prioritize",
+    "prioritised": "prioritized",
+    "prioritises": "prioritizes",
+    "prioritising": "prioritizing",
+    "privatisation": "privatization",
+    "privatisations": "privatizations",
+    "privatise": "privatize",
+    "privatised": "privatized",
+    "privatises": "privatizes",
+    "privatising": "privatizing",
+    "professionalisation": "professionalization",
+    "professionalise": "professionalize",
+    "professionalised": "professionalized",
+    "professionalises": "professionalizes",
+    "professionalising": "professionalizing",
+    "programme": "program",
+    "programmes": "programs",
+    "prologue": "prolog",
+    "prologues": "prologs",
+    "propagandise": "propagandize",
+    "propagandised": "propagandized",
+    "propagandises": "propagandizes",
+    "propagandising": "propagandizing",
+    "proselytise": "proselytize",
+    "proselytised": "proselytized",
+    "proselytiser": "proselytizer",
+    "proselytisers": "proselytizers",
+    "proselytises": "proselytizes",
+    "proselytising": "proselytizing",
+    "psychoanalyse": "psychoanalyze",
+    "psychoanalysed": "psychoanalyzed",
+    "psychoanalyses": "psychoanalyzes",
+    "psychoanalysing": "psychoanalyzing",
+    "publicise": "publicize",
+    "publicised": "publicized",
+    "publicises": "publicizes",
+    "publicising": "publicizing",
+    "pulverisation": "pulverization",
+    "pulverise": "pulverize",
+    "pulverised": "pulverized",
+    "pulverises": "pulverizes",
+    "pulverising": "pulverizing",
+    "pummelled": "pummel",
+    "pummelling": "pummeled",
+    "pyjama": "pajama",
+    "pyjamas": "pajamas",
+    "pzazz": "pizzazz",
+    "quarrelled": "quarreled",
+    "quarrelling": "quarreling",
+    "radicalise": "radicalize",
+    "radicalised": "radicalized",
+    "radicalises": "radicalizes",
+    "radicalising": "radicalizing",
+    "rancour": "rancor",
+    "randomise": "randomize",
+    "randomised": "randomized",
+    "randomises": "randomizes",
+    "randomising": "randomizing",
+    "rationalisation": "rationalization",
+    "rationalisations": "rationalizations",
+    "rationalise": "rationalize",
+    "rationalised": "rationalized",
+    "rationalises": "rationalizes",
+    "rationalising": "rationalizing",
+    "ravelled": "raveled",
+    "ravelling": "raveling",
+    "realisable": "realizable",
+    "realisation": "realization",
+    "realisations": "realizations",
+    "realise": "realize",
+    "realised": "realized",
+    "realises": "realizes",
+    "realising": "realizing",
+    "recognisable": "recognizable",
+    "recognisably": "recognizably",
+    "recognisance": "recognizance",
+    "recognise": "recognize",
+    "recognised": "recognized",
+    "recognises": "recognizes",
+    "recognising": "recognizing",
+    "reconnoitre": "reconnoiter",
+    "reconnoitred": "reconnoitered",
+    "reconnoitres": "reconnoiters",
+    "reconnoitring": "reconnoitering",
+    "refuelled": "refueled",
+    "refuelling": "refueling",
+    "regularisation": "regularization",
+    "regularise": "regularize",
+    "regularised": "regularized",
+    "regularises": "regularizes",
+    "regularising": "regularizing",
+    "remodelled": "remodeled",
+    "remodelling": "remodeling",
+    "remould": "remold",
+    "remoulded": "remolded",
+    "remoulding": "remolding",
+    "remoulds": "remolds",
+    "reorganisation": "reorganization",
+    "reorganisations": "reorganizations",
+    "reorganise": "reorganize",
+    "reorganised": "reorganized",
+    "reorganises": "reorganizes",
+    "reorganising": "reorganizing",
+    "revelled": "reveled",
+    "reveller": "reveler",
+    "revellers": "revelers",
+    "revelling": "reveling",
+    "revitalise": "revitalize",
+    "revitalised": "revitalized",
+    "revitalises": "revitalizes",
+    "revitalising": "revitalizing",
+    "revolutionise": "revolutionize",
+    "revolutionised": "revolutionized",
+    "revolutionises": "revolutionizes",
+    "revolutionising": "revolutionizing",
+    "rhapsodise": "rhapsodize",
+    "rhapsodised": "rhapsodized",
+    "rhapsodises": "rhapsodizes",
+    "rhapsodising": "rhapsodizing",
+    "rigour": "rigor",
+    "rigours": "rigors",
+    "ritualised": "ritualized",
+    "rivalled": "rivaled",
+    "rivalling": "rivaling",
+    "romanticise": "romanticize",
+    "romanticised": "romanticized",
+    "romanticises": "romanticizes",
+    "romanticising": "romanticizing",
+    "rumour": "rumor",
+    "rumoured": "rumored",
+    "rumours": "rumors",
+    "sabre": "saber",
+    "sabres": "sabers",
+    "saltpetre": "saltpeter",
+    "sanitise": "sanitize",
+    "sanitised": "sanitized",
+    "sanitises": "sanitizes",
+    "sanitising": "sanitizing",
+    "satirise": "satirize",
+    "satirised": "satirized",
+    "satirises": "satirizes",
+    "satirising": "satirizing",
+    "saviour": "savior",
+    "saviours": "saviors",
+    "savour": "savor",
+    "savoured": "savored",
+    "savouries": "savories",
+    "savouring": "savoring",
+    "savours": "savors",
+    "savoury": "savory",
+    "scandalise": "scandalize",
+    "scandalised": "scandalized",
+    "scandalises": "scandalizes",
+    "scandalising": "scandalizing",
+    "sceptic": "skeptic",
+    "sceptical": "skeptical",
+    "sceptically": "skeptically",
+    "scepticism": "skepticism",
+    "sceptics": "skeptics",
+    "sceptre": "scepter",
+    "sceptres": "scepters",
+    "scrutinise": "scrutinize",
+    "scrutinised": "scrutinized",
+    "scrutinises": "scrutinizes",
+    "scrutinising": "scrutinizing",
+    "secularisation": "secularization",
+    "secularise": "secularize",
+    "secularised": "secularized",
+    "secularises": "secularizes",
+    "secularising": "secularizing",
+    "sensationalise": "sensationalize",
+    "sensationalised": "sensationalized",
+    "sensationalises": "sensationalizes",
+    "sensationalising": "sensationalizing",
+    "sensitise": "sensitize",
+    "sensitised": "sensitized",
+    "sensitises": "sensitizes",
+    "sensitising": "sensitizing",
+    "sentimentalise": "sentimentalize",
+    "sentimentalised": "sentimentalized",
+    "sentimentalises": "sentimentalizes",
+    "sentimentalising": "sentimentalizing",
+    "sepulchre": "sepulcher",
+    "sepulchres": "sepulchers",
+    "serialisation": "serialization",
+    "serialisations": "serializations",
+    "serialise": "serialize",
+    "serialised": "serialized",
+    "serialises": "serializes",
+    "serialising": "serializing",
+    "sermonise": "sermonize",
+    "sermonised": "sermonized",
+    "sermonises": "sermonizes",
+    "sermonising": "sermonizing",
+    "sheikh": "sheik",
+    "shovelled": "shoveled",
+    "shovelling": "shoveling",
+    "shrivelled": "shriveled",
+    "shrivelling": "shriveling",
+    "signalise": "signalize",
+    "signalised": "signalized",
+    "signalises": "signalizes",
+    "signalising": "signalizing",
+    "signalled": "signaled",
+    "signalling": "signaling",
+    "smoulder": "smolder",
+    "smouldered": "smoldered",
+    "smouldering": "smoldering",
+    "smoulders": "smolders",
+    "snivelled": "sniveled",
+    "snivelling": "sniveling",
+    "snorkelled": "snorkeled",
+    "snorkelling": "snorkeling",
+    "snowplough": "snowplow",
+    "snowploughs": "snowplow",
+    "socialisation": "socialization",
+    "socialise": "socialize",
+    "socialised": "socialized",
+    "socialises": "socializes",
+    "socialising": "socializing",
+    "sodomise": "sodomize",
+    "sodomised": "sodomized",
+    "sodomises": "sodomizes",
+    "sodomising": "sodomizing",
+    "solemnise": "solemnize",
+    "solemnised": "solemnized",
+    "solemnises": "solemnizes",
+    "solemnising": "solemnizing",
+    "sombre": "somber",
+    "specialisation": "specialization",
+    "specialisations": "specializations",
+    "specialise": "specialize",
+    "specialised": "specialized",
+    "specialises": "specializes",
+    "specialising": "specializing",
+    "spectre": "specter",
+    "spectres": "specters",
+    "spiralled": "spiraled",
+    "spiralling": "spiraling",
+    "splendour": "splendor",
+    "splendours": "splendors",
+    "squirrelled": "squirreled",
+    "squirrelling": "squirreling",
+    "stabilisation": "stabilization",
+    "stabilise": "stabilize",
+    "stabilised": "stabilized",
+    "stabiliser": "stabilizer",
+    "stabilisers": "stabilizers",
+    "stabilises": "stabilizes",
+    "stabilising": "stabilizing",
+    "standardisation": "standardization",
+    "standardise": "standardize",
+    "standardised": "standardized",
+    "standardises": "standardizes",
+    "standardising": "standardizing",
+    "stencilled": "stenciled",
+    "stencilling": "stenciling",
+    "sterilisation": "sterilization",
+    "sterilisations": "sterilizations",
+    "sterilise": "sterilize",
+    "sterilised": "sterilized",
+    "steriliser": "sterilizer",
+    "sterilisers": "sterilizers",
+    "sterilises": "sterilizes",
+    "sterilising": "sterilizing",
+    "stigmatisation": "stigmatization",
+    "stigmatise": "stigmatize",
+    "stigmatised": "stigmatized",
+    "stigmatises": "stigmatizes",
+    "stigmatising": "stigmatizing",
+    "storey": "story",
+    "storeys": "stories",
+    "subsidisation": "subsidization",
+    "subsidise": "subsidize",
+    "subsidised": "subsidized",
+    "subsidiser": "subsidizer",
+    "subsidisers": "subsidizers",
+    "subsidises": "subsidizes",
+    "subsidising": "subsidizing",
+    "succour": "succor",
+    "succoured": "succored",
+    "succouring": "succoring",
+    "succours": "succors",
+    "sulphate": "sulfate",
+    "sulphates": "sulfates",
+    "sulphide": "sulfide",
+    "sulphides": "sulfides",
+    "sulphur": "sulfur",
+    "sulphurous": "sulfurous",
+    "summarise": "summarize",
+    "summarised": "summarized",
+    "summarises": "summarizes",
+    "summarising": "summarizing",
+    "swivelled": "swiveled",
+    "swivelling": "swiveling",
+    "symbolise": "symbolize",
+    "symbolised": "symbolized",
+    "symbolises": "symbolizes",
+    "symbolising": "symbolizing",
+    "sympathise": "sympathize",
+    "sympathised": "sympathized",
+    "sympathiser": "sympathizer",
+    "sympathisers": "sympathizers",
+    "sympathises": "sympathizes",
+    "sympathising": "sympathizing",
+    "synchronisation": "synchronization",
+    "synchronise": "synchronize",
+    "synchronised": "synchronized",
+    "synchronises": "synchronizes",
+    "synchronising": "synchronizing",
+    "synthesise": "synthesize",
+    "synthesised": "synthesized",
+    "synthesiser": "synthesizer",
+    "synthesisers": "synthesizers",
+    "synthesises": "synthesizes",
+    "synthesising": "synthesizing",
+    "syphon": "siphon",
+    "syphoned": "siphoned",
+    "syphoning": "siphoning",
+    "syphons": "siphons",
+    "systematisation": "systematization",
+    "systematise": "systematize",
+    "systematised": "systematized",
+    "systematises": "systematizes",
+    "systematising": "systematizing",
+    "tantalise": "tantalize",
+    "tantalised": "tantalized",
+    "tantalises": "tantalizes",
+    "tantalising": "tantalizing",
+    "tantalisingly": "tantalizingly",
+    "tasselled": "tasseled",
+    "technicolour": "technicolor",
+    "temporise": "temporize",
+    "temporised": "temporized",
+    "temporises": "temporizes",
+    "temporising": "temporizing",
+    "tenderise": "tenderize",
+    "tenderised": "tenderized",
+    "tenderises": "tenderizes",
+    "tenderising": "tenderizing",
+    "terrorise": "terrorize",
+    "terrorised": "terrorized",
+    "terrorises": "terrorizes",
+    "terrorising": "terrorizing",
+    "theatre": "theater",
+    "theatregoer": "theatergoer",
+    "theatregoers": "theatergoers",
+    "theatres": "theaters",
+    "theorise": "theorize",
+    "theorised": "theorized",
+    "theorises": "theorizes",
+    "theorising": "theorizing",
+    "tonne": "ton",
+    "tonnes": "tons",
+    "towelled": "toweled",
+    "towelling": "toweling",
+    "toxaemia": "toxemia",
+    "tranquillise": "tranquilize",
+    "tranquillised": "tranquilized",
+    "tranquilliser": "tranquilizer",
+    "tranquillisers": "tranquilizers",
+    "tranquillises": "tranquilizes",
+    "tranquillising": "tranquilizing",
+    "tranquillity": "tranquility",
+    "tranquillize": "tranquilize",
+    "tranquillized": "tranquilized",
+    "tranquillizer": "tranquilizer",
+    "tranquillizers": "tranquilizers",
+    "tranquillizes": "tranquilizes",
+    "tranquillizing": "tranquilizing",
+    "tranquilly": "tranquility",
+    "transistorised": "transistorized",
+    "traumatise": "traumatize",
+    "traumatised": "traumatized",
+    "traumatises": "traumatizes",
+    "traumatising": "traumatizing",
+    "travelled": "traveled",
+    "traveller": "traveler",
+    "travellers": "travelers",
+    "travelling": "traveling",
+    "travelog": "travelogue",
+    "travelogs": "travelogues",
+    "trialled": "trialed",
+    "trialling": "trialing",
+    "tricolour": "tricolor",
+    "tricolours": "tricolors",
+    "trivialise": "trivialize",
+    "trivialised": "trivialized",
+    "trivialises": "trivializes",
+    "trivialising": "trivializing",
+    "tumour": "tumor",
+    "tumours": "tumors",
+    "tunnelled": "tunneled",
+    "tunnelling": "tunneling",
+    "tyrannise": "tyrannize",
+    "tyrannised": "tyrannized",
+    "tyrannises": "tyrannizes",
+    "tyrannising": "tyrannizing",
+    "tyre": "tire",
+    "tyres": "tires",
+    "unauthorised": "unauthorized",
+    "uncivilised": "uncivilized",
+    "underutilised": "underutilized",
+    "unequalled": "unequaled",
+    "unfavourable": "unfavorable",
+    "unfavourably": "unfavorably",
+    "unionisation": "unionization",
+    "unionise": "unionize",
+    "unionised": "unionized",
+    "unionises": "unionizes",
+    "unionising": "unionizing",
+    "unorganised": "unorganized",
+    "unravelled": "unraveled",
+    "unravelling": "unraveling",
+    "unrecognisable": "unrecognizable",
+    "unrecognised": "unrecognized",
+    "unrivalled": "unrivaled",
+    "unsavoury": "unsavory",
+    "untrammelled": "untrammeled",
+    "urbanisation": "urbanization",
+    "urbanise": "urbanize",
+    "urbanised": "urbanized",
+    "urbanises": "urbanizes",
+    "urbanising": "urbanizing",
+    "utilisable": "utilizable",
+    "utilisation": "utilization",
+    "utilise": "utilize",
+    "utilised": "utilized",
+    "utilises": "utilizes",
+    "utilising": "utilizing",
+    "valour": "valor",
+    "vandalise": "vandalize",
+    "vandalised": "vandalized",
+    "vandalises": "vandalizes",
+    "vandalising": "vandalizing",
+    "vaporisation": "vaporization",
+    "vaporise": "vaporize",
+    "vaporised": "vaporized",
+    "vaporises": "vaporizes",
+    "vaporising": "vaporizing",
+    "vapour": "vapor",
+    "vapours": "vapors",
+    "verbalise": "verbalize",
+    "verbalised": "verbalized",
+    "verbalises": "verbalizes",
+    "verbalising": "verbalizing",
+    "victimisation": "victimization",
+    "victimise": "victimize",
+    "victimised": "victimized",
+    "victimises": "victimizes",
+    "victimising": "victimizing",
+    "videodisc": "videodisk",
+    "videodiscs": "videodisks",
+    "vigour": "vigor",
+    "visualisation": "visualization",
+    "visualisations": "visualizations",
+    "visualise": "visualize",
+    "visualised": "visualized",
+    "visualises": "visualizes",
+    "visualising": "visualizing",
+    "vocalisation": "vocalization",
+    "vocalisations": "vocalizations",
+    "vocalise": "vocalize",
+    "vocalised": "vocalized",
+    "vocalises": "vocalizes",
+    "vocalising": "vocalizing",
+    "vulcanised": "vulcanized",
+    "vulgarisation": "vulgarization",
+    "vulgarise": "vulgarize",
+    "vulgarised": "vulgarized",
+    "vulgarises": "vulgarizes",
+    "vulgarising": "vulgarizing",
+    "waggon": "wagon",
+    "waggons": "wagons",
+    "watercolour": "watercolor",
+    "watercolours": "watercolors",
+    "weaselled": "weaseled",
+    "weaselling": "weaseling",
+    "westernisation": "westernization",
+    "westernise": "westernize",
+    "westernised": "westernized",
+    "westernises": "westernizes",
+    "westernising": "westernizing",
+    "womanise": "womanize",
+    "womanised": "womanized",
+    "womaniser": "womanizer",
+    "womanisers": "womanizers",
+    "womanises": "womanizes",
+    "womanising": "womanizing",
+    "woollen": "woolen",
+    "woollens": "woolens",
+    "woollies": "woolies",
+    "woolly": "wooly",
+    "worshipped": "worshiped",
+    "worshipper": "worshiper",
+    "worshipping": "worshiping",
+    "yodelled": "yodeled",
+    "yodelling": "yodeling",
+    "yoghourt": "yogurt",
+    "yoghourts": "yogurts",
+    "yoghurt": "yogurt",
+    "yoghurts": "yogurts",
+}
+# non-ASCII letters that are not separated by "NFKD" normalization
+ADDITIONAL_DIACRITICS = {
+    "œ": "oe",
+    "Œ": "OE",
+    "ø": "o",
+    "Ø": "O",
+    "æ": "ae",
+    "Æ": "AE",
+    "ß": "ss",
+    "ẞ": "SS",
+    "đ": "d",
+    "Đ": "D",
+    "ð": "d",
+    "Ð": "D",
+    "þ": "th",
+    "Þ": "th",
+    "ł": "l",
+    "Ł": "L",
+}
+def remove_symbols_and_diacritics(s: str, keep=""):
+    """
+    Replace any other markers, symbols, and punctuations with a space, and drop any diacritics
+    (category 'Mn' and some manual mappings)
+    """
+    def replace_character(char):
+        if char in keep:
+            return char
+        elif char in ADDITIONAL_DIACRITICS:
+            return ADDITIONAL_DIACRITICS[char]
+        elif unicodedata.category(char) == "Mn":
+            return ""
+        elif unicodedata.category(char)[0] in "MSP":
+            return " "
+        return char
+    return "".join(replace_character(c) for c in unicodedata.normalize("NFKD", s))
+def remove_symbols(s: str):
+    """
+    Replace any other markers, symbols, punctuations with a space, keeping diacritics
+    """
+    return "".join(
+        " " if unicodedata.category(c)[0] in "MSP" else c
+        for c in unicodedata.normalize("NFKC", s)
+    )
+class BasicTextNormalizer:
+    def __init__(self, remove_diacritics: bool = False, split_letters: bool = False):
+        self.clean = (
+            remove_symbols_and_diacritics if remove_diacritics else remove_symbols
+        )
+        self.split_letters = split_letters
+    def __call__(self, s: str):
+        s = s.lower()
+        s = re.sub(r"[<\[][^>\]]*[>\]]", "", s)  # remove words between brackets
+        s = re.sub(r"\(([^)]+?)\)", "", s)  # remove words between parenthesis
+        s = self.clean(s).lower()
+        if self.split_letters:
+            s = " ".join(regex.findall(r"\X", s, regex.U))
+        s = re.sub(
+            r"\s+", " ", s
+        )  # replace any successive whitespace characters with a space
+        return s
+class EnglishNumberNormalizer:
+    """
+    Convert any spelled-out numbers into arabic numbers, while handling:
+    - remove any commas
+    - keep the suffixes such as: `1960s`, `274th`, `32nd`, etc.
+    - spell out currency symbols after the number. e.g. `$20 million` -> `20000000 dollars`
+    - spell out `one` and `ones`
+    - interpret successive single-digit numbers as nominal: `one oh one` -> `101`
+    """
+    def __init__(self):
+        super().__init__()
+        self.zeros = {"o", "oh", "zero"}
+        # fmt: off
+        self.ones = {
+            name: i
+            for i, name in enumerate(
+                [
+                    "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten",
+                    "eleven", "twelve", "thirteen", "fourteen", "fifteen", "sixteen", "seventeen",
+                    "eighteen", "nineteen"],
+                start=1,
+            )
+        }
+        # fmt: on
+        self.ones_plural = {
+            "sixes" if name == "six" else name + "s": (value, "s")
+            for name, value in self.ones.items()
+        }
+        self.ones_ordinal = {
+            "zeroth": (0, "th"),
+            "first": (1, "st"),
+            "second": (2, "nd"),
+            "third": (3, "rd"),
+            "fifth": (5, "th"),
+            "twelfth": (12, "th"),
+            **{
+                name + ("h" if name.endswith("t") else "th"): (value, "th")
+                for name, value in self.ones.items()
+                if value > 3 and value != 5 and value != 12
+            },
+        }
+        self.ones_suffixed = {**self.ones_plural, **self.ones_ordinal}
+        self.tens = {
+            "twenty": 20,
+            "thirty": 30,
+            "forty": 40,
+            "fifty": 50,
+            "sixty": 60,
+            "seventy": 70,
+            "eighty": 80,
+            "ninety": 90,
+        }
+        self.tens_plural = {
+            name.replace("y", "ies"): (value, "s") for name, value in self.tens.items()
+        }
+        self.tens_ordinal = {
+            name.replace("y", "ieth"): (value, "th")
+            for name, value in self.tens.items()
+        }
+        self.tens_suffixed = {**self.tens_plural, **self.tens_ordinal}
+        self.multipliers = {
+            "hundred": 100,
+            "thousand": 1_000,
+            "million": 1_000_000,
+            "billion": 1_000_000_000,
+            "trillion": 1_000_000_000_000,
+            "quadrillion": 1_000_000_000_000_000,
+            "quintillion": 1_000_000_000_000_000_000,
+            "sextillion": 1_000_000_000_000_000_000_000,
+            "septillion": 1_000_000_000_000_000_000_000_000,
+            "octillion": 1_000_000_000_000_000_000_000_000_000,
+            "nonillion": 1_000_000_000_000_000_000_000_000_000_000,
+            "decillion": 1_000_000_000_000_000_000_000_000_000_000_000,
+        }
+        self.multipliers_plural = {
+            name + "s": (value, "s") for name, value in self.multipliers.items()
+        }
+        self.multipliers_ordinal = {
+            name + "th": (value, "th") for name, value in self.multipliers.items()
+        }
+        self.multipliers_suffixed = {
+            **self.multipliers_plural,
+            **self.multipliers_ordinal,
+        }
+        self.decimals = {*self.ones, *self.tens, *self.zeros}
+        self.preceding_prefixers = {
+            "minus": "-",
+            "negative": "-",
+            "plus": "+",
+            "positive": "+",
+        }
+        self.following_prefixers = {
+            "pound": "£",
+            "pounds": "£",
+            "euro": "€",
+            "euros": "€",
+            "dollar": "$",
+            "dollars": "$",
+            "cent": "¢",
+            "cents": "¢",
+        }
+        self.prefixes = set(
+            list(self.preceding_prefixers.values())
+            + list(self.following_prefixers.values())
+        )
+        self.suffixers = {
+            "per": {"cent": "%"},
+            "percent": "%",
+        }
+        self.specials = {"and", "double", "triple", "point"}
+        self.words = {
+            key
+            for mapping in [
+                self.zeros,
+                self.ones,
+                self.ones_suffixed,
+                self.tens,
+                self.tens_suffixed,
+                self.multipliers,
+                self.multipliers_suffixed,
+                self.preceding_prefixers,
+                self.following_prefixers,
+                self.suffixers,
+                self.specials,
+            ]
+            for key in mapping
+        }
+        self.literal_words = {"one", "ones"}
+    def process_words(self, words: List[str]) -> Iterator[str]:
+        prefix: Optional[str] = None
+        value: Optional[Union[str, int]] = None
+        skip = False
+        def to_fraction(s: str):
+            try:
+                return Fraction(s)
+            except ValueError:
+                return None
+        def output(result: Union[str, int]):
+            nonlocal prefix, value
+            result = str(result)
+            if prefix is not None:
+                result = prefix + result
+            value = None
+            prefix = None
+            return result
+        if len(words) == 0:
+            return
+        for i, current in enumerate(words):
+            prev = words[i - 1] if i != 0 else None
+            next = words[i + 1] if i != len(words) - 1 else None
+            if skip:
+                skip = False
+                continue
+            next_is_numeric = next is not None and re.match(r"^\d+(\.\d+)?$", next)
+            has_prefix = current[0] in self.prefixes
+            current_without_prefix = current[1:] if has_prefix else current
+            if re.match(r"^\d+(\.\d+)?$", current_without_prefix):
+                # arabic numbers (potentially with signs and fractions)
+                f = to_fraction(current_without_prefix)
+                if f is None:
+                    raise ValueError("Converting the fraction failed")
+                if value is not None:
+                    if isinstance(value, str) and value.endswith("."):
+                        # concatenate decimals / ip address components
+                        value = str(value) + str(current)
+                        continue
+                    else:
+                        yield output(value)
+                prefix = current[0] if has_prefix else prefix
+                if f.denominator == 1:
+                    value = f.numerator  # store integers as int
+                else:
+                    value = current_without_prefix
+            elif current not in self.words:
+                # non-numeric words
+                if value is not None:
+                    yield output(value)
+                yield output(current)
+            elif current in self.zeros:
+                value = str(value or "") + "0"
+            elif current in self.ones:
+                ones = self.ones[current]
+                if value is None:
+                    value = ones
+                elif isinstance(value, str) or prev in self.ones:
+                    if (
+                        prev in self.tens and ones < 10
+                    ):  # replace the last zero with the digit
+                        value = value[:-1] + str(ones)
+                    else:
+                        value = str(value) + str(ones)
+                elif ones < 10:
+                    if value % 10 == 0:
+                        value += ones
+                    else:
+                        value = str(value) + str(ones)
+                else:  # eleven to nineteen
+                    if value % 100 == 0:
+                        value += ones
+                    else:
+                        value = str(value) + str(ones)
+            elif current in self.ones_suffixed:
+                # ordinal or cardinal; yield the number right away
+                ones, suffix = self.ones_suffixed[current]
+                if value is None:
+                    yield output(str(ones) + suffix)
+                elif isinstance(value, str) or prev in self.ones:
+                    if prev in self.tens and ones < 10:
+                        yield output(value[:-1] + str(ones) + suffix)
+                    else:
+                        yield output(str(value) + str(ones) + suffix)
+                elif ones < 10:
+                    if value % 10 == 0:
+                        yield output(str(value + ones) + suffix)
+                    else:
+                        yield output(str(value) + str(ones) + suffix)
+                else:  # eleven to nineteen
+                    if value % 100 == 0:
+                        yield output(str(value + ones) + suffix)
+                    else:
+                        yield output(str(value) + str(ones) + suffix)
+                value = None
+            elif current in self.tens:
+                tens = self.tens[current]
+                if value is None:
+                    value = tens
+                elif isinstance(value, str):
+                    value = str(value) + str(tens)
+                else:
+                    if value % 100 == 0:
+                        value += tens
+                    else:
+                        value = str(value) + str(tens)
+            elif current in self.tens_suffixed:
+                # ordinal or cardinal; yield the number right away
+                tens, suffix = self.tens_suffixed[current]
+                if value is None:
+                    yield output(str(tens) + suffix)
+                elif isinstance(value, str):
+                    yield output(str(value) + str(tens) + suffix)
+                else:
+                    if value % 100 == 0:
+                        yield output(str(value + tens) + suffix)
+                    else:
+                        yield output(str(value) + str(tens) + suffix)
+            elif current in self.multipliers:
+                multiplier = self.multipliers[current]
+                if value is None:
+                    value = multiplier
+                elif isinstance(value, str) or value == 0:
+                    f = to_fraction(value)
+                    p = f * multiplier if f is not None else None
+                    if f is not None and p.denominator == 1:
+                        value = p.numerator
+                    else:
+                        yield output(value)
+                        value = multiplier
+                else:
+                    before = value // 1000 * 1000
+                    residual = value % 1000
+                    value = before + residual * multiplier
+            elif current in self.multipliers_suffixed:
+                multiplier, suffix = self.multipliers_suffixed[current]
+                if value is None:
+                    yield output(str(multiplier) + suffix)
+                elif isinstance(value, str):
+                    f = to_fraction(value)
+                    p = f * multiplier if f is not None else None
+                    if f is not None and p.denominator == 1:
+                        yield output(str(p.numerator) + suffix)
+                    else:
+                        yield output(value)
+                        yield output(str(multiplier) + suffix)
+                else:  # int
+                    before = value // 1000 * 1000
+                    residual = value % 1000
+                    value = before + residual * multiplier
+                    yield output(str(value) + suffix)
+                value = None
+            elif current in self.preceding_prefixers:
+                # apply prefix (positive, minus, etc.) if it precedes a number
+                if value is not None:
+                    yield output(value)
+                if next in self.words or next_is_numeric:
+                    prefix = self.preceding_prefixers[current]
+                else:
+                    yield output(current)
+            elif current in self.following_prefixers:
+                # apply prefix (dollars, cents, etc.) only after a number
+                if value is not None:
+                    prefix = self.following_prefixers[current]
+                    yield output(value)
+                else:
+                    yield output(current)
+            elif current in self.suffixers:
+                # apply suffix symbols (percent -> '%')
+                if value is not None:
+                    suffix = self.suffixers[current]
+                    if isinstance(suffix, dict):
+                        if next in suffix:
+                            yield output(str(value) + suffix[next])
+                            skip = True
+                        else:
+                            yield output(value)
+                            yield output(current)
+                    else:
+                        yield output(str(value) + suffix)
+                else:
+                    yield output(current)
+            elif current in self.specials:
+                if next not in self.words and not next_is_numeric:
+                    # apply special handling only if the next word can be numeric
+                    if value is not None:
+                        yield output(value)
+                    yield output(current)
+                elif current == "and":
+                    # ignore "and" after hundreds, thousands, etc.
+                    if prev not in self.multipliers:
+                        if value is not None:
+                            yield output(value)
+                        yield output(current)
+                elif current == "double" or current == "triple":
+                    if next in self.ones or next in self.zeros:
+                        repeats = 2 if current == "double" else 3
+                        ones = self.ones.get(next, 0)
+                        value = str(value or "") + str(ones) * repeats
+                        skip = True
+                    else:
+                        if value is not None:
+                            yield output(value)
+                        yield output(current)
+                elif current == "point":
+                    if next in self.decimals or next_is_numeric:
+                        value = str(value or "") + "."
+                else:
+                    # should all have been covered at this point
+                    raise ValueError(f"Unexpected token: {current}")
+            else:
+                # all should have been covered at this point
+                raise ValueError(f"Unexpected token: {current}")
+        if value is not None:
+            yield output(value)
+    def preprocess(self, s: str):
+        # replace "<number> and a half" with "<number> point five"
+        results = []
+        segments = re.split(r"\band\s+a\s+half\b", s)
+        for i, segment in enumerate(segments):
+            if len(segment.strip()) == 0:
+                continue
+            if i == len(segments) - 1:
+                results.append(segment)
+            else:
+                results.append(segment)
+                last_word = segment.rsplit(maxsplit=2)[-1]
+                if last_word in self.decimals or last_word in self.multipliers:
+                    results.append("point five")
+                else:
+                    results.append("and a half")
+        s = " ".join(results)
+        # put a space at number/letter boundary
+        s = re.sub(r"([a-z])([0-9])", r"\1 \2", s)
+        s = re.sub(r"([0-9])([a-z])", r"\1 \2", s)
+        # but remove spaces which could be a suffix
+        s = re.sub(r"([0-9])\s+(st|nd|rd|th|s)\b", r"\1\2", s)
+        return s
+    def postprocess(self, s: str):
+        def combine_cents(m: Match):
+            try:
+                currency = m.group(1)
+                integer = m.group(2)
+                cents = int(m.group(3))
+                return f"{currency}{integer}.{cents:02d}"
+            except ValueError:
+                return m.string
+        def extract_cents(m: Match):
+            try:
+                return f"¢{int(m.group(1))}"
+            except ValueError:
+                return m.string
+        # apply currency postprocessing; "$2 and ¢7" -> "$2.07"
+        s = re.sub(r"([€£$])([0-9]+) (?:and )?¢([0-9]{1,2})\b", combine_cents, s)
+        s = re.sub(r"[€£$]0.([0-9]{1,2})\b", extract_cents, s)
+        # write "one(s)" instead of "1(s)", just for the readability
+        s = re.sub(r"\b1(s?)\b", r"one\1", s)
+        return s
+    def __call__(self, s: str):
+        s = self.preprocess(s)
+        s = " ".join(word for word in self.process_words(s.split()) if word is not None)
+        s = self.postprocess(s)
+        return s
+class EnglishSpellingNormalizer:
+    """
+    Applies British-American spelling mappings as listed in [1].
+    [1] https://www.tysto.com/uk-us-spelling-list.html
+    """
+    def __init__(self, english_spelling_mapping):
+        self.mapping = english_spelling_mapping
+    def __call__(self, s: str):
+        return " ".join(self.mapping.get(word, word) for word in s.split())
+class EnglishTextNormalizer:
+    def __init__(self, english_spelling_mapping=abbr):
+        self.ignore_patterns = r"\b(hmm|mm|mhm|mmm|uh|um)\b"
+        self.replacers = {
+            # common contractions
+            r"\bwon't\b": "will not",
+            r"\bcan't\b": "can not",
+            r"\blet's\b": "let us",
+            r"\bain't\b": "aint",
+            r"\by'all\b": "you all",
+            r"\bwanna\b": "want to",
+            r"\bgotta\b": "got to",
+            r"\bgonna\b": "going to",
+            r"\bi'ma\b": "i am going to",
+            r"\bimma\b": "i am going to",
+            r"\bwoulda\b": "would have",
+            r"\bcoulda\b": "could have",
+            r"\bshoulda\b": "should have",
+            r"\bma'am\b": "madam",
+            # contractions in titles/prefixes
+            r"\bmr\b": "mister ",
+            r"\bmrs\b": "missus ",
+            r"\bst\b": "saint ",
+            r"\bdr\b": "doctor ",
+            r"\bprof\b": "professor ",
+            r"\bcapt\b": "captain ",
+            r"\bgov\b": "governor ",
+            r"\bald\b": "alderman ",
+            r"\bgen\b": "general ",
+            r"\bsen\b": "senator ",
+            r"\brep\b": "representative ",
+            r"\bpres\b": "president ",
+            r"\brev\b": "reverend ",
+            r"\bhon\b": "honorable ",
+            r"\basst\b": "assistant ",
+            r"\bassoc\b": "associate ",
+            r"\blt\b": "lieutenant ",
+            r"\bcol\b": "colonel ",
+            r"\bjr\b": "junior ",
+            r"\bsr\b": "senior ",
+            r"\besq\b": "esquire ",
+            # prefect tenses, ideally it should be any past participles, but it's harder..
+            r"'d been\b": " had been",
+            r"'s been\b": " has been",
+            r"'d gone\b": " had gone",
+            r"'s gone\b": " has gone",
+            r"'d done\b": " had done",  # "'s done" is ambiguous
+            r"'s got\b": " has got",
+            # general contractions
+            r"n't\b": " not",
+            r"'re\b": " are",
+            r"'s\b": " is",
+            r"'d\b": " would",
+            r"'ll\b": " will",
+            r"'t\b": " not",
+            r"'ve\b": " have",
+            r"'m\b": " am",
+        }
+        self.standardize_numbers = EnglishNumberNormalizer()
+        self.standardize_spellings = EnglishSpellingNormalizer(english_spelling_mapping)
+    def __call__(self, s: str):
+        s = s.lower()
+        s = re.sub(r"[<\[][^>\]]*[>\]]", "", s)  # remove words between brackets
+        s = re.sub(r"\(([^)]+?)\)", "", s)  # remove words between parenthesis
+        s = re.sub(self.ignore_patterns, "", s)
+        s = re.sub(
+            r"\s+'", "'", s
+        )  # standardize when there's a space before an apostrophe
+        for pattern, replacement in self.replacers.items():
+            s = re.sub(pattern, replacement, s)
+        s = re.sub(r"(\d),(\d)", r"\1\2", s)  # remove commas between digits
+        s = re.sub(r"\.([^0-9]|$)", r" \1", s)  # remove periods not followed by numbers
+        s = remove_symbols_and_diacritics(
+            s, keep=".%$¢€£"
+        )  # keep some symbols for numerics
+        s = self.standardize_numbers(s)
+        s = self.standardize_spellings(s)
+        # now remove prefix/suffix symbols that are not preceded/followed by numbers
+        s = re.sub(r"[.$¢€£]([^0-9])", r" \1", s)
+        s = re.sub(r"([^0-9])%", r"\1 ", s)
+        s = re.sub(
+            r"\s+", " ", s
+        )  # replace any successive whitespace characters with a space
+        return s
+text_normalizer = EnglishTextNormalizer()

utils.py ADDED Viewed

	@@ -0,0 +1,991 @@

+import colorsys
+import json
+import os
+import random
+from concurrent.futures import ThreadPoolExecutor
+from dataclasses import dataclass, make_dataclass
+from datetime import datetime
+from io import BytesIO
+import aiohttp
+import evaluate
+import numpy as np
+import pandas as pd
+import plotly.graph_objects as go
+from huggingface_hub import hf_hub_download, list_repo_files
+from pydub import AudioSegment
+from constants import WHISPER_OPEN_AI_LINK
+# Load the Word Error Rate (WER) metric from the evaluate library
+wer_metric = evaluate.load("wer")
+def compute_average_wer(results):
+    """
+    Compute the average Word Error Rate (WER) for a list of transcription results.
+    :param results: List of dictionaries, each containing 'reference' and 'prediction' keys
+    :return: Average WER as a percentage, rounded to 2 decimal places
+    This function calculates the WER for each reference-prediction pair and returns
+    the average. If no predictions are provided, it returns 100% WER.
+    """
+    references = [result["reference"] for result in results]
+    predictions = [result["prediction"] for result in results]
+    if len(predictions) == 0:
+        return 1
+    return round(
+        wer_metric.compute(references=references, predictions=predictions) * 100.0,
+        2,
+    )
+def read_json_line_by_line(file_path):
+    """
+    Read a JSON file line by line, parsing each line as a separate JSON object.
+    :param file_path: Path to the JSON file
+    :return: List of parsed JSON objects
+    This function is useful for reading large JSON files that contain one JSON object
+    per line. It handles JSON parsing errors gracefully, skipping invalid lines.
+    """
+    data = []
+    with open(file_path, "r") as f:
+        for line in f:
+            try:
+                item = json.loads(line.strip())
+                data.append(item)
+            except json.JSONDecodeError:
+                print(f"Skipping invalid JSON in {file_path}: {line}")
+    return data
+def group_wer(group):
+    """
+    Calculate the Word Error Rate (WER) for a group of transcriptions.
+    :param group: DataFrame group containing 'normalized_reference' and 'normalized_prediction' columns
+    :return: Average WER for the group
+    This function is typically used with DataFrame groupby operations to calculate
+    WER for specific groups of transcriptions.
+    """
+    return compute_average_wer(
+        group[["normalized_reference", "normalized_prediction"]]
+        .rename(
+            columns={
+                "normalized_reference": "reference",
+                "normalized_prediction": "prediction",
+            }
+        )
+        .to_dict("records")
+    )
+def load_multilingual_results(csv_file):
+    """
+    Load multilingual results from a CSV file into a pandas DataFrame.
+    :param csv_file: Path to the CSV file containing multilingual results
+    :return: DataFrame with the loaded results, or None if the file is not found
+    This function attempts to load a CSV file using pandas, handling potential
+    FileNotFoundError exceptions.
+    """
+    try:
+        df = pd.json_normalize(csv_file)
+        return df
+    except FileNotFoundError:
+        return None
+def download_dataset(repo_id, local_dir, remote_dir, path_includes=""):
+    """
+    Download benchmark result files from a specified Hugging Face repository to a local directory.
+    :param repo_id: ID of the Hugging Face repository
+    :param local_dir: Local directory where downloaded files will be saved
+    :param remote_dir: Remote directory within the repository to download from
+    This function uses the Hugging Face Hub API to list and download files from a
+    specific directory in a repository. It forces the download to ensure up-to-date files.
+    """
+    files = list_repo_files(repo_id, repo_type="dataset")
+    directory_files = [
+        file for file in files if file.startswith(remote_dir) and path_includes in file
+    ]
+    with ThreadPoolExecutor() as executor:
+        executor.map(
+            lambda file: hf_hub_download(
+                repo_id=repo_id,
+                repo_type="dataset",
+                filename=file,
+                local_dir=local_dir,
+                force_download=True,
+            ),
+            directory_files,
+        )
+def process_file(file_path):
+    """
+    Process a file containing JSON objects delimited by new lines.
+    :param file_path: Path to the file to be processed
+    :return: List of dictionaries, each representing a parsed JSON object
+    This function reads the file line by line, parsing each line as a JSON object.
+    It handles potential JSON decoding errors, printing error messages for invalid lines.
+    """
+    data = []
+    with open(file_path, "r") as file:
+        for line in file:
+            line = line.strip()
+            if not line:
+                continue
+            try:
+                json_obj = json.loads(line)
+                data.append(json_obj)
+            except json.JSONDecodeError as e:
+                print(f"Error decoding JSON in line: {line}")
+                print(f"Error message: {str(e)}")
+    return data
+def dir_to_json(root_dir, output_file):
+    """
+    Convert a directory of benchmark result files to a single JSON file.
+    :param root_dir: Root directory containing the benchmark result files
+    :param output_file: Output file where the JSON data will be saved
+    This function walks through the directory structure, processes each file,
+    and writes the combined data to a single JSON file. It extracts metadata
+    from the file path and includes it in the JSON output.
+    """
+    with open(output_file, "w") as outfile:
+        for subdir, _, files in os.walk(root_dir):
+            for file in files:
+                file_path = os.path.join(subdir, file)
+                # ignore .DS_Store and summary files
+                if file_path.endswith(".DS_Store") or "summary" in file_path:
+                    continue
+                parts = file_path.split(os.sep)
+                print(parts)
+                model_version = parts[2]
+                device_name = parts[3].replace("_", " ")
+                os_type_version = parts[4]
+                dataset_name = parts[5]
+                timestamp_commit = parts[6].replace(".json", "")
+                timestamp, commit_hash, commit_timestamp = timestamp_commit.split("_")
+                data_list = process_file(file_path)
+                for data in data_list:
+                    original_entry = {
+                        "model": model_version.replace("_", "/"),
+                        "device": device_name,
+                        "os": os_type_version.replace("_", " "),
+                        "wer": data["wer"],
+                        "dataset_name": dataset_name,
+                        "reference_transcription": data["reference_transcription"],
+                        "prediction_transcription": data["prediction_transcription"],
+                        "difference_transcription": data["difference_transcription"],
+                        "audio_file_url": data["audio_file_url"],
+                        "timestamp": timestamp.replace("-", ":").replace(":", "-", 2),
+                        "commit_hash": commit_hash,
+                        "commit_timestamp": commit_timestamp,
+                    }
+                    outfile.write(json.dumps(original_entry) + "\n")
+async def download_audio_to_ndarray(url):
+    """
+    Downloads an audio file from a URL and converts it to a NumPy array.
+    :param url: The URL of the audio file to download
+    :return: A tuple containing the sample rate and audio data as a NumPy array
+    This asynchronous function uses aiohttp to download the audio file,
+    converts it to an AudioSegment, and then to a NumPy array. It handles
+    both mono and stereo audio files.
+    """
+    async with aiohttp.ClientSession() as session:
+        async with session.get(url) as response:
+            if response.status == 200:
+                audio_bytes = BytesIO(await response.read())
+                audio = AudioSegment.from_file(audio_bytes, format="mp3")
+                audio_data = np.array(audio.get_array_of_samples())
+                if audio.channels == 2:
+                    audio_data = audio_data.reshape((-1, 2))
+                return audio.frame_rate, audio_data
+            else:
+                return None, None
+async def play_audio(url):
+    """
+    Wrapper function for Gradio to play audio from a URL.
+    :param url: The URL of the audio file to play
+    :return: A tuple of sample rate and audio data, or an error message
+    This function uses download_audio_to_ndarray to get the audio data
+    and returns it in a format suitable for Gradio's audio player.
+    """
+    sample_rate, audio_data = await download_audio_to_ndarray(url)
+    if audio_data is None:
+        return "Error downloading the file"
+    else:
+        return sample_rate, audio_data
+def get_filter_cond(df, model, device, os, dataset, timestamp=None):
+    """
+    Creates a filter condition for a DataFrame based on specified parameters.
+    :param df: DataFrame containing the transcription data
+    :param model: String representing the model name
+    :param device: String representing the device name
+    :param os: String representing the OS name
+    :param dataset: String representing the dataset name
+    :param timestamp: Optional timestamp for filtering (default: None)
+    :return: A boolean mask for filtering the DataFrame
+    This function constructs a complex boolean condition for filtering
+    the DataFrame based on the provided parameters.
+    """
+    filter_cond = (
+        (df["model"] == model)
+        & (df["device"] == device)
+        & (df["os"] == os)
+        & (df["dataset_name"] == dataset)
+    )
+    return filter_cond & (df["timestamp"] == timestamp) if timestamp else filter_cond
+def get_filtered_transcript(df, model, device, os, dataset, timestamp):
+    """
+    Retrieves filtered transcription data from a DataFrame.
+    :param df: DataFrame containing the transcription data
+    :param model: String representing the model name
+    :param device: String representing the device name
+    :param os: String representing the OS name
+    :param dataset: String representing the dataset name
+    :param timestamp: String representing the timestamp
+    :return: A filtered DataFrame with transcription data
+    This function applies a filter to the input DataFrame and returns
+    relevant columns for transcription analysis.
+    """
+    filter_cond = get_filter_cond(df, model, device, os, dataset, timestamp)
+    df = df[filter_cond][
+        [
+            "reference_transcription",
+            "prediction_transcription",
+            "difference_transcription",
+            "audio_file_url",
+        ]
+    ]
+    return df
+def get_filtered_timestamps(df, model, device, os, dataset):
+    """
+    Retrieves unique timestamps for a specific model, device, OS, and dataset combination.
+    :param df: DataFrame containing the transcription data
+    :param model: String representing the model name
+    :param device: String representing the device name
+    :param os: String representing the OS name
+    :param dataset: String representing the dataset name
+    :return: A filtered DataFrame containing unique timestamps
+    This function is useful for getting a list of available timestamps
+    for a specific configuration, which can be used for further analysis or UI elements.
+    """
+    filter_cond = get_filter_cond(df, model, device, os, dataset)
+    df = df[filter_cond][["timestamp"]].drop_duplicates()
+    return df
+def make_model_name_clickable_link(model):
+    """
+    Creates an HTML link to the Hugging Face model page.
+    :param model: String representing the model name
+    :return: An HTML string containing a clickable link to the model page
+    This function generates a formatted HTML link that can be used in
+    web interfaces to provide direct access to the model's page on Hugging Face.
+    """
+    return f"""<a style="color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;" href="https://huggingface.co/argmaxinc/whisperkit-coreml/tree/main/{model.replace('/', '_')}" target="_blank">{model}</a>"""
+def make_dataset_wer_clickable_link(row, dataset):
+    """
+    Creates a clickable link for the WER value of a dataset.
+    :param row: Row containing the dataset WER value
+    :param dataset: String representing the dataset name
+    :return: An HTML string containing a clickable link to the dataset's WER details
+    This function generates a formatted HTML link that can be used in
+    web interfaces to provide access to detailed WER information for a specific dataset.
+    """
+    dataset_column = f"{dataset}"
+    href = WHISPER_OPEN_AI_LINK.format(
+        row["Model"].replace("/", "_"),
+        dataset,
+    )
+    return f'<a style="color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;" href="{href}">{row[dataset_column]}</a>'
+def make_timestamp_clickable_link(model, dataset, timestamp):
+    """
+    Creates a clickable link for a timestamp.
+    :param model: String representing the model name
+    :param dataset: String representing the dataset name
+    :param timestamp: Timestamp to be displayed and used in the link
+    :return: An HTML string containing a clickable div for the timestamp
+    This function generates a formatted HTML div that can be used as a clickable
+    element in web interfaces, typically for displaying and interacting with specific timestamps.
+    """
+    elem_id = (
+        f"{dataset}-{model}-{timestamp}".replace(" ", "_")
+        .replace('"', "")
+        .replace("'", "")
+        .replace(",", "")
+    )
+    onclick = f"onclick=\"document.getElementById('{elem_id}').click();\""
+    return f'<div style="color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;" {onclick} href="#">{timestamp}</div>'
+def make_multilingual_model_clickable_link(model):
+    """
+    Creates a clickable link for a multilingual model name.
+    :param model: String representing the model name
+    :return: An HTML string containing a clickable div for the model name
+    This function generates a formatted HTML div that can be used as a clickable
+    element in web interfaces, typically for displaying and interacting with multilingual model names.
+    """
+    elem_id = (
+        f"{model}".replace(" ", "_").replace('"', "").replace("'", "").replace(",", "")
+    )
+    onclick = f"onclick=\"document.getElementById('{elem_id}').click();console.log('hello');\""
+    return f'<div style="color: #3B82F6; text-decoration: underline; text-decoration-style: dotted;" {onclick} href="#">{model}</div>'
+def plot_metric(
+    df, y_axis_col, y_axis_title, fig_title, filter_input=None, exclude_input=None
+):
+    """
+    Plots a metric for each model-device-OS group in a DataFrame.
+    :param df: DataFrame containing the benchmark data
+    :param y_axis_col: DataFrame column to use as the y-axis
+    :param y_axis_title: Display name for the y-axis
+    :param fig_title: Display title for the figure
+    :param filter_input: Optional string to filter the model-device-OS combinations
+    :param exclude_input: Optional string to exclude model-device-OS combinations
+    :return: A Plotly figure object
+    """
+    grouped = df.groupby(["model", "device", "os"])
+    sorted_groups = [group.sort_values("commit_timestamp") for _, group in grouped]
+    if filter_input:
+        filters = [f.strip().lower() for f in filter_input.split(";")]
+        sorted_groups = [
+            group
+            for group in sorted_groups
+            if any(
+                f
+                in f"{group['model'].iloc[0]}-{group['device'].iloc[0]}-{group['os'].iloc[0]}".lower()
+                for f in filters
+            )
+        ]
+    if exclude_input:
+        excludes = [e.strip().lower() for e in exclude_input.split(";")]
+        sorted_groups = [
+            group
+            for group in sorted_groups
+            if not any(
+                e
+                in f"{group['model'].iloc[0]}-{group['device'].iloc[0]}-{group['os'].iloc[0]}".lower()
+                for e in excludes
+            )
+        ]
+    base_colors = ["#4542f4", "#0e0c06", "#ccf0a7", "#ff7f4e", "#ffd15a"]
+    num_colors = len(sorted_groups)
+    random_colors = generate_random_colors(base_colors, num_colors)
+    fig = go.Figure()
+    for i, group in enumerate(sorted_groups):
+        model_device_os = (
+            f"{group['model'].iloc[0]}-{group['device'].iloc[0]}-{group['os'].iloc[0]}"
+        )
+        fig.add_trace(
+            go.Scatter(
+                x=group["commit_timestamp"].apply(
+                    lambda x: datetime.strptime(x, "%Y-%m-%dT%H%M%S").strftime(
+                        "%Y-%m-%d %H:%M:%S"
+                    )
+                ),
+                y=group[y_axis_col],
+                mode="lines+markers",
+                name=model_device_os,
+                line=dict(color=random_colors[i % len(random_colors)]),
+                marker=dict(color=random_colors[i % len(random_colors)]),
+                hovertemplate=(
+                    f"<b>{model_device_os}</b><br>"
+                    "Timestamp: %{x}<br>"
+                    f"{y_axis_title}: %{{y:.2f}}<br>"
+                    "<extra></extra>"
+                ),
+            )
+        )
+    fig.update_layout(
+        title=fig_title,
+        xaxis_title="Commit Timestamp",
+        yaxis_title=y_axis_title,
+        legend_title="Model-Device-OS",
+        width=1100,
+        height=600,
+        plot_bgcolor="rgb(250,249,244)",
+    )
+    return fig
+def fields(raw_class):
+    """
+    Returns the fields of a dataclass.
+    :param raw_class: The dataclass to inspect
+    :return: List of fields in the dataclass
+    This utility function extracts and returns all the fields defined in a dataclass,
+    excluding special methods and attributes.
+    """
+    return [
+        v for k, v in raw_class.__dict__.items() if k[:2] != "__" and k[-2:] != "__"
+    ]
+def get_os_name_and_version(os_string):
+    """
+    Extracts the OS name and major version from a string.
+    :param os_string: String representing the OS name and version
+    :return: Formatted string with OS name and major version
+    This function splits the input string into OS name and version,
+    then returns a formatted string with just the major version number.
+    """
+    os_name, os_version = os_string.split()
+    os_version = os_version.split(".")[0]
+    return f"{os_name} {os_version}"
+def create_initial_quality_column_dict():
+    """
+    Creates the initial column dictionary for the quality table.
+    :return: A list of column dictionaries
+    This function defines the basic structure of the quality table,
+    including columns for model, average WER, and QoI (Quality of Implementation).
+    """
+    return [
+        [
+            "model",
+            ColumnContent,
+            ColumnContent("Model", "html", True, never_hidden=True),
+        ],
+        ["average_wer", ColumnContent, ColumnContent("Average WER", "html", True)],
+        ["qoi", ColumnContent, ColumnContent("QoI", "html", True)],
+    ]
+def calculate_parity(m2_ultra_wer, row):
+    """
+    Calculates the WER parity between M2 Ultra and the current model.
+    :param m2_ultra_wer: DataFrame containing WER values for M2 Ultra
+    :param row: Current row being processed
+    :return: WER difference between M2 Ultra and current model, or None if not applicable
+    This function computes the percentage difference in WER between the M2 Ultra model
+    and the current model, providing a measure of relative performance.
+    """
+    if row["Model"] in m2_ultra_wer.index:
+        return round(m2_ultra_wer[row["Model"]] - row["Average WER"], 2)
+    return None
+def create_initial_performance_column_dict():
+    """
+    Creates the initial column dictionary for the performance table.
+    :return: A list of column dictionaries
+    This function defines the basic structure of the performance table,
+    including columns for model, device, OS, parity, average WER, QoI, speed, and tokens per second.
+    """
+    return [
+        [
+            "model",
+            ColumnContent,
+            ColumnContent("Model", "html", True, never_hidden=True),
+        ],
+        [
+            "device",
+            ColumnContent,
+            ColumnContent("Device", "html", True, never_hidden=True),
+        ],
+        ["os", ColumnContent, ColumnContent("OS", "html", True, never_hidden=True)],
+        ["parity", ColumnContent, ColumnContent("Parity %", "html", False)],
+        ["average_wer", ColumnContent, ColumnContent("Average WER", "html", False)],
+        ["qoi", ColumnContent, ColumnContent("QoI", "html", False)],
+        ["speed", ColumnContent, ColumnContent("Speed", "html", False)],
+        ["toks", ColumnContent, ColumnContent("Tok / s", "html", False)],
+    ]
+def add_datasets_to_quality_columns(column_dict, datasets):
+    """
+    Adds dataset-specific columns to the quality table column dictionary.
+    :param column_dict: The initial column dictionary
+    :param datasets: List of dataset names to add
+    :return: A dictionary containing the updated column dictionary and related metadata
+    This function extends the quality table structure with columns for each dataset,
+    and creates a dataclass to represent the table structure. It also generates
+    metadata about the columns for use in the UI.
+    """
+    updated_column_dict = column_dict.copy()
+    for dataset in datasets:
+        field_name = dataset.replace("-", "")
+        updated_column_dict.append(
+            [field_name, ColumnContent, ColumnContent(dataset, "html", True)]
+        )
+    AutoEvalColumn = make_dataclass("AutoEvalColumn", updated_column_dict, frozen=True)
+    COLS = [c.name for c in fields(AutoEvalColumn) if not c.hidden]
+    TYPES = [c.type for c in fields(AutoEvalColumn) if not c.hidden]
+    ALWAYS_HERE_COLS = [c.name for c in fields(AutoEvalColumn) if c.never_hidden]
+    TOGGLE_COLS = [c.name for c in fields(AutoEvalColumn) if not c.never_hidden]
+    SELECTED_COLS = [
+        c.name
+        for c in fields(AutoEvalColumn)
+        if not c.never_hidden and c.displayed_by_default
+    ]
+    return {
+        "column_dict": updated_column_dict,
+        "AutoEvalColumn": AutoEvalColumn,
+        "COLS": COLS,
+        "TYPES": TYPES,
+        "ALWAYS_HERE_COLS": ALWAYS_HERE_COLS,
+        "TOGGLE_COLS": TOGGLE_COLS,
+        "SELECTED_COLS": SELECTED_COLS,
+    }
+def add_datasets_to_performance_columns(column_dict, datasets):
+    """
+    Adds dataset-specific columns to the performance table column dictionary.
+    :param column_dict: The initial column dictionary
+    :param datasets: List of dataset names to add
+    :return: A dictionary containing the updated column dictionary and related metadata
+    This function extends the performance table structure with columns for each dataset,
+    adding both speed and tokens per second metrics. It also creates a dataclass to
+    represent the table structure and generates metadata about the columns for use in the UI.
+    """
+    updated_column_dict = column_dict.copy()
+    for dataset in datasets:
+        field_name = dataset.replace("-", "")
+        updated_column_dict.append(
+            [
+                f"{field_name}_speed",
+                ColumnContent,
+                ColumnContent(
+                    f"{'Short-Form' if dataset == 'librispeech-10mins' else 'Long-Form'} Speed",
+                    "html",
+                    True,
+                ),
+            ]
+        )
+        updated_column_dict.append(
+            [
+                f"{field_name}_toks",
+                ColumnContent,
+                ColumnContent(
+                    f"{'Short-Form' if dataset == 'librispeech-10mins' else 'Long-Form'} Tok/s",
+                    "html",
+                    True,
+                ),
+            ]
+        )
+    AutoEvalColumn = make_dataclass("AutoEvalColumn", updated_column_dict, frozen=True)
+    COLS = [c.name for c in fields(AutoEvalColumn) if not c.hidden]
+    TYPES = [c.type for c in fields(AutoEvalColumn) if not c.hidden]
+    ALWAYS_HERE_COLS = [c.name for c in fields(AutoEvalColumn) if c.never_hidden]
+    TOGGLE_COLS = [c.name for c in fields(AutoEvalColumn) if not c.never_hidden]
+    SELECTED_COLS = [
+        c.name
+        for c in fields(AutoEvalColumn)
+        if not c.never_hidden and c.displayed_by_default
+    ]
+    return {
+        "column_dict": updated_column_dict,
+        "AutoEvalColumn": AutoEvalColumn,
+        "COLS": COLS,
+        "TYPES": TYPES,
+        "ALWAYS_HERE_COLS": ALWAYS_HERE_COLS,
+        "TOGGLE_COLS": TOGGLE_COLS,
+        "SELECTED_COLS": SELECTED_COLS,
+    }
+def create_confusion_matrix_plot(matrix, labels, is_forced):
+    """
+    Creates a confusion matrix plot for language detection.
+    :param matrix: 2D numpy array representing the confusion matrix
+    :param labels: List of language labels
+    :param is_forced: Boolean indicating whether language hint was used
+    :return: A Plotly figure object representing the confusion matrix
+    This function generates a heatmap visualization of the confusion matrix
+    for language detection, with customized layout and hover information.
+    """
+    fig = go.Figure(
+        data=go.Heatmap(
+            z=matrix,
+            x=labels,
+            y=labels,
+            colorscale=[
+                [0, "rgb(250,249,244)"],
+                [0.5, "rgb(69,66,244)"],
+                [1.0, "rgb(14,12,6)"],
+            ],
+            hoverongaps=False,
+            hovertemplate="True: %{y}<br>Predicted: %{x}<br>Value: %{z}<extra></extra>",
+        )
+    )
+    fig.update_layout(
+        title=f'Language Detection Confusion Matrix with {"Language Hint" if is_forced else "Language Prediction by Model"}',
+        xaxis_title="Predicted Language",
+        yaxis_title="True Language",
+        xaxis=dict(tickangle=-45),
+        width=600,
+        height=600,
+        margin=dict(l=50, r=50, t=50, b=50),
+    )
+    return fig
+def hex_to_rgb(hex_color):
+    """
+    Converts a hexadecimal color code to RGB values.
+    :param hex_color: String representing a color in hexadecimal format
+    :return: Tuple of three integers representing RGB values
+    This function takes a hex color code and returns the corresponding
+    RGB values as a tuple of integers.
+    """
+    hex_color = hex_color.lstrip("#")
+    return tuple(int(hex_color[i : i + 2], 16) for i in (0, 2, 4))
+def rgb_to_hex(rgb):
+    """
+    Converts RGB values to a hexadecimal color code.
+    :param rgb: Tuple of three integers representing RGB values
+    :return: String representing the color in hexadecimal format
+    This function takes RGB values as a tuple and returns the corresponding
+    hex color code as a string.
+    """
+    return "#{:02x}{:02x}{:02x}".format(*rgb)
+def interpolate_colors(color1, color2, factor):
+    """
+    Interpolates between two colors in HSV space.
+    :param color1: First color in hexadecimal format
+    :param color2: Second color in hexadecimal format
+    :param factor: Float between 0 and 1, representing the interpolation factor
+    :return: Interpolated color in hexadecimal format
+    This function performs color interpolation in HSV color space, which can
+    produce more visually pleasing results than simple RGB interpolation.
+    """
+    rgb1 = hex_to_rgb(color1)
+    rgb2 = hex_to_rgb(color2)
+    hsv1 = colorsys.rgb_to_hsv(*[x / 255.0 for x in rgb1])
+    hsv2 = colorsys.rgb_to_hsv(*[x / 255.0 for x in rgb2])
+    h = (hsv1[0] + factor * (hsv2[0] - hsv1[0])) % 1.0
+    s = hsv1[1] + factor * (hsv2[1] - hsv1[1])
+    v = hsv1[2] + factor * (hsv2[2] - hsv1[2])
+    rgb = colorsys.hsv_to_rgb(h, s, v)
+    return rgb_to_hex(tuple(int(x * 255) for x in rgb))
+def color_distance(color1, color2):
+    """
+    Calculates the Euclidean distance between two colors in RGB space.
+    :param color1: First color in hexadecimal format
+    :param color2: Second color in hexadecimal format
+    :return: Float representing the distance between the two colors
+    This function computes the Euclidean distance between two colors in RGB space,
+    which can be used as a measure of color similarity.
+    """
+    rgb1 = hex_to_rgb(color1)
+    rgb2 = hex_to_rgb(color2)
+    return sum((a - b) ** 2 for a, b in zip(rgb1, rgb2)) ** 0.5
+def generate_random_colors(base_colors, num_colors, min_distance=30):
+    """
+    Generates a list of random colors based on a set of base colors.
+    :param base_colors: List of base colors in hexadecimal format
+    :param num_colors: Number of colors to generate
+    :param min_distance: Minimum distance between generated colors (default: 30)
+    :return: List of generated colors in hexadecimal format
+    This function creates a list of random colors by interpolating between
+    the provided base colors. It attempts to maintain a minimum distance
+    between colors to ensure visual distinctiveness.
+    """
+    generated_colors = []
+    attempts = 0
+    max_attempts = 1000
+    while len(generated_colors) < num_colors and attempts < max_attempts:
+        color1, color2 = random.sample(base_colors, 2)
+        factor = random.random()
+        new_color = interpolate_colors(color1, color2, factor)
+        if all(color_distance(new_color, c) >= min_distance for c in generated_colors):
+            generated_colors.append(new_color)
+            attempts = 0
+        else:
+            attempts += 1
+        if attempts > 100:
+            if random.random() < 0.1:
+                generated_colors.append(new_color)
+                attempts = 0
+    return generated_colors
+@dataclass
+class Task:
+    """
+    Dataclass representing a benchmark task.
+    :param benchmark: String representing the benchmark name
+    :param metric: String representing the metric used for evaluation
+    :param col_name: String representing the column name in the results DataFrame
+    """
+    benchmark: str
+    metric: str
+    col_name: str
+@dataclass(frozen=True)
+class ColumnContent:
+    """
+    Dataclass representing a column in the results table.
+    :param name: String representing the column name
+    :param type: String representing the data type of the column
+    :param displayed_by_default: Boolean indicating if the column should be displayed by default
+    :param hidden: Boolean indicating if the column should be hidden (default: False)
+    :param never_hidden: Boolean indicating if the column should never be hidden (default: False)
+    :param dummy: Boolean indicating if this is a dummy column (default: False)
+    """
+    name: str
+    type: str
+    displayed_by_default: bool
+    hidden: bool = False
+    never_hidden: bool = False
+    dummy: bool = False
+css = """
+@font-face {
+    font-family: 'Zwizz Regular';
+    font-style: normal;
+    font-weight: normal;
+    src: local('Zwizz Regular'), url('static/Zwizz-Regular.woff') format('woff');
+}
+@font-face {
+    font-family: 'Zwizz Medium';
+    font-style: normal;
+    font-weight: normal;
+    src: local('Zwizz Medium'), url('static/Zwizz-Medium.woff') format('woff');
+}
+@font-face {
+    font-family: 'Zwizz SemiBold';
+    font-style: normal;
+    font-weight: normal;
+    src: local('Zwizz SemiBold'), url('static/Zwizz-SemiBold.woff') format('woff');
+}
+@import url('https://fonts.googleapis.com/css2?family=Noto+Color+Emoji&display=swap');
+@import url('https://fonts.googleapis.com/css2?family=Sora:wght@300..400&display=swap');
+/* Typography Scale */
+h1, .h1 {
+    font-family: 'Sora', sans-serif;
+    font-weight: 300;
+    font-size: 2em;
+    letter-spacing: -0.05em;
+}
+h2, .h2 {
+    font-family: 'Sora', sans-serif;
+    font-weight: 400;
+    letter-spacing: -0.05em;
+}
+h3, h4, h5, .h3, .h4, .h5 {
+    font-family: 'Sora', sans-serif;
+    font-weight: 400;
+    letter-spacing: -0.05em;
+}
+h6, .h6, pre, code, .monospace {
+    font-family: 'IBM Plex Mono', monospace;
+    font-weight: 400;
+    letter-spacing: 0.01em;
+}
+/* Add strong tag styling */
+strong, b {
+    font-family: 'Zwizz SemiBold', -apple-system, BlinkMacSystemFont, system-ui, sans-serif;
+    letter-spacing: -0.02em;
+}
+/* Global Zwizz styles */
+:root {
+    --zwizz-spacing: -0.02em;
+}
+/* All Gradio elements should have Zwizz spacing */
+.gradio-container * {
+    letter-spacing: var(--zwizz-spacing);
+    line-height: 1.7;
+}
+/* UI Elements */
+.tab-buttons button, #models-to-add-text, .gradio-button {
+    font-family: 'Sora', sans-serif;
+    font-weight: 400;
+    letter-spacing: -0.05em;
+}
+/* Specific Table Styling */
+table, .table, th, td {
+    font-family: 'IBM Plex Mono', 'Noto Color Emoji', sans-serif, monospace !important;
+    font-weight: 400;
+    letter-spacing: 0.01em;
+}
+/* Technical/Code Elements */
+.code-block, .technical-text {
+    font-family: 'IBM Plex Mono', monospace;
+    font-weight: 400;
+    letter-spacing: 0.01em;
+}
+/* Additional Elements */
+#methodology-text p, #methodology-text li, .markdown-text {
+    font-family: 'Zwizz Regular', -apple-system, BlinkMacSystemFont, system-ui, sans-serif;
+    font-size: 16px !important;
+    letter-spacing: var(--zwizz-spacing);
+    line-height: 1.7;
+}
+/* Font weight utilities */
+.zwizz-medium {
+    font-family: 'Zwizz Medium', -apple-system, BlinkMacSystemFont, system-ui, sans-serif;
+}
+.zwizz-semibold {
+    font-family: 'Zwizz SemiBold', -apple-system, BlinkMacSystemFont, system-ui, sans-serif;
+}
+/* Maintaining Original Layout Rules */
+.gradio-container {
+    max-width: 95% !important;
+}
+/* Table Layouts */
+.large-table,
+.large-table .table-wrap,
+#multilingual-model-table .table-wrap,
+#lookup-table .table-wrap {
+    height: 35em !important;
+    overflow-y: scroll !important;
+}
+/* SVG Container Rules */
+.svg-container,
+.main-svg {
+    width: 100% !important;
+}
+.large-table, .large-table .table-wrap, #multilingual-model-table .table-wrap, #lookup-table .table-wrap {
+    height: 35em !important;
+    overflow-y: scroll !important;
+}
+.left-side-table .table-wrap {
+    height: 15em !important;
+    overflow-y: scroll !important;
+}
+#average-wer-table .table-wrap {
+    height: 8em !important;
+    overflow-y: scroll !important;
+}
+#general-wer-table .table-wrap {
+    height: 35em !important;
+    overflow-y: scroll !important;
+}
+"""