S2S-Arena / templates /index.html
KurtDu's picture
Update templates/index.html
5432885 verified
raw
history blame
7.78 kB
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Speech-to-Speech Model Comparison</title>
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0-alpha1/dist/css/bootstrap.min.css" rel="stylesheet">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.0.0-beta3/css/all.min.css">
<style>
body {
background-color: #f0f8ff;
font-family: 'Arial', sans-serif;
}
.container {
background-color: #fff;
border-radius: 15px;
box-shadow: 0 6px 15px rgba(0, 0, 0, 0.15);
padding: 40px;
max-width: 800px;
margin: 30px auto;
}
h3 {
font-size: 2rem;
font-weight: bold;
color: #333;
text-align: center;
margin-bottom: 20px;
}
p {
color: #555;
font-size: 1rem;
line-height: 1.8;
}
.btn {
border-radius: 25px;
font-size: 1.1rem;
padding: 12px 25px;
font-weight: bold;
transition: background-color 0.3s ease, transform 0.2s ease;
}
.btn-primary {
background-color: #007bff;
border: none;
}
.btn-primary:hover {
background-color: #0056b3;
transform: scale(1.05);
}
.icon {
color: #f39c12;
margin-right: 5px;
}
.section-title {
font-size: 1.2rem;
font-weight: bold;
color: #007bff;
display: flex;
align-items: center;
margin-top: 20px;
}
.section-title .fa {
margin-right: 10px;
}
.audio-container {
text-align: center;
margin-top: 20px;
}
.audio-container .audio-item {
display: flex;
justify-content: center;
align-items: center;
margin-bottom: 15px;
}
.audio-container .audio-item span {
margin-right: 10px;
font-weight: bold;
}
audio {
display: inline-block;
}
</style>
</head>
<body>
<div class="container py-5">
<h3 class="mb-4">⚖️ Speech-to-Speech Model Comparison</h3>
<div id="evaluation-info" class="mb-5">
<p class="text-start">
<span class="section-title"><i class="fas fa-info-circle"></i> Welcome to the Speech-to-Speech (S2S)
Model Evaluation! 👏</span>
In this evaluation, you will assess the performance of different S2S models, such as
<strong>ChatGPT-4o</strong>, <strong>FunAudioLLM</strong>, <strong>SpeechGPT</strong>,
<strong>Mini-Omni</strong>, <strong>Cascade</strong>, and <strong>LLaMA-Omni</strong>.
<br>
<span>🎯 <strong>Goal:</strong> Test how well these models handle speech tasks across different domains.<span>
<span class="section-title"><i class="fas fa-tasks"></i> How It Works</span>
Once you select a specific domain and task (e.g., <em>Educational Tutoring</em> and <em>Rhythm
Control</em>),
you will proceed to the evaluation stage. In each round, you will be presented with an audio input.
<span><strong>
<br>
🌰 Example:</strong></span>
<div class="audio-container">
<div class="audio-item">
<span>Audio Sample:</span>
<audio controls>
<source src="/static/audio/sample/input_audio.wav" type="audio/wav">
</audio>
</div>
</div>
The corresponding text is:
<em>"Say the following sentence at my speed first, then say it again very slowly:
'Artificial intelligence is changing the world in many ways.'" </em> 🧠
<small>(Note: the audio plays at 1.5x the normal speed.)</small>
<span class="section-title"><i class="fas fa-star"></i> Model Performance</span>
<div class="audio-container">
<div class="audio-item">
<span>ChatGPT-4o:</span>
<audio controls>
<source src="/static/audio/sample/4o_audio.wav" type="audio/wav">
</audio>
</div>
<p style="margin: 0; text-align: left;">
🎙️ <strong>Speech:</strong> Partially followed the instruction on speed.
</p>
<p style="margin: 0; text-align: left;">
🧾 <strong>Semantics:</strong> Accurately followed the instruction, with no semantic deviation or
missing
information.
</p>
<br>
<div class="audio-item">
<span>FunAudioLLM:</span>
<audio controls>
<source src="/static/audio/sample/FunAudio_audio.wav" type="audio/wav">
</audio>
</div>
<p style="margin: 0; text-align: left;">
🎙️ <strong>Speech:</strong> Partially followed the instruction on speed.
</p>
<p style="margin: 0; text-align: left;">
🧾 <strong>Semantics:</strong> Accurately followed the instruction, with no semantic deviation or
missing
information.
</p>
<br>
<div class="audio-item">
<span>SpeechGPT:</span>
<audio controls>
<source src="/static/audio/sample/SpeechGPT.wav" type="audio/wav">
</audio>
</div>
<p style="margin: 0; text-align: left;">
🎙️ <strong>Speech:</strong> Did not follow the instruction on speed.
</p>
<p style="margin: 0; text-align: left;">
🧾 <strong>Semantics:</strong> Partially followed the instruction, with minor semantic deviation and
missing information.
</p>
<br>
<div class="audio-item">
<span>Mini-Omni:</span>
<audio controls>
<source src="/static/audio/sample/mini-omni.wav" type="audio/wav">
</audio>
</div>
<p style="margin: 0; text-align: left;">
🎙️ <strong>Speech:</strong> Did not follow the instruction on speed.
</p>
<p style="margin: 0; text-align: left;">
🧾 <strong>Semantics:</strong> Did not follow the instruction, with significant semantic deviation
and missing information.
</p>
</div>
<p class="text-start">
After making your choice, you'll proceed to the next round. 🔄
</p>
<p class="text-start">
<strong>Click the button below to start the evaluation! 🚀</strong>
</p>
</div>
<div class="text-center">
<a href="http://71.132.14.167:6002/" target="_blank" class="btn btn-primary"><i class="fas fa-play"></i>
Start Evaluation</a>
</div>
</div>
</body>
</html>