File size: 2,638 Bytes
ecd1c48 f1246c9 ecd1c48 5c3722f 3fce9d1 3482ad8 5794f9b 3fce9d1 5c3722f 3482ad8 5794f9b 3fce9d1 55d6099 5c3722f 55d6099 3482ad8 1f2d661 3e25dc8 3fce9d1 5c3722f 3e25dc8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
---
title: KITT
emoji: 🦀
colorFrom: red
colorTo: gray
sdk: gradio
sdk_version: 4.36.1
app_file: space.py
pinned: false
license: mit
---
# KITT: Knowledge-based Intelligence for Transportation Technologies
Presented at [IEEE VNC 2024](https://ieee-vnc.org/2024/) (Vehicle Networking Conference) as a demo titled "Demo: Towards a Conversational LLM-Based Voice Assistant for Transportation Applications"
## Abstract
Conversational assistants based on large language models (LLMs) have spread widely across many domains, and the automotive industry is keen to follow suit.
However, current LLMs lack sufficient understanding of geospatial data; in addition, timely information, such as weather and traffic conditions, is inaccessible to LLMs.
In this demo, we present an in-car assistant capable of verbally communicating with the driver, and by utilizing external APIs, it can answer questions related to routing, finding points of interest, and is aware of the local weather and traffic conditions.
The assistant, including a customizable speech synthesizer, is accessible through a graphical user interface that facilitates experimentation by simulating the change in time, origin, destination, and location of the car.
## Description
This project integrates speech-to-text and text-to-speech functionalities into a car's infotainment system, using the LLaMA 3 model to process and respond to vocal queries from users. It employs Gradio for user interface creation, NexusRaven for function calling, and integrates various APIs to fetch real-time information, making it a comprehensive solution for creating a responsive and interactive car assistant.
## Features
• Speech-to-Text and Text-to-Speech: Enables the car assistant to listen to spoken questions and respond audibly, providing a hands-free experience for drivers and passengers.
• Intelligent Function Calling with NexusRaven: Implements a sophisticated system for executing commands and retrieving information based on user queries, using the LLaMA 3 model's capabilities.
• Dynamic Model Integration: Incorporates multiple models for language recognition, speech processing, and text generation.
• User-Friendly Gradio Interface: easy-to-use interface for testing and deploying the speaking assistant within the car's infotainment system.
• Real-Time Information Retrieval: Capable of integrating with various APIs to provide up-to-date information on weather, routes, points of interest, and more.
## Authors
Sasan Jafarnejad
Abigail Berthe--Pardo
## License
KITT is released under the [MIT License](https://opensource.org/licenses/MIT). |