Spaces:

sasan
/

KITT

Build error

File size: 2,638 Bytes

ecd1c48
 
 
 
 
 
 
f1246c9
ecd1c48
 
 
 
5c3722f
 
 
 
 
 
 
 
 
 
3fce9d1
3482ad8
5794f9b
3fce9d1
5c3722f
3482ad8
5794f9b
3fce9d1
55d6099
5c3722f
55d6099
 
3482ad8
 
1f2d661
3e25dc8
3fce9d1
5c3722f
3e25dc8

---
title: KITT
emoji: 🦀
colorFrom: red
colorTo: gray
sdk: gradio
sdk_version: 4.36.1
app_file: space.py
pinned: false
license: mit
---

# KITT: Knowledge-based Intelligence for Transportation Technologies

Presented at [IEEE VNC 2024](https://ieee-vnc.org/2024/) (Vehicle Networking Conference) as a demo titled "Demo: Towards a Conversational LLM-Based Voice Assistant for Transportation Applications"

## Abstract

Conversational assistants based on large language models (LLMs) have spread widely across many domains, and the automotive industry is keen to follow suit.  
However, current LLMs lack sufficient understanding of geospatial data; in addition, timely information, such as weather and traffic conditions, is inaccessible to LLMs.
In this demo, we present an in-car assistant capable of verbally communicating with the driver, and by utilizing external APIs, it can answer questions related to routing, finding points of interest, and is aware of the local weather and traffic conditions.
The assistant, including a customizable speech synthesizer, is accessible through a graphical user interface that facilitates experimentation by simulating the change in time, origin, destination, and location of the car.


## Description

This project integrates speech-to-text and text-to-speech functionalities into a car's infotainment system, using the LLaMA 3 model to process and respond to vocal queries from users. It employs Gradio for user interface creation, NexusRaven for function calling, and integrates various APIs to fetch real-time information, making it a comprehensive solution for creating a responsive and interactive car assistant.

## Features

•	Speech-to-Text and Text-to-Speech: Enables the car assistant to listen to spoken questions and respond audibly, providing a hands-free experience for drivers and passengers.  
•	Intelligent Function Calling with NexusRaven: Implements a sophisticated system for executing commands and retrieving information based on user queries, using the LLaMA 3 model's capabilities.  
•	Dynamic Model Integration: Incorporates multiple models for language recognition, speech processing, and text generation.  
•	User-Friendly Gradio Interface: easy-to-use interface for testing and deploying the speaking assistant within the car's infotainment system.  
•	Real-Time Information Retrieval: Capable of integrating with various APIs to provide up-to-date information on weather, routes, points of interest, and more.


## Authors

Sasan Jafarnejad
Abigail Berthe--Pardo


## License

KITT is released under the [MIT License](https://opensource.org/licenses/MIT).