sasan commited on
Commit
5c3722f
·
1 Parent(s): a6f6ce8

chore: Update LLaMA model to version 3 and refactor speech-to-text functionality

Browse files
Files changed (1) hide show
  1. README.md +13 -23
README.md CHANGED
@@ -1,41 +1,31 @@
1
- # Project Title: Talking car
 
 
 
 
 
 
 
 
 
2
 
3
- A speaking assistant designed for in-car use, leveraging the LLaMA 2 model to facilitate vocal interactions between the car and its users. This notebook provides the foundation for a speech-enabled interface that can understand spoken questions and respond verbally, enhancing the driving experience with intelligent assistance.
4
 
5
  ## Description
6
 
7
- This project integrates speech-to-text and text-to-speech functionalities into a car's infotainment system, using the LLaMA 2 model to process and respond to vocal queries from users. It employs Gradio for user interface creation, NexusRaven for function calling, and integrates various APIs to fetch real-time information, making it a comprehensive solution for creating a responsive and interactive car assistant.
8
 
9
  ## Features
10
 
11
  • Speech-to-Text and Text-to-Speech: Enables the car assistant to listen to spoken questions and respond audibly, providing a hands-free experience for drivers and passengers.
12
- • Intelligent Function Calling with NexusRaven: Implements a sophisticated system for executing commands and retrieving information based on user queries, using the LLaMA 2 model's capabilities.
13
  • Dynamic Model Integration: Incorporates multiple models for language recognition, speech processing, and text generation.
14
  • User-Friendly Gradio Interface: easy-to-use interface for testing and deploying the speaking assistant within the car's infotainment system.
15
  • Real-Time Information Retrieval: Capable of integrating with various APIs to provide up-to-date information on weather, routes, points of interest, and more.
16
 
17
- ## Requirements
18
-
19
- • Gradio for creating interactive interfaces
20
- • Hugging Face Transformers and additional ML models for speech and language processing
21
- • NexusRaven for complex function execution
22
- All required libraries and packages are directly loaded inside the notebook.
23
-
24
- ## Installation
25
-
26
- To set up the speaking assistant in your car's system, follow these steps:
27
- 1. Run all the cells until the “Interfaces (text and audio)” section.
28
- 2. Choose between the interfaces which one to run: audio-to-audio or text-to-text.
29
-
30
- ### Usage
31
- 1. Model Setup: Begin by loading the necessary models for speech recognition, language processing, and text-to-speech conversion as detailed in the "Models loads" section.
32
- 2. Function Definition: Customize the assistant's responses and capabilities by defining functions in the "Function calling with NexusRaven" section.
33
- 3. Interface Configuration: Choose the Gradio interface that suits your in-car system, following setup instructions in the "Interfaces (text and audio)" section.
34
- 4. Activation: Execute one of the interface to start the speaking assistant, enabling vocal interactions within the car.
35
 
36
  ## Authors
37
 
38
- Sasan Jafarnejad
39
  Abigail Berthe--Pardo
40
 
41
 
 
1
+ # KITT: Knowledge-based Intelligence for Transportation Technologies
2
+
3
+ Presented at [IEEE VNC 2024](https://ieee-vnc.org/2024/) (Vehicle Networking Conference) as a demo titled "Demo: Towards a Conversational LLM-Based Voice Assistant for Transportation Applications"
4
+
5
+ ## Abstract
6
+
7
+ Conversational assistants based on large language models (LLMs) have spread widely across many domains, and the automotive industry is keen to follow suit.
8
+ However, current LLMs lack sufficient understanding of geospatial data; in addition, timely information, such as weather and traffic conditions, is inaccessible to LLMs.
9
+ In this demo, we present an in-car assistant capable of verbally communicating with the driver, and by utilizing external APIs, it can answer questions related to routing, finding points of interest, and is aware of the local weather and traffic conditions.
10
+ The assistant, including a customizable speech synthesizer, is accessible through a graphical user interface that facilitates experimentation by simulating the change in time, origin, destination, and location of the car.
11
 
 
12
 
13
  ## Description
14
 
15
+ This project integrates speech-to-text and text-to-speech functionalities into a car's infotainment system, using the LLaMA 3 model to process and respond to vocal queries from users. It employs Gradio for user interface creation, NexusRaven for function calling, and integrates various APIs to fetch real-time information, making it a comprehensive solution for creating a responsive and interactive car assistant.
16
 
17
  ## Features
18
 
19
  • Speech-to-Text and Text-to-Speech: Enables the car assistant to listen to spoken questions and respond audibly, providing a hands-free experience for drivers and passengers.
20
+ • Intelligent Function Calling with NexusRaven: Implements a sophisticated system for executing commands and retrieving information based on user queries, using the LLaMA 3 model's capabilities.
21
  • Dynamic Model Integration: Incorporates multiple models for language recognition, speech processing, and text generation.
22
  • User-Friendly Gradio Interface: easy-to-use interface for testing and deploying the speaking assistant within the car's infotainment system.
23
  • Real-Time Information Retrieval: Capable of integrating with various APIs to provide up-to-date information on weather, routes, points of interest, and more.
24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
  ## Authors
27
 
28
+ Sasan Jafarnejad
29
  Abigail Berthe--Pardo
30
 
31