Spaces:
Running
on
Zero
Running
on
Zero
StevenChen16
commited on
Commit
•
786f9d6
1
Parent(s):
1c274b4
完全重构代码
Browse files
app.py
CHANGED
@@ -1,196 +1,24 @@
|
|
1 |
import gradio as gr
|
2 |
import os
|
3 |
import spaces
|
|
|
4 |
from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer
|
5 |
from threading import Thread
|
6 |
from langchain_community.vectorstores.faiss import FAISS
|
7 |
from langchain_huggingface import HuggingFaceEmbeddings
|
8 |
from huggingface_hub import snapshot_download
|
9 |
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
|
|
|
|
|
|
|
|
|
16 |
|
17 |
-
@spaces.GPU(duration=120)
|
18 |
-
class RAGChatbot:
|
19 |
-
def __init__(self):
|
20 |
-
# First create embeddings directly
|
21 |
-
self.embeddings = create_embedding_model('intfloat/multilingual-e5-large-instruct')
|
22 |
-
# Then initialize other models
|
23 |
-
self.init_models()
|
24 |
-
# Finally initialize vector store
|
25 |
-
self.init_vector_store()
|
26 |
-
|
27 |
-
self.background_prompt = '''
|
28 |
-
As an AI legal assistant, you are a highly trained expert in U.S. and Canadian law. Your purpose is to provide accurate, comprehensive, and professional legal information to assist users with a wide range of legal questions. When answering questions, you should actively ask questions to obtain more information, analyze from different perspectives, and explain your reasoning process to the user.
|
29 |
-
|
30 |
-
In addition to providing general legal advice and analysis, you are also capable of assisting clients with drafting and reviewing standardized contracts and legal documents. However, your primary role is still to provide personalized legal guidance through interactive conversations with clients.
|
31 |
-
|
32 |
-
Please adhere to the following guidelines:
|
33 |
-
|
34 |
-
1. Clarify the question:
|
35 |
-
|
36 |
-
- Ask questions to clarify the user's specific situation and needs to provide the most relevant and targeted advice.
|
37 |
-
- However, if the user has already provided sufficient background information, avoid excessively pressing for details. Focus on understanding the core of the issue, rather than unnecessary minutiae.
|
38 |
-
|
39 |
-
2. Gather information:
|
40 |
-
|
41 |
-
- Identify the key information needed to answer the question and proactively ask the user for these details.
|
42 |
-
- When gathering information, be sure to identify which details are directly relevant to the legal analysis of the case. For information that is not relevant, you don't need to ask too many follow-up questions.
|
43 |
-
- If the user indicates that they have provided all relevant information, accept this and do not continue to demand more details.
|
44 |
-
|
45 |
-
3. Multi-perspective analysis:
|
46 |
-
|
47 |
-
- Evaluate legal issues from different viewpoints, considering various possible interpretations and applications.
|
48 |
-
- Present arguments supporting and opposing specific perspectives to comprehensively clarify complex issues.
|
49 |
-
- In your analysis, strive to balance comprehensiveness and conciseness. Provide thorough analysis, but also ensure that the user can easily understand and absorb the information.
|
50 |
-
|
51 |
-
4. Explain reasoning:
|
52 |
-
|
53 |
-
- Explain the main legal principles, regulations, and case law you consider when analyzing the issue.
|
54 |
-
- Clarify how you apply legal knowledge to the user's specific situation and the logic behind your conclusions.
|
55 |
-
- When explaining your reasoning, use clear and concise language, avoiding excessive length or repetition.
|
56 |
-
|
57 |
-
5. Interactive dialogue:
|
58 |
-
|
59 |
-
- Encourage the user to participate in the discussion, ask follow-up questions, and share their thoughts and concerns.
|
60 |
-
- Dynamically adjust your analysis and recommendations based on new information obtained in the conversation.
|
61 |
-
- In your interactions, be attentive to the user's needs and concerns. If they express satisfaction or indicate that they don't require more information, respect their wishes.
|
62 |
-
|
63 |
-
6. Professional advice:
|
64 |
-
|
65 |
-
- Provide clear, actionable legal advice, but also emphasize the necessity of consulting a professional lawyer before making a final decision.
|
66 |
-
- If clients wish to speak with a lawyer, you can introduce our team (WealthWizards), which consists of lawyers with different specializations and orientations.
|
67 |
-
- When providing advice, use language that is easy to understand and communicate with a tone of empathy and care. Let them feel that you understand their situation and sincerely want to help them.
|
68 |
-
|
69 |
-
7. Assistance with standardized contracts and legal documents:
|
70 |
-
|
71 |
-
- When clients request assistance with drafting or reviewing standardized contracts and legal documents, provide guidance and support to the best of your abilities.
|
72 |
-
- Analyze the client's needs and requirements, and offer suggestions on appropriate contract templates or clauses to include.
|
73 |
-
- Review drafted documents for potential legal issues or areas that may need improvement, and provide constructive feedback.
|
74 |
-
- However, always remind clients that while you can assist with drafting and review, final documents should still be reviewed and approved by a licensed attorney.
|
75 |
-
|
76 |
-
Please remember that your role is to provide general legal information and analysis, but also to actively guide and interact with the user during the conversation in a personalized and professional manner. If you feel that necessary information is missing to provide targeted analysis and advice, take the initiative to ask until you believe you have sufficient details. However, also be mindful to avoid over-inquiring or disregarding the user's needs and concerns.
|
77 |
-
|
78 |
-
When assisting with standardized contracts and documents, aim to provide value-added services while still maintaining the importance of attorney review. Your contract assistance should be a supplement to, not a replacement for, the interactive legal guidance that is your primary function.
|
79 |
-
|
80 |
-
Now, please guide me step by step to describe the legal issues I am facing, according to the above requirements.
|
81 |
-
'''
|
82 |
-
|
83 |
-
# @spaces.GPU
|
84 |
-
def init_models(self):
|
85 |
-
"""Initialize the LLM model"""
|
86 |
-
print("Initializing LLM model...")
|
87 |
-
self.llm_model_name = 'StevenChen16/llama3-8b-Lawyer'
|
88 |
-
self.tokenizer = AutoTokenizer.from_pretrained(self.llm_model_name)
|
89 |
-
self.model = AutoModelForCausalLM.from_pretrained(
|
90 |
-
self.llm_model_name,
|
91 |
-
device_map="auto"
|
92 |
-
)
|
93 |
-
self.terminators = [
|
94 |
-
self.tokenizer.eos_token_id,
|
95 |
-
self.tokenizer.convert_tokens_to_ids("<|eot_id|>")
|
96 |
-
]
|
97 |
-
print("LLM model initialized successfully")
|
98 |
-
|
99 |
-
def init_vector_store(self):
|
100 |
-
"""Load vector store from HuggingFace Hub"""
|
101 |
-
try:
|
102 |
-
print("Downloading vector store from HuggingFace Hub...")
|
103 |
-
# Download FAISS files from HuggingFace Hub
|
104 |
-
repo_path = snapshot_download(
|
105 |
-
repo_id="StevenChen16/laws.faiss",
|
106 |
-
repo_type="model"
|
107 |
-
)
|
108 |
-
|
109 |
-
print("Loading vector store...")
|
110 |
-
# Load the vector store from downloaded files
|
111 |
-
self.vector_store = FAISS.load_local(
|
112 |
-
folder_path=repo_path,
|
113 |
-
embeddings=self.embeddings,
|
114 |
-
allow_dangerous_deserialization=True
|
115 |
-
)
|
116 |
-
print("Vector store loaded successfully")
|
117 |
-
|
118 |
-
except Exception as e:
|
119 |
-
raise RuntimeError(f"Failed to load vector store from HuggingFace Hub: {str(e)}")
|
120 |
-
|
121 |
-
def get_relevant_context(self, query, k=4):
|
122 |
-
"""Retrieve relevant context from vector store"""
|
123 |
-
retriever = self.vector_store.as_retriever(
|
124 |
-
search_type="similarity_score_threshold",
|
125 |
-
search_kwargs={
|
126 |
-
"score_threshold": 0.7,
|
127 |
-
"k": k
|
128 |
-
}
|
129 |
-
)
|
130 |
-
docs = retriever.invoke(query)
|
131 |
-
return "\n".join(doc.page_content for doc in docs)
|
132 |
-
|
133 |
-
# @spaces.GPU(duration=120)
|
134 |
-
def generate_response(self, message, history, temperature=0.6, max_new_tokens=4096):
|
135 |
-
"""Generate streaming response with RAG context"""
|
136 |
-
# Get relevant context
|
137 |
-
context = self.get_relevant_context(message)
|
138 |
-
|
139 |
-
# Build conversation history
|
140 |
-
conversation = []
|
141 |
-
for user, assistant in history:
|
142 |
-
conversation.extend([
|
143 |
-
{"role": "user", "content": user},
|
144 |
-
{"role": "assistant", "content": assistant}
|
145 |
-
])
|
146 |
-
|
147 |
-
# Add context and background prompt to message
|
148 |
-
if context:
|
149 |
-
enhanced_message = f"""Based on the following legal context:
|
150 |
-
{context}
|
151 |
-
|
152 |
-
{self.background_prompt}
|
153 |
-
|
154 |
-
User question: {message}"""
|
155 |
-
else:
|
156 |
-
enhanced_message = message + self.background_prompt
|
157 |
-
|
158 |
-
conversation.append({"role": "user", "content": enhanced_message})
|
159 |
-
|
160 |
-
# Generate response
|
161 |
-
input_ids = self.tokenizer.apply_chat_template(
|
162 |
-
conversation,
|
163 |
-
return_tensors="pt"
|
164 |
-
).to(self.model.device)
|
165 |
-
|
166 |
-
streamer = TextIteratorStreamer(
|
167 |
-
self.tokenizer,
|
168 |
-
timeout=10.0,
|
169 |
-
skip_prompt=True,
|
170 |
-
skip_special_tokens=True
|
171 |
-
)
|
172 |
-
|
173 |
-
generate_kwargs = dict(
|
174 |
-
input_ids=input_ids,
|
175 |
-
streamer=streamer,
|
176 |
-
max_new_tokens=max_new_tokens,
|
177 |
-
do_sample=True,
|
178 |
-
temperature=temperature,
|
179 |
-
eos_token_id=self.terminators,
|
180 |
-
)
|
181 |
-
|
182 |
-
if temperature == 0:
|
183 |
-
generate_kwargs['do_sample'] = False
|
184 |
-
|
185 |
-
t = Thread(target=self.model.generate, kwargs=generate_kwargs)
|
186 |
-
t.start()
|
187 |
-
|
188 |
-
outputs = []
|
189 |
-
for text in streamer:
|
190 |
-
outputs.append(text)
|
191 |
-
yield "".join(outputs)
|
192 |
-
|
193 |
-
# Gradio interface constants
|
194 |
DESCRIPTION = '''
|
195 |
<div style="display: flex; align-items: center; justify-content: center; text-align: center;">
|
196 |
<a href="https://wealthwizards.org/" target="_blank">
|
@@ -218,6 +46,7 @@ PLACEHOLDER = """
|
|
218 |
</div>
|
219 |
"""
|
220 |
|
|
|
221 |
css = """
|
222 |
h1 {
|
223 |
text-align: center;
|
@@ -231,16 +60,179 @@ h1 {
|
|
231 |
}
|
232 |
"""
|
233 |
|
234 |
-
#
|
235 |
-
|
|
|
|
|
|
|
|
|
|
|
236 |
|
237 |
-
|
238 |
-
|
239 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
240 |
|
|
|
241 |
gr.ChatInterface(
|
242 |
-
fn=
|
243 |
-
chatbot=
|
244 |
fill_height=True,
|
245 |
examples=[
|
246 |
['What are the key differences between a sole proprietorship and a partnership?'],
|
@@ -250,9 +242,9 @@ with gr.Blocks(css=css) as demo:
|
|
250 |
['How can I protect my intellectual property when sharing my idea with potential investors?']
|
251 |
],
|
252 |
cache_examples=False,
|
253 |
-
|
254 |
|
255 |
gr.Markdown(LICENSE)
|
256 |
-
|
257 |
if __name__ == "__main__":
|
258 |
demo.launch()
|
|
|
1 |
import gradio as gr
|
2 |
import os
|
3 |
import spaces
|
4 |
+
from transformers import GemmaTokenizer, AutoModelForCausalLM
|
5 |
from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer
|
6 |
from threading import Thread
|
7 |
from langchain_community.vectorstores.faiss import FAISS
|
8 |
from langchain_huggingface import HuggingFaceEmbeddings
|
9 |
from huggingface_hub import snapshot_download
|
10 |
|
11 |
+
# Set an environment variable
|
12 |
+
HF_TOKEN = os.environ.get("HF_TOKEN", None)
|
13 |
+
|
14 |
+
|
15 |
+
|
16 |
+
MODEL_NAME_OR_PATH = 'StevenChen16/llama3-8b-Lawyer'
|
17 |
+
# MODEL_NAME_OR_PATH = 'nvidia/Llama3-ChatQA-1.5-8B'
|
18 |
+
|
19 |
+
|
20 |
+
|
21 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
DESCRIPTION = '''
|
23 |
<div style="display: flex; align-items: center; justify-content: center; text-align: center;">
|
24 |
<a href="https://wealthwizards.org/" target="_blank">
|
|
|
46 |
</div>
|
47 |
"""
|
48 |
|
49 |
+
|
50 |
css = """
|
51 |
h1 {
|
52 |
text-align: center;
|
|
|
60 |
}
|
61 |
"""
|
62 |
|
63 |
+
# Load the tokenizer and model
|
64 |
+
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME_OR_PATH)
|
65 |
+
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME_OR_PATH, device_map="auto") # to("cuda:0")
|
66 |
+
terminators = [
|
67 |
+
tokenizer.eos_token_id,
|
68 |
+
tokenizer.convert_tokens_to_ids("<|eot_id|>")
|
69 |
+
]
|
70 |
|
71 |
+
def create_embedding_model(model_name):
|
72 |
+
"""Create embedding model instance"""
|
73 |
+
return HuggingFaceEmbeddings(
|
74 |
+
model_name=model_name,
|
75 |
+
model_kwargs={'trust_remote_code': True}
|
76 |
+
)
|
77 |
+
embedding_model = create_embedding_model('intfloat/multilingual-e5-large-instruct')
|
78 |
+
try:
|
79 |
+
print("Downloading vector store from HuggingFace Hub...")
|
80 |
+
# Download FAISS files from HuggingFace Hub
|
81 |
+
repo_path = snapshot_download(
|
82 |
+
repo_id="StevenChen16/laws.faiss",
|
83 |
+
repo_type="model"
|
84 |
+
)
|
85 |
+
|
86 |
+
print("Loading vector store...")
|
87 |
+
# Load the vector store from downloaded files
|
88 |
+
vector_store = FAISS.load_local(
|
89 |
+
folder_path=repo_path,
|
90 |
+
embeddings=embedding_model,
|
91 |
+
allow_dangerous_deserialization=True
|
92 |
+
)
|
93 |
+
print("Vector store loaded successfully")
|
94 |
+
|
95 |
+
except Exception as e:
|
96 |
+
raise RuntimeError(f"Failed to load vector store from HuggingFace Hub: {str(e)}")
|
97 |
+
vector_store = FAISS.load_local(repo_path, embedding_model, allow_dangerous_deserialization=True)
|
98 |
+
|
99 |
+
|
100 |
+
background_prompt = '''
|
101 |
+
As an AI legal assistant, you are a highly trained expert in U.S. and Canadian law. Your purpose is to provide accurate, comprehensive, and professional legal information to assist users with a wide range of legal questions. When answering questions, you should actively ask questions to obtain more information, analyze from different perspectives, and explain your reasoning process to the user.
|
102 |
+
|
103 |
+
In addition to providing general legal advice and analysis, you are also capable of assisting clients with drafting and reviewing standardized contracts and legal documents. However, your primary role is still to provide personalized legal guidance through interactive conversations with clients.
|
104 |
+
|
105 |
+
Please adhere to the following guidelines:
|
106 |
+
|
107 |
+
1. Clarify the question:
|
108 |
+
|
109 |
+
- Ask questions to clarify the user's specific situation and needs to provide the most relevant and targeted advice.
|
110 |
+
- However, if the user has already provided sufficient background information, avoid excessively pressing for details. Focus on understanding the core of the issue, rather than unnecessary minutiae.
|
111 |
+
|
112 |
+
2. Gather information:
|
113 |
+
|
114 |
+
- Identify the key information needed to answer the question and proactively ask the user for these details.
|
115 |
+
- When gathering information, be sure to identify which details are directly relevant to the legal analysis of the case. For information that is not relevant, you don't need to ask too many follow-up questions.
|
116 |
+
- If the user indicates that they have provided all relevant information, accept this and do not continue to demand more details.
|
117 |
+
|
118 |
+
3. Multi-perspective analysis:
|
119 |
+
|
120 |
+
- Evaluate legal issues from different viewpoints, considering various possible interpretations and applications.
|
121 |
+
- Present arguments supporting and opposing specific perspectives to comprehensively clarify complex issues.
|
122 |
+
- In your analysis, strive to balance comprehensiveness and conciseness. Provide thorough analysis, but also ensure that the user can easily understand and absorb the information.
|
123 |
+
|
124 |
+
4. Explain reasoning:
|
125 |
+
|
126 |
+
- Explain the main legal principles, regulations, and case law you consider when analyzing the issue.
|
127 |
+
- Clarify how you apply legal knowledge to the user's specific situation and the logic behind your conclusions.
|
128 |
+
- When explaining your reasoning, use clear and concise language, avoiding excessive length or repetition.
|
129 |
+
|
130 |
+
5. Interactive dialogue:
|
131 |
+
|
132 |
+
- Encourage the user to participate in the discussion, ask follow-up questions, and share their thoughts and concerns.
|
133 |
+
- Dynamically adjust your analysis and recommendations based on new information obtained in the conversation.
|
134 |
+
- In your interactions, be attentive to the user's needs and concerns. If they express satisfaction or indicate that they don't require more information, respect their wishes.
|
135 |
+
|
136 |
+
6. Professional advice:
|
137 |
+
|
138 |
+
- Provide clear, actionable legal advice, but also emphasize the necessity of consulting a professional lawyer before making a final decision.
|
139 |
+
- If clients wish to speak with a lawyer, you can introduce our team (WealthWizards), which consists of lawyers with different specializations and orientations.
|
140 |
+
- When providing advice, use language that is easy to understand and communicate with a tone of empathy and care. Let them feel that you understand their situation and sincerely want to help them.
|
141 |
+
|
142 |
+
7. Assistance with standardized contracts and legal documents:
|
143 |
+
|
144 |
+
- When clients request assistance with drafting or reviewing standardized contracts and legal documents, provide guidance and support to the best of your abilities.
|
145 |
+
- Analyze the client's needs and requirements, and offer suggestions on appropriate contract templates or clauses to include.
|
146 |
+
- Review drafted documents for potential legal issues or areas that may need improvement, and provide constructive feedback.
|
147 |
+
- However, always remind clients that while you can assist with drafting and review, final documents should still be reviewed and approved by a licensed attorney.
|
148 |
+
|
149 |
+
Please remember that your role is to provide general legal information and analysis, but also to actively guide and interact with the user during the conversation in a personalized and professional manner. If you feel that necessary information is missing to provide targeted analysis and advice, take the initiative to ask until you believe you have sufficient details. However, also be mindful to avoid over-inquiring or disregarding the user's needs and concerns.
|
150 |
+
|
151 |
+
When assisting with standardized contracts and documents, aim to provide value-added services while still maintaining the importance of attorney review. Your contract assistance should be a supplement to, not a replacement for, the interactive legal guidance that is your primary function.
|
152 |
+
|
153 |
+
Now, please guide me step by step to describe the legal issues I am facing, according to the above requirements.
|
154 |
+
'''
|
155 |
+
|
156 |
+
def query_vector_store(vector_store: FAISS, query, k=4, relevance_threshold=0.8):
|
157 |
+
"""
|
158 |
+
从向量存储中查询相似文档。
|
159 |
+
参数:
|
160 |
+
vector_store (FAISS): 向量存储实例
|
161 |
+
query (str): 查询内容
|
162 |
+
k (int): 返回文档数量
|
163 |
+
relevance_threshold (float): 相关性阈值
|
164 |
+
返回:
|
165 |
+
context (list): 查询到的上下文内容
|
166 |
+
"""
|
167 |
+
retriever = vector_store.as_retriever(search_type="similarity_score_threshold", search_kwargs={"score_threshold": relevance_threshold, "k": k})
|
168 |
+
similar_docs = retriever.invoke(query)
|
169 |
+
context = [doc.page_content for doc in similar_docs]
|
170 |
+
return context
|
171 |
+
|
172 |
+
@spaces.GPU(duration=120)
|
173 |
+
def chat_llama3_8b(message: str,
|
174 |
+
history: list,
|
175 |
+
temperature=0.6,
|
176 |
+
max_new_tokens=4096
|
177 |
+
) -> str:
|
178 |
+
"""
|
179 |
+
Generate a streaming response using the llama3-8b model.
|
180 |
+
Args:
|
181 |
+
message (str): The input message.
|
182 |
+
history (list): The conversation history used by ChatInterface.
|
183 |
+
temperature (float): The temperature for generating the response.
|
184 |
+
max_new_tokens (int): The maximum number of new tokens to generate.
|
185 |
+
Returns:
|
186 |
+
str: The generated response.
|
187 |
+
"""
|
188 |
+
citation = query_vector_store(vector_store, message, 4, 0.7)
|
189 |
+
if citation != None:
|
190 |
+
context = "Based on this citations: " + citation + "please answer questions:"
|
191 |
+
conversation = []
|
192 |
+
for user, assistant in history:
|
193 |
+
# content = background_prompt + user
|
194 |
+
conversation.extend([{"role": "user", "content": user}, {"role": "assistant", "content": assistant}])
|
195 |
+
if citation != None:
|
196 |
+
message = background_prompt + context + message
|
197 |
+
else:
|
198 |
+
message = background_prompt + message
|
199 |
+
conversation.append({"role": "user", "content": message})
|
200 |
+
|
201 |
+
input_ids = tokenizer.apply_chat_template(conversation, return_tensors="pt").to(model.device)
|
202 |
+
|
203 |
+
streamer = TextIteratorStreamer(tokenizer, timeout=10.0, skip_prompt=True, skip_special_tokens=True)
|
204 |
+
|
205 |
+
generate_kwargs = dict(
|
206 |
+
input_ids= input_ids,
|
207 |
+
streamer=streamer,
|
208 |
+
max_new_tokens=max_new_tokens,
|
209 |
+
do_sample=True,
|
210 |
+
temperature=temperature,
|
211 |
+
eos_token_id=terminators,
|
212 |
+
)
|
213 |
+
# This will enforce greedy generation (do_sample=False) when the temperature is passed 0, avoiding the crash.
|
214 |
+
if temperature == 0:
|
215 |
+
generate_kwargs['do_sample'] = False
|
216 |
+
|
217 |
+
t = Thread(target=model.generate, kwargs=generate_kwargs)
|
218 |
+
t.start()
|
219 |
+
|
220 |
+
outputs = []
|
221 |
+
for text in streamer:
|
222 |
+
outputs.append(text)
|
223 |
+
#print(outputs)
|
224 |
+
yield "".join(outputs)
|
225 |
+
|
226 |
+
|
227 |
+
# Gradio block
|
228 |
+
chatbot=gr.Chatbot(height=600, placeholder=PLACEHOLDER, label='Gradio ChatInterface')
|
229 |
+
|
230 |
+
with gr.Blocks(fill_height=True, css=css) as demo:
|
231 |
|
232 |
+
gr.Markdown(DESCRIPTION)
|
233 |
gr.ChatInterface(
|
234 |
+
fn=chat_llama3_8b,
|
235 |
+
chatbot=chatbot,
|
236 |
fill_height=True,
|
237 |
examples=[
|
238 |
['What are the key differences between a sole proprietorship and a partnership?'],
|
|
|
242 |
['How can I protect my intellectual property when sharing my idea with potential investors?']
|
243 |
],
|
244 |
cache_examples=False,
|
245 |
+
)
|
246 |
|
247 |
gr.Markdown(LICENSE)
|
248 |
+
|
249 |
if __name__ == "__main__":
|
250 |
demo.launch()
|