File size: 9,693 Bytes
00e9ce3
 
eb7a6f0
981080a
df68b1b
 
 
 
981080a
 
cf8eaec
981080a
 
 
 
 
 
 
 
6932386
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
981080a
47d5dee
 
981080a
e8cb9e7
96343a3
981080a
 
eb7a6f0
00e9ce3
981080a
 
 
8e10ba2
00e9ce3
 
47d5dee
981080a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e723b21
 
 
 
 
 
 
e19515a
4e8674d
 
bde699a
41cbd88
081649b
41cbd88
 
 
 
 
730432c
 
e19515a
71e00a3
e723b21
981080a
 
 
 
 
 
 
 
 
e723b21
981080a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
PREFIX = """You are an Internet Search Scraper with acces to an external set of tools.
Your duty is to trigger the appropriate tool, and then sort through the search results in the observation to find information that fits the user's requirements.
Deny the users request to perform any search that can be considered dangerous, harmful, illegal, or potentially illegal

Make sure your information is current
Current Date and Time is:
{timestamp}

You have access to the following tools:
- action: UPDATE-TASK action_input=NEW_TASK
- action: SEARCH_ENGINE action_input=SEARCH_ENGINE_URL/?q=SEARCH_QUERY
- action: SCRAPE_WEBSITE action_input=WEBSITE_URL
- action: COMPLETE

Search Purpose:
{purpose}
"""
FINDER = """

Instructions
- Use the provided tool to find a website to scrape
- Use the tool provided tool to scrape the text from the website url
- Find the pertinent information in the text that you scrape
- If you find conflicting data, decide which source is more accurate
- When you are finished, return with\naction: COMPLETE

Use the following format:
task: choose the next action from your available tools
action: the action to take (should be one of [UPDATE-TASK, SEARCH_ENGINE, SCRAPE_WEBSITE, COMPLETE]) action_input=XXX

Example:
User command: Find me the breaking news from today
action: SEARCH_ENGINE action_input=https://www.google.com/search?q=todays+breaking+news


Progress:
{history}"""

FINDER_OG = """

Instructions
- Use the provided tool to find a website to scrape
- Use the tool provided tool to scrape the text from the website url
- Find the pertinent information in the text that you scrape
- If you find conflicting data, decide which source is more accurate
- When you are finished, return with\naction: COMPLETE

Use the following format:
task: choose the next action from your available tools
action: the action to take (should be one of [UPDATE-TASK, SEARCH_ENGINE, SCRAPE_WEBSITE, COMPLETE]) action_input=XXX

Example:
User command: Find me the breaking news from today
action: SEARCH_ENGINE action_input=https://www.google.com/search?q=todays+breaking+news


Progress:
{history}"""

MODEL_FINDER_PRE = """
You have access to the following tools:
- action: UPDATE-TASK action_input=NEW_TASK
- action: SEARCH action_input=SEARCH_QUERY
- action: COMPLETE
Instructions
- Generate a search query for the requested model from these options: 
>{TASKS}
- Return the search query using the search tool
- Wait for the search to return a result
- After observing the search result, choose a model
- Return the name of the repo and model ("repo/model")
- When you are finished, return with  action: COMPLETE
Use the following format:
task: the input task you must complete
thought: you should always think about what to do
action: the action to take (should be one of [UPDATE-TASK, SEARCH, COMPLETE]) action_input=XXX
observation: the result of the action
thought: you should always think after an observation
action: SEARCH action_input='text-generation'
... (thought/action/observation/thought can repeat N times)
Example:
***************************
User command: Find me a text generation model with less than 50M parameters.
thought: I will use the option 'text-generation'
action: SEARCH action_input=text-generation
--- pause and wait for data to be returned ---
Response:
Assistant: I found the 'distilgpt2' model which has around 82M parameters. It is a distilled version of the GPT-2 model from OpenAI, trained by Hugging Face. Here's how to load it:
action: COMPLETE
***************************
You are attempting to complete the task
task: {task}
{history}"""


ACTION_PROMPT = """
You have access to the following tools:
- action: UPDATE-TASK action_input=NEW_TASK
- action: SEARCH action_input=SEARCH_QUERY
- action: COMPLETE
Instructions
- Generate a search query for the requested model
- Return the search query using the search tool
- Wait for the search to return a result
- After observing the search result, choose a model
- Return the name of the repo and model ("repo/model")
Use the following format:
task: the input task you must complete
action: the action to take (should be one of [UPDATE-TASK, SEARCH, COMPLETE]) action_input=XXX
observation: the result of the action
action: SEARCH action_input='text generation'
You are attempting to complete the task
task: {task}
{history}"""

ACTION_PROMPT_PRE = """
You have access to the following tools:
- action: UPDATE-TASK action_input=NEW_TASK
- action: SEARCH action_input=SEARCH_QUERY
- action: COMPLETE
Instructions
- Generate a search query for the requested model
- Return the search query using the search tool
- Wait for the search to return a result
- After observing the search result, choose a model
- Return the name of the repo and model ("repo/model")
Use the following format:
task: the input task you must complete
thought: you should always think about what to do
action: the action to take (should be one of [UPDATE-TASK, SEARCH, COMPLETE]) action_input=XXX
observation: the result of the action
thought: you should always think after an observation
action: SEARCH action_input='text generation'
... (thought/action/observation/thought can repeat N times)
You are attempting to complete the task
task: {task}
{history}"""

TASK_PROMPT = """
You are attempting to complete the task
task: {task}
Progress:
{history}
Tasks should be small, isolated, and independent
To start a search use the format:
action: SEARCH_ENGINE action_input=URL/?q='SEARCH_QUERY'
What should the task be for us to achieve the purpose?
task: """


COMPRESS_DATA_PROMPT_SMALL = """
You are attempting to complete the task
task: {task}
Current data:
{knowledge}
New data:
{history}
Compress the data above into a concise data presentation of relevant data
Include datapoints that will provide greater accuracy in completing the task
Return the data in JSON format to save space
"""

SAVE_MEMORY = """
You are attempting to complete the task
task: {task}
Current data:
{knowledge}
New data:
{history}
Instructions:
Compile and categorize the data above into a JSON dictionary string
Include ALL datapoints, titles, descriptions, and source urls indexed into an easy to search JSON format
Your final response should be only the final formatted JSON string enclosed in brackets, and nothing else.
Required keys:
"keywords":["short", "list", "of", "keywords", "relevant", "to", "this", "entry"]
"title":"title of entry"
"description":"description of entry"
"content":"full content of data about entry"
"url":"https://url.source"

"""
UNUSED="""
Example Response:
{"results": [ { "title": "Current weather - Florida - AccuWeather Forecast for Today & Tomorrow", "description": "Get the current weather in Florida, including temperature, wind speed, and humidity. See the 10-day forecast for Florida.", "url": "https://www.accuweather.com/en/us/florida/weather-forecast/351103", "datapoints": { "date_time": "2024-01-30 07:55:43.207290", "temperature": { "value": 27, "unit": "C" }, "humidity": { "value": 55, "unit": "%" }, "wind_speed": { "value": 10, "unit": "km/h" } } } ] }
"""

COMPRESS_DATA_PROMPT = """
You are attempting to complete the task
task: {task}
Current data:
{knowledge}
New data:
{history}
Compress the data above into a concise data presentation of relevant data
Include all datapoints and source urls that will provide greater accuracy in completing the task
"""

COMPRESS_HISTORY_PROMPT = """
You are attempting to complete the task
task: {task}
Progress:
{history}
Compress the timeline of progress above into a single summary (as a paragraph)
Include all important milestones, the current challenges, and implementation details necessary to proceed
"""

LOG_PROMPT = """
PROMPT
**************************************
{}
**************************************
"""

LOG_RESPONSE = """
RESPONSE
**************************************
{}
**************************************
"""


FINDER1 = """
Example Response 1:
User command: Find me a text generation model with less than 50M parameters.
Query: text generation
--- pause and wait for data to be returned ---
Assistant: I found the 'distilgpt2' model which has around 82M parameters. It is a distilled version of the GPT-2 model from OpenAI, trained by Hugging Face. Here's how to load it:
```python
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("distilgpt2")
model = AutoModelForMaskedLM.from_pretrained("distilgpt2")
```
Example Response 2:
User command: Help me locate a multilingual Named Entity Recognition model.
Query: named entity recognition
--- pause and wait for data to be returned ---
Assistant: I discovered the 'dbmdz/bert-base-multilingual-cased' model, which supports named entity recognition across multiple languages. Here's how to load it:
```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-multilingual-cased")
model = AutoModelForTokenClassification.from_pretrained("dbmdz/bert-base-multilingual-cased")
```
Example Response 3:
User command: Search for a question-answering model fine-tuned on the SQuAD v2 dataset with more than 90% accuracy.
action: SEARCH action_input=named entity recognition
--- pause and wait for data to be returned ---
Assistant: I found the 'pranavkv/roberta-base-squad2' model, which was fine-tuned on the SQuAD v2 dataset and achieves approximately 91% accuracy. Below is the way to load it:
```python
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
tokenizer = AutoTokenizer.from_pretrained("pranavkv/roberta-base-squad2")
model = AutoModelForQuestionAnswering.from_pretrained("pranavkv/roberta-base-squad2")
```
"""