doberst commited on
Commit
9288fbc
1 Parent(s): bcadd8b

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -0
README.md CHANGED
@@ -1,3 +1,90 @@
1
  ---
2
  license: cc-by-sa-4.0
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-sa-4.0
3
+ inference: false
4
  ---
5
+
6
+ # SLIM-SA-NER-3B
7
+
8
+ <!-- Provide a quick summary of what the model is/does. -->
9
+
10
+ **slim-sa-ner-3b** combines two of the most popular traditional classifier functions (**Sentiment Analysis** and **Named Entity Recognition**), and reimagines them as function calls on a specialized decoder-based LLM, generating output consisting of a python dictionary with keys corresponding to sentiment, and NER identifiers, such as people, organization, and place, e.g.:
11
+
12
+ &nbsp;&nbsp;&nbsp;&nbsp;`{'sentiment': ['positive'], people': ['..'], 'organization': ['..'],'place': ['..]}`
13
+
14
+ This 'combo' model is designed to illustrate the potential power of using function calls on small, specialized models to enable a single model architecture to combine the capabilities of what were traditionally two separate model architectures on an encoder.
15
+
16
+ The intent of SLIMs is to forge a middle-ground between traditional encoder-based classifiers and open-ended API-based LLMs, providing an intuitive, flexible natural language response, without complex prompting, and with improved generalization and ability to fine-tune to a specific domain use case.
17
+
18
+
19
+ This model is fine-tuned on top of [**llmware/bling-stable-lm-3b-4e1t-v0**](https://huggingface.co/llmware/bling-stable-lm-3b-4e1t-v0), which in turn, is a fine-tune of stabilityai/stablelm-3b-4elt.
20
+
21
+ Each slim model has a 'quantized tool' version, e.g., [**'slim-sa-ner-3b-tool'**](https://huggingface.co/llmware/slim-sa-ner-3b-tool).
22
+
23
+
24
+ ## Prompt format:
25
+
26
+ `function = "classify"`
27
+ `params = "sentiment, person, organization, place"`
28
+ `prompt = "<human> " + {text} + "\n" + `
29
+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp; &nbsp; &nbsp;`"<{function}> " + {params} + "</{function}>" + "\n<bot>:"`
30
+
31
+
32
+ <details>
33
+ <summary>Transformers Script </summary>
34
+
35
+ model = AutoModelForCausalLM.from_pretrained("llmware/slim-sa-ner-3b")
36
+ tokenizer = AutoTokenizer.from_pretrained("llmware/slim-sa-ner-3b")
37
+
38
+ function = "classify"
39
+ params = "topic"
40
+
41
+ text = "Tesla stock declined yesterday 8% in premarket trading after a poorly-received event in San Francisco yesterday, in which the company indicated a likely shortfall in revenue."
42
+
43
+ prompt = "<human>: " + text + "\n" + f"<{function}> {params} </{function}>\n<bot>:"
44
+
45
+ inputs = tokenizer(prompt, return_tensors="pt")
46
+ start_of_input = len(inputs.input_ids[0])
47
+
48
+ outputs = model.generate(
49
+ inputs.input_ids.to('cpu'),
50
+ eos_token_id=tokenizer.eos_token_id,
51
+ pad_token_id=tokenizer.eos_token_id,
52
+ do_sample=True,
53
+ temperature=0.3,
54
+ max_new_tokens=100
55
+ )
56
+
57
+ output_only = tokenizer.decode(outputs[0][start_of_input:], skip_special_tokens=True)
58
+
59
+ print("output only: ", output_only)
60
+
61
+ # here's the fun part
62
+ try:
63
+ output_only = ast.literal_eval(llm_string_output)
64
+ print("success - converted to python dictionary automatically")
65
+ except:
66
+ print("fail - could not convert to python dictionary automatically - ", llm_string_output)
67
+
68
+ </details>
69
+
70
+ <details>
71
+
72
+
73
+
74
+
75
+ <summary>Using as Function Call in LLMWare</summary>
76
+
77
+ from llmware.models import ModelCatalog
78
+ slim_model = ModelCatalog().load_model("llmware/slim-sa-ner-3b")
79
+ response = slim_model.function_call(text,params=["sentiment", "people", "organization", "place"], function="classify")
80
+
81
+ print("llmware - llm_response: ", response)
82
+
83
+ </details>
84
+
85
+
86
+ ## Model Card Contact
87
+
88
+ Darren Oberst & llmware team
89
+
90
+ [Join us on Discord](https://discord.gg/MhZn5Nc39h)