Ericu950 commited on
Commit
08552f6
1 Parent(s): 4e9b1dc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +94 -0
README.md CHANGED
@@ -15,6 +15,10 @@ tags:
15
 
16
  This is a fine-tuned version of the Llama-3.1-8B-Instruct model, specialized in assigning a date to Greek documentary papyri. On a test set of 1,856 unseen papyri, its predictions were, on average, 21.7 years away from the actual date spans.
17
 
 
 
 
 
18
  ## Usage
19
 
20
  To run the model, use the following code:
@@ -100,4 +104,94 @@ You should get this output:
100
  Year: 71 or 72 AD
101
  Suggestion 1: 71
102
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
103
 
 
15
 
16
  This is a fine-tuned version of the Llama-3.1-8B-Instruct model, specialized in assigning a date to Greek documentary papyri. On a test set of 1,856 unseen papyri, its predictions were, on average, 21.7 years away from the actual date spans.
17
 
18
+ ## Dataset
19
+
20
+ This model is fine-tuned on the Ericu950/Papyri_1 dataset, which consists of Greek documentary papyri texts and their corresponding dates sourced from the amazing Papyri.info.
21
+
22
  ## Usage
23
 
24
  To run the model, use the following code:
 
104
  Year: 71 or 72 AD
105
  Suggestion 1: 71
106
  ```
107
+ ## Usage on free tier in Google Colab
108
+
109
+ If you don’t have access to larger GPUs but want to try the model out, you can run it in a quantized format in Google Colab. **The quality of the responses might deteriorate significantly.** Follow these steps:
110
+
111
+ ### Step 1: Install Dependencies
112
+ ```
113
+ !pip install -U bitsandbytes
114
+ ```
115
+ After installing, **restart the runtime**.
116
+
117
+ ### Step 2: Run the model
118
+
119
+ ```
120
+
121
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, pipeline
122
+ import torch
123
+
124
+ quant_config = BitsAndBytesConfig(
125
+ load_in_4bit=True,
126
+ bnb_4bit_quant_type="nf4",
127
+ bnb_4bit_use_double_quant=True,
128
+ bnb_4bit_compute_dtype=torch.bfloat16
129
+ )
130
+
131
+ model = AutoModelForCausalLM.from_pretrained("Ericu950/Papy_1_Llama-3.1-8B-Instruct_date",
132
+ device_map = "auto", quantization_config = quant_config)
133
+ tokenizer = AutoTokenizer.from_pretrained("Ericu950/Papy_1_Llama-3.1-8B-Instruct_date")
134
+
135
+ generation_pipeline = pipeline(
136
+ "text-generation",
137
+ model=model,
138
+ tokenizer=tokenizer,
139
+ device_map="auto",
140
+ )
141
+
142
+ papyrus_edition = """
143
+ ετουσ τεταρτου αυτοκρατοροσ καισαροσ ουεσπασιανου σεβαστου ------------------
144
+ ομολογει παυσιριων απολλωνιου του παuσιριωνοσ μητροσ ---------------τωι γεγονοτι αυτωι
145
+ εκ τησ γενομενησ και μετηλλαχυιασ αυτου γυναικοσ -------------------------
146
+ απο τησ αυτησ πολεωσ εν αγυιαι συγχωρειν ειναι ----------------------------------
147
+ --------------------σ αυτωι εξ ησ συνεστιν ------------------------------------
148
+ ----τησ αυτησ γενεασ την υπαρχουσαν αυτωι οικιαν ------------
149
+ ------------------ ---------καὶ αιθριον και αυλη απερ ο υιοσ διοκοροσ --------------------------
150
+ --------εγραψεν του δ αυτου διοσκορου ειναι ------------------------------------
151
+ ---------- και προ κατενγεγυηται τα δικαια --------------------------------------
152
+ νησ κατα τουσ τησ χωρασ νομουσ· εαν δε μη ---------------------------------------
153
+ υπ αυτου τηι του διοσκορου σημαινομενηι -----------------------------------ενοικισμωι του
154
+ ημισουσ μερουσ τησ προκειμενησ οικιασ --------------------------------- διοσκοροσ την τουτων αποχην
155
+ ---------------------------------------------μηδ υπεναντιον τουτοισ επιτελειν μηδε
156
+ ------------------------------------------------ ανασκευηι κατ αυτησ τιθεσθαι ομολογιαν μηδε
157
+ ----------------------------------- επιτελεσαι η χωρισ του κυρια ειναι τα διομολογημενα
158
+ παραβαινειν, εκτεινειν δε τον παραβησομενον τωι υιωι διοσκορωι η τοισ παρ αυτου καθ εκαστην
159
+ εφοδον το τε βλαβοσ και επιτιμον αργυριου δραχμασ 0 και εισ το δημοσιον τασ ισασ και μηθεν
160
+ ησσον· δ -----ιων ομολογιαν συνεχωρησεν·"""
161
+
162
+ system_prompt = "Date this papyrus fragment to an exact year!"
163
+
164
+ input_messages = [
165
+ {"role": "system", "content": system_prompt},
166
+ {"role": "user", "content": papyrus_edition},
167
+ ]
168
+
169
+ outputs = generation_pipeline(
170
+ input_messages,
171
+ max_new_tokens=4,
172
+ num_beams=10,
173
+ num_return_sequences=1,
174
+ early_stopping=True,
175
+ )
176
+
177
+ beam_contents = []
178
+ for output in outputs:
179
+ generated_text = output.get('generated_text', [])
180
+ for item in generated_text:
181
+ if item.get('role') == 'assistant':
182
+ beam_contents.append(item.get('content'))
183
+
184
+ real_response = "71 or 72 AD"
185
+
186
+ print(f"Year: {real_response}")
187
+ for i, content in enumerate(beam_contents, start=1):
188
+ print(f"Suggestion {i}: {content}")
189
+ ```
190
+ ### Expected Output:
191
+ ```
192
+ Year: 71 or 72 AD
193
+ Suggestion 1: 71
194
+ ```
195
+
196
+
197