Ericu950
/

Papy_1_Llama-3.1-8B-Instruct_date

@@ -15,6 +15,10 @@ tags:
 This is a fine-tuned version of the Llama-3.1-8B-Instruct model, specialized in assigning a date to Greek documentary papyri. On a test set of 1,856 unseen papyri, its predictions were, on average, 21.7 years away from the actual date spans.
 ## Usage
 To run the model, use the following code:
@@ -100,4 +104,94 @@ You should get this output:
 Year: 71 or 72 AD
 Suggestion 1: 71
 ```

 This is a fine-tuned version of the Llama-3.1-8B-Instruct model, specialized in assigning a date to Greek documentary papyri. On a test set of 1,856 unseen papyri, its predictions were, on average, 21.7 years away from the actual date spans.
+## Dataset
+This model is fine-tuned on the Ericu950/Papyri_1 dataset, which consists of Greek documentary papyri texts and their corresponding dates sourced from the amazing Papyri.info.
 ## Usage
 To run the model, use the following code:
 Year: 71 or 72 AD
 Suggestion 1: 71
 ```
+## Usage on free tier in Google Colab
+If you don’t have access to larger GPUs but want to try the model out, you can run it in a quantized format in Google Colab. **The quality of the responses might deteriorate significantly.** Follow these steps:
+### Step 1: Install Dependencies
+```
+!pip install -U bitsandbytes
+```
+After installing, **restart the runtime**.
+### Step 2: Run the model
+```
+from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, pipeline
+import torch
+quant_config = BitsAndBytesConfig(
+   load_in_4bit=True,
+   bnb_4bit_quant_type="nf4",
+   bnb_4bit_use_double_quant=True,
+   bnb_4bit_compute_dtype=torch.bfloat16
+)
+model = AutoModelForCausalLM.from_pretrained("Ericu950/Papy_1_Llama-3.1-8B-Instruct_date",
+device_map = "auto", quantization_config = quant_config)
+tokenizer = AutoTokenizer.from_pretrained("Ericu950/Papy_1_Llama-3.1-8B-Instruct_date")
+generation_pipeline = pipeline(
+    "text-generation",
+    model=model,
+    tokenizer=tokenizer,
+    device_map="auto",
+)
+papyrus_edition = """
+ετουσ τεταρτου αυτοκρατοροσ καισαροσ ουεσπασιανου σεβαστου ------------------
+ομολογει παυσιριων απολλωνιου του παuσιριωνοσ μητροσ ---------------τωι γεγονοτι αυτωι
+εκ τησ γενομενησ και μετηλλαχυιασ αυτου γυναικοσ -------------------------
+απο τησ αυτησ πολεωσ εν αγυιαι συγχωρειν ειναι ----------------------------------
+--------------------σ αυτωι εξ ησ συνεστιν ------------------------------------
+----τησ αυτησ γενεασ την υπαρχουσαν αυτωι οικιαν ------------
+------------------ ---------καὶ αιθριον και αυλη απερ ο υιοσ διοκοροσ --------------------------
+--------εγραψεν του δ αυτου διοσκορου ειναι ------------------------------------
+---------- και προ κατενγεγυηται τα δικαια --------------------------------------
+νησ κατα τουσ τησ χωρασ νομουσ· εαν δε μη ---------------------------------------
+υπ αυτου τηι του διοσκορου σημαινομενηι -----------------------------------ενοικισμωι του
+ημισουσ μερουσ τησ προκειμενησ οικιασ --------------------------------- διοσκοροσ την τουτων αποχην
+---------------------------------------------μηδ υπεναντιον τουτοισ επιτελειν μηδε
+------------------------------------------------ ανασκευηι κατ αυτησ τιθεσθαι ομολογιαν μηδε
+----------------------------------- επιτελεσαι η χωρισ του κυρια ειναι τα διομολογημενα
+παραβαινειν, εκτεινειν δε τον παραβησομενον τωι υιωι διοσκορωι η τοισ παρ αυτου καθ εκαστην
+εφοδον το τε βλαβοσ και επιτιμον αργυριου δραχμασ 0 και εισ το δημοσιον τασ ισασ και μηθεν
+ησσον· δ -----ιων ομολογιαν συνεχωρησεν·"""
+system_prompt = "Date this papyrus fragment to an exact year!"
+input_messages = [
+    {"role": "system", "content": system_prompt},
+    {"role": "user", "content": papyrus_edition},
+]
+outputs = generation_pipeline(
+    input_messages,
+    max_new_tokens=4,
+    num_beams=10,
+    num_return_sequences=1,
+    early_stopping=True,
+)
+beam_contents = []
+for output in outputs:
+    generated_text = output.get('generated_text', [])
+    for item in generated_text:
+        if item.get('role') == 'assistant':
+            beam_contents.append(item.get('content'))
+real_response = "71 or 72 AD"
+print(f"Year: {real_response}")
+for i, content in enumerate(beam_contents, start=1):
+    print(f"Suggestion {i}: {content}")
+```
+### Expected Output:
+```
+Year: 71 or 72 AD
+Suggestion 1: 71
+```