Ericu950
/

Papy_1_Llama-3.1-8B-Instruct_date

@@ -21,13 +21,35 @@ This model is fine-tuned on the Ericu950/Papyri_1 dataset, which consists of Gre
 ## Usage
-To run the model, use the following code:
 ```python
 import json
 from transformers import pipeline, AutoTokenizer, LlamaForCausalLM
 import torch
 papyrus_edition = """
 ετουσ τεταρτου αυτοκρατοροσ καισαροσ ουεσπασιανου σεβαστου ------------------
 ομολογει παυσιριων απολλωνιου του παuσιριωνοσ μητροσ ---------------τωι γεγονοτι αυτωι
@@ -49,22 +71,6 @@ papyrus_edition = """
 ησσον· δ -----ιων ομολογιαν συνεχωρησεν·
 """
-model_id = "Ericu950/Papy_1_Llama-3.1-8B-Instruct_date"
-model = LlamaForCausalLM.from_pretrained(
-    model_id,
-    device_map="auto",
-)
-tokenizer = AutoTokenizer.from_pretrained(model_id)
-generation_pipeline = pipeline(
-    "text-generation",
-    model=model,
-    tokenizer=tokenizer,
-    device_map="auto",
-)
 system_prompt = "Date this papyrus fragment to an exact year!"
 input_messages = [
@@ -80,7 +86,7 @@ terminators = [
 outputs = generation_pipeline(
     input_messages,
     max_new_tokens=4,
-    num_beams=20,
     num_return_sequences=1,
     early_stopping=True,
 )
@@ -98,25 +104,25 @@ print(f"Year: {real_response}")
 for i, content in enumerate(beam_contents, start=1):
     print(f"Suggestion {i}: {content}")
 ```
-You should get this output:
 ```
 Year: 71 or 72 AD
 Suggestion 1: 71
 ```
 ## Usage on free tier in Google Colab
-If you don’t have access to larger GPUs but want to try the model out, you can run it in a quantized format in Google Colab. **The quality of the responses might deteriorate significantly.** Follow these steps:
 ### Step 1: Install Dependencies
 ```
 !pip install -U bitsandbytes
 ```
-After installing, **restart the runtime**.
-### Step 2: Run the model
-```
 from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, pipeline
 import torch
@@ -138,7 +144,10 @@ generation_pipeline = pipeline(
     tokenizer=tokenizer,
     device_map="auto",
 )
 papyrus_edition = """
 ετουσ τεταρτου αυτοκρατοροσ καισαροσ ουεσπασιανου σεβαστου ------------------
 ομολογει παυσιριων απολλωνιου του παuσιριωνοσ μητροσ ---------------τωι γεγονοτι αυτωι

 ## Usage
+To run the model on a GPU with larger memory, following these steps:
+### 1. Download and load the model
 ```python
 import json
 from transformers import pipeline, AutoTokenizer, LlamaForCausalLM
 import torch
+model_id = "Ericu950/Papy_1_Llama-3.1-8B-Instruct_date"
+model = LlamaForCausalLM.from_pretrained(
+    model_id,
+    device_map="auto",
+)
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+generation_pipeline = pipeline(
+    "text-generation",
+    model=model,
+    tokenizer=tokenizer,
+    device_map="auto",
+)
+```
+### 2. Run inference on a papyrus fragment of your choice
+```python
+# This is a rough transcription of Pap.Ups. 106
 papyrus_edition = """
 ετουσ τεταρτου αυτοκρατοροσ καισαροσ ουεσπασιανου σεβαστου ------------------
 ομολογει παυσιριων απολλωνιου του παuσιριωνοσ μητροσ ---------------τωι γεγονοτι αυτωι
 ησσον· δ -----ιων ομολογιαν συνεχωρησεν·
 """
 system_prompt = "Date this papyrus fragment to an exact year!"
 input_messages = [
 outputs = generation_pipeline(
     input_messages,
     max_new_tokens=4,
+    num_beams=45, # Set this as high as your memory will allow!
     num_return_sequences=1,
     early_stopping=True,
 )
 for i, content in enumerate(beam_contents, start=1):
     print(f"Suggestion {i}: {content}")
 ```
+### Expected Output:
 ```
 Year: 71 or 72 AD
 Suggestion 1: 71
 ```
 ## Usage on free tier in Google Colab
+If you don’t have access to a larger GPU but want to try the model out, you can run it in a quantized format in Google Colab. **The quality of the responses might deteriorate significantly.** Follow these steps:
 ### Step 1: Install Dependencies
 ```
 !pip install -U bitsandbytes
+import os
+os._exit(00)
 ```
+### Step 2: Download and quantize the model
+```python
 from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, pipeline
 import torch
     tokenizer=tokenizer,
     device_map="auto",
 )
+```
+### Step 3: Run inference on a papyrus fragment of your choice
+```
+# This is a rough transcription of Pap.Ups. 106
 papyrus_edition = """
 ετουσ τεταρτου αυτοκρατοροσ καισαροσ ουεσπασιανου σεβαστου ------------------
 ομολογει παυσιριων απολλωνιου του παuσιριωνοσ μητροσ ---------------τωι γεγονοτι αυτωι