Sandiago21
/

falcon-7b-prompt-answering

@@ -5,8 +5,8 @@ language:
 library_name: transformers
 pipeline_tag: text-generation
 tags:
-- llama
-- decapoda-research-7b-hf
 - prompt answering
 - peft
 ---
@@ -88,9 +88,11 @@ Use the code below to get started with the model.
 ```python
 import torch
 from peft import PeftConfig, PeftModel
-from transformers import GenerationConfig, AutoTokenizer, AutoModelForCausalLM
-MODEL_NAME = "Sandiago21/falcon-7b-prompt-answering"
 compute_dtype = getattr(torch, "float16")
@@ -112,13 +114,13 @@ tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
 model = PeftModel.from_pretrained(model, MODEL_NAME)
-generation_config = GenerationConfig(
-    temperature=0.2,
-    top_p=0.75,
-    top_k=40,
-    num_beams=4,
-    max_new_tokens=32,
-)
 model.eval()
 if torch.__version__ >= "2":
@@ -144,7 +146,7 @@ with torch.no_grad():
 response = tokenizer.decode(outputs.sequences[0], skip_special_tokens=True)
 print(response)
->>> The capital city of Greece is Athens and it borders Turkey, Bulgaria, Macedonia, Albania, and the Aegean Sea.
 ```
 2. You can also directly call the model from HuggingFace using the following code snippet:
@@ -152,7 +154,7 @@ print(response)
 ```python
 import torch
 from peft import PeftConfig, PeftModel
-from transformers import GenerationConfig, AutoTokenizer, AutoModelForCausalLM
 MODEL_NAME = "Sandiago21/falcon-7b-prompt-answering"
 BASE_MODEL = "tiiuae/falcon-7b"
@@ -166,8 +168,6 @@ bnb_config = BitsAndBytesConfig(
     bnb_4bit_use_double_quant=True,
 )
-MODEL_NAME = "Sandiago21/falcon-7b-prompt-answering"
 model = AutoModelForCausalLM.from_pretrained(
     BASE_MODEL,
     quantization_config=bnb_config,
@@ -179,13 +179,13 @@ tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
 model = PeftModel.from_pretrained(model, MODEL_NAME)
-generation_config = GenerationConfig(
-    temperature=0.2,
-    top_p=0.75,
-    top_k=40,
-    num_beams=4,
-    max_new_tokens=32,
-)
 model.eval()
 if torch.__version__ >= "2":
@@ -212,7 +212,7 @@ with torch.no_grad():
 response = tokenizer.decode(outputs.sequences[0], skip_special_tokens=True)
 print(response)
->>> The capital city of Greece is Athens and it borders Turkey, Bulgaria, Macedonia, Albania, and the Aegean Sea.
 ```
 ## Training Details
@@ -245,12 +245,10 @@ The following hyperparameters were used during training:
 The tiiuae/falcon-7b was finetuned on conversations and question answering data
 ### Training Procedure
 The tiiuae/falcon-7b model was further trained and finetuned on question answering and prompts data for 1 epoch (approximately 10 hours of training on a single GPU)
 ## Model Architecture and Objective
 The model is based on tiiuae/falcon-7b model and finetuned adapters on top of the main model on conversations and question answering data.

 library_name: transformers
 pipeline_tag: text-generation
 tags:
+- falcon
+- falcon-7b
 - prompt answering
 - peft
 ---
 ```python
 import torch
 from peft import PeftConfig, PeftModel
+from transformers import GenerationConfig, AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
+MODEL_NAME = "."
+config = PeftConfig.from_pretrained(MODEL_NAME)
 compute_dtype = getattr(torch, "float16")
 model = PeftModel.from_pretrained(model, MODEL_NAME)
+generation_config = model.generation_config
+generation_config.top_p = 0.7
+generation_config.num_return_sequences = 1
+generation_config.max_new_tokens = 32
+generation_config.use_cache = False
+generation_config.pad_token_id = tokenizer.eos_token_id
+generation_config.eos_token_id = tokenizer.eos_token_id
 model.eval()
 if torch.__version__ >= "2":
 response = tokenizer.decode(outputs.sequences[0], skip_special_tokens=True)
 print(response)
+>>> The capital city of Greece is Athens and it borders Albania, Bulgaria, Macedonia, and Turkey.
 ```
 2. You can also directly call the model from HuggingFace using the following code snippet:
 ```python
 import torch
 from peft import PeftConfig, PeftModel
+from transformers import GenerationConfig, AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
 MODEL_NAME = "Sandiago21/falcon-7b-prompt-answering"
 BASE_MODEL = "tiiuae/falcon-7b"
     bnb_4bit_use_double_quant=True,
 )
 model = AutoModelForCausalLM.from_pretrained(
     BASE_MODEL,
     quantization_config=bnb_config,
 model = PeftModel.from_pretrained(model, MODEL_NAME)
+generation_config = model.generation_config
+generation_config.top_p = 0.7
+generation_config.num_return_sequences = 1
+generation_config.max_new_tokens = 32
+generation_config.use_cache = False
+generation_config.pad_token_id = tokenizer.eos_token_id
+generation_config.eos_token_id = tokenizer.eos_token_id
 model.eval()
 if torch.__version__ >= "2":
 response = tokenizer.decode(outputs.sequences[0], skip_special_tokens=True)
 print(response)
+>>> The capital city of Greece is Athens and it borders Albania, Bulgaria, Macedonia, and Turkey.
 ```
 ## Training Details
 The tiiuae/falcon-7b was finetuned on conversations and question answering data
 ### Training Procedure
 The tiiuae/falcon-7b model was further trained and finetuned on question answering and prompts data for 1 epoch (approximately 10 hours of training on a single GPU)
 ## Model Architecture and Objective
 The model is based on tiiuae/falcon-7b model and finetuned adapters on top of the main model on conversations and question answering data.

notebooks/HuggingFace-Inference-Falcon.ipynb CHANGED Viewed

@@ -101,7 +101,7 @@
   {
    "cell_type": "code",
    "execution_count": 4,
-   "id": "6072bb1e",
    "metadata": {},
    "outputs": [
     {
@@ -164,7 +164,7 @@
   {
    "cell_type": "code",
    "execution_count": 6,
-   "id": "af8527bd",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -665,7 +665,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "61ec99a8",
    "metadata": {},
    "outputs": [],
    "source": []

   {
    "cell_type": "code",
    "execution_count": 4,
+   "id": "fd681dd1",
    "metadata": {},
    "outputs": [
     {
   {
    "cell_type": "code",
    "execution_count": 6,
+   "id": "78a786cc",
    "metadata": {},
    "outputs": [],
    "source": [
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "b061a441",
    "metadata": {},
    "outputs": [],
    "source": []