ttronrud commited on
Commit
72bc16a
1 Parent(s): fe6de98

Updated model card to reflect changes to model.

Browse files
Files changed (1) hide show
  1. README.md +18 -3
README.md CHANGED
@@ -17,7 +17,7 @@ the case with the baseline.
17
 
18
  The architecture of this LoRA model follows that of the LLaMA-7b Alpaca-LoRA with the hyper-parameters:
19
  ```
20
- LORA_R = 16
21
  LORA_ALPHA = 16
22
  LORA_DROPOUT= 0.05
23
  LORA_TARGET_MODULES = [
@@ -28,8 +28,24 @@ LORA_TARGET_MODULES = [
28
  ]
29
  ```
30
  The model was trained using PEFT for up to 3 epochs, with <code>load_best_model_at_end=True</code> set.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
 
32
- It can be recombined with the baseline model to generate text:
33
  ```
34
  BASE_MODEL = "openlm-research/open_llama_7b_700bt_preview"
35
 
@@ -39,7 +55,6 @@ bmodel = LlamaForCausalLM.from_pretrained(
39
  device_map="sequential"
40
  )
41
 
42
-
43
  peft_model_id = "starfishmedical/SFDocumentOracle-open_llama_7b_lora"
44
  tokenizer = LlamaTokenizer.from_pretrained(peft_model_id)
45
 
 
17
 
18
  The architecture of this LoRA model follows that of the LLaMA-7b Alpaca-LoRA with the hyper-parameters:
19
  ```
20
+ LORA_R = 8
21
  LORA_ALPHA = 16
22
  LORA_DROPOUT= 0.05
23
  LORA_TARGET_MODULES = [
 
28
  ]
29
  ```
30
  The model was trained using PEFT for up to 3 epochs, with <code>load_best_model_at_end=True</code> set.
31
+ The learning rate was set to 5e-5, so the minimal validation loss occurred very near to the end of training.
32
+
33
+ Both the combined model and adapter weights are available.
34
+
35
+ The combined model can be loaded and used right out of the box:
36
+ ```
37
+ BASE_MODEL = "StarFish-DocOracle"
38
+
39
+ model = LlamaForCausalLM.from_pretrained(
40
+ BASE_MODEL,
41
+ torch_dtype=torch.float16,
42
+ device_map="sequential"
43
+ )
44
+ tokenizer = LlamaTokenizer.from_pretrained(BASE_MODEL)
45
+ ```
46
+
47
+ The adapter can be recombined with the baseline model to generate text:
48
 
 
49
  ```
50
  BASE_MODEL = "openlm-research/open_llama_7b_700bt_preview"
51
 
 
55
  device_map="sequential"
56
  )
57
 
 
58
  peft_model_id = "starfishmedical/SFDocumentOracle-open_llama_7b_lora"
59
  tokenizer = LlamaTokenizer.from_pretrained(peft_model_id)
60