Text Generation
GGUF
English
biology
medical
munish0838 commited on
Commit
dc835fd
1 Parent(s): 389ea64

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +86 -0
README.md ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - EleutherAI/pile
4
+ - Open-Orca/OpenOrca
5
+ - GAIR/lima
6
+ - WizardLM/WizardLM_evol_instruct_V2_196k
7
+ language:
8
+ - en
9
+ license: llama3
10
+ tags:
11
+ - biology
12
+ - medical
13
+ pipeline_tag: text-generation
14
+ base_model: instruction-pretrain/medicine-Llama3-8B
15
+ ---
16
+
17
+ # QuantFactory/medicine-Llama3-8B-GGUF
18
+ This is quantized version of [instruction-pretrain/medicine-Llama3-8B](https://huggingface.co/instruction-pretrain/medicine-Llama3-8B) created using llama.cpp
19
+
20
+ # Model Description
21
+ ## Instruction Pre-Training: Language Models are Supervised Multitask Learners
22
+ This repo contains the **biomedicine model developed from Llama3-8B** in our paper [Instruction Pre-Training: Language Models are Supervised Multitask Learners](https://huggingface.co/papers/2406.14491).
23
+
24
+ We explore supervised multitask pre-training by proposing ***Instruction Pre-Training***, a framework that scalably augments massive raw corpora with instruction-response pairs to pre-train language models. The instruction-response pairs are generated by an efficient instruction synthesizer built on open-source models. ***Instruction Pre-Training* outperforms *Vanilla Pre-training* in both general pre-training from scratch and domain-adaptive continual pre-training.** In pre-training from scratch, *Instruction Pre-Training* not only improves pre-trained base models but also benefits more from further instruction tuning. **In continual pre-training, *Instruction Pre-Training* enables Llama3-8B to be comparable to or even outperform Llama3-70B.**
25
+
26
+ <p align='center'>
27
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/66711d2ee12fa6cc5f5dfc89/vRdsFIVQptbNaGiZ18Lih.png" width="400">
28
+ </p>
29
+
30
+
31
+ ## Resources
32
+ **🤗 We share our data and models with example usages, feel free to open any issues or discussions! 🤗**
33
+
34
+ - Context-Based Instruction Synthesizer: [instruction-synthesizer](https://huggingface.co/instruction-pretrain/instruction-synthesizer)
35
+ - Fine-Tuning Data for the Synthesizer: [ft-instruction-synthesizer-collection](https://huggingface.co/datasets/instruction-pretrain/ft-instruction-synthesizer-collection)
36
+ - General Models Pre-Trained from Scratch:
37
+ - [InstructLM-500M](https://huggingface.co/instruction-pretrain/InstructLM-500M)
38
+ - [InstructLM-1.3B](https://huggingface.co/instruction-pretrain/InstructLM-1.3B)
39
+ - Domain-Specific Models Pre-Trained from Llama3-8B:
40
+ - [Finance-Llama3-8B](https://huggingface.co/instruction-pretrain/finance-Llama3-8B)
41
+ - [Biomedicine-Llama3-8B](https://huggingface.co/instruction-pretrain/medicine-Llama3-8B)
42
+
43
+
44
+ ## Domain-Adaptive Continued Pre-Training
45
+ Following [AdaptLLM](https://huggingface.co/AdaptLLM/medicine-chat), we augment the domain-specific raw corpora with instruction-response pairs generated by our [context-based instruction synthesizer](https://huggingface.co/instruction-pretrain/instruction-synthesizer).
46
+
47
+ For example, to chat with the biomedicine-Llama3-8B model:
48
+ ```python
49
+ from transformers import AutoModelForCausalLM, AutoTokenizer
50
+
51
+ model = AutoModelForCausalLM.from_pretrained("instruction-pretrain/medicine-Llama3-8B")
52
+ tokenizer = AutoTokenizer.from_pretrained("instruction-pretrain/medicine-Llama3-8B")
53
+
54
+ # Put your input here, NO prompt template is required
55
+ user_input = '''Question: Which of the following is an example of monosomy?
56
+ Options:
57
+ - 46,XX
58
+ - 47,XXX
59
+ - 69,XYY
60
+ - 45,X
61
+
62
+ Please provide your choice first and then provide explanations if possible.'''
63
+
64
+ inputs = tokenizer(user_input, return_tensors="pt", add_special_tokens=True).input_ids.to(model.device)
65
+ outputs = model.generate(input_ids=inputs, max_new_tokens=400)[0]
66
+
67
+ answer_start = int(inputs.shape[-1])
68
+ pred = tokenizer.decode(outputs[answer_start:], skip_special_tokens=True)
69
+
70
+ print(pred)
71
+ ```
72
+
73
+ ## Model Citation
74
+ If you find our work helpful, please cite us:
75
+
76
+ [AdaptLLM](https://huggingface.co/papers/2309.09530)
77
+ ```bibtex
78
+ @inproceedings{
79
+ cheng2024adapting,
80
+ title={Adapting Large Language Models via Reading Comprehension},
81
+ author={Daixuan Cheng and Shaohan Huang and Furu Wei},
82
+ booktitle={The Twelfth International Conference on Learning Representations},
83
+ year={2024},
84
+ url={https://openreview.net/forum?id=y886UXPEZ0}
85
+ }
86
+ ```