ahujachirag commited on
Commit
b235bfe
1 Parent(s): 6fe45dc

Updated model

Browse files
README.md CHANGED
@@ -1,3 +1,75 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: cc-by-4.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
+ inference: true
5
+ widget:
6
+ - text: "What are the duties of the President of India as per the Constitution?"
7
+ example_title: "Duties of President"
8
+ - text: "Can you analyze the legal implications of the Ayodhya Verdict by the Supreme Court of India?"
9
+ example_title: "Implications of Ayodhya Verdict"
10
+ - text: "Can you summarize the main provisions of the Hindu Succession Act, 1956?"
11
+ example_title: "Diving Top 10"
12
+ - text: "Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\nDevelop a legal strategy for a client based on the facts of the provided case.\n\n### Input:\nThe client in question is a government company that terminated the services of a permanent employee without providing any justification. The termination was carried out by invoking a rule similar to Rule 9(i) in the Central Inland Water Transport Corporation Ltd. vs Brojo Nath Ganguly & Anr. case. The employee who was terminated has taken legal action by challenging both the termination order and the validity of the rule in the High Court under Article 226.\n\n### Response:\n"
13
+ example_title: "Create Legal Strategy 1"
14
+ - text: "Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction: \nDevelop a legal strategy for a hypothetical client based on the facts of the provided case.\n\n### Input:\nThe individual seeking assistance is a research scientist employed at a government-funded research institute, comparable to CSIR. They have been unjustly dismissed from their position and seek to contest the termination through legal means. The individual contends that the institute, being government-funded, qualifies as a 'State' as per Article 12 of the Constitution. Consequently, they believe they should have the right to file a writ petition against the institute.\n\n### Response:\n"
15
+ example_title: "Create Legal Strategy 2"
16
+ - text: "What is DV act ?"
17
+ example_title: "Understand Act"
18
  license: cc-by-4.0
19
  ---
20
+ # LokPalAI: Bridging the Gap to Legal Empowerment
21
+
22
+ LokPalAI is an advanced language model finetuned for Indian scenarios, specifically designed to bridge the gap between individuals and legal empowerment. With LokPalAI, users can interact with a powerful query box to seek information and guidance related to Indian law.
23
+
24
+
25
+ ## Features:
26
+ 1. Interact with LokPalAI’s Query Box: LokPalAI provides a user-friendly query box interface where users can input their legal queries and receive accurate and relevant responses. Whether you need information about a specific law, legal procedure, or any other legal matter, LokPalAI is here to assist you.
27
+ 2. Enhanced with Rail Guards: To ensure the accuracy and reliability of the information provided, LokPalAI incorporates rail guards. These safeguards help prevent the generation of misleading or incorrect legal advice. We understand the importance of reliable legal information, and our rail guards are designed to maintain the highest standards of accuracy.
28
+ 3. Real-Time Responses using RAG: LokPalAI leverages the Retrieve and Generate (RAG) framework to provide real-time responses to your legal queries. RAG combines the power of retrieval-based models with generation-based models, ensuring that the information provided is both contextually relevant and up to date.
29
+ 4. Thorough Testing and Maintenance: We understand the criticality of maintaining a reliable and accurate legal information system. LokPalAI undergoes extensive testing to ensure its performance and reliability. We continuously monitor and update the model to account for changes in Indian law, ensuring that the information provided is always accurate and up to date.
30
+
31
+ # ✨ LokpalGPT-Instruct-Falcon-7b
32
+
33
+ ## Dataset
34
+ The dataset is being curated and created using judgements available in IndianKanoon.com. You can refer the whole process here. Soon, we will be releasing our dataset and the training process.
35
+
36
+ ## How to Use for Inference ?
37
+
38
+ 💥 **Falcon LLMs require PyTorch 2.0 for use with `transformers`!**
39
+
40
+ For fast inference with Falcon, check-out [Text Generation Inference](https://github.com/huggingface/text-generation-inference)! Read more in this [blogpost]((https://huggingface.co/blog/falcon).
41
+
42
+ You will need **at least 16GB of memory** to swiftly run inference with LokpalGPT-Instruct-Falcon-7b.
43
+
44
+
45
+ ```python
46
+ from transformers import AutoTokenizer, AutoModelForCausalLM
47
+ import transformers
48
+ import torch
49
+
50
+ model = "lokpalai/lokpalgpt-falcon-7b-lora-4.5"
51
+
52
+ tokenizer = AutoTokenizer.from_pretrained(model)
53
+ pipeline = transformers.pipeline(
54
+ "text-generation",
55
+ model=model,
56
+ tokenizer=tokenizer,
57
+ torch_dtype=torch.bfloat16,
58
+ trust_remote_code=True,
59
+ device_map="auto",
60
+ )
61
+ sequences = pipeline(
62
+ "Can you analyze the legal implications of the Ayodhya Verdict by the Supreme Court of India?",
63
+ max_length=200,
64
+ do_sample=True,
65
+ top_k=10,
66
+ num_return_sequences=1,
67
+ temperature=0.5,
68
+ eos_token_id=tokenizer.eos_token_id,
69
+ pad_token_id=tokenizer.eos_token_id,
70
+ )
71
+ for seq in sequences:
72
+ print(f"Result: {seq['generated_text']}")
73
+
74
+ ```
75
+
config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "tiiuae/falcon-7b-instruct",
3
+ "alibi": false,
4
+ "apply_residual_connection_post_layernorm": false,
5
+ "architectures": [
6
+ "RWForCausalLM"
7
+ ],
8
+ "attention_dropout": 0.0,
9
+ "auto_map": {
10
+ "AutoConfig": "tiiuae/falcon-7b-instruct--configuration_RW.RWConfig",
11
+ "AutoModelForCausalLM": "tiiuae/falcon-7b-instruct--modelling_RW.RWForCausalLM"
12
+ },
13
+ "bias": false,
14
+ "bos_token_id": 11,
15
+ "eos_token_id": 11,
16
+ "hidden_dropout": 0.0,
17
+ "hidden_size": 4544,
18
+ "initializer_range": 0.02,
19
+ "layer_norm_epsilon": 1e-05,
20
+ "model_type": "RefinedWebModel",
21
+ "multi_query": true,
22
+ "n_head": 71,
23
+ "n_layer": 32,
24
+ "parallel_attn": true,
25
+ "torch_dtype": "bfloat16",
26
+ "transformers_version": "4.30.0.dev0",
27
+ "use_cache": true,
28
+ "vocab_size": 65024
29
+ }
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 1,
4
+ "eos_token_id": 2,
5
+ "transformers_version": "4.30.0.dev0"
6
+ }
pytorch_model-00001-of-00002.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5b2eba3c43780dabd8eb868c6956ad252b720219e8166db482bd8d7cb7efcb66
3
+ size 9951028257
pytorch_model-00002-of-00002.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f85f71110d906f7f2db5d6de45bb3a34b031e00160049bbcfff88fafc6ef1dee
3
+ size 3892483153
pytorch_model.bin.index.json ADDED
@@ -0,0 +1,203 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "total_size": 13843441408
4
+ },
5
+ "weight_map": {
6
+ "lm_head.weight": "pytorch_model-00001-of-00002.bin",
7
+ "transformer.h.0.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
8
+ "transformer.h.0.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
9
+ "transformer.h.0.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
10
+ "transformer.h.0.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
11
+ "transformer.h.0.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
12
+ "transformer.h.0.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
13
+ "transformer.h.1.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
14
+ "transformer.h.1.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
15
+ "transformer.h.1.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
16
+ "transformer.h.1.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
17
+ "transformer.h.1.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
18
+ "transformer.h.1.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
19
+ "transformer.h.10.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
20
+ "transformer.h.10.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
21
+ "transformer.h.10.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
22
+ "transformer.h.10.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
23
+ "transformer.h.10.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
24
+ "transformer.h.10.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
25
+ "transformer.h.11.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
26
+ "transformer.h.11.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
27
+ "transformer.h.11.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
28
+ "transformer.h.11.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
29
+ "transformer.h.11.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
30
+ "transformer.h.11.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
31
+ "transformer.h.12.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
32
+ "transformer.h.12.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
33
+ "transformer.h.12.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
34
+ "transformer.h.12.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
35
+ "transformer.h.12.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
36
+ "transformer.h.12.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
37
+ "transformer.h.13.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
38
+ "transformer.h.13.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
39
+ "transformer.h.13.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
40
+ "transformer.h.13.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
41
+ "transformer.h.13.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
42
+ "transformer.h.13.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
43
+ "transformer.h.14.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
44
+ "transformer.h.14.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
45
+ "transformer.h.14.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
46
+ "transformer.h.14.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
47
+ "transformer.h.14.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
48
+ "transformer.h.14.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
49
+ "transformer.h.15.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
50
+ "transformer.h.15.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
51
+ "transformer.h.15.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
52
+ "transformer.h.15.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
53
+ "transformer.h.15.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
54
+ "transformer.h.15.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
55
+ "transformer.h.16.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
56
+ "transformer.h.16.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
57
+ "transformer.h.16.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
58
+ "transformer.h.16.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
59
+ "transformer.h.16.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
60
+ "transformer.h.16.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
61
+ "transformer.h.17.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
62
+ "transformer.h.17.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
63
+ "transformer.h.17.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
64
+ "transformer.h.17.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
65
+ "transformer.h.17.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
66
+ "transformer.h.17.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
67
+ "transformer.h.18.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
68
+ "transformer.h.18.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
69
+ "transformer.h.18.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
70
+ "transformer.h.18.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
71
+ "transformer.h.18.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
72
+ "transformer.h.18.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
73
+ "transformer.h.19.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
74
+ "transformer.h.19.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
75
+ "transformer.h.19.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
76
+ "transformer.h.19.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
77
+ "transformer.h.19.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
78
+ "transformer.h.19.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
79
+ "transformer.h.2.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
80
+ "transformer.h.2.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
81
+ "transformer.h.2.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
82
+ "transformer.h.2.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
83
+ "transformer.h.2.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
84
+ "transformer.h.2.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
85
+ "transformer.h.20.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
86
+ "transformer.h.20.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
87
+ "transformer.h.20.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
88
+ "transformer.h.20.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
89
+ "transformer.h.20.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
90
+ "transformer.h.20.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
91
+ "transformer.h.21.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
92
+ "transformer.h.21.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
93
+ "transformer.h.21.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
94
+ "transformer.h.21.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
95
+ "transformer.h.21.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
96
+ "transformer.h.21.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
97
+ "transformer.h.22.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
98
+ "transformer.h.22.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
99
+ "transformer.h.22.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
100
+ "transformer.h.22.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
101
+ "transformer.h.22.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
102
+ "transformer.h.22.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
103
+ "transformer.h.23.input_layernorm.bias": "pytorch_model-00002-of-00002.bin",
104
+ "transformer.h.23.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
105
+ "transformer.h.23.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
106
+ "transformer.h.23.mlp.dense_h_to_4h.weight": "pytorch_model-00002-of-00002.bin",
107
+ "transformer.h.23.self_attention.dense.weight": "pytorch_model-00002-of-00002.bin",
108
+ "transformer.h.23.self_attention.query_key_value.weight": "pytorch_model-00002-of-00002.bin",
109
+ "transformer.h.24.input_layernorm.bias": "pytorch_model-00002-of-00002.bin",
110
+ "transformer.h.24.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
111
+ "transformer.h.24.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
112
+ "transformer.h.24.mlp.dense_h_to_4h.weight": "pytorch_model-00002-of-00002.bin",
113
+ "transformer.h.24.self_attention.dense.weight": "pytorch_model-00002-of-00002.bin",
114
+ "transformer.h.24.self_attention.query_key_value.weight": "pytorch_model-00002-of-00002.bin",
115
+ "transformer.h.25.input_layernorm.bias": "pytorch_model-00002-of-00002.bin",
116
+ "transformer.h.25.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
117
+ "transformer.h.25.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
118
+ "transformer.h.25.mlp.dense_h_to_4h.weight": "pytorch_model-00002-of-00002.bin",
119
+ "transformer.h.25.self_attention.dense.weight": "pytorch_model-00002-of-00002.bin",
120
+ "transformer.h.25.self_attention.query_key_value.weight": "pytorch_model-00002-of-00002.bin",
121
+ "transformer.h.26.input_layernorm.bias": "pytorch_model-00002-of-00002.bin",
122
+ "transformer.h.26.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
123
+ "transformer.h.26.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
124
+ "transformer.h.26.mlp.dense_h_to_4h.weight": "pytorch_model-00002-of-00002.bin",
125
+ "transformer.h.26.self_attention.dense.weight": "pytorch_model-00002-of-00002.bin",
126
+ "transformer.h.26.self_attention.query_key_value.weight": "pytorch_model-00002-of-00002.bin",
127
+ "transformer.h.27.input_layernorm.bias": "pytorch_model-00002-of-00002.bin",
128
+ "transformer.h.27.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
129
+ "transformer.h.27.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
130
+ "transformer.h.27.mlp.dense_h_to_4h.weight": "pytorch_model-00002-of-00002.bin",
131
+ "transformer.h.27.self_attention.dense.weight": "pytorch_model-00002-of-00002.bin",
132
+ "transformer.h.27.self_attention.query_key_value.weight": "pytorch_model-00002-of-00002.bin",
133
+ "transformer.h.28.input_layernorm.bias": "pytorch_model-00002-of-00002.bin",
134
+ "transformer.h.28.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
135
+ "transformer.h.28.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
136
+ "transformer.h.28.mlp.dense_h_to_4h.weight": "pytorch_model-00002-of-00002.bin",
137
+ "transformer.h.28.self_attention.dense.weight": "pytorch_model-00002-of-00002.bin",
138
+ "transformer.h.28.self_attention.query_key_value.weight": "pytorch_model-00002-of-00002.bin",
139
+ "transformer.h.29.input_layernorm.bias": "pytorch_model-00002-of-00002.bin",
140
+ "transformer.h.29.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
141
+ "transformer.h.29.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
142
+ "transformer.h.29.mlp.dense_h_to_4h.weight": "pytorch_model-00002-of-00002.bin",
143
+ "transformer.h.29.self_attention.dense.weight": "pytorch_model-00002-of-00002.bin",
144
+ "transformer.h.29.self_attention.query_key_value.weight": "pytorch_model-00002-of-00002.bin",
145
+ "transformer.h.3.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
146
+ "transformer.h.3.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
147
+ "transformer.h.3.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
148
+ "transformer.h.3.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
149
+ "transformer.h.3.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
150
+ "transformer.h.3.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
151
+ "transformer.h.30.input_layernorm.bias": "pytorch_model-00002-of-00002.bin",
152
+ "transformer.h.30.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
153
+ "transformer.h.30.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
154
+ "transformer.h.30.mlp.dense_h_to_4h.weight": "pytorch_model-00002-of-00002.bin",
155
+ "transformer.h.30.self_attention.dense.weight": "pytorch_model-00002-of-00002.bin",
156
+ "transformer.h.30.self_attention.query_key_value.weight": "pytorch_model-00002-of-00002.bin",
157
+ "transformer.h.31.input_layernorm.bias": "pytorch_model-00002-of-00002.bin",
158
+ "transformer.h.31.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
159
+ "transformer.h.31.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
160
+ "transformer.h.31.mlp.dense_h_to_4h.weight": "pytorch_model-00002-of-00002.bin",
161
+ "transformer.h.31.self_attention.dense.weight": "pytorch_model-00002-of-00002.bin",
162
+ "transformer.h.31.self_attention.query_key_value.weight": "pytorch_model-00002-of-00002.bin",
163
+ "transformer.h.4.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
164
+ "transformer.h.4.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
165
+ "transformer.h.4.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
166
+ "transformer.h.4.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
167
+ "transformer.h.4.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
168
+ "transformer.h.4.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
169
+ "transformer.h.5.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
170
+ "transformer.h.5.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
171
+ "transformer.h.5.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
172
+ "transformer.h.5.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
173
+ "transformer.h.5.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
174
+ "transformer.h.5.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
175
+ "transformer.h.6.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
176
+ "transformer.h.6.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
177
+ "transformer.h.6.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
178
+ "transformer.h.6.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
179
+ "transformer.h.6.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
180
+ "transformer.h.6.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
181
+ "transformer.h.7.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
182
+ "transformer.h.7.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
183
+ "transformer.h.7.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
184
+ "transformer.h.7.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
185
+ "transformer.h.7.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
186
+ "transformer.h.7.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
187
+ "transformer.h.8.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
188
+ "transformer.h.8.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
189
+ "transformer.h.8.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
190
+ "transformer.h.8.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
191
+ "transformer.h.8.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
192
+ "transformer.h.8.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
193
+ "transformer.h.9.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
194
+ "transformer.h.9.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
195
+ "transformer.h.9.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
196
+ "transformer.h.9.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
197
+ "transformer.h.9.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
198
+ "transformer.h.9.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
199
+ "transformer.ln_f.bias": "pytorch_model-00002-of-00002.bin",
200
+ "transformer.ln_f.weight": "pytorch_model-00002-of-00002.bin",
201
+ "transformer.word_embeddings.weight": "pytorch_model-00001-of-00002.bin"
202
+ }
203
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ ">>TITLE<<",
4
+ ">>ABSTRACT<<",
5
+ ">>INTRODUCTION<<",
6
+ ">>SUMMARY<<",
7
+ ">>COMMENT<<",
8
+ ">>ANSWER<<",
9
+ ">>QUESTION<<",
10
+ ">>DOMAIN<<",
11
+ ">>PREFIX<<",
12
+ ">>SUFFIX<<",
13
+ ">>MIDDLE<<"
14
+ ],
15
+ "eos_token": "<|endoftext|>"
16
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "clean_up_tokenization_spaces": true,
4
+ "eos_token": "<|endoftext|>",
5
+ "model_max_length": 2048,
6
+ "tokenizer_class": "PreTrainedTokenizerFast"
7
+ }