Chanukya Patnaik
commited on
Commit
•
9d481ad
1
Parent(s):
18545f0
Update README.md
Browse files
README.md
CHANGED
@@ -8,26 +8,26 @@ pipeline_tag: text-generation
|
|
8 |
---
|
9 |
# Model card for aiplanet/effi-13b
|
10 |
|
11 |
-
|
12 |
|
13 |
-
This
|
14 |
|
15 |
-
## Why use
|
16 |
-
- This is a ready to use chat/instruct model based on Llama-2-13b-chat-hf which provides a rationale for the context provided.
|
17 |
-
-
|
18 |
This is an instruct model, which may not be ideal for further finetuning. If you are interested in building your own instruct/chat model, we recommend starting from **Llama-2-13b-chat-hf**
|
19 |
|
20 |
-
You will need at least **85-100GB of memory to swiftly run inference with effi-
|
21 |
|
22 |
## Model Details
|
23 |
|
24 |
### Model Description
|
25 |
|
26 |
-
This model has been fine tuned on Chain of Thought
|
27 |
|
28 |
|
29 |
|
30 |
-
- **Developed by:**
|
31 |
- **Model type:** Casual Decoder only
|
32 |
- **Language(s) (NLP):** English
|
33 |
- **License:** Apache 2.0
|
@@ -47,7 +47,7 @@ This model has been fine tuned on Chain of Thought datsets which has context fr
|
|
47 |
|
48 |
### Direct Use
|
49 |
|
50 |
-
|
51 |
|
52 |
### Out-of-Scope Use
|
53 |
|
@@ -60,9 +60,9 @@ This model has been majorly trained on English data, and will not generalize ap
|
|
60 |
|
61 |
### Recommendations
|
62 |
|
63 |
-
We recommend users of
|
64 |
|
65 |
-
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
|
66 |
|
67 |
## How to Get Started with the Model
|
68 |
|
@@ -164,7 +164,7 @@ print(result[0]['generated_text'].strip().split("[/INST]")[-1])
|
|
164 |
|
165 |
### Training Data
|
166 |
|
167 |
-
|
168 |
The data was tokenized with the **meta-llama/Llama-2-13b-chat-hf** tokenizer.
|
169 |
|
170 |
|
@@ -239,8 +239,8 @@ See the OpenLLM Leaderboard(https://huggingface.co/spaces/HuggingFaceH4/open_llm
|
|
239 |
|
240 |
## Citation
|
241 |
|
242 |
-
@article{
|
243 |
-
title={{
|
244 |
author={aiplanet},
|
245 |
year={2023}
|
246 |
}
|
|
|
8 |
---
|
9 |
# Model card for aiplanet/effi-13b
|
10 |
|
11 |
+
effi-13B parameters is a causal decoder-only model built by Aiplanet based on Llama-2-13b-chat-hf and fine tuned using the CoT dataset available in huggingface datasets.It is made available under the Apache 2.0 license.
|
12 |
|
13 |
+
This model card aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
|
14 |
|
15 |
+
## Why use effi-13B-Instruct?
|
16 |
+
- This is a ready to use chat/instruct model based on Llama-2-13b-chat-hf, which provides a rationale for the context provided.
|
17 |
+
- Llama-2 is the best open-source model available.
|
18 |
This is an instruct model, which may not be ideal for further finetuning. If you are interested in building your own instruct/chat model, we recommend starting from **Llama-2-13b-chat-hf**
|
19 |
|
20 |
+
You will need at least **85-100GB of memory to swiftly run inference with effi-13b**.
|
21 |
|
22 |
## Model Details
|
23 |
|
24 |
### Model Description
|
25 |
|
26 |
+
This model has been fine tuned on Chain of Thought datasets which has context from mixed sources with corresponding rationale. The final finetuned Large Language Model(LLM) have shown enhanced capabilities of solving novel tasks by providing a reasoning.
|
27 |
|
28 |
|
29 |
|
30 |
+
- **Developed by:** AI Planet
|
31 |
- **Model type:** Casual Decoder only
|
32 |
- **Language(s) (NLP):** English
|
33 |
- **License:** Apache 2.0
|
|
|
47 |
|
48 |
### Direct Use
|
49 |
|
50 |
+
effi-13b has been finetuned on a Chain of Thought dataset.
|
51 |
|
52 |
### Out-of-Scope Use
|
53 |
|
|
|
60 |
|
61 |
### Recommendations
|
62 |
|
63 |
+
We recommend users of effi-13b to develop guardrails and take appropriate precautions for any production use.
|
64 |
|
65 |
+
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information is needed for further recommendations.
|
66 |
|
67 |
## How to Get Started with the Model
|
68 |
|
|
|
164 |
|
165 |
### Training Data
|
166 |
|
167 |
+
effi-13b has been finetuned on https://huggingface.co/datasets/kaist-ai/CoT-Collection
|
168 |
The data was tokenized with the **meta-llama/Llama-2-13b-chat-hf** tokenizer.
|
169 |
|
170 |
|
|
|
239 |
|
240 |
## Citation
|
241 |
|
242 |
+
@article{effi-13b,
|
243 |
+
title={{effi-13b}: an open large language model with state-of-the-art performance},
|
244 |
author={aiplanet},
|
245 |
year={2023}
|
246 |
}
|