Text Generation
Transformers
Safetensors
English
llama
climate
conversational
text-generation-inference
Inference Endpoints
dthulke commited on
Commit
e3c1318
1 Parent(s): 2276ebf

update model card

Browse files
Files changed (1) hide show
  1. README.md +69 -26
README.md CHANGED
@@ -1,47 +1,90 @@
1
  ---
2
- base_model: meta-llama/Llama-2-70b-hf
3
  language:
4
  - en
 
 
 
 
 
 
 
 
 
 
 
5
  ---
 
6
 
7
- # Model Card for climategpt/climategpt-70b
8
- - This model is the 70B parameter variant of the ClimateGPT model release.
 
 
 
9
 
10
- ## Overview
11
- - **Developed by:** AppTek, Eqtylab, Erasmus AI
 
 
 
 
 
 
12
  - **Model type:** decoder-only Transformer
13
- - **Language(s) (NLP):** natively supported: English; supported via cascaded MT on web interface: Arabic, Bangla, Chinese (simplified), Dutch, Finnougoric, French, Germanic, Greek, Hebrew, Indonesian, Japenese, Korean, Lithuanian, Pashto, Persian, Portuguese, Russian, Spanish, Thai, Turkish, Vietnamese,
14
  - **License:** TO BE ADDED
15
- - **Finetuned from model:** Llama2 70B
16
- - **Repository:** https://huggingface.co/climategpt/climategpt-70b
17
- - **Paper:** TO BE ADDED
18
- - **Demo:** TO BE ADDED
 
 
19
 
20
  ## Uses
21
  - This model is intended to be directly used as a question answering model that is specialized in the climate domain.
22
- - The model is aimed at providing useful feedback for decision makers, scientists and jounalists involved in climate discussions.
23
- - The model can also be used as a starting point for interested developers for further finetuning.
24
  - The model is NOT intended to be a general-purpose chatbot (although it has chat capabilities).
25
  - For the full system including cascaded MT, RAG, etc., we recommend the user to go to our demo website: TO BE ADDED.
26
- - For hands-on finetuning deployment and inference, we recommend the user to directly use the Huggingface helpers.
27
- - For in-depth model conversion and finetuning, we recommend the user to use https://github.com/epfLLM/Megatron-LLM/.
28
- - **Despite the efforts from the development team to elimite them, as every other chat-capable LLMs, this model may generate biased, offensive, inaccurate responses.**
29
 
30
- ## How to Get Started with the Model
31
- After downloading the HF formatted model, the HF helpers should work out-of-the-box.
32
- It is also possible to evaluate the model with https://github.com/EleutherAI/lm-evaluation-harness by plugging in the model identifier ```--model_args pretrained=climategpt/climategpt-70b```.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
 
34
  ## Training
35
  - For the Llama2 training data, we refer the user to https://huggingface.co/meta-llama/Llama-2-70b-hf.
36
- - For continued pretraining, 4.2B climate domain tokens (tokenized by the Llama tokenizer) are used.
37
- - For instruction finetuning, about 272K instruction-completion pairs (both in the climate domain but also general domain) are used.
 
 
 
 
38
 
39
  ## Environmental Impact
40
- - **Hardware Type:** H100
41
- - **Hours used:** 2300 hrs
42
- - **Cloud Provider:** TO BE ADDED
43
- - **Compute Region:** TO BE ADDED
44
- - **Carbon Emitted:** TO BE ADDED
 
 
45
 
46
  ## Citation
47
- **BibTeX:** TO BE ADDED
 
1
  ---
 
2
  language:
3
  - en
4
+ datasets:
5
+ - OpenAssistant/oasst1
6
+ - databricks/databricks-dolly-15k
7
+ base_model: meta-llama/Llama-2-70b-hf
8
+ tags:
9
+ - climate
10
+ co2_eq_emissions:
11
+ emissions: 40600
12
+ training_type: "pre-training"
13
+ geographical_location: "Washington, USA"
14
+ hardware_used: "8x NVIDIA H100 HBM"
15
  ---
16
+ # ClimateGPT 70B
17
 
18
+ ClimateGPT is an ensemble of AI models designed to augment human decisions on the fast-moving field of climate change.
19
+ ClimateGPT 70B is a 70 billion transformer decoder model that was adapted from Llama 2 to the domain of climate science using continuous pre-training on a collection of 4.2B tokens from curated climate documents.
20
+ The model is further instruction fine-tuned on a dataset of instruction-completion pairs manually collected by AppTek in cooperation with climate scientists.
21
+ [ClimateGPT 7B](https://huggingface.co/eci-io/climategpt-7b) outperforms Llama 2 70B Chat on our climate-specific benchmarks.
22
+ The model is designed to be used together with retrieval augmentation to extend the knowledge, and increase the factuality of the model and with cascaded machine translation to increase the language coverage.
23
 
24
+ <blockquote style="padding: 10px; margin: 0 0 10px; border-left: 5px solid #ddd;">
25
+ A paper describing our approach will be released soon.
26
+ </blockquote>
27
+
28
+ ## Model Details
29
+ - **Trained by:** [AppTek](https://apptek.com)
30
+ - **Powered by:** [Erasmus AI](https://erasmus.ai)
31
+ - **Verified by:** [EQTYLab](https://eqtylab.io)
32
  - **Model type:** decoder-only Transformer
33
+ - **Language(s) (NLP):** English
34
  - **License:** TO BE ADDED
35
+ - **Continued pre-trained from:** Llama 2 70B
36
+ - **Context length:** 4K tokens
37
+ - **Input:** Text-only data
38
+ - **Output:** Model generates text only
39
+ - **Paper:** The paper will be released soon.
40
+ - **Website:** [eci.io](https://eci.io)
41
 
42
  ## Uses
43
  - This model is intended to be directly used as a question answering model that is specialized in the climate domain.
44
+ - The model is aimed at providing useful feedback for decision makers, scientists and journalists involved in climate discussions.
45
+ - The model can also be used as a starting point for interested developers for further fine-tuning.
46
  - The model is NOT intended to be a general-purpose chatbot (although it has chat capabilities).
47
  - For the full system including cascaded MT, RAG, etc., we recommend the user to go to our demo website: TO BE ADDED.
48
+ - **Despite the efforts from the development team to eliminate them, as every other chat-capable LLMs, this model may generate biased, offensive or inaccurate responses.**
49
+
50
+ ## Downstream Use
51
 
52
+ ClimateGPT 70B is an instruction-tuned model that can be directly used for climate-specific question-answering applications.
53
+ It was trained to perform well with retrieval augmentation and supports up to 5 references in context.
54
+
55
+ The model was trained using ChatML so the following format should be followed when prompting, including the `<|im_start|>`, `<|im_end|>` tags, `system`, `user`, `context` and `assistant` identifiers and `[[0]]`, `[[1]]]` etc. tokens to indicate references.
56
+
57
+ """
58
+ <|im_start|>system
59
+ {system_message}<|im_end|>
60
+ <|im_start|>user
61
+ {prompt}<|im_end|>
62
+ <|im_start|>context
63
+ [[0]] "{reference1_title}", {reference1_year}
64
+ {reference1_text}
65
+ [[1]] "{reference2_title}", {reference2_year}
66
+ {reference2_text}
67
+ [...]<|im_end|>
68
+ <|im_start|>assistant
69
+ """
70
 
71
  ## Training
72
  - For the Llama2 training data, we refer the user to https://huggingface.co/meta-llama/Llama-2-70b-hf.
73
+ - For continued pre-training, 4.2B climate domain tokens (tokenized by the Llama tokenizer) are used.
74
+ - For instruction fine-tuning, about 272K instruction-completion pairs (both in the climate domain but also general domain) are used.
75
+
76
+ ## Evaluation
77
+
78
+ Detailed evaluation results are presented on our model card website: [eci.io/model-card](https://eci.io/model-card)
79
 
80
  ## Environmental Impact
81
+ - **Hardware Type:** 8x NVIDIA H100 HBM
82
+ - **Power Consumption per GPU:** 775W
83
+ - **Hours used:** 2,182 hrs
84
+ - **Cloud Provider:** MLFoundry
85
+ - **Compute Region:** Washington, USA
86
+ - **Energy Mix:** 100% Hydro Power (24g CO2eq/kWh according to IPCC 2014)
87
+ - **Carbon Emitted:** 40.6kg CO2eq
88
 
89
  ## Citation
90
+ **BibTeX:** Paper will be released soon.