FemkeBakker
/

AmsterdamDocClassificationLlama200T3Epochs

@@ -8,6 +8,10 @@ tags:
 model-index:
 - name: AmsterdamDocClassificationLlama200T3Epochs
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -15,24 +19,24 @@ should probably proofread and complete it, then remove this comment. -->
 # AmsterdamDocClassificationLlama200T3Epochs
-This model is a fine-tuned version of [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) on the [AmsterdamDocClassification](https://huggingface.co/datasets/FemkeBakker/AmsterdamBalancedFirst200Tokens) dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.8116
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -67,6 +71,8 @@ The following hyperparameters were used during training:
 | 0.9744        | 2.7855 | 1722 | 0.8116          |
 | 1.0399        | 2.9842 | 1845 | 0.8116          |
 ### Framework versions
@@ -74,3 +80,8 @@ The following hyperparameters were used during training:
 - Pytorch 2.3.0+cu121
 - Datasets 2.19.1
 - Tokenizers 0.19.1

 model-index:
 - name: AmsterdamDocClassificationLlama200T3Epochs
   results: []
+datasets:
+- FemkeBakker/AmsterdamBalancedFirst200Tokens
+language:
+- nl
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # AmsterdamDocClassificationLlama200T3Epochs
+As part of the Assessing Large Language Models for Document Classification project by the Municipality of Amsterdam, we fine-tune Mistral, Llama, and GEITje for document classification.
+The fine-tuning is performed using the [AmsterdamBalancedFirst200Tokens](https://huggingface.co/datasets/FemkeBakker/AmsterdamBalancedFirst200Tokens) dataset, which consists of documents truncated to the first 200 tokens.
+In our research, we evaluate the fine-tuning of these LLMs across one, two, and three epochs.
+This model is a fine-tuned version of [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) and has been fine-tuned for three epochs.
 It achieves the following results on the evaluation set:
 - Loss: 0.8116
 ## Training and evaluation data
+- The training data consists of 9900 documents and their labels formatted into conversations.
+- The evaluation data consists of 1100 documents and their labels formatted into conversations.
 ## Training procedure
+See the [GitHub](https://github.com/Amsterdam-Internships/document-classification-using-large-language-models) for specifics about the training and the code.
 ### Training hyperparameters
 The following hyperparameters were used during training:
 | 0.9744        | 2.7855 | 1722 | 0.8116          |
 | 1.0399        | 2.9842 | 1845 | 0.8116          |
+Training time: in total it took 2 hours and 3 minutes to fine-tune the model for three epochs.
 ### Framework versions
 - Pytorch 2.3.0+cu121
 - Datasets 2.19.1
 - Tokenizers 0.19.1
+### Acknowledgements
+This model was trained as part of [insert thesis info] in collaboration with Amsterdam Intelligence for the City of Amsterdam.