ifisch commited on
Commit
860936c
·
verified ·
1 Parent(s): e82503d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -1
README.md CHANGED
@@ -106,7 +106,15 @@ Seed: 38
106
 
107
  #### 3.3.3 Training
108
 
109
- The training process involved feeding the cleaned and prepared dataset into the GPT-2 model. We used a combination of supervised learning and transfer learning techniques to fine-tune the model effectively.
 
 
 
 
 
 
 
 
110
 
111
  #### 3.3.4 Generation and Deployment
112
 
 
106
 
107
  #### 3.3.3 Training
108
 
109
+ The training process involved feeding the cleaned and prepared dataset into the GPT-2 model. We used a combination of supervised learning techniques to fine-tune the model effectively.
110
+ We trained the model using the Hugging Face Trainer, which takes the parameters as
111
+ input. We opted for this because it is optimized for transformer and also comes from
112
+ the same framework. During training, we used the WANDB API to track the training
113
+ of each model of the respective parties and obtain metrics. We ran the training of the
114
+ models via Kaggle, as Kaggle provides two T4 GPUs and so we got good hardware
115
+ without paying anything. Another advantage was that we could run the training via
116
+ CUDA. The training took between 2 and 10 hours, depending on the number of tweets
117
+ from each party. We will go into this in more detail in the evaluation.
118
 
119
  #### 3.3.4 Generation and Deployment
120