Horus7 commited on
Commit
d3ac06b
1 Parent(s): b4cf4fd

Training in progress epoch 0

Browse files
Files changed (3) hide show
  1. README.md +22 -22
  2. config.json +18 -24
  3. tf_model.h5 +3 -0
README.md CHANGED
@@ -1,21 +1,27 @@
1
  ---
2
- license: mit
3
- base_model: Jean-Baptiste/camembert-ner
4
  tags:
5
- - generated_from_trainer
6
  model-index:
7
- - name: my_awesome_wnut_model
8
  results: []
9
  ---
10
 
11
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
- should probably proofread and complete it, then remove this comment. -->
13
 
14
- # my_awesome_wnut_model
15
 
16
- This model is a fine-tuned version of [Jean-Baptiste/camembert-ner](https://huggingface.co/Jean-Baptiste/camembert-ner) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
- - Loss: nan
 
 
 
 
 
 
19
 
20
  ## Model description
21
 
@@ -34,25 +40,19 @@ More information needed
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
- - learning_rate: 2e-05
38
- - train_batch_size: 16
39
- - eval_batch_size: 16
40
- - seed: 42
41
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
- - lr_scheduler_type: linear
43
- - num_epochs: 2
44
 
45
  ### Training results
46
 
47
- | Training Loss | Epoch | Step | Validation Loss |
48
- |:-------------:|:-----:|:----:|:---------------:|
49
- | No log | 1.0 | 2 | nan |
50
- | No log | 2.0 | 4 | nan |
51
 
52
 
53
  ### Framework versions
54
 
55
- - Transformers 4.34.1
56
- - Pytorch 2.1.0+cu118
57
  - Datasets 2.14.6
58
  - Tokenizers 0.14.1
 
1
  ---
2
+ license: apache-2.0
3
+ base_model: distilbert-base-uncased
4
  tags:
5
+ - generated_from_keras_callback
6
  model-index:
7
+ - name: Horus7/my_awesome_wnut_model
8
  results: []
9
  ---
10
 
11
+ <!-- This model card has been generated automatically according to the information Keras had access to. You should
12
+ probably proofread and complete it, then remove this comment. -->
13
 
14
+ # Horus7/my_awesome_wnut_model
15
 
16
+ This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Train Loss: 1.4529
19
+ - Validation Loss: 1.3140
20
+ - Train Precision: 0.0
21
+ - Train Recall: 0.0
22
+ - Train F1: 0.0
23
+ - Train Accuracy: 0.6667
24
+ - Epoch: 0
25
 
26
  ## Model description
27
 
 
40
  ### Training hyperparameters
41
 
42
  The following hyperparameters were used during training:
43
+ - optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 3, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
44
+ - training_precision: float32
 
 
 
 
 
45
 
46
  ### Training results
47
 
48
+ | Train Loss | Validation Loss | Train Precision | Train Recall | Train F1 | Train Accuracy | Epoch |
49
+ |:----------:|:---------------:|:---------------:|:------------:|:--------:|:--------------:|:-----:|
50
+ | 1.4529 | 1.3140 | 0.0 | 0.0 | 0.0 | 0.6667 | 0 |
 
51
 
52
 
53
  ### Framework versions
54
 
55
+ - Transformers 4.35.0
56
+ - TensorFlow 2.14.0
57
  - Datasets 2.14.6
58
  - Tokenizers 0.14.1
config.json CHANGED
@@ -1,16 +1,13 @@
1
  {
2
- "_name_or_path": "Jean-Baptiste/camembert-ner",
 
3
  "architectures": [
4
- "CamembertForTokenClassification"
5
  ],
6
- "attention_probs_dropout_prob": 0.1,
7
- "bos_token_id": 5,
8
- "classifier_dropout": null,
9
- "eos_token_id": 6,
10
- "gradient_checkpointing": false,
11
- "hidden_act": "gelu",
12
- "hidden_dropout_prob": 0.1,
13
- "hidden_size": 768,
14
  "id2label": {
15
  "0": "O",
16
  "1": "B-depart",
@@ -19,7 +16,6 @@
19
  "4": "I-arrive"
20
  },
21
  "initializer_range": 0.02,
22
- "intermediate_size": 3072,
23
  "label2id": {
24
  "B-arrive": 3,
25
  "B-depart": 1,
@@ -27,17 +23,15 @@
27
  "I-depart": 2,
28
  "O": 0
29
  },
30
- "layer_norm_eps": 1e-05,
31
- "max_position_embeddings": 514,
32
- "model_type": "camembert",
33
- "num_attention_heads": 12,
34
- "num_hidden_layers": 12,
35
- "output_past": true,
36
- "pad_token_id": 1,
37
- "position_embedding_type": "absolute",
38
- "torch_dtype": "float32",
39
- "transformers_version": "4.34.1",
40
- "type_vocab_size": 1,
41
- "use_cache": true,
42
- "vocab_size": 32005
43
  }
 
1
  {
2
+ "_name_or_path": "distilbert-base-uncased",
3
+ "activation": "gelu",
4
  "architectures": [
5
+ "DistilBertForTokenClassification"
6
  ],
7
+ "attention_dropout": 0.1,
8
+ "dim": 768,
9
+ "dropout": 0.1,
10
+ "hidden_dim": 3072,
 
 
 
 
11
  "id2label": {
12
  "0": "O",
13
  "1": "B-depart",
 
16
  "4": "I-arrive"
17
  },
18
  "initializer_range": 0.02,
 
19
  "label2id": {
20
  "B-arrive": 3,
21
  "B-depart": 1,
 
23
  "I-depart": 2,
24
  "O": 0
25
  },
26
+ "max_position_embeddings": 512,
27
+ "model_type": "distilbert",
28
+ "n_heads": 12,
29
+ "n_layers": 6,
30
+ "pad_token_id": 0,
31
+ "qa_dropout": 0.1,
32
+ "seq_classif_dropout": 0.2,
33
+ "sinusoidal_pos_embds": false,
34
+ "tie_weights_": true,
35
+ "transformers_version": "4.35.0",
36
+ "vocab_size": 30522
 
 
37
  }
tf_model.h5 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9e7649e3d65b58f84289c1164dd805b473dea703c553e2caa3a12444490b6bfb
3
+ size 265594128