luismsgomes commited on
Commit
fbb7c63
1 Parent(s): c88d963

added trained model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 1536,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md CHANGED
@@ -1,3 +1,132 @@
1
  ---
 
2
  license: mit
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: pt
3
  license: mit
4
+ library_name: sentence-transformers
5
+ pipeline_tag: sentence-similarity
6
+ tags:
7
+ - sentence-transformers
8
+ - feature-extraction
9
+ - sentence-similarity
10
+ - transformers
11
+
12
  ---
13
+
14
+ # Serafim 900m Portuguese (PT) Sentence Encoder
15
+
16
+ This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 1536 dimensional dense vector space and can be used for tasks like clustering or semantic search.
17
+
18
+ <!--- Describe your model here -->
19
+
20
+ ## Usage (Sentence-Transformers)
21
+
22
+ Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
23
+
24
+ ```
25
+ pip install -U sentence-transformers
26
+ ```
27
+
28
+ Then you can use the model like this:
29
+
30
+ ```python
31
+ from sentence_transformers import SentenceTransformer
32
+ sentences = ["This is an example sentence", "Each sentence is converted"]
33
+
34
+ model = SentenceTransformer('{MODEL_NAME}')
35
+ embeddings = model.encode(sentences)
36
+ print(embeddings)
37
+ ```
38
+
39
+
40
+
41
+ ## Usage (HuggingFace Transformers)
42
+ Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
43
+
44
+ ```python
45
+ from transformers import AutoTokenizer, AutoModel
46
+ import torch
47
+
48
+
49
+ #Mean Pooling - Take attention mask into account for correct averaging
50
+ def mean_pooling(model_output, attention_mask):
51
+ token_embeddings = model_output[0] #First element of model_output contains all token embeddings
52
+ input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
53
+ return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
54
+
55
+
56
+ # Sentences we want sentence embeddings for
57
+ sentences = ['This is an example sentence', 'Each sentence is converted']
58
+
59
+ # Load model from HuggingFace Hub
60
+ tokenizer = AutoTokenizer.from_pretrained('{MODEL_NAME}')
61
+ model = AutoModel.from_pretrained('{MODEL_NAME}')
62
+
63
+ # Tokenize sentences
64
+ encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
65
+
66
+ # Compute token embeddings
67
+ with torch.no_grad():
68
+ model_output = model(**encoded_input)
69
+
70
+ # Perform pooling. In this case, mean pooling.
71
+ sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
72
+
73
+ print("Sentence embeddings:")
74
+ print(sentence_embeddings)
75
+ ```
76
+
77
+
78
+
79
+ ## Evaluation Results
80
+
81
+ <!--- Describe how your model was evaluated -->
82
+
83
+ For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name={MODEL_NAME})
84
+
85
+
86
+ ## Training
87
+ The model was trained with the parameters:
88
+
89
+ **DataLoader**:
90
+
91
+ `torch.utils.data.dataloader.DataLoader` of length 1183 with parameters:
92
+ ```
93
+ {'batch_size': 16, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
94
+ ```
95
+
96
+ **Loss**:
97
+
98
+ `sentence_transformers.losses.CoSENTLoss.CoSENTLoss` with parameters:
99
+ ```
100
+ {'scale': 20.0, 'similarity_fct': 'pairwise_cos_sim'}
101
+ ```
102
+
103
+ Parameters of the fit()-Method:
104
+ ```
105
+ {
106
+ "epochs": 10,
107
+ "evaluation_steps": 119,
108
+ "evaluator": "sentence_transformers.evaluation.EmbeddingSimilarityEvaluator.EmbeddingSimilarityEvaluator",
109
+ "max_grad_norm": 1,
110
+ "optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
111
+ "optimizer_params": {
112
+ "lr": 1e-06
113
+ },
114
+ "scheduler": "WarmupLinear",
115
+ "steps_per_epoch": 1183,
116
+ "warmup_steps": 1183,
117
+ "weight_decay": 0.01
118
+ }
119
+ ```
120
+
121
+
122
+ ## Full Model Architecture
123
+ ```
124
+ SentenceTransformer(
125
+ (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: DebertaV2Model
126
+ (1): Pooling({'word_embedding_dimension': 1536, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
127
+ )
128
+ ```
129
+
130
+ ## Citing & Authors
131
+
132
+ <!--- Describe where people can find more information -->
config.json ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "models/albertina-900m-ptpt-europarl-eubookshop-ted2020-tatoeba-ct1-nli-gist10-v1",
3
+ "architectures": [
4
+ "DebertaV2Model"
5
+ ],
6
+ "attention_head_size": 64,
7
+ "attention_probs_dropout_prob": 0.1,
8
+ "conv_act": "gelu",
9
+ "conv_kernel_size": 3,
10
+ "hidden_act": "gelu",
11
+ "hidden_dropout_prob": 0.1,
12
+ "hidden_size": 1536,
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 6144,
15
+ "layer_norm_eps": 1e-07,
16
+ "max_position_embeddings": 512,
17
+ "max_relative_positions": -1,
18
+ "model_type": "deberta-v2",
19
+ "norm_rel_ebd": "layer_norm",
20
+ "num_attention_heads": 24,
21
+ "num_hidden_layers": 24,
22
+ "pad_token_id": 0,
23
+ "pooler_dropout": 0,
24
+ "pooler_hidden_act": "gelu",
25
+ "pooler_hidden_size": 1536,
26
+ "pos_att_type": [
27
+ "p2c",
28
+ "c2p"
29
+ ],
30
+ "position_biased_input": false,
31
+ "position_buckets": 256,
32
+ "relative_attention": true,
33
+ "share_att_key": true,
34
+ "torch_dtype": "float32",
35
+ "transformers_version": "4.39.3",
36
+ "type_vocab_size": 0,
37
+ "vocab_size": 128100
38
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "2.6.1",
4
+ "transformers": "4.39.3",
5
+ "pytorch": "2.2.2+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null
9
+ }
eval/similarity_evaluation_assin-ptbr-test_results.csv ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ epoch,steps,cosine_pearson,cosine_spearman,euclidean_pearson,euclidean_spearman,manhattan_pearson,manhattan_spearman,dot_pearson,dot_spearman
2
+ -1,-1,0.8075227156584123,0.7883413099648385,0.8180576858475002,0.7927229847323022,0.8174669419480555,0.7923730715175294,0.7051105842918381,0.689291436431814
eval/similarity_evaluation_assin-ptpt-test_results.csv ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ epoch,steps,cosine_pearson,cosine_spearman,euclidean_pearson,euclidean_spearman,manhattan_pearson,manhattan_spearman,dot_pearson,dot_spearman
2
+ -1,-1,0.8398807243271513,0.830669533954273,0.8471996412591254,0.8330789353909063,0.8469094638091792,0.8326100102419636,0.7494520951197198,0.7411218421717549
eval/similarity_evaluation_assin2-test_results.csv ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ epoch,steps,cosine_pearson,cosine_spearman,euclidean_pearson,euclidean_spearman,manhattan_pearson,manhattan_spearman,dot_pearson,dot_spearman
2
+ -1,-1,0.8548813206256356,0.8266509159141969,0.8373929869094534,0.8254115917892378,0.8374704310841169,0.8252176122026239,0.770906756889177,0.7278639880863395
eval/similarity_evaluation_iris-sts-test_results.csv ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ epoch,steps,cosine_pearson,cosine_spearman,euclidean_pearson,euclidean_spearman,manhattan_pearson,manhattan_spearman,dot_pearson,dot_spearman
2
+ -1,-1,0.8279941050190001,0.823119444971123,0.8041795163558138,0.8097763184725875,0.8051128154098245,0.8105488384195639,0.8057909117487693,0.8149271728899469
eval/similarity_evaluation_stsb-multi-mt-pt-test_results.csv ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ epoch,steps,cosine_pearson,cosine_spearman,euclidean_pearson,euclidean_spearman,manhattan_pearson,manhattan_spearman,dot_pearson,dot_spearman
2
+ -1,-1,0.8474849678461291,0.8570183131973625,0.8452275645972888,0.8568781860925376,0.8452748867543359,0.8571309389539689,0.7492848737757264,0.7435566736867906
eval/similarity_evaluation_validation_results.csv ADDED
@@ -0,0 +1,101 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ epoch,steps,cosine_pearson,cosine_spearman,euclidean_pearson,euclidean_spearman,manhattan_pearson,manhattan_spearman,dot_pearson,dot_spearman
2
+ 0,119,0.8346352333520197,0.8422258392177346,0.8197328311746798,0.8367438852730853,0.8196068984821245,0.8366298806343659,0.7999211730641367,0.8053265672590242
3
+ 0,238,0.8415745890153112,0.8496095291374381,0.8274763650212377,0.8454146873497612,0.8274548823651928,0.8453744878235535,0.7969709739096635,0.8028039583453915
4
+ 0,357,0.8471888539359285,0.8555908909768408,0.8352050721932037,0.8528823194369696,0.835280067861715,0.8530515634727752,0.780989840038815,0.7879134941576926
5
+ 0,476,0.8507833456286397,0.8585549655097531,0.840200684023137,0.8575144771585445,0.8403271432910826,0.8576170544545164,0.762348014249639,0.7703305106263088
6
+ 0,595,0.853728637707974,0.8611992964914437,0.8420733708416612,0.8600249158181528,0.842197972954118,0.8601328503801203,0.7594614594636205,0.7688144711990734
7
+ 0,714,0.8581341835739743,0.8649379219352153,0.8461227843405976,0.8635347248304177,0.8462742004065475,0.8636965138901219,0.7592863294609045,0.7694674305360168
8
+ 0,833,0.8617022624175867,0.8675011960829289,0.8490680130964816,0.8656953192577258,0.8492312160377112,0.8658860896842852,0.7636658012413412,0.7730869159500267
9
+ 0,952,0.8643163567507588,0.8706952625463907,0.8519765861516742,0.8689336743383054,0.8521915507762666,0.8691745592674796,0.7592215659881358,0.769186321249198
10
+ 0,1071,0.8611884373197293,0.8689202700885262,0.8487978530766281,0.8669072933524347,0.8489373361025635,0.8671607146801986,0.7582760453988152,0.7713158768684139
11
+ 0,-1,0.8625732361590839,0.8719521910950583,0.8505907673710995,0.8687591440294259,0.8507990574257344,0.8690258019410102,0.7733986406344333,0.7851179681140572
12
+ 1,119,0.8637581854219967,0.8727700044319258,0.8516982658195925,0.8684338468122845,0.8518921945540673,0.8686887630550413,0.7803127775757297,0.7901394351735449
13
+ 1,238,0.8682053463581425,0.8765962894990316,0.8558395863225453,0.8718997524767961,0.8560405579464444,0.872161713939156,0.7903410027509883,0.7983628762795825
14
+ 1,357,0.8708807373918951,0.8784204184626634,0.8591423211215432,0.8746097281513557,0.8594010444976521,0.8749954074656185,0.7839380734149561,0.7914107028700469
15
+ 1,476,0.8741631123660174,0.8793750571771543,0.861003977756639,0.8770255475070269,0.8612162538427957,0.8772978152802219,0.7791966451332369,0.7878095913800425
16
+ 1,595,0.8756930592886156,0.8803242602839599,0.8618262658846538,0.877805489760628,0.8620093635714191,0.8780396434712808,0.7812351760461747,0.7893244160597023
17
+ 1,714,0.8758144408693579,0.8808607952554944,0.8605710330861767,0.8772149578087245,0.8607473511262962,0.8774595099086204,0.7860915777429817,0.795473981985746
18
+ 1,833,0.8776072052673576,0.881278524199399,0.8624933340459493,0.8787030385135469,0.8626351249862101,0.8789840724569133,0.7766840101481022,0.7871081705935022
19
+ 1,952,0.8769741665043229,0.8824471439922139,0.8619313234505173,0.8776593891366289,0.8621291750253715,0.8779535378381056,0.7922799721222221,0.8009354162292346
20
+ 1,1071,0.8775980441756146,0.8827541900015038,0.8625238402655603,0.8797365984212887,0.862664721843668,0.8800151030124124,0.7770746783577623,0.7889645158344328
21
+ 1,-1,0.877099109875496,0.8833609213732116,0.8609816375695695,0.878285892053751,0.8610893950283386,0.8784517206813098,0.7801227547503152,0.7937268453513722
22
+ 2,119,0.8797123417342431,0.883311542932101,0.8625163465488785,0.8793169305755694,0.8625877386285254,0.8795218643704542,0.7769360986154548,0.7894635798554434
23
+ 2,238,0.8774215776264491,0.8839346833608172,0.8620022096288248,0.8797857728271842,0.8621243517355406,0.8800591535743469,0.7793757375983782,0.7921988557357168
24
+ 2,357,0.8774541296526093,0.8836527851793334,0.8596102951576453,0.877332053722741,0.8597705124184118,0.8775777323366196,0.791813183311673,0.8034857727347607
25
+ 2,476,0.8754756764039029,0.884149068654327,0.8609488577232387,0.8797482487391428,0.8610905041763258,0.8800249949037978,0.7760560198605742,0.7910944848244663
26
+ 2,595,0.8777657495486987,0.8851085015033453,0.8621205431833266,0.8809319597588805,0.8622656419255181,0.8811721520501965,0.7777457906056856,0.7914322148177159
27
+ 2,714,0.8785240118960488,0.8846726918792049,0.8600507271518938,0.8791553550801946,0.8601082907322292,0.879325222267365,0.7875005659126604,0.7997218082510967
28
+ 2,833,0.8791941692079219,0.8856305505816707,0.8618978820190925,0.8806042705185254,0.8619686380064013,0.8808193890646552,0.7844849303036342,0.7967633009057592
29
+ 2,952,0.8803283497877149,0.8857125268640175,0.8627878619812928,0.8813356706792251,0.8628858761188877,0.8815478540953914,0.7831892307080908,0.7951059592833541
30
+ 2,1071,0.8795404929338702,0.8844312344596341,0.8599202912664154,0.8789285212854102,0.8598858754843276,0.8790520758238188,0.7922701916057839,0.8042356729678226
31
+ 2,-1,0.880508979314413,0.885095687429474,0.8616143053434878,0.8809830411647824,0.8616187501402122,0.8811695572916177,0.7874118084250354,0.7997450242457712
32
+ 3,119,0.8816533382802142,0.8851243056867188,0.8628425007521099,0.880829025420507,0.8628125648500715,0.8809692507712359,0.7869744646248878,0.7979916755416128
33
+ 3,238,0.8807309692032356,0.8850163773393138,0.8624790157595976,0.8799381539498488,0.8624312380559872,0.8800304126407217,0.7945512813551645,0.8046794677197014
34
+ 3,357,0.8826232640116277,0.8857770158049765,0.8639310978192669,0.8817881834140852,0.86387934916352,0.8818851073711856,0.7906760859386978,0.8015165708711791
35
+ 3,476,0.8809344351088982,0.8857802143851341,0.8597534631974171,0.879375764731323,0.8596971458337169,0.8794561503894729,0.8028357247959041,0.8144060060713446
36
+ 3,595,0.8820522308796299,0.8859536709804566,0.8620460165163851,0.8823521967598943,0.8620117777847519,0.8824593546641254,0.7994128547215231,0.8093649925971239
37
+ 3,714,0.8826480264525399,0.8865490022745225,0.863193631181805,0.8829999412689435,0.8631559253089283,0.883053446050108,0.7961183141584264,0.8057997663159502
38
+ 3,833,0.882009894592221,0.8864467897201691,0.8610185339633059,0.8816839405077725,0.8609406967187891,0.8817475524124645,0.800585953549725,0.8119971965609749
39
+ 3,952,0.8823437458230885,0.8873805139753449,0.8614310414878082,0.8815176517993175,0.8613949983418825,0.8816575030505889,0.7936419123889158,0.8076496676030118
40
+ 3,1071,0.8826260168164414,0.8876498625323778,0.8622013126142408,0.8815402460589877,0.8622150789268221,0.8817652464424495,0.7940618231781008,0.8071859811738558
41
+ 3,-1,0.8836393190295158,0.888473308146647,0.8633960055371852,0.8827903189384879,0.8634147656281557,0.8829850701103715,0.7890017808734735,0.8029192274151917
42
+ 4,119,0.8823269095622938,0.8866172371904456,0.859286368297807,0.8789092916130744,0.8592201961602011,0.8790749602279632,0.7991227979721551,0.8119059356362098
43
+ 4,238,0.8826294305368632,0.887586021684644,0.8607354869574075,0.880994037596667,0.8606673790052873,0.8811584876478844,0.7981880899663201,0.8115542102434757
44
+ 4,357,0.8832933726454343,0.8882327497282719,0.8622748213094523,0.8822333902142013,0.8621851085676328,0.882355549696812,0.7988963332996472,0.8124587315554166
45
+ 4,476,0.881840606772165,0.8871740242898601,0.8597749785578143,0.8801527492219774,0.8596811583195798,0.8801590286792192,0.7969672061105048,0.8117635173621588
46
+ 4,595,0.8825416511317475,0.8880577247333887,0.8607539589011415,0.8814134240194618,0.8607086725511335,0.8814996409846403,0.7961649450250078,0.8103670019003154
47
+ 4,714,0.8827719191295197,0.8881860868030793,0.8611783953057693,0.8825743918676177,0.8611376073518966,0.8827259527265117,0.7927346101697337,0.8078758159577673
48
+ 4,833,0.8832817154391954,0.8873959595747313,0.859775482425091,0.879869875494685,0.8596985017910792,0.8799332015315074,0.8015401529412554,0.8149507340611569
49
+ 4,952,0.8839285784973908,0.8882458968476794,0.8617608864496966,0.8819075612947938,0.8616972691840576,0.8820104514776882,0.7993808818218207,0.812585045222275
50
+ 4,1071,0.8842443693402884,0.8884481320355871,0.861411652254568,0.8819965111408936,0.8613699357899787,0.8821318238594213,0.8010105017393705,0.8142833242064953
51
+ 4,-1,0.8835761328285844,0.8875401487658368,0.8607692480192544,0.8816097204116775,0.8606935887981578,0.8817094983258242,0.8029179148892821,0.8163325954535507
52
+ 5,119,0.8819248884341266,0.8868404163950547,0.8570311377998923,0.8787894475696096,0.8569700814480326,0.8789334641619787,0.809271447266343,0.8233360073648849
53
+ 5,238,0.8828858577177023,0.8873538955526303,0.8585903109915216,0.8803086938842148,0.8585284373368823,0.8803969069408362,0.8057578109521051,0.8196105353498617
54
+ 5,357,0.8832909519288645,0.8872992374274631,0.8583917444932727,0.8797038148166145,0.8583140564407189,0.8798148754120191,0.8055051212877938,0.8196555967170814
55
+ 5,476,0.8826094514670383,0.8867466022826724,0.8573749245221443,0.8788982614268408,0.8572875550370438,0.878961852484204,0.8067103466288865,0.820547481298486
56
+ 5,595,0.8824653848000071,0.8868049693628222,0.8576239395487223,0.8789344326737744,0.8575386219495266,0.8789623101275904,0.80491169635693,0.8192953636418171
57
+ 5,714,0.883827984557394,0.887059438655981,0.8593328578430289,0.8802883691732354,0.8592383748304541,0.8803542548868963,0.8016058901768601,0.815977015849128
58
+ 5,833,0.8844118179914054,0.8877861807360763,0.8603227262765712,0.8803764558810377,0.8602683815895195,0.880439845012194,0.8035038719008838,0.8167024769486264
59
+ 5,952,0.8841082371446042,0.8878693971567,0.8597206326772422,0.8795947385235381,0.8597097120275098,0.8797125141547529,0.806073907418141,0.8188202620523802
60
+ 5,1071,0.883779381056941,0.8879402321779829,0.8594641306620686,0.879248078430363,0.8594511971950967,0.8793885937042077,0.8113494252090672,0.8235114341956302
61
+ 5,-1,0.8847414960923969,0.8883022169509511,0.8606873758387314,0.8820278066203547,0.8606306678543,0.8820793670697799,0.8017729510049806,0.8160117533648213
62
+ 6,119,0.8838288422287383,0.8873597927326926,0.8583880856333336,0.8789969305127278,0.8583360432437586,0.8791727559910513,0.809000328260038,0.821975775738682
63
+ 6,238,0.8836114667450747,0.88790223343508,0.8589272216695568,0.8798240396096403,0.8588412675704384,0.8799039937040818,0.8080929006253283,0.821721847487454
64
+ 6,357,0.883778420023108,0.8882032634209196,0.8596298895772982,0.880196829766678,0.8595364536914696,0.8802463482974427,0.8073354999592192,0.8207415217726155
65
+ 6,476,0.8841043256772667,0.888041408229941,0.859810104100427,0.8801869754264525,0.8596809935676518,0.8801790758462332,0.8049252203380058,0.818654908198364
66
+ 6,595,0.8838454589666223,0.8872387446887711,0.8584521556400968,0.8792383996269925,0.8582673995747374,0.8791877355608364,0.8058480323319779,0.819487011995362
67
+ 6,714,0.8833172986319796,0.8876791168185859,0.8581522916922523,0.8797672422330824,0.8580179727965334,0.8797845301675117,0.8036633399921781,0.818298359326293
68
+ 6,833,0.8834612655017161,0.8882526297591219,0.8592653110475658,0.8802309563347771,0.8591876346502277,0.8802985138290014,0.8027752682307142,0.816891303715587
69
+ 6,952,0.8835073686420667,0.8876633606655318,0.8577806834391506,0.8791748627714069,0.8577173082663588,0.8792212806970685,0.8079631141408513,0.8217472929064212
70
+ 6,1071,0.8842506598446916,0.8883416521022558,0.8590551448577368,0.8807189197133856,0.8589967741515393,0.8807289235962532,0.8063848226510055,0.8202184974150694
71
+ 6,-1,0.8844600757143403,0.8881849263171692,0.8587697220019439,0.8806486650130695,0.858697821712032,0.8807337831296924,0.8054941978241436,0.8199172072087788
72
+ 7,119,0.8843233564651158,0.8878668561914071,0.8583754601948382,0.8794236243971868,0.8582777784974844,0.8794019523685322,0.8067345312512814,0.8208734165053734
73
+ 7,238,0.8842069123767062,0.8876958700011693,0.8576908275191834,0.8786038773476983,0.8576162374293321,0.8785831681578846,0.8112058763968272,0.8248139182215718
74
+ 7,357,0.8843568838718368,0.8874532187170571,0.8576078949060202,0.8789180326460854,0.857505217930445,0.8789497578711735,0.809849416186635,0.8235332397101185
75
+ 7,476,0.8840960442473929,0.88758882230927,0.8577984557372981,0.8789836071140122,0.8577025219199634,0.8790212363859548,0.8109781202622401,0.8244100257497375
76
+ 7,595,0.8839522818886445,0.8878792664378807,0.8577508869451539,0.8789612189619493,0.8576695096590751,0.8790398088348405,0.8114132404949927,0.8251376104265605
77
+ 7,714,0.8845789341200696,0.8878316390152882,0.8577039062047612,0.8795504665148022,0.8576005066352508,0.8795613353697498,0.8101185439459966,0.8243093738847095
78
+ 7,833,0.8844863367228244,0.8877441555281138,0.857209933409016,0.8786124252989105,0.857133630519639,0.8787347231438426,0.81327057733651,0.8268425028082523
79
+ 7,952,0.884990469341495,0.8883032929352225,0.8585746339320629,0.8801830536048518,0.8584889202814387,0.8801856656920507,0.8103449022001461,0.8239891033797256
80
+ 7,1071,0.8850047552231747,0.8882237708987581,0.8584780288032094,0.8797523045928094,0.8584021942420996,0.8797210860496824,0.8128707366250155,0.8260361309332872
81
+ 7,-1,0.8850130416456771,0.8882085392118093,0.8583691365371056,0.8795345430182983,0.8582834204985644,0.879553317951403,0.8147244503581379,0.8276986779576448
82
+ 8,119,0.8855986538452559,0.8890111343134921,0.859481644376079,0.8807434561263918,0.8594233704068954,0.8807969959149904,0.8109033875801263,0.8243871522369683
83
+ 8,238,0.8848267279333828,0.8884737905882877,0.8581581033252191,0.8793356550120777,0.858088377724103,0.8793829772364761,0.813149448497148,0.8267203223021778
84
+ 8,357,0.884131504672114,0.8881623430585593,0.8571323348237995,0.8786245273869394,0.857058528214968,0.8786646439724858,0.8137684876575587,0.8277266315676831
85
+ 8,476,0.8844227251532278,0.8881091748831389,0.8570469050793121,0.8788083094401644,0.8569683400073297,0.8788021063771873,0.8134265280692008,0.8275774171869791
86
+ 8,595,0.8849894433902971,0.8884472111574989,0.8579427362418912,0.8796785160873225,0.8578700389859438,0.8797504118833454,0.8125462042885104,0.8266702584294826
87
+ 8,714,0.885436741163783,0.8885093734904793,0.8584871714459472,0.8798006293770824,0.8584068641570854,0.8798711497148297,0.8127096678528773,0.8265193950775724
88
+ 8,833,0.8849984170629683,0.8882580609531054,0.8577326127814072,0.8794075516493508,0.8576477441659175,0.8794500064166123,0.8132890048973692,0.8274101791184788
89
+ 8,952,0.8845661125214329,0.888062058103838,0.8572723544827137,0.8792775550907799,0.8571972402011276,0.8792914426273238,0.813280453345221,0.8276583986861932
90
+ 8,1071,0.8847988095756709,0.8881979938737582,0.8576123917582051,0.8792309837031971,0.8575387320390747,0.8792823921909539,0.8144436921534146,0.828347426711952
91
+ 8,-1,0.885035579557929,0.8883907300155452,0.858015997657079,0.8794273474961103,0.857943885268162,0.8794874571401071,0.8133686556708982,0.8272294589572149
92
+ 9,119,0.8851013154381319,0.8883719988726287,0.8577841020119,0.8789955702633683,0.8577237377708655,0.8790972244681775,0.8149674818850363,0.8285548551734055
93
+ 9,238,0.8851983236326592,0.8885050681662259,0.8579398453081819,0.8792140615845516,0.8578933578605225,0.8793023945506666,0.8153075127117759,0.8288012065142186
94
+ 9,357,0.8851902701287351,0.8886245192451891,0.8580723149102902,0.8794392667136579,0.8580226964018515,0.8795110527045215,0.8147058523539643,0.8283904352688051
95
+ 9,476,0.8852587044988647,0.8885275318024204,0.8580862697499411,0.8795488907424688,0.8580276149758482,0.8796409828051247,0.8145297144961017,0.8282917085555715
96
+ 9,595,0.8853092023594366,0.8885406616358956,0.8582555040090838,0.8799553935316201,0.8581934604223546,0.880051702411222,0.8133260547126236,0.8273292862167865
97
+ 9,714,0.885259239382198,0.8885171080665215,0.8581607689075803,0.8799298029678154,0.8581027687689353,0.8799861700994119,0.8135668662502087,0.8275243398979024
98
+ 9,833,0.8851742516515474,0.888485349797145,0.8579674650502147,0.8796152756792894,0.8579118189362843,0.8797145710793993,0.8144297643257481,0.8283031321447352
99
+ 9,952,0.8851121365078664,0.8884794837703002,0.85783832539016,0.8794322215708568,0.8577845693011349,0.8795155229614361,0.8147876641430668,0.8285994550694681
100
+ 9,1071,0.8851464311449804,0.8884921129348036,0.8578738562486644,0.8794391359760069,0.8578212878659055,0.8795307830357721,0.8148702579699264,0.8286361820023991
101
+ 9,-1,0.8851496072546251,0.8884768692344156,0.857861213926136,0.8794319947766976,0.8578082485108272,0.8795180610062464,0.8148881308081785,0.8286586758055441
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f805cc49fd4e2c2a9dd448b842929d9fc889b72feb934bde8cec25849750803f
3
+ size 3538419000
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 128,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "[CLS]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "[SEP]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "[MASK]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "[PAD]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "[SEP]",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "[CLS]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "[SEP]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "[UNK]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "128000": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "bos_token": "[CLS]",
45
+ "clean_up_tokenization_spaces": true,
46
+ "cls_token": "[CLS]",
47
+ "do_lower_case": false,
48
+ "eos_token": "[SEP]",
49
+ "mask_token": "[MASK]",
50
+ "max_length": 128,
51
+ "model_max_length": 512,
52
+ "pad_to_multiple_of": null,
53
+ "pad_token": "[PAD]",
54
+ "pad_token_type_id": 0,
55
+ "padding_side": "right",
56
+ "sep_token": "[SEP]",
57
+ "sp_model_kwargs": {},
58
+ "split_by_punct": false,
59
+ "stride": 0,
60
+ "tokenizer_class": "DebertaV2Tokenizer",
61
+ "truncation_side": "right",
62
+ "truncation_strategy": "longest_first",
63
+ "unk_token": "[UNK]",
64
+ "vocab_type": "spm"
65
+ }
train-config.yaml ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ trainer: "sts"
2
+ model_name: "albertina-900m-ptpt-europarl-eubookshop-ted2020-tatoeba-ct1-nli-gist10-sts-cosent20-v1"
3
+ base_model_name: "albertina-900m-ptpt-europarl-eubookshop-ted2020-tatoeba-ct1-nli-gist10-v1"
4
+ loss_function: "cosent"
5
+ seed: 1
6
+ learning_rate: 1e-6
7
+ warmup_ratio: 0.1
8
+ weight_decay: 0.01
9
+ batch_size: 16
10
+ use_amp: True
11
+ epochs: 10
12
+ validations_per_epoch: 10
13
+
14
+ # HPs used by JRodrigues to train albertina-100m-portuguese-ptpt-encoder:
15
+ # learning_rate 1e-5
16
+ # lr_scheduler_type linear
17
+ # weight_decay 0.01
18
+ # per_device_train_batch_size 192
19
+ # gradient_accumulation_steps 1
20
+ # num_train_epochs 150
21
+ # num_warmup_steps 10000