SentenceTransformer based on Qwen/Qwen2.5-0.5B-Instruct
This is a sentence-transformers model finetuned from Qwen/Qwen2.5-0.5B-Instruct. It maps sentences & paragraphs to a 896-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Qwen/Qwen2.5-0.5B-Instruct
- Maximum Sequence Length: 1024 tokens
- Output Dimensionality: 896 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 1024, 'do_lower_case': False}) with Transformer model: Qwen2Model
(1): Pooling({'word_embedding_dimension': 896, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("AlexWortega/qwen3k")
# Run inference
sentences = [
'When was ABC formed?',
"American Broadcasting Company\nABC launched as a radio network on October 12, 1943, serving as the successor to the NBC Blue Network, which had been purchased by Edward J. Noble. It extended its operations to television in 1948, following in the footsteps of established broadcast networks CBS and NBC. In the mid-1950s, ABC merged with United Paramount Theatres, a chain of movie theaters that formerly operated as a subsidiary of Paramount Pictures. Leonard Goldenson, who had been the head of UPT, made the new television network profitable by helping develop and greenlight many successful series. In the 1980s, after purchasing an 80% interest in cable sports channel ESPN, the network's corporate parent, American Broadcasting Companies, Inc., merged with Capital Cities Communications, owner of several print publications, and television and radio stations. In 1996, most of Capital Cities/ABC's assets were purchased by The Walt Disney Company.",
'Americans Battling Communism\nAmericans Battling Communism, Inc. (ABC) was an anti-communist organization created following an October 1947 speech by Pennsylvania Judge Blair Gunther that called for an "ABC movement" to educate America about communism. Chartered in November 1947 by Harry Alan Sherman, a local lawyer active in various anti-communist organizations, the group took part in such activities as blacklisting by disclosing the names of people suspected of being communists. Its members included local judges and lawyers active in the McCarthy-era prosecution of communists.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 896]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Semantic Similarity
- Datasets:
sts-dev-896
andsts-dev-768
- Evaluated with
EmbeddingSimilarityEvaluator
Metric | sts-dev-896 | sts-dev-768 |
---|---|---|
pearson_cosine | 0.7513 | 0.7504 |
spearman_cosine | 0.7603 | 0.759 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 1,077,240 training samples
- Columns:
query
,response
, andnegative
- Approximate statistics based on the first 1000 samples:
query response negative type string string string details - min: 4 tokens
- mean: 8.76 tokens
- max: 26 tokens
- min: 23 tokens
- mean: 141.88 tokens
- max: 532 tokens
- min: 4 tokens
- mean: 134.02 tokens
- max: 472 tokens
- Samples:
query response negative Was there a year 0?
Year zero
Year zero does not exist in the anno Domini system usually used to number years in the Gregorian calendar and in its predecessor, the Julian calendar. In this system, the year 1 BC is followed by AD 1. However, there is a year zero in astronomical year numbering (where it coincides with the Julian year 1 BC) and in ISO 8601:2004 (where it coincides with the Gregorian year 1 BC) as well as in all Buddhist and Hindu calendars.504
Year 504 (DIV) was a leap year starting on Thursday (link will display the full calendar) of the Julian calendar. At the time, it was known as the Year of the Consulship of Nicomachus without colleague (or, less frequently, year 1257 "Ab urbe condita"). The denomination 504 for this year has been used since the early medieval period, when the Anno Domini calendar era became the prevalent method in Europe for naming years.When is the dialectical method used?
Dialectic
Dialectic or dialectics (Greek: διαλεκτική, dialektikḗ; related to dialogue), also known as the dialectical method, is at base a discourse between two or more people holding different points of view about a subject but wishing to establish the truth through reasoned arguments. Dialectic resembles debate, but the concept excludes subjective elements such as emotional appeal and the modern pejorative sense of rhetoric.[1][2] Dialectic may be contrasted with the didactic method, wherein one side of the conversation teaches the other. Dialectic is alternatively known as minor logic, as opposed to major logic or critique.Derek Bentley case
Another factor in the posthumous defence was that a "confession" recorded by Bentley, which was claimed by the prosecution to be a "verbatim record of dictated monologue", was shown by forensic linguistics methods to have been largely edited by policemen. Linguist Malcolm Coulthard showed that certain patterns, such as the frequency of the word "then" and the grammatical use of "then" after the grammatical subject ("I then" rather than "then I"), were not consistent with Bentley's use of language (his idiolect), as evidenced in court testimony. These patterns fit better the recorded testimony of the policemen involved. This is one of the earliest uses of forensic linguistics on record.What do Grasshoppers eat?
Grasshopper
Grasshoppers are plant-eaters, with a few species at times becoming serious pests of cereals, vegetables and pasture, especially when they swarm in their millions as locusts and destroy crops over wide areas. They protect themselves from predators by camouflage; when detected, many species attempt to startle the predator with a brilliantly-coloured wing-flash while jumping and (if adult) launching themselves into the air, usually flying for only a short distance. Other species such as the rainbow grasshopper have warning coloration which deters predators. Grasshoppers are affected by parasites and various diseases, and many predatory creatures feed on both nymphs and adults. The eggs are the subject of attack by parasitoids and predators.Groundhog
Very often the dens of groundhogs provide homes for other animals including skunks, red foxes, and cottontail rabbits. The fox and skunk feed upon field mice, grasshoppers, beetles and other creatures that destroy farm crops. In aiding these animals, the groundhog indirectly helps the farmer. In addition to providing homes for itself and other animals, the groundhog aids in soil improvement by bringing subsoil to the surface. The groundhog is also a valuable game animal and is considered a difficult sport when hunted in a fair manner. In some parts of Appalachia, they are eaten. - Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 12per_device_eval_batch_size
: 12gradient_accumulation_steps
: 4num_train_epochs
: 1warmup_ratio
: 0.3bf16
: Truebatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 12per_device_eval_batch_size
: 12per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 4eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.3warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Click to expand
Epoch | Step | Training Loss | sts-dev-896_spearman_cosine | sts-dev-768_spearman_cosine |
---|---|---|---|---|
0.0004 | 10 | 2.2049 | - | - |
0.0009 | 20 | 2.3168 | - | - |
0.0013 | 30 | 2.3544 | - | - |
0.0018 | 40 | 2.2519 | - | - |
0.0022 | 50 | 2.1809 | - | - |
0.0027 | 60 | 2.1572 | - | - |
0.0031 | 70 | 2.1855 | - | - |
0.0036 | 80 | 2.5887 | - | - |
0.0040 | 90 | 2.883 | - | - |
0.0045 | 100 | 2.8557 | - | - |
0.0049 | 110 | 2.9356 | - | - |
0.0053 | 120 | 2.8833 | - | - |
0.0058 | 130 | 2.8394 | - | - |
0.0062 | 140 | 2.923 | - | - |
0.0067 | 150 | 2.8191 | - | - |
0.0071 | 160 | 2.8658 | - | - |
0.0076 | 170 | 2.8252 | - | - |
0.0080 | 180 | 2.8312 | - | - |
0.0085 | 190 | 2.7761 | - | - |
0.0089 | 200 | 2.7193 | - | - |
0.0094 | 210 | 2.724 | - | - |
0.0098 | 220 | 2.7484 | - | - |
0.0102 | 230 | 2.7262 | - | - |
0.0107 | 240 | 2.6964 | - | - |
0.0111 | 250 | 2.6676 | - | - |
0.0116 | 260 | 2.6715 | - | - |
0.0120 | 270 | 2.6145 | - | - |
0.0125 | 280 | 2.6191 | - | - |
0.0129 | 290 | 1.9812 | - | - |
0.0134 | 300 | 1.6413 | - | - |
0.0138 | 310 | 1.6126 | - | - |
0.0143 | 320 | 1.3599 | - | - |
0.0147 | 330 | 1.2996 | - | - |
0.0151 | 340 | 1.2654 | - | - |
0.0156 | 350 | 1.9409 | - | - |
0.0160 | 360 | 2.1287 | - | - |
0.0165 | 370 | 1.8442 | - | - |
0.0169 | 380 | 1.6837 | - | - |
0.0174 | 390 | 1.5489 | - | - |
0.0178 | 400 | 1.4382 | - | - |
0.0183 | 410 | 1.4848 | - | - |
0.0187 | 420 | 1.3481 | - | - |
0.0192 | 430 | 1.3467 | - | - |
0.0196 | 440 | 1.3977 | - | - |
0.0201 | 450 | 1.26 | - | - |
0.0205 | 460 | 1.2412 | - | - |
0.0209 | 470 | 1.316 | - | - |
0.0214 | 480 | 1.3501 | - | - |
0.0218 | 490 | 1.2246 | - | - |
0.0223 | 500 | 1.2271 | - | - |
0.0227 | 510 | 1.1871 | - | - |
0.0232 | 520 | 1.1685 | - | - |
0.0236 | 530 | 1.1624 | - | - |
0.0241 | 540 | 1.1911 | - | - |
0.0245 | 550 | 1.1978 | - | - |
0.0250 | 560 | 1.1228 | - | - |
0.0254 | 570 | 1.1091 | - | - |
0.0258 | 580 | 1.1433 | - | - |
0.0263 | 590 | 1.0638 | - | - |
0.0267 | 600 | 1.0515 | - | - |
0.0272 | 610 | 1.175 | - | - |
0.0276 | 620 | 1.0943 | - | - |
0.0281 | 630 | 1.1226 | - | - |
0.0285 | 640 | 0.9871 | - | - |
0.0290 | 650 | 1.0171 | - | - |
0.0294 | 660 | 1.0169 | - | - |
0.0299 | 670 | 0.9643 | - | - |
0.0303 | 680 | 0.9563 | - | - |
0.0307 | 690 | 0.9841 | - | - |
0.0312 | 700 | 1.0349 | - | - |
0.0316 | 710 | 0.8958 | - | - |
0.0321 | 720 | 0.9225 | - | - |
0.0325 | 730 | 0.842 | - | - |
0.0330 | 740 | 0.9104 | - | - |
0.0334 | 750 | 0.8927 | - | - |
0.0339 | 760 | 0.8508 | - | - |
0.0343 | 770 | 0.8835 | - | - |
0.0348 | 780 | 0.9531 | - | - |
0.0352 | 790 | 0.926 | - | - |
0.0356 | 800 | 0.8718 | - | - |
0.0361 | 810 | 0.8261 | - | - |
0.0365 | 820 | 0.8169 | - | - |
0.0370 | 830 | 0.8525 | - | - |
0.0374 | 840 | 0.8504 | - | - |
0.0379 | 850 | 0.7625 | - | - |
0.0383 | 860 | 0.8259 | - | - |
0.0388 | 870 | 0.7558 | - | - |
0.0392 | 880 | 0.7898 | - | - |
0.0397 | 890 | 0.7694 | - | - |
0.0401 | 900 | 0.7429 | - | - |
0.0405 | 910 | 0.6666 | - | - |
0.0410 | 920 | 0.7407 | - | - |
0.0414 | 930 | 0.6665 | - | - |
0.0419 | 940 | 0.7597 | - | - |
0.0423 | 950 | 0.7035 | - | - |
0.0428 | 960 | 0.7166 | - | - |
0.0432 | 970 | 0.6889 | - | - |
0.0437 | 980 | 0.7541 | - | - |
0.0441 | 990 | 0.7175 | - | - |
0.0446 | 1000 | 0.7389 | 0.6420 | 0.6403 |
0.0450 | 1010 | 0.7142 | - | - |
0.0454 | 1020 | 0.7301 | - | - |
0.0459 | 1030 | 0.7299 | - | - |
0.0463 | 1040 | 0.6759 | - | - |
0.0468 | 1050 | 0.7036 | - | - |
0.0472 | 1060 | 0.6286 | - | - |
0.0477 | 1070 | 0.595 | - | - |
0.0481 | 1080 | 0.6099 | - | - |
0.0486 | 1090 | 0.6377 | - | - |
0.0490 | 1100 | 0.6309 | - | - |
0.0495 | 1110 | 0.6306 | - | - |
0.0499 | 1120 | 0.557 | - | - |
0.0504 | 1130 | 0.5898 | - | - |
0.0508 | 1140 | 0.5896 | - | - |
0.0512 | 1150 | 0.6399 | - | - |
0.0517 | 1160 | 0.5923 | - | - |
0.0521 | 1170 | 0.5787 | - | - |
0.0526 | 1180 | 0.591 | - | - |
0.0530 | 1190 | 0.5714 | - | - |
0.0535 | 1200 | 0.6047 | - | - |
0.0539 | 1210 | 0.5904 | - | - |
0.0544 | 1220 | 0.543 | - | - |
0.0548 | 1230 | 0.6033 | - | - |
0.0553 | 1240 | 0.5445 | - | - |
0.0557 | 1250 | 0.5217 | - | - |
0.0561 | 1260 | 0.5835 | - | - |
0.0566 | 1270 | 0.5353 | - | - |
0.0570 | 1280 | 0.5887 | - | - |
0.0575 | 1290 | 0.5967 | - | - |
0.0579 | 1300 | 0.5036 | - | - |
0.0584 | 1310 | 0.5915 | - | - |
0.0588 | 1320 | 0.5719 | - | - |
0.0593 | 1330 | 0.5238 | - | - |
0.0597 | 1340 | 0.5647 | - | - |
0.0602 | 1350 | 0.538 | - | - |
0.0606 | 1360 | 0.5457 | - | - |
0.0610 | 1370 | 0.5169 | - | - |
0.0615 | 1380 | 0.4967 | - | - |
0.0619 | 1390 | 0.4864 | - | - |
0.0624 | 1400 | 0.5133 | - | - |
0.0628 | 1410 | 0.5587 | - | - |
0.0633 | 1420 | 0.4691 | - | - |
0.0637 | 1430 | 0.5186 | - | - |
0.0642 | 1440 | 0.4907 | - | - |
0.0646 | 1450 | 0.5281 | - | - |
0.0651 | 1460 | 0.4741 | - | - |
0.0655 | 1470 | 0.4452 | - | - |
0.0659 | 1480 | 0.4771 | - | - |
0.0664 | 1490 | 0.4289 | - | - |
0.0668 | 1500 | 0.4551 | - | - |
0.0673 | 1510 | 0.4558 | - | - |
0.0677 | 1520 | 0.5159 | - | - |
0.0682 | 1530 | 0.4296 | - | - |
0.0686 | 1540 | 0.4548 | - | - |
0.0691 | 1550 | 0.4439 | - | - |
0.0695 | 1560 | 0.4295 | - | - |
0.0700 | 1570 | 0.4466 | - | - |
0.0704 | 1580 | 0.4717 | - | - |
0.0708 | 1590 | 0.492 | - | - |
0.0713 | 1600 | 0.4566 | - | - |
0.0717 | 1610 | 0.4451 | - | - |
0.0722 | 1620 | 0.4715 | - | - |
0.0726 | 1630 | 0.4573 | - | - |
0.0731 | 1640 | 0.3972 | - | - |
0.0735 | 1650 | 0.5212 | - | - |
0.0740 | 1660 | 0.4381 | - | - |
0.0744 | 1670 | 0.4552 | - | - |
0.0749 | 1680 | 0.4767 | - | - |
0.0753 | 1690 | 0.4398 | - | - |
0.0757 | 1700 | 0.4801 | - | - |
0.0762 | 1710 | 0.3751 | - | - |
0.0766 | 1720 | 0.4407 | - | - |
0.0771 | 1730 | 0.4305 | - | - |
0.0775 | 1740 | 0.3938 | - | - |
0.0780 | 1750 | 0.4748 | - | - |
0.0784 | 1760 | 0.428 | - | - |
0.0789 | 1770 | 0.404 | - | - |
0.0793 | 1780 | 0.4261 | - | - |
0.0798 | 1790 | 0.359 | - | - |
0.0802 | 1800 | 0.4422 | - | - |
0.0807 | 1810 | 0.4748 | - | - |
0.0811 | 1820 | 0.4352 | - | - |
0.0815 | 1830 | 0.4032 | - | - |
0.0820 | 1840 | 0.4124 | - | - |
0.0824 | 1850 | 0.4486 | - | - |
0.0829 | 1860 | 0.429 | - | - |
0.0833 | 1870 | 0.4189 | - | - |
0.0838 | 1880 | 0.3658 | - | - |
0.0842 | 1890 | 0.4297 | - | - |
0.0847 | 1900 | 0.4215 | - | - |
0.0851 | 1910 | 0.3726 | - | - |
0.0856 | 1920 | 0.3736 | - | - |
0.0860 | 1930 | 0.4287 | - | - |
0.0864 | 1940 | 0.4402 | - | - |
0.0869 | 1950 | 0.4353 | - | - |
0.0873 | 1960 | 0.3622 | - | - |
0.0878 | 1970 | 0.3557 | - | - |
0.0882 | 1980 | 0.4107 | - | - |
0.0887 | 1990 | 0.3982 | - | - |
0.0891 | 2000 | 0.453 | 0.7292 | 0.7261 |
0.0896 | 2010 | 0.3971 | - | - |
0.0900 | 2020 | 0.4374 | - | - |
0.0905 | 2030 | 0.4322 | - | - |
0.0909 | 2040 | 0.3945 | - | - |
0.0913 | 2050 | 0.356 | - | - |
0.0918 | 2060 | 0.4182 | - | - |
0.0922 | 2070 | 0.3694 | - | - |
0.0927 | 2080 | 0.3989 | - | - |
0.0931 | 2090 | 0.4237 | - | - |
0.0936 | 2100 | 0.3961 | - | - |
0.0940 | 2110 | 0.4264 | - | - |
0.0945 | 2120 | 0.3609 | - | - |
0.0949 | 2130 | 0.4154 | - | - |
0.0954 | 2140 | 0.3661 | - | - |
0.0958 | 2150 | 0.3328 | - | - |
0.0962 | 2160 | 0.3456 | - | - |
0.0967 | 2170 | 0.3478 | - | - |
0.0971 | 2180 | 0.3339 | - | - |
0.0976 | 2190 | 0.3833 | - | - |
0.0980 | 2200 | 0.3238 | - | - |
0.0985 | 2210 | 0.3871 | - | - |
0.0989 | 2220 | 0.4009 | - | - |
0.0994 | 2230 | 0.4115 | - | - |
0.0998 | 2240 | 0.4024 | - | - |
0.1003 | 2250 | 0.35 | - | - |
0.1007 | 2260 | 0.3649 | - | - |
0.1011 | 2270 | 0.3615 | - | - |
0.1016 | 2280 | 0.3898 | - | - |
0.1020 | 2290 | 0.3866 | - | - |
0.1025 | 2300 | 0.3904 | - | - |
0.1029 | 2310 | 0.3321 | - | - |
0.1034 | 2320 | 0.3803 | - | - |
0.1038 | 2330 | 0.3831 | - | - |
0.1043 | 2340 | 0.403 | - | - |
0.1047 | 2350 | 0.3803 | - | - |
0.1052 | 2360 | 0.3463 | - | - |
0.1056 | 2370 | 0.3987 | - | - |
0.1060 | 2380 | 0.3731 | - | - |
0.1065 | 2390 | 0.353 | - | - |
0.1069 | 2400 | 0.3166 | - | - |
0.1074 | 2410 | 0.3895 | - | - |
0.1078 | 2420 | 0.4025 | - | - |
0.1083 | 2430 | 0.3798 | - | - |
0.1087 | 2440 | 0.2991 | - | - |
0.1092 | 2450 | 0.3094 | - | - |
0.1096 | 2460 | 0.3669 | - | - |
0.1101 | 2470 | 0.3412 | - | - |
0.1105 | 2480 | 0.3697 | - | - |
0.1110 | 2490 | 0.369 | - | - |
0.1114 | 2500 | 0.3393 | - | - |
0.1118 | 2510 | 0.4232 | - | - |
0.1123 | 2520 | 0.3445 | - | - |
0.1127 | 2530 | 0.4165 | - | - |
0.1132 | 2540 | 0.3721 | - | - |
0.1136 | 2550 | 0.3476 | - | - |
0.1141 | 2560 | 0.2847 | - | - |
0.1145 | 2570 | 0.3609 | - | - |
0.1150 | 2580 | 0.3017 | - | - |
0.1154 | 2590 | 0.374 | - | - |
0.1159 | 2600 | 0.3365 | - | - |
0.1163 | 2610 | 0.393 | - | - |
0.1167 | 2620 | 0.3623 | - | - |
0.1172 | 2630 | 0.3538 | - | - |
0.1176 | 2640 | 0.3206 | - | - |
0.1181 | 2650 | 0.3962 | - | - |
0.1185 | 2660 | 0.3087 | - | - |
0.1190 | 2670 | 0.3482 | - | - |
0.1194 | 2680 | 0.3616 | - | - |
0.1199 | 2690 | 0.3955 | - | - |
0.1203 | 2700 | 0.3915 | - | - |
0.1208 | 2710 | 0.3782 | - | - |
0.1212 | 2720 | 0.3576 | - | - |
0.1216 | 2730 | 0.3544 | - | - |
0.1221 | 2740 | 0.3572 | - | - |
0.1225 | 2750 | 0.3107 | - | - |
0.1230 | 2760 | 0.3579 | - | - |
0.1234 | 2770 | 0.3571 | - | - |
0.1239 | 2780 | 0.3694 | - | - |
0.1243 | 2790 | 0.3674 | - | - |
0.1248 | 2800 | 0.3373 | - | - |
0.1252 | 2810 | 0.3362 | - | - |
0.1257 | 2820 | 0.3225 | - | - |
0.1261 | 2830 | 0.3609 | - | - |
0.1265 | 2840 | 0.3681 | - | - |
0.1270 | 2850 | 0.4059 | - | - |
0.1274 | 2860 | 0.3047 | - | - |
0.1279 | 2870 | 0.3446 | - | - |
0.1283 | 2880 | 0.3507 | - | - |
0.1288 | 2890 | 0.3124 | - | - |
0.1292 | 2900 | 0.3712 | - | - |
0.1297 | 2910 | 0.3394 | - | - |
0.1301 | 2920 | 0.3869 | - | - |
0.1306 | 2930 | 0.3449 | - | - |
0.1310 | 2940 | 0.3752 | - | - |
0.1314 | 2950 | 0.3341 | - | - |
0.1319 | 2960 | 0.3329 | - | - |
0.1323 | 2970 | 0.36 | - | - |
0.1328 | 2980 | 0.3788 | - | - |
0.1332 | 2990 | 0.3834 | - | - |
0.1337 | 3000 | 0.3426 | 0.7603 | 0.7590 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.3.0
- Transformers: 4.46.2
- PyTorch: 2.1.0+cu118
- Accelerate: 1.1.1
- Datasets: 3.1.0
- Tokenizers: 0.20.3
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 8
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for AlexWortega/qwen3k
Evaluation results
- Pearson Cosine on sts dev 896self-reported0.751
- Spearman Cosine on sts dev 896self-reported0.760
- Pearson Cosine on sts dev 768self-reported0.750
- Spearman Cosine on sts dev 768self-reported0.759