juanpablomesa's picture
Add new SentenceTransformer model.
1e3dede verified
metadata
base_model: sentence-transformers/all-mpnet-base-v2
datasets: []
language:
  - en
library_name: sentence-transformers
license: apache-2.0
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:4012
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: >-
      Extensive messenger RNA editing generates transcript and protein diversity
      in genes involved in neural excitability, as previously described, as well
      as in genes participating in a broad range of other cellular functions. 
    sentences:
      - Do cephalopods use RNA editing less frequently than other species?
      - GV1001 vaccine targets which enzyme?
      - Which event results in the acetylation of S6K1?
  - source_sentence: >-
      Yes, exposure to household furry pets influences the gut microbiota of
      infants.
    sentences:
      - Can pets affect infant microbiomed?
      - What is the mode of action of Thiazovivin?
      - What are the effects of CAMK4 inhibition?
  - source_sentence: >-
      In children with heart failure evidence of the effect of enalapril is
      empirical. Enalapril was clinically safe and effective in 50% to 80% of
      for children with cardiac failure secondary to congenital heart
      malformations before and after cardiac surgery,  impaired ventricular
      function , valvar regurgitation,  congestive cardiomyopathy,  , arterial
      hypertension, life-threatening arrhythmias coexisting with circulatory
      insufficiency.   

      ACE inhibitors have shown a transient beneficial effect on heart failure
      due to anticancer drugs and possibly a beneficial effect in muscular
      dystrophy-associated cardiomyopathy, which deserves further studies.
    sentences:
      - Which receptors can be evaluated with the [18F]altanserin?
      - >-
        In what proportion of children with heart failure has Enalapril been
        shown to be safe and effective?
      - Which major signaling pathways are regulated by RIP1?
  - source_sentence: >-
      Cellular senescence-associated heterochromatic foci (SAHFS) are a novel
      type of chromatin condensation involving alterations of linker histone H1
      and linker DNA-binding proteins. SAHFS can be formed by a variety of cell
      types, but their mechanism of action remains unclear.
    sentences:
      - >-
        What is the relationship between the X chromosome and a  neutrophil
        drumstick?
      - Which microRNAs are involved in exercise adaptation?
      - How are SAHFS created?
  - source_sentence: >-
      Multicluster Pcdh diversity is required for mouse olfactory neural circuit
      assembly. The vertebrate clustered protocadherin (Pcdh) cell surface
      proteins are encoded by three closely linked gene clusters (Pcdhα, Pcdhβ,
      and Pcdhγ). Although deletion of individual Pcdh clusters had subtle
      phenotypic consequences, the loss of all three clusters (tricluster
      deletion) led to a severe axonal arborization defect and loss of
      self-avoidance.
    sentences:
      - >-
        What are the effects of the deletion of all three Pcdh clusters
        (tricluster deletion) in mice?
      - what is the role of MEF-2 in cardiomyocyte differentiation?
      - >-
        How many periods of regulatory innovation led to the evolution of
        vertebrates?
model-index:
  - name: BGE base Financial Matryoshka
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 768
          type: dim_768
        metrics:
          - type: cosine_accuracy@1
            value: 0.8373408769448374
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.9306930693069307
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.9448373408769448
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.958981612446959
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.8373408769448374
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.31023102310231027
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.18896746817538893
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09589816124469587
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.8373408769448374
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.9306930693069307
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.9448373408769448
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.958981612446959
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9038566618329213
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.8855380436002787
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8867903631779396
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 512
          type: dim_512
        metrics:
          - type: cosine_accuracy@1
            value: 0.8373408769448374
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.9335219236209336
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.9462517680339463
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.9603960396039604
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.8373408769448374
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.31117397454031115
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.18925035360678924
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09603960396039603
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.8373408769448374
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.9335219236209336
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.9462517680339463
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.9603960396039604
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9045496377971035
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.8860549830493253
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8870969130410834
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 256
          type: dim_256
        metrics:
          - type: cosine_accuracy@1
            value: 0.8288543140028288
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.9222065063649222
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.942008486562942
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.9533239038189534
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.8288543140028288
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.3074021687883074
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.18840169731258838
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09533239038189532
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.8288543140028288
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.9222065063649222
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.942008486562942
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.9533239038189534
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.8963408137245359
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.8774370804427385
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8786914503856871
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 128
          type: dim_128
        metrics:
          - type: cosine_accuracy@1
            value: 0.809052333804809
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.8995756718528995
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.9207920792079208
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.9405940594059405
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.809052333804809
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.29985855728429983
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.18415841584158416
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09405940594059406
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.809052333804809
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.8995756718528995
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.9207920792079208
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.9405940594059405
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.8794609712523561
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.8593930311398488
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8608652296821839
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 64
          type: dim_64
        metrics:
          - type: cosine_accuracy@1
            value: 0.7694483734087695
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.8613861386138614
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.8868458274398868
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.9080622347949081
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.7694483734087695
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.2871287128712871
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.17736916548797735
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09080622347949079
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.7694483734087695
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.8613861386138614
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.8868458274398868
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.9080622347949081
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.841605620432732
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.8200012348173592
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8223782042287946
            name: Cosine Map@100

BGE base Financial Matryoshka

This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-mpnet-base-v2
  • Maximum Sequence Length: 384 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("juanpablomesa/all-mpnet-base-v2-bioasq-matryoshka")
# Run inference
sentences = [
    'Multicluster Pcdh diversity is required for mouse olfactory neural circuit assembly. The vertebrate clustered protocadherin (Pcdh) cell surface proteins are encoded by three closely linked gene clusters (Pcdhα, Pcdhβ, and Pcdhγ). Although deletion of individual Pcdh clusters had subtle phenotypic consequences, the loss of all three clusters (tricluster deletion) led to a severe axonal arborization defect and loss of self-avoidance.',
    'What are the effects of the deletion of all three Pcdh clusters (tricluster deletion) in mice?',
    'How many periods of regulatory innovation led to the evolution of vertebrates?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.8373
cosine_accuracy@3 0.9307
cosine_accuracy@5 0.9448
cosine_accuracy@10 0.959
cosine_precision@1 0.8373
cosine_precision@3 0.3102
cosine_precision@5 0.189
cosine_precision@10 0.0959
cosine_recall@1 0.8373
cosine_recall@3 0.9307
cosine_recall@5 0.9448
cosine_recall@10 0.959
cosine_ndcg@10 0.9039
cosine_mrr@10 0.8855
cosine_map@100 0.8868

Information Retrieval

Metric Value
cosine_accuracy@1 0.8373
cosine_accuracy@3 0.9335
cosine_accuracy@5 0.9463
cosine_accuracy@10 0.9604
cosine_precision@1 0.8373
cosine_precision@3 0.3112
cosine_precision@5 0.1893
cosine_precision@10 0.096
cosine_recall@1 0.8373
cosine_recall@3 0.9335
cosine_recall@5 0.9463
cosine_recall@10 0.9604
cosine_ndcg@10 0.9045
cosine_mrr@10 0.8861
cosine_map@100 0.8871

Information Retrieval

Metric Value
cosine_accuracy@1 0.8289
cosine_accuracy@3 0.9222
cosine_accuracy@5 0.942
cosine_accuracy@10 0.9533
cosine_precision@1 0.8289
cosine_precision@3 0.3074
cosine_precision@5 0.1884
cosine_precision@10 0.0953
cosine_recall@1 0.8289
cosine_recall@3 0.9222
cosine_recall@5 0.942
cosine_recall@10 0.9533
cosine_ndcg@10 0.8963
cosine_mrr@10 0.8774
cosine_map@100 0.8787

Information Retrieval

Metric Value
cosine_accuracy@1 0.8091
cosine_accuracy@3 0.8996
cosine_accuracy@5 0.9208
cosine_accuracy@10 0.9406
cosine_precision@1 0.8091
cosine_precision@3 0.2999
cosine_precision@5 0.1842
cosine_precision@10 0.0941
cosine_recall@1 0.8091
cosine_recall@3 0.8996
cosine_recall@5 0.9208
cosine_recall@10 0.9406
cosine_ndcg@10 0.8795
cosine_mrr@10 0.8594
cosine_map@100 0.8609

Information Retrieval

Metric Value
cosine_accuracy@1 0.7694
cosine_accuracy@3 0.8614
cosine_accuracy@5 0.8868
cosine_accuracy@10 0.9081
cosine_precision@1 0.7694
cosine_precision@3 0.2871
cosine_precision@5 0.1774
cosine_precision@10 0.0908
cosine_recall@1 0.7694
cosine_recall@3 0.8614
cosine_recall@5 0.8868
cosine_recall@10 0.9081
cosine_ndcg@10 0.8416
cosine_mrr@10 0.82
cosine_map@100 0.8224

Training Details

Training Dataset

Unnamed Dataset

  • Size: 4,012 training samples
  • Columns: positive and anchor
  • Approximate statistics based on the first 1000 samples:
    positive anchor
    type string string
    details
    • min: 3 tokens
    • mean: 63.14 tokens
    • max: 384 tokens
    • min: 5 tokens
    • mean: 16.13 tokens
    • max: 49 tokens
  • Samples:
    positive anchor
    Aberrant patterns of H3K4, H3K9, and H3K27 histone lysine methylation were shown to result in histone code alterations, which induce changes in gene expression, and affect the proliferation rate of cells in medulloblastoma. What is the implication of histone lysine methylation in medulloblastoma?
    STAG1/STAG2 proteins are tumour suppressor proteins that suppress cell proliferation and are essential for differentiation. What is the role of STAG1/STAG2 proteins in differentiation?
    The association between cell phone use and incident glioblastoma remains unclear. Some studies have reported that cell phone use was associated with incident glioblastoma, and with reduced survival of patients diagnosed with glioblastoma. However, other studies have repeatedly replicated to find an association between cell phone use and glioblastoma. What is the association between cell phone use and glioblastoma?
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 16
  • learning_rate: 2e-05
  • num_train_epochs: 4
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • tf32: True
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 16
  • eval_accumulation_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: True
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss dim_128_cosine_map@100 dim_256_cosine_map@100 dim_512_cosine_map@100 dim_64_cosine_map@100 dim_768_cosine_map@100
0.8889 7 - 0.8540 0.8752 0.8825 0.8050 0.8864
1.2698 10 1.2032 - - - - -
1.9048 15 - 0.8569 0.8775 0.8850 0.8169 0.8840
2.5397 20 0.5051 - - - - -
2.9206 23 - 0.861 0.8794 0.8866 0.8242 0.8858
3.5556 28 - 0.8609 0.8787 0.8871 0.8224 0.8868
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.5
  • Sentence Transformers: 3.0.1
  • Transformers: 4.41.2
  • PyTorch: 2.1.2+cu121
  • Accelerate: 0.31.0
  • Datasets: 2.19.1
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning}, 
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}