“WadoodAbdul” commited on
Commit
4e35351
·
1 Parent(s): eb6e73c

added dataset links and reproducibility steps

Browse files
Files changed (1) hide show
  1. src/about.py +9 -7
src/about.py CHANGED
@@ -55,7 +55,9 @@ The evaluation metrics used in this leaderboard focus primarily on the F1-score,
55
  # Which evaluations are you running? how can people reproduce what you have?
56
  LLM_BENCHMARKS_TEXT = f"""
57
 
58
- Note: It is important to note that the purpose of this evaluation is purely academic and exploratory. The models assessed here have not been approved for clinical use, and their results should not be interpreted as clinically validated. The leaderboard serves as a platform for researchers to compare models, understand their strengths and limitations, and drive further advancements in the field of clinical NLP.
 
 
59
 
60
  ## About
61
  The Named Clinical Entity Recognition Leaderboard is aimed at advancing the field of natural language processing in healthcare. It provides a standardized platform for evaluating and comparing the performance of various language models in recognizing named clinical entities, a critical task for applications such as clinical documentation, decision support, and information extraction. By fostering transparency and facilitating benchmarking, the leaderboard's goal is to drive innovation and improvement in NLP models. It also helps researchers identify the strengths and weaknesses of different approaches, ultimately contributing to the development of more accurate and reliable tools for clinical use. Despite its exploratory nature, the leaderboard aims to play a role in guiding research and ensuring that advancements are grounded in rigorous and comprehensive evaluations.
@@ -64,22 +66,22 @@ The Named Clinical Entity Recognition Leaderboard is aimed at advancing the fiel
64
 
65
  ### Datasets
66
  📈 We evaluate the models on 4 datasets, encompassing 6 entity types
67
- - NCBI
68
- - CHIA
69
- - BIORED
70
- - BC5CD
71
 
72
  ### Evaluation Metrics
73
  We perceive NER objects as span(with character offsets) instead of token level artifacts. This enables us to expand to nested NER scenarios easily.
74
 
75
 
76
  ## Reproducibility
77
- To reproduce our results, here is the commands you can run:
78
 
79
  """
80
 
81
  EVALUATION_QUEUE_TEXT = """
82
- Follow the steps detailed in the [medics_ner](https://github.com/WadoodAbdul/medics_ner/blob/3b415e9c4c9561ce5168374813072bde36658ff4/docs/submit_to_leaderboard.md) repo to upload you model to the leaderoard.
83
  """
84
 
85
  CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
 
55
  # Which evaluations are you running? how can people reproduce what you have?
56
  LLM_BENCHMARKS_TEXT = f"""
57
 
58
+ #### Disclaimer & Advisory
59
+
60
+ It is important to note that the purpose of this evaluation is purely academic and exploratory. The models assessed here have not been approved for clinical use, and their results should not be interpreted as clinically validated. The leaderboard serves as a platform for researchers to compare models, understand their strengths and limitations, and drive further advancements in the field of clinical NLP.
61
 
62
  ## About
63
  The Named Clinical Entity Recognition Leaderboard is aimed at advancing the field of natural language processing in healthcare. It provides a standardized platform for evaluating and comparing the performance of various language models in recognizing named clinical entities, a critical task for applications such as clinical documentation, decision support, and information extraction. By fostering transparency and facilitating benchmarking, the leaderboard's goal is to drive innovation and improvement in NLP models. It also helps researchers identify the strengths and weaknesses of different approaches, ultimately contributing to the development of more accurate and reliable tools for clinical use. Despite its exploratory nature, the leaderboard aims to play a role in guiding research and ensuring that advancements are grounded in rigorous and comprehensive evaluations.
 
66
 
67
  ### Datasets
68
  📈 We evaluate the models on 4 datasets, encompassing 6 entity types
69
+ - [NCBI](https://huggingface.co/datasets/m42-health/m2_ncbi)
70
+ - [CHIA](https://huggingface.co/datasets/m42-health/m2_chia)
71
+ - [BIORED](https://huggingface.co/datasets/m42-health/m2_biored)
72
+ - [BC5CD](https://huggingface.co/datasets/m42-health/m2_bc5cdr)
73
 
74
  ### Evaluation Metrics
75
  We perceive NER objects as span(with character offsets) instead of token level artifacts. This enables us to expand to nested NER scenarios easily.
76
 
77
 
78
  ## Reproducibility
79
+ To reproduce our results, follow the steps detailed [here](https://github.com/WadoodAbdul/medics_ner/blob/master/docs/reproducing_results.md)
80
 
81
  """
82
 
83
  EVALUATION_QUEUE_TEXT = """
84
+ Follow the steps detailed in the [medics_ner](https://github.com/WadoodAbdul/medics_ner/blob/master/docs/submit_to_leaderboard.md) repo to upload you model to the leaderoard.
85
  """
86
 
87
  CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"