Update README.md
Browse files
README.md
CHANGED
@@ -3,7 +3,9 @@ license: apache-2.0
|
|
3 |
---
|
4 |
|
5 |
|
6 |
-
`clinitokenizer` is a sentence tokenizer for clinical text to split unstructured text from clinical text (such as Electronic Medical Records) into individual sentences.
|
|
|
|
|
7 |
|
8 |
General English sentence tokenizers are often unable to correctly parse medical abbreviations, jargon, and other conventions often used in medical records (see "Motivating Examples" section below). clinitokenizer is specifically trained on medical record data and can perform better in these situations (conversely, for non-domain specific use, using more general sentence tokenizers may yield better results).
|
9 |
|
|
|
3 |
---
|
4 |
|
5 |
|
6 |
+
`clinitokenizer` is a sentence tokenizer for clinical text to split unstructured text from clinical text (such as Electronic Medical Records) into individual sentences.
|
7 |
+
|
8 |
+
To use this model, see the [clinitokenizer repository](https://github.com/clinisift/clinitokenizer).
|
9 |
|
10 |
General English sentence tokenizers are often unable to correctly parse medical abbreviations, jargon, and other conventions often used in medical records (see "Motivating Examples" section below). clinitokenizer is specifically trained on medical record data and can perform better in these situations (conversely, for non-domain specific use, using more general sentence tokenizers may yield better results).
|
11 |
|