Update README.md
Browse files
README.md
CHANGED
@@ -17,7 +17,7 @@ The model is an improvement of the MiniCheck model proposed in the following pap
|
|
17 |
|
18 |
The model takes as input a document and **a sentence** and determines whether the sentence is supported by the document: **MiniCheck-Model(document, claim) -> {0, 1}**
|
19 |
|
20 |
-
**In order to fact-check a multi-sentence claim, the claim should first be broken up into sentences.** The document does not need to be chunked unless it exceeds `32K` tokens.
|
21 |
|
22 |
`Llama-3.1-Bespoke-MiniCheck-7B` is finetuned from `internlm/internlm2_5-7b-chat` ([Cai et al., 2024](https://arxiv.org/pdf/2403.17297))
|
23 |
on the combination of 35K data points only:
|
@@ -68,7 +68,7 @@ claim_2 = "The students are on vacation."
|
|
68 |
# model_name can be one of:
|
69 |
# ['roberta-large', 'deberta-v3-large', 'flan-t5-large', 'Bespoke-MiniCheck-7B']
|
70 |
scorer = MiniCheck(model_name='Bespoke-MiniCheck-7B', enable_prefix_caching=False, cache_dir='./ckpts')
|
71 |
-
pred_label, raw_prob, _, _ = scorer.score(docs=[doc, doc], claims=[claim_1, claim_2])
|
72 |
|
73 |
print(pred_label) # [1, 0]
|
74 |
print(raw_prob) # [0.9840446675150499, 0.010986349594852094]
|
|
|
17 |
|
18 |
The model takes as input a document and **a sentence** and determines whether the sentence is supported by the document: **MiniCheck-Model(document, claim) -> {0, 1}**
|
19 |
|
20 |
+
**In order to fact-check a multi-sentence claim, the claim should first be broken up into sentences.** The document does not need to be chunked unless it exceeds `32K` tokens. Depending on use cases, adjusting chunk size may yield better performance.
|
21 |
|
22 |
`Llama-3.1-Bespoke-MiniCheck-7B` is finetuned from `internlm/internlm2_5-7b-chat` ([Cai et al., 2024](https://arxiv.org/pdf/2403.17297))
|
23 |
on the combination of 35K data points only:
|
|
|
68 |
# model_name can be one of:
|
69 |
# ['roberta-large', 'deberta-v3-large', 'flan-t5-large', 'Bespoke-MiniCheck-7B']
|
70 |
scorer = MiniCheck(model_name='Bespoke-MiniCheck-7B', enable_prefix_caching=False, cache_dir='./ckpts')
|
71 |
+
pred_label, raw_prob, _, _ = scorer.score(docs=[doc, doc], claims=[claim_1, claim_2]) # can set `chunk_size=your-specified-value` here, default to 32K chunk size.
|
72 |
|
73 |
print(pred_label) # [1, 0]
|
74 |
print(raw_prob) # [0.9840446675150499, 0.010986349594852094]
|