joe32140 commited on
Commit
59fde6d
·
verified ·
1 Parent(s): 0a96982

add model card

Browse files
Files changed (1) hide show
  1. README.md +72 -0
README.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - joe32140/chime-claim-category
5
+ language:
6
+ - en
7
+ metrics:
8
+ - f1
9
+ - recall
10
+ - precision
11
+ tags:
12
+ - medical
13
+ ---
14
+
15
+ # Flan-T5 Large for Claim Category Classification from (CHIME)[https://github.com/allenai/chime]
16
+
17
+ This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) for classifying whether a given claim belongs to a specific category in CHIME paper.
18
+
19
+ ## Model description
20
+
21
+ The model is based on the Flan-T5 Large architecture and has been fine-tuned on a custom dataset for claim category classification. It takes a claim and a category as input and predicts whether the claim belongs to that category (1) or not (0).
22
+
23
+ ## Intended uses & limitations
24
+
25
+ This model is designed for binary classification of claims into categories. It can be used to determine if a given claim belongs to a specific category. The model's performance may vary depending on the domain and complexity of the claims and categories.
26
+
27
+ ## How to use
28
+
29
+ Here's how to use the model for prediction:
30
+
31
+ ```python
32
+ from transformers import T5Tokenizer, T5ForConditionalGeneration
33
+
34
+ # Load the model and tokenizer
35
+ model_name = "joe32140/flan-t5-large-claim-category"
36
+ tokenizer = T5Tokenizer.from_pretrained(model_name)
37
+ model = T5ForConditionalGeneration.from_pretrained(model_name)
38
+
39
+ # Prepare the input
40
+ claim = "The Earth is flat."
41
+ category = "Astronomy"
42
+ prefix = "Please answer this question: Does the claim belong to the category?"
43
+ input_text = f"{prefix} Claim: {claim} Category: {category}"
44
+
45
+ # Tokenize and generate prediction
46
+ input_ids = tokenizer(input_text, return_tensors="pt").input_ids
47
+ outputs = model.generate(input_ids, max_length=8)
48
+ prediction = tokenizer.decode(outputs[0], skip_special_tokens=True)
49
+
50
+ # Convert prediction to integer (0 or 1)
51
+ result = int(prediction.strip())
52
+
53
+ print(f"Claim: {claim}")
54
+ print(f"Category: {category}")
55
+ print(f"Belongs to category: {result}")
56
+ ```
57
+
58
+ ## Training data
59
+
60
+ The model was trained on a custom dataset containing claims and categories. The dataset is publicly available at (CHIME: claim/category)[https://huggingface.co/datasets/joe32140/chime-claim-category].
61
+
62
+ ## Training procedure
63
+
64
+ The model was fine-tuned using the following hyperparameters:
65
+ - Learning rate: 3e-4
66
+ - Batch size: 16
67
+ - Number of epochs: 2
68
+ - Training was done using the Seq2SeqTrainer from the Transformers library.
69
+
70
+ ## Limitations and bias
71
+
72
+ As with any machine learning model, this model may have biases present in the training data. Users should be aware of potential biases and evaluate the model's performance on their specific use case.