ClinicalMetaScience commited on
Commit
1ef071e
1 Parent(s): 128988d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -2
README.md CHANGED
@@ -23,8 +23,37 @@ Click 'Compute' to predict the class labels for an example abstract or an abstra
23
  The class label 'positive' corresponds to 'positive results only', while 'negative' represents 'mixed and negative results'.
24
 
25
  ## Using the model for larger data
26
- Use this [script](https://github.com/PsyCapsLock/PubBiasDetect/blob/main/Scripts/Predict_Example_Abstracts_using_NegativeResultDetector.ipynb)
27
- from our [GitHub repository](https://github.com/PsyCapsLock/PubBiasDetect) to analyze your own or our example data.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
  ## Disclaimer
30
  This tool is developed to analyze and predict the prevalence of positive and negative results in scientific abstracts based on the SciBERT model. While publication bias is a plausible explanation for certain patterns of results observed in scientific literature, the analyses conducted by this tool do not conclusively establish the presence of publication bias or any other underlying factors. It's essential to understand that this tool evaluates data but does not delve into the underlying reasons for the observed trends.
 
23
  The class label 'positive' corresponds to 'positive results only', while 'negative' represents 'mixed and negative results'.
24
 
25
  ## Using the model for larger data
26
+ ```
27
+ from transformers import AutoTokenizer, Trainer, AutoModelForSequenceClassification
28
+ ## 1. Load tokenizer
29
+ tokenizer = AutoTokenizer.from_pretrained('allenai/scibert_scivocab_uncased')
30
+
31
+ ## 2. Apply preprocess function to data
32
+ ## Make sure your text column is named 'text'. Otherwise replace 'text' with the name of your text column.
33
+ def preprocess_function(examples):
34
+ return tokenizer(examples["text"],
35
+ truncation=True,
36
+ max_length=512,
37
+ padding='max_length'
38
+ )
39
+ tokenized_data = dataset.map(preprocess_function, batched=True)
40
+
41
+ # 3. Load Model
42
+ NegativeResultDetector = AutoModelForSequenceClassification.from_pretrained("ClinicalMetaScience/NegativeResultDetector")
43
+
44
+ ## 4. Initialize the trainer with the model and tokenizer
45
+ trainer = Trainer(
46
+ model=NegativeResultDetector,
47
+ tokenizer=tokenizer,
48
+ )
49
+
50
+ # 5. Apply NegativeResultDetector for prediction on inference data
51
+ predict_test=trainer.predict(tokenized_data["inference"])
52
+
53
+ ```
54
+
55
+ Further information on analyzing your own or our example data can be found in this [script](https://github.com/PsyCapsLock/PubBiasDetect/blob/main/Scripts/Predict_Example_Abstracts_using_NegativeResultDetector.ipynb)
56
+ from our [GitHub repository](https://github.com/PsyCapsLock/PubBiasDetect).
57
 
58
  ## Disclaimer
59
  This tool is developed to analyze and predict the prevalence of positive and negative results in scientific abstracts based on the SciBERT model. While publication bias is a plausible explanation for certain patterns of results observed in scientific literature, the analyses conducted by this tool do not conclusively establish the presence of publication bias or any other underlying factors. It's essential to understand that this tool evaluates data but does not delve into the underlying reasons for the observed trends.