jrazi commited on
Commit
3444dbc
1 Parent(s): 51bad3d

initial update to README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -0
README.md CHANGED
@@ -1,3 +1,73 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ language:
4
+ - fa
5
+ metrics:
6
+ - f1
7
+ - precision
8
+ - accuracy
9
+ library_name: transformers
10
+ pipeline_tag: text-classification
11
  ---
12
+
13
+ ---
14
+
15
+ # Persian Poem Classifier Based on ParsBERT
16
+
17
+ ## Model Description
18
+
19
+ This model, named "Persian Poem Classifier," is based on the ParsBERT architecture and has been fine-tuned to classify Persian poems. Specifically, the model can evaluate whether a given piece of text is poetic, whether it adheres to a valid poetic structure, and whether it captures the style of a specific poet.
20
+
21
+ ### Features
22
+
23
+ - **Multi-task Classification**: Determines if the text is poetic, if it's a valid poem, and if it conforms to a certain poet's style.
24
+ - **Language Support**: Specialized for Persian language text.
25
+ - **High Accuracy**: Fine-tuned using a diverse dataset of Persian poems.
26
+
27
+ ## Intended Use
28
+
29
+ This model is intended to be used by researchers, poets, and NLP enthusiasts who are interested in the automated analysis of Persian poetry. It can be utilized in applications ranging from educational platforms to advanced poetry-generating algorithms.
30
+
31
+ ## Limitations
32
+
33
+ - The model has been trained on a specific set of poets and may not generalize well to other styles.
34
+ - It assumes that the input text is in Persian and adheres to the specific poetic structures it has been trained on.
35
+
36
+ ## Installation & Usage
37
+
38
+ You can easily install the model using the Hugging Face `transformers` library as follows:
39
+
40
+ ```bash
41
+ pip install transformers
42
+ ```
43
+
44
+ To classify a poem, you can use the following code snippet:
45
+
46
+ ```python
47
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
48
+
49
+ tokenizer = AutoTokenizer.from_pretrained("jrazi/persian-poem-classifier")
50
+ model = AutoModelForSequenceClassification.from_pretrained("jrazi/persian-poem-classifier")
51
+
52
+ text = "Your Persian poem here"
53
+ inputs = tokenizer(text, return_tensors="pt")
54
+
55
+ outputs = model(**inputs)
56
+ ```
57
+
58
+ ## Data Source
59
+
60
+ The model is fine-tuned on a curated dataset of Persian poems featuring various poets. The dataset contains multi-label annotations to evaluate the poetic nature, structure, and style conformity of the text. For creating negative labels, the model uses some of the publicly available persian text corporas. In addition to that, we used data augmentation techniques to further diversify our model, in order to make it generalize better.
61
+
62
+ ## Evaluation Metrics
63
+
64
+ The model has been evaluated using standard classification metrics like accuracy, F1-score, and ROC AUC for each of the multi-task objectives.
65
+
66
+ | Metric | Is Poetic | Is Valid Poem | Has Poet Style |
67
+ | ------ | --------- | ------------- | -------------- |
68
+ | F1 | 0.66 | 0.66 | 0.59 |
69
+ | Prec | 0.81 | 0.77 | 0.71 |
70
+ | Acc | 0.85 | 0.84 | 0.64 |
71
+
72
+
73
+ ---