NHLOCAL commited on
Commit
58d4ef5
ยท
1 Parent(s): 3957c60

first create

Browse files
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2023 NHLOCAL
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
README.md CHANGED
@@ -1,13 +1,36 @@
1
- ---
2
- title: Trying1
3
- emoji: ๐Ÿข
4
- colorFrom: blue
5
- colorTo: pink
6
- sdk: gradio
7
- sdk_version: 4.21.0
8
- app_file: app.py
9
- pinned: false
10
- license: mit
11
- ---
12
-
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # is this a bible?
2
+ An IA model that detects whether a given verse is from the Bible or not
3
+
4
+ The model presents capabilities at a very high recognition level, for the Hebrew language.
5
+ The complete dataset on which the model was trained is stored in the `bible_data.csv` file.
6
+
7
+ You can try the model's capabilities easily,By downloading the release file from here - https://github.com/NHLOCAL/is-this-bible/releases/download/v1.0/is-this-bible.zip.
8
+
9
+ **To run the model, download the following libraries using pip**:
10
+
11
+ `nltk`, `joblib`.
12
+
13
+ -----
14
+
15
+ **ื“ื•ื’ืžื”:**
16
+
17
+ ืงืœื˜ ืฉืœื™ืœื™:
18
+ ```shell
19
+ try_model.py "ื‘ื’ื™ื˜ื”ืื‘ ื ื™ืชืŸ ืœื”ืขืœื•ืช ืžืขืจื›ื•ืช ืงื•ื“ ืคืชื•ื—"
20
+ ```
21
+ ืคืœื˜:
22
+
23
+
24
+ ```shell
25
+ Text: ื‘ื’ื™ื˜ื”ืื‘ ื ื™ืชืŸ ืœื”ืขืœื•ืช ืžืขืจื›ื•ืช ืงื•ื“ ืคืชื•ื— | Prediction: Other | Confidence Score: 0.0340
26
+ ```
27
+ ืงืœื˜ ื—ื™ื•ื‘ื™:
28
+
29
+ ```shell
30
+ try_model.py "ืขื ื™ื” ืกืขืจื” ืœื ื ื—ืžื” ื”ื ื” ืื ื›ื™ ืžืจื‘ื™ืฅ ื‘ืคื•ืš ืื‘ื ื™ืš"
31
+ ```
32
+ ืคืœื˜:
33
+
34
+ ```shell
35
+ Text: ืขื ื™ื” ืกืขืจื” ืœื ื ื—ืžื” ื”ื ื” ืื ื›ื™ ืžืจื‘ื™ืฅ ื‘ืคื•ืš ืื‘ื ื™ืš ื•ื™ืกื“ืชื™ืš ื‘ืกืคื™ืจื™ื | Prediction: Bible | Confidence Score: 1.0000
36
+ ```
data_creation/bible_data.csv ADDED
The diff for this file is too large to render. See raw diff
 
data_creation/bible_talmud_data.csv ADDED
The diff for this file is too large to render. See raw diff
 
data_creation/collect_talmud_data.py ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # -*- coding: utf-8 -*-
2
+ import re
3
+
4
+ def divide_into_sentences(text):
5
+ text = remove_headers(text)
6
+ # Split the text into words
7
+ words = text.split()
8
+ sentences = [' '.join(words[i:i+12]) for i in range(0, len(words), 12)]
9
+ # Split the text into sentences using periods, commas, and colons as separators
10
+ # Remove empty sentences
11
+ sentences = [sentence.strip() for sentence in sentences if sentence.strip()]
12
+ # add numver for all items
13
+ sentences = [sentence.strip() for sentence in sentences if sentence.strip()]
14
+ return sentences
15
+
16
+ def remove_headers(text):
17
+ pattern = r"(ื“ืฃ )(.*?)( ื’ืžืจื )"
18
+ text = re.sub(pattern, "", text)
19
+ pattern = r"(ื“ืฃ )(.*?)( ืžืฉื ื” )"
20
+ result = re.sub(pattern, "", text)
21
+ return result
22
+
23
+
24
+ def write_sentences_to_file(sentences, output_filename):
25
+ modified_sentences = [f"{sentence},2" for sentence in sentences]
26
+ with open(output_filename, 'a', encoding='utf-8') as f:
27
+ f.write('\n'.join(modified_sentences))
28
+
29
+ # Example passage
30
+ example_passage = """
31
+ ื“ืฃ ืžื—,ื ื’ืžืจื ืืข"ืค ืฉืื™ ืืคืฉืจ ื•ื”ืœื ื‘ื ื‘ื ืœืจ' ืžืื™ืจ ืืข"ืค ืฉืื™ ืืคืฉืจ ืœืจื‘ื ืŸ ื•ืœืชื ื™ ื‘ื ื”ืขืœื™ื•ืŸ ืจ"ืž ืื•ืžืจ ืœื ื—ื•ืœืฆืช ื•ืœื ืžืชื™ื‘ืžืช ื•ื—ื›"ื ืื• ื—ื•ืœืฆืช ืื• ืžืชื™ื‘ืžืช ื•ืื ื ื™ื“ืขื ื ืžืฉื•ื ื“ืื™ ืืคืฉืจ ื”ื•ื ืื™ ืœื ืชื ื ืืข"ืค ืฉืื™ ืืคืฉืจ ื”ื•ื” ืืžื™ื ื ืจื•ื‘ ื ืฉื™ื ืชื—ืชื•ืŸ ืืชื™ ื‘ืจื™ืฉื ื•ืžื™ืขื•ื˜ ืขืœื™ื•ืŸ ืืชื™ ื‘ืจื™ืฉื ื•ืจื‘ื™ ืžืื™ืจ ืœื˜ืขืžื™ื” ื“ื—ื™ื™ืฉ ืœืžื™ืขื•ื˜ื ื•ืจื‘ื ืŸ ืœื˜ืขืžื™ื™ื”ื• ื“ืœื ื—ื™ื™ืฉื™ ืœืžื™ืขื•ื˜ื ื•ื”ื ื™ ืžื™ืœื™ ื‘ืกืชืžื ืื‘ืœ ื”ื™ื›ื ื“ื‘ื“ืงืŸ ื•ืœื ืืฉื›ื—ืŸ ืื™ืžืจ ืžื•ื“ื• ืœื™ื” ืจื‘ื ืŸ ืœืจ"ืž ื“ืขืœื™ื•ืŸ ืงื“ื™ื ืงืž"ืœ ื“ืื™ ืืคืฉืจ ื•ื“ืื™ ืืชื™ ื•ืžื ืชืจ ื”ื•ื ื“ื ืชืจ ื‘ืฉืœืžื ืœืจ"ืž ื”ื™ื™ื ื• ื“ื›ืชื™ื‘ (ื™ื—ื–ืงืืœ ื˜ื–) ืฉื“ื™ื ื ื›ื•ื ื• ื•ืฉืขืจืš ืฆืžื— ืืœื ืœืจื‘ื ืŸ ืื™ืคื›ื ืžื‘ืขื™ ืœื™ื” ื”"ืง ื›ื™ื•ืŸ ืฉืฉื“ื™ื ื ื›ื•ื ื• ื‘ื™ื“ื•ืข ืฉืฉืขืจืš ืฆืžื— ื‘ืฉืœืžื ืœืจ"ืž ื”ื™ื™ื ื• ื“ื›ืชื™ื‘ (ื™ื—ื–ืงืืœ ื›ื’) ื‘ืขืฉื•ืช ืžืžืฆืจื™ื ื“ื“ื™ืš ืœืžืขืŸ ืฉื“ื™ ื ืขื•ืจื™ืš ืืœื ืœืจื‘ื ืŸ ืื™ืคื›ื ืžื‘ืขื™ ืœื™ื” ื”"ืง ื›ื™ื•ืŸ ืฉื‘ืื• ื“ื“ื™ืš ื‘ื™ื“ื•ืข ืฉื‘ืื• ื ืขื•ืจื™ืš ื•ืื™ื‘ืขื™ืช ืื™ืžื ืžืื™ ืฉื“ื™ ื›ื•ืœื” ื‘ื“ื“ื™ ื›ืชื™ื‘ ื•ื”"ืง ื”ืงื‘"ื” ืœื™ืฉืจืืœ
32
+
33
+ ื“ืฃ ืžื—,ื‘ ื’ืžืจื ืื™ื›ืจืคื• ื“ื“ื™ืš ืœื ื”ื“ืจืช ื‘ืš ืื™ืฉืชื“ื• ื“ื“ื™ืš ื ืžื™ ืœื ื”ื“ืจืช ื‘ืš ื“ื›ื•ืœื™ ืขืœืžื ืžื™ื”ื ืืชื—ืชื•ืŸ ืกืžื›ื™ื ืŸ ืžื ืœืŸ ืืžืจ ืจื‘ ื™ื”ื•ื“ื” ืืžืจ ืจื‘ ื•ื›ืŸ ืชื ื ื“ื‘ื™ ืจ' ื™ืฉืžืขืืœ ืืžืจ ืงืจื (ื‘ืžื“ื‘ืจ ื”) ืื™ืฉ ืื• ืืฉื” ื›ื™ ื™ืขืฉื• ืžื›ืœ ื—ื˜ืื•ืช ื”ืื“ื ื”ืฉื•ื” ื”ื›ืชื•ื‘ ืืฉื” ืœืื™ืฉ ืœื›ืœ ืขื•ื ืฉื™ืŸ ืฉื‘ืชื•ืจื” ืžื” ืื™ืฉ ื‘ืกื™ืžืŸ ืื—ื“ ืืฃ ืืฉื” ื‘ืกื™ืžืŸ ืื—ื“ ื•ืื™ืžื ืื• ื”ืื™ ืื• ื”ืื™ ื›ืื™ืฉ ืžื” ืื™ืฉ ืชื—ืชื•ืŸ ื•ืœื ืขืœื™ื•ืŸ ืืฃ ืืฉื” ืชื—ืชื•ืŸ ื•ืœื ืขืœื™ื•ืŸ ืชื ื™ื ื ืžื™ ื”ื›ื™ ื"ืจ ืืœื™ืขื–ืจ ื‘ืจ' ืฆื“ื•ืง ื›ืš ื”ื™ื• ืžืคืจืฉื™ืŸ ื‘ื™ื‘ื ื” ื•ืืžืจื• ื›ื™ื•ืŸ ืฉื‘ื ืชื—ืชื•ืŸ ืฉื•ื‘ ืื™ืŸ ืžืฉื’ื™ื—ื™ืŸ ืขืœ ืขืœื™ื•ืŸ ืชื ื™ื ืจืฉื‘"ื’ ืื•ืžืจ ื‘ื ื•ืช ื›ืจื›ื™ื ืชื—ืชื•ืŸ ืžืžื”ืจ ืœื‘ื ืžืคื ื™ ืฉืจื’ื™ืœื•ืช ื‘ืžืจื—ืฆืื•ืช ื‘ื ื•ืช ื›ืคืจื™ื ืขืœื™ื•ืŸ ืžืžื”ืจ ืœื‘ื ืžืคื ื™ ืฉื˜ื•ื—ื ื•ืช ื‘ืจื—ื™ื ืจ"ืฉ ื‘ืŸ ืืœืขื–ืจ ืื•ืžืจ ื‘ื ื•ืช ืขืฉื™ืจื™ื ืฆื“ ื™ืžื™ืŸ ืžืžื”ืจ ืœื‘ื ืฉื ื™ืฉื•ืฃ ื‘ืืคืงืจื™ืกื•ืชืŸ ื‘ื ื•ืช ืขื ื™ื™ื ืฆื“ ืฉืžืืœ ืžืžื”ืจ ืœื‘ื ืžืคื ื™ ืฉืฉื•ืื‘ื•ืช ื›ื“ื™ ืžื™ื ืขืœื™ื”ืŸ ื•ืื™ื‘ืขื™ืช ืื™ืžื ืžืคื ื™ ืฉื ื•ืฉืื™ืŸ ืื—ื™ื”ืŸ ืขืœ ื’ืกืกื™ื”ืŸ ืช"ืจ ืฆื“ ืฉืžืืœ ืงื•ื“ื ืœืฆื“ ื™ืžื™ืŸ ืจื‘ื™ ื—ื ื™ื ื ื‘ืŸ ืื—ื™ ืจ' ื™ื”ื•ืฉืข ืื•ืžืจ ืžืขื•ืœื ืœื ืงื“ื ืฆื“ ืฉืžืืœ ืœืฆื“ ื™ืžื™ืŸ ื—ื•ืฅ ืžืื—ืช ืฉื”ื™ืชื” ื‘ืฉื›ื•ื ืชื™ ืฉืงื“ื ืฆื“ ืฉืžืืœ ืœืฆื“ ื™ืžื™ืŸ ื•ื—ื–ืจ ืœืื™ืชื ื• ืช"ืจ ื›ืœ ื”ื ื‘ื“ืงื•ืช ื ื‘ื“ืงื•ืช ืขืœ ืคื™ ื ืฉื™ื ื•ื›ืŸ ื”ื™ื” ืจื‘ื™ ืืœื™ืขื–ืจ ืžื•ืกืจ ืœืืฉืชื• ื•ืจื‘ื™ ื™ืฉืžืขืืœ ืžื•ืกืจ ืœืืžื• ืจื‘ื™ ื™ื”ื•ื“ื” ืื•ืžืจ ืœืคื ื™ ื”ืคืจืง ื•ืœืื—ืจ ื”ืคืจืง ื ืฉื™ื ื‘ื•ื“ืงื•ืช ืื•ืชืŸ ืชื•ืš ื”ืคืจืง ืื™ืŸ ื ืฉื™ื ื‘ื•ื“ืงื•ืช ืื•ืชืŸ ืฉืื™ืŸ ืžืฉื™ืื™ืŸ ืกืคืงื•ืช ืขืœ ืคื™ ื ืฉื™ื ืจ"ืฉ ืื•ืžืจ ืืฃ ืชื•ืš ื”ืคืจืง ื ืฉื™ื ื‘ื•ื“ืงื•ืช ืื•ืชืŸ ื•ื ืืžื ืช ืืฉื” ืœื”ื—ืžื™ืจ ืื‘ืœ ืœื ืœื”ืงืœ ื›ื™ืฆื“ ื’ื“ื•ืœื” ื”ื™ื ืฉืœื ืชืžืืŸ ืงื˜ื ื” ื”ื™ื ืฉืœื ืชื—ืœื•ืฅ ืื‘ืœ ืื™ืŸ ื ืืžื ืช ืœื•ืžืจ ืงื˜ื ื” ื”ื™ื ืฉืชืžืืŸ ื•ื’ื“ื•ืœื” ื”ื™ื ืฉืชื—ืœื•ืฅ ืืžืจ ืžืจ ืจื‘ื™ ื™ื”ื•ื“ื” ืื•ืžืจ ืœืคื ื™ ื”ืคืจืง ื•ืœืื—ืจ ื”ืคืจืง ื ืฉื™ื ื‘ื•ื“ืงื•ืช ืื•ืชืŸ ื‘ืฉืœืžื ืœืคื ื™ ื”ืคืจืง ื‘ืขื™ ื‘ื“ื™ืงื” ื“ืื™ ืžืฉืชื›ื—ื™ ืœืื—ืจ ื”ืคืจืง ืฉื•ืžื ื ื™ื ื”ื• ืืœื ืœืื—ืจ ื”ืคืจืง ืœืžื” ืœื™ ื‘ื“ื™ืงื” ื•ื”ืืžืจ ืจื‘ื ืงื˜ื ื” ืฉื”ื’ื™ืขื” ืœื›ืœืœ ืฉื ื•ืชื™ื” ืื™ื ื” ืฆืจื™ื›ื” ื‘ื“ื™ืงื” ื—ื–ืงื” ื”ื‘ื™ืื” ืกื™ืžื ื™ืŸ ื›ื™ ืืžืจ ืจื‘ื ื—ื–ืงื” ืœืžื™ืื•ืŸ ืื‘ืœ ืœื—ืœื™ืฆื” ื‘ืขื™ื ื‘ื“ื™ืงื” ืชื•ืš ื”ืคืจืง ืื™ืŸ ื ืฉื™ื ื‘ื•ื“ืงื•ืช ืื•ืชืŸ ืงืกื‘ืจ ืชื•ืš ื”ืคืจืง ื›ืœืื—ืจ ื”๏ฟฝ๏ฟฝืจืง <ื“ืžื™> ื•ืœืื—ืจ ื”ืคืจืง ื“ืื™ื›ื ื—ื–ืงื” ื“ืจื‘ื ืกืžื›ื™ื ืŸ ืื ืฉื™ื ื•ื‘ื“ืงื™ ืชื•ืš ื”ืคืจืง ื“ืœื™ื›ื ื—ื–ืงื” ื“ืจื‘ื ืœื ืกืžื›ื™ื ืŸ ืื ืฉื™ื ื•ืœื ื‘ื“ืงื™ ื ืฉื™ื ืจ"ืฉ ืื•ืžืจ ืืฃ ืชื•ืš ื”ืคืจืง ื ืฉื™ื ื‘ื•ื“ืงื•ืช ืื•ืชืŸ ืงืกื‘ืจ ืชื•ืš ื”ืคืจืง ื›ืœืคื ื™ ื”ืคืจืง ื•ื‘ืขื™ื ื‘ื“ื™ืงื” ื“ืื™ ืžืฉืชื›ื—ื™ ืœืื—ืจ ื”ืคืจืง ืฉื•ืžื ื ื™ื ื”ื• ื•ื ืืžื ืช ืืฉื” ืœื”ื—ืžื™ืจ ืื‘ืœ ืœื ืœื”ืงืœ ื”ืื™ ืžืืŸ ืงืชื ื™ ืœื” ืื™ื‘ืขื™ืช ืื™ืžื ืจื‘ื™ ื™ื”ื•ื“ื” ื•ืืชื•ืš ื”ืคืจืง
34
+
35
+ ื“ืฃ ืžื˜,ื ื’ืžืจื ื•ืื™ื‘ืขื™ืช ืื™ืžื ืจื‘ื™ ืฉืžืขื•ืŸ ื•ืœืื—ืจ ื”ืคืจืง ื•ืœื™ืช ืœื™ื” ื—ื–ืงื” ื“ืจื‘ื: ืžืคื ื™ ืฉืืžืจื• ืืคืฉืจ ื›ื•': ื”ื ืชื• ืœืžื” ืœื™ ื”ื ืชื ื ืœื™ื” ืจื™ืฉื ื•ื›ื™ ืชื™ืžื ืžืฉื•ื ื“ืงื ื‘ืขื™ ืœืžืกืชืžื” ื›ืจื‘ื ืŸ ืคืฉื™ื˜ื ื™ื—ื™ื“ ื•ืจื‘ื™ื ื”ืœื›ื” ื›ืจื‘ื™ื ืžื”ื• ื“ืชื™ืžื ืžืกืชื‘ืจื ื˜ืขืžื ื“ืจ"ืž ื“ืงื ืžืกื™ื™ืข ืœื™ื” ืงืจืื™ ืงืž"ืœ ื•ืื™ื‘ืขื™ืช ืื™ืžื ืžืฉื•ื ื“ืงื ื‘ืขื™ ืœืžืชื ื™ ื›ื™ื•ืฆื ื‘ื•:
36
+
37
+ ื“ืฃ ืžื˜,ื ืžืฉื ื” ื›ื™ื•ืฆื ื‘ื• ื›ืœ ื›ืœื™ ื—ืจืก ืฉื”ื•ื ืžื›ื ื™ืก ืžื•ืฆื™ื ื•ื™ืฉ ืฉืžื•ืฆื™ื ื•ืื™ื ื• ืžื›ื ื™ืก ื›ืœ ืื‘ืจ ืฉื™ืฉ ื‘ื• ืฆืคื•ืจืŸ ื™ืฉ ื‘ื• ืขืฆื ื•ื™ืฉ ืฉื™ืฉ ื‘ื• ืขืฆื ื•ืื™ืŸ ื‘ื• ืฆืคื•ืจืŸ ื›ืœ ื”ืžื˜ืžื ืžื“ืจืก ืžื˜ืžื ื˜ืžื ืžืช ื•ื™ืฉ ืฉืžื˜ืžื ื˜ืžื ืžืช ื•ืื™ื ื• ืžื˜ืžื ืžื“ืจืก:
38
+
39
+ ื“ืฃ ืžื˜,ื ื’ืžืจื ืžื›ื ื™ืก ืคืกื•ืœ ืœืžื™ ื—ื˜ืืช ื•ืคืกื•ืœ ืžืฉื•ื ื’ืกื˜ืจื ืžื•ืฆื™ื ื›ืฉืจ ืœืžื™ ื—ื˜ืืช ื•ืคืกื•ืœ ืžืฉื•ื ื’ืกื˜ืจื ืืžืจ ืจื‘ ืืกื™ ืฉื•ื ื™ืŸ ื›ืœื™ ื—ืจืก ืฉื™ืขื•ืจื• ื‘ื›ื•ื ืก ืžืฉืงื” ื•ืœื ืืžืจื• ืžื•ืฆื™ื ืžืฉืงื” ืืœื ืœืขื ื™ืŸ ื’ืกื˜ืจื ื‘ืœื‘ื“ ืžืื™ ื˜ืขืžื ืืžืจ ืžืจ ื–ื•ื˜ืจื ื‘ืจื™ื” ื“ืจื‘ ื ื—ืžืŸ ืœืคื™ ืฉืื™ืŸ ืื•ืžืจื™ื ื”ื‘ื ื’ืกื˜ืจื ืœื’ืกื˜ืจื ืชื ื• ืจื‘ื ืŸ ื›ื™ืฆื“ ื‘ื•ื“ืงื™ืŸ ื›ืœื™ ื—ืจืก ืœื™ื“ืข ืื ื ื™ืงื‘ ื‘ื›ื•ื ืก ืžืฉืงื” ืื ืœืื• ื™ื‘ื™ื ืขืจื™ื‘ื” ืžืœืื” ืžื™ื ื•ื ื•ืชืŸ ืงื“ืจื” ืœืชื•ื›ื” ืื ื›ื ืกื” ื‘ื™ื“ื•ืข ืฉื›ื•ื ืก ืžืฉืงื” ื•ืื ืœืื• ื‘ื™ื“ื•ืข ืฉืžื•ืฆื™ื ืžืฉืงื”
40
+
41
+ ื“ืฃ ืžื˜,ื‘ ื’ืžืจื ืจื‘ื™ ื™ื”ื•ื“ื” ืื•ืžืจ ื›ื•ืคืฃ ืื–ื ื™ ืงื“ืจื” ืœืชื•ื›ื” ื•ืžืฆื™ืฃ ืขืœื™ื” ืžื™ื ื•ืื ื›ื•ื ืก ื‘ื™ื“ื•ืข ืฉื›ื•ื ืก ืžืฉืงื” ื•ืื ืœืื• ื‘ื™ื“ื•ืข ืฉืžื•ืฆื™ื ืžืฉืงื” ืื• ืฉื•ืคืชื” ืขืœ ื’ื‘ื™ ื”ืื•ืจ ืื ื”ืื•ืจ ืžืขืžื™ื“ื” ื‘ื™ื“ื•ืข ืฉืžื•ืฆื™ื ืžืฉืงื” ื•ืื ืœืื• ื‘ื™ื“ื•ืข ืฉืžื›ื ื™ืก ืžืฉืงื” ืจ' ื™ื•ืกื™ ืื•ืžืจ ืืฃ ืœื ืฉื•ืคืชื” ืขืœ ื’ื‘ื™ ื”ืื•ืจ ืžืคื ื™ ืฉื”ืื•ืจ ืžืขืžื™ื“ื” ืืœื ืฉื•ืคืชื” ืขืœ ื’ื‘ื™ ื”ืจืžืฅ ืื ืจืžืฅ ืžืขืžื™ื“ื” ื‘ื™ื“ื•ืข ืฉืžื•ืฆื™ื ืžืฉืงื” ื•ืื ืœืื• ื‘ื™ื“ื•ืข ืฉื›ื•ื ืก ืžืฉืงื” ื”ื™ื” ื˜ื•ืจื“ ื˜ื™ืคื” ืื—ืจ ื˜ื™ืคื” ื‘ื™ื“ื•ืข ืฉื›ื•ื ืก ืžืฉืงื” ืžืื™ ืื™ื›ื ื‘ื™ืŸ ืช"ืง ืœืจ' ื™ื”ื•ื“ื” ืืžืจ ืขื•ืœื ื›ื™ื ื•ืก ืขืœ ื™ื“ื™ ื”ื“ื—ืง ืื™ื›ื ื‘ื™ื ื™ื™ื”ื•: ื›ืœ ืื‘ืจ ืฉื™ืฉ ื‘ื• ืฆืคื•ืจืŸ ื•ื›ื•': ื™ืฉ ื‘ื• ืฆืคื•ืจืŸ ืžื˜ืžื ื‘ืžื’ืข ื•ื‘ืžืฉื ื•ื‘ืื”ืœ ื™ืฉ ื‘ื• ืขืฆื ื•ืื™ืŸ ื‘ื• ืฆืคื•ืจืŸ ืžื˜ืžื ื‘ืžื’ืข ื•ื‘ืžืฉื ื•ืื™ื ื• ืžื˜ืžื ื‘ืื”ืœ ืืžืจ ืจื‘ ื—ืกื“ื ื“ื‘ืจ ื–ื” ืจื‘ื™ื ื• ื”ื’ื“ื•ืœ ืืžืจื• ื”ืžืงื•ื ื™ื”ื™ื” ื‘ืขื–ืจื• ืืฆื‘ืข ื™ืชืจื” ืฉื™ืฉ ื‘ื• ืขืฆื ื•ืื™ืŸ ื‘ื• ืฆืคื•ืจืŸ ืžื˜ืžื ื‘ืžื’ืข ื•ื‘ืžืฉื ื•ืื™ื ื• ืžื˜ืžื ื‘ืื”ืœ ืืžืจ ืจื‘ื” ื‘ืจ ื‘ืจ ื—ื ื” ื"ืจ ื™ื•ื—ื ืŸ ื•ื›ืฉืื™ื ื” ื ืกืคืจืช ืขืœ ื’ื‘ ื”ื™ื“: ื›ืœ ื”ืžื˜ืžื ืžื“ืจืก ื•ื›ื•': ื›ืœ ื“ื—ื–ื™ ืœืžื“ืจืก ืžื˜ืžื ื˜ืžื ืžืช ื•ื™ืฉ ืฉืžื˜ืžื ื˜ืžื ืžืช ื•ืื™ืŸ ืžื˜ืžื ืžื“ืจืก ืœืืชื•ื™ื™ ืžืื™ ืœืืชื•ื™ื™ ืกืื” ื•ืชืจืงื‘ ื“ืชื ื™ื (ื•ื™ืงืจื ื˜ื•) ื•ื”ื™ื•ืฉื‘ ืขืœ ื”ื›ืœื™ ื™ื›ื•ืœ ื›ืคื” ืกืื” ื•ื™ืฉื‘ ืขืœื™ื” ืื• ืชืจืงื‘ ื•ื™ืฉื‘ ืขืœื™ื• ื™ื”ื ื˜ืžื ืช"ืœ (ื•ื™ืงืจื ื˜ื•) ืืฉืจ ื™ืฉื‘ ืขืœื™ื• ื”ื–ื‘ ืžื™ ืฉืžื™ื•ื—ื“ ืœื™ืฉื™ื‘ื” ื™ืฆื ื–ื” ืฉืื•ืžืจื™ื ืœื• ืขืžื•ื“ ื•ื ืขืฉื” ืžืœืื›ืชื ื•:
42
+
43
+ ื“ืฃ ืžื˜,ื‘ ืžืฉื ื” ื›ืœ ื”ืจืื•ื™ ืœื“ื•ืŸ ื“ื™ื ื™ ื ืคืฉื•ืช ืจืื•ื™ ืœื“ื•ืŸ ื“ื™ื ื™ ืžืžื•ื ื•ืช ื•ื™ืฉ ืฉืจืื•ื™ ืœื“ื•ืŸ ื“ื™ื ื™ ืžืžื•ื ื•ืช ื•ืื™ื ื• ืจืื•ื™ ืœื“ื•ืŸ ื“ื™ื ื™ ื ืคืฉื•ืช:
44
+
45
+ ื“ืฃ ืžื˜,ื‘ ื’ืžืจื ืืžืจ ืจื‘ ื™ื”ื•ื“ื” ืœืืชื•ื™ื™ ืžืžื–ืจ ืชื ื™ื ื ื—ื“ื ื–ื™ืžื ื ื”ื›ืœ ื›ืฉืจื™ืŸ ืœื“ื•ืŸ ื“ื™ื ื™ ืžืžื•ื ื•ืช ื•ืื™ืŸ ื”ื›ืœ ื›ืฉืจื™ืŸ ืœื“ื•ืŸ ื“ื™ื ื™ ื ืคืฉื•ืช ื•ื”ื•ื™ื ืŸ ื‘ื” ืœืืชื•ื™ื™ ืžืื™ ื•ืืžืจ ืจื‘ ื™ื”ื•ื“ื” ืœืืชื•ื™ื™ ืžืžื–ืจ ื—ื“ื ืœืืชื•ื™ื™ ื’ืจ ื•ื—ื“ื ืœืืชื•ื™ื™ ืžืžื–ืจ ื•ืฆืจื™ื›ื™ ื“ืื™ ืืฉืžืขื™ื ืŸ ื’ืจ ืžืฉื•ื ื“ืจืื•ื™ ืœื‘ื ื‘ืงื”ืœ ืื‘ืœ ืžืžื–ืจ ื“ืื™ืŸ ืจืื•ื™ ืœื‘ื ื‘ืงื”ืœ ืื™ืžื ืœื ื•ืื™ ืืฉืžืขื™ื ืŸ ืžืžื–ืจ ืžืฉื•ื ื“ืงืืชื™ ืžื˜ืคื” ื›ืฉืจื” ืื‘ืœ ื’ืจ ื“ืงืืชื™ ืžื˜ืคื” ืคืกื•ืœื” ืื™ืžื ืœื ืฆืจื™ื›ื:
46
+
47
+ ื“ืฃ ืžื˜,ื‘ ืžืฉื ื” ื›ืœ ื”ื›ืฉืจ ืœื“ื•ืŸ ื›ืฉืจ ืœื”ืขื™ื“ ื•ื™ืฉ ืฉื›ืฉืจ ืœื”ืขื™ื“ ื•ืื™ื ื• ื›ืฉืจ ืœื“ื•ืŸ:
48
+
49
+ ื“ืฃ ืžื˜,ื‘ ื’ืžืจื ืœืืชื•ื™ื™ ืžืื™ ื"ืจ ื™ื•ื—ื ืŸ ืœืืชื•ื™ื™ ืกื•ืžื ื‘ืื—ืช ืžืขื™ื ื™ื• ื•ืžื ื™
50
+
51
+ ื“ืฃ ื ,ื ื’ืžืจื ืจื‘ื™ ืžืื™ืจ ื”ื™ื ื“ืชื ื™ื ื”ื™ื” ืจื‘ื™ ืžืื™ืจ ืื•ืžืจ ืžื” ืช"ืœ (ื“ื‘ืจื™ื ื›ื) ืขืœ ืคื™ื”ื ื™ื”ื™ื” ื›ืœ ืจื™ื‘ ื•ื›ืœ ื ื’ืข ื•ื›ื™ ืžื” ืขื ื™ืŸ ืจื™ื‘ื™๏ฟฝ๏ฟฝ ืืฆืœ ื ื’ืขื™ื ืžืงื™ืฉ ืจื™ื‘ื™ื ืœื ื’ืขื™ื ืžื” ื ื’ืขื™ื ื‘ื™ื•ื ื“ื›ืชื™ื‘ (ื•ื™ืงืจื ื™ื’) ื•ื‘ื™ื•ื ื”ืจืื•ืช ื‘ื• ืืฃ ืจื™ื‘ื™ื ื‘ื™ื•ื ื•ืžื” ื ื’ืขื™ื ืฉืœื ื‘ืกื•ืžื ื“ื›ืชื™ื‘ (ื•ื™ืงืจื ื™ื’) ืœื›ืœ ืžืจืื” ืขื™ื ื™ ื”ื›ื”ืŸ ืืฃ ืจื™ื‘ื™ื ืฉืœื ื‘ืกื•ืžื ื•ืžืงื™ืฉ ื ื’ืขื™ื ืœืจื™ื‘ื™ื ืžื” ืจื™ื‘ื™ื ืฉืœื ื‘ืงืจื•ื‘ื™ื ืืฃ ื ื’ืขื™ื ืฉืœื ื‘ืงืจื•ื‘ื™ื ืื™ ืžื” ืจื™ื‘ื™ื ื‘ืฉืœืฉื” ืืฃ ื ื’ืขื™ื ื‘ืฉืœืฉื” ื•ื“ื™ืŸ ื”ื•ื ืžืžื•ื ื• ื‘ืฉืœืฉื” ื’ื•ืคื• ืœื ื›"ืฉ ืช"ืœ (ื•ื™ืงืจื ื™ื’) ื•ื”ื•ื‘ื ืืœ ืื”ืจืŸ ื”ื›ื”ืŸ ืื• ืืœ ืื—ื“ ืžื‘ื ื™ื• ื”ื›ื”ื ื™ื ื”ื ืœืžื“ืช ืฉืืคื™ืœื• ื›ื”ืŸ ืื—ื“ ืจื•ืื” ืืช ื”ื ื’ืขื™ื ื”ื”ื•ื ืกืžื™ื ื“ื”ื•ื” ื‘ืฉื‘ื‘ื•ืชื™ื” ื“ืจื‘ื™ ื™ื•ื—ื ืŸ ื“ื”ื•ื” ืงื“ื™ื™ืŸ ื“ื™ื ื ื•ืœื ืงืืžืจ ืœื™ื” ื•ืœื ืžื™ื“ื™ ื”ื™ื›ื™ ืขื‘ื™ื“ ื”ื›ื™ ื•ื”ืืžืจ ืจื‘ื™ ื™ื•ื—ื ืŸ ื”ืœื›ื” ื›ืกืชื ืžืฉื ื” ื•ืชื ืŸ ื›ืœ ื”ื›ืฉืจ ืœื“ื•ืŸ ื›ืฉืจ ืœื”ืขื™ื“ ื•ื™ืฉ ื›ืฉืจ ืœื”ืขื™ื“ ื•ืื™ืŸ ื›ืฉืจ ืœื“ื•ืŸ ื•ืืžืจื™ื ืŸ ืœืืชื•ื™ื™ ืžืื™ ื•ืืžืจ ืจื‘ื™ ื™ื•ื—ื ืŸ ืœืืชื•ื™ื™ ืกื•ืžื ื‘ืื—ืช ืžืขื™ื ื™ื• ืจื‘ื™ ื™ื•ื—ื ืŸ ืกืชืžื ืื—ืจื™ื ื ืืฉื›ื— ื“ืชื ืŸ ื“ื™ื ื™ ืžืžื•ื ื•ืช ื“ื ื™ืŸ ื‘ื™ื•ื ื•ื’ื•ืžืจื™ืŸ ื‘ืœื™ืœื” ื•ืžืื™ ืื•ืœืžื™ื” ื“ื”ืื™ ืกืชืžื ืžื”ืื™ ืกืชืžื ืื™ื‘ืขื™ืช ืื™ืžื ืกืชืžื ื“ืจื‘ื™ื ืขื“ื™ืฃ ื•ืื™ื‘ืขื™ืช ืื™ืžื ืžืฉื•ื ื“ืงืชื ื™ ืœื” ื’ื‘ื™ ื”ืœื›ืชื ื“ื“ื™ื ื™:
52
+
53
+ ื“ืฃ ื ,ื ืžืฉื ื” ื›ืœ ืฉื—ื™ื™ื‘ ื‘ืžืขืฉืจื•ืช ืžื˜ืžื ื˜ื•ืžืืช ืื•ื›ืœื™ืŸ ื•ื™ืฉ ืฉืžื˜ืžื ื˜ื•ืžืืช ืื•ื›ืœื™ืŸ ื•ืื™ื ื• ื—ื™ื™ื‘ ื‘ืžืขืฉืจื•ืช:
54
+
55
+ ื“ืฃ ื ,ื ื’ืžืจื ืœืืชื•ื™ื™ ืžืื™ ืœืืชื•ื™ื™ ื‘ืฉืจ ื•ื“ื’ื™ื ื•ื‘ื™ืฆื™ื:
56
+
57
+ ื“ืฃ ื ,ื ืžืฉื ื” ื›ืœ ืฉื—ื™ื™ื‘ ื‘ืคืื” ื—ื™ื™ื‘ ื‘ืžืขืฉืจื•ืช ื•ื™ืฉ ืฉื—ื™ื™ื‘ ื‘ืžืขืฉืจื•ืช ื•ืื™ื ื• ื—ื™ื™ื‘ ื‘ืคืื”:
58
+
59
+ ื“ืฃ ื ,ื ื’ืžืจื ืœืืชื•ื™ื™ ืžืื™ ืœืืชื•ื™ื™ ืชืื ื” ื•ื™ืจืง ืฉืื™ื ื• ื—ื™ื™ื‘ ื‘ืคืื” ื“ืชื ืŸ ื›ืœืœ ืืžืจื• ื‘ืคืื” ื›ืœ ืฉื”ื•ื ืื•ื›ืœ ื•ื ืฉืžืจ ื•ื’ื™ื“ื•ืœื• ืžืŸ ื”ืืจืฅ ื•ืœืงื™ื˜ืชื• ื›ืื—ื“ ื•ืžื›ื ื™ืกื• ืœืงื™ื•ื ื—ื™ื™ื‘ ื‘ืคืื” ืื•ื›ืœ ืœืžืขื•ื˜ื™ ืกืคื™ื—ื™ ืกื˜ื™ื ื•ืงื•ืฆื” ื•ื ืฉืžืจ ืœืžืขื•ื˜ื™ ื”ืคืงืจ ื•ื’ื™ื“ื•ืœื• ืžืŸ ื”ืืจืฅ ืœืžืขื•ื˜ื™ ื›ืžื”ื™ื ื•ืคื˜ืจื™ื•ืช ื•ืœืงื™ื˜ืชื• ื›ืื—ื“ ืœืžืขื•ื˜ื™ ืชืื ื” ื•ืžื›ื ื™ืกื• ืœืงื™ื•ื ืœืžืขื•ื˜ื™ ื™ืจืง ื•ืื™ืœื• ื’ื‘ื™ ืžืขืฉืจ ืชื ืŸ ื›ืœ ืฉื”ื•ื ืื•ื›ืœ ื•ื ืฉืžืจ ื•ื’ื™ื“ื•ืœื• ืžืŸ ื”ืืจืฅ ื—ื™ื™ื‘ ื‘ืžืขืฉืจื•ืช ื•ืื™ืœื• ืœืงื™ื˜ืชื• ื›ืื—ื“ ื•ืžื›ื ื™ืกื• ืœืงื™ื•ื ืœื ืงืชื ื™ ืื ื”ื™ื• ื‘ื”ื ืฉื•ืžื™ื ื•ื‘ืฆืœื™ืŸ ื—ื™ื™ื‘ื™ืŸ ื“ืชื ืŸ ืžืœื‘ื ื•ืช ื‘ืฆืœื™ื ืฉื‘ื™ืŸ ื”ื™ืจืง ืจ' ื™ื•ืกื™ ืื•ืžืจ ืคืื” ืžื›ืœ ืื—ืช ื•ืื—ืช ื•ื—ื›"ื ืžืื—ืช ืขืœ ื”ื›ืœ ืืžืจ ืจื‘ื” ื‘ืจ ื‘ืจ ื—ื ื” ื"ืจ ื™ื•ื—ื ืŸ ืขื•ืœืฉื™ืŸ ืฉื–ืจืขืŸ ืžืชื—ื™ืœื” ืœื‘ื”ืžื” ื•ื ืžืœืš ืขืœื™ื”ืŸ ืœืื“ื
60
+ """
61
+
62
+ # Divide the passage into sentences
63
+ divided_sentences = divide_into_sentences(example_passage)
64
+
65
+ # Write the divided sentences to a new text file
66
+ output_filename = "divided_sentences.txt"
67
+ write_sentences_to_file(divided_sentences, output_filename)
68
+
69
+ print(f"Divided sentences written to '{output_filename}'")
data_creation/craet_model_bible_talmud.py ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import pandas as pd
2
+ import re
3
+ import nltk
4
+ from nltk.tokenize import word_tokenize
5
+ from sklearn.feature_extraction.text import TfidfVectorizer
6
+ from sklearn.svm import SVC
7
+ from sklearn.model_selection import train_test_split
8
+ from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
9
+ import joblib
10
+
11
+ # Load the dataset (assuming it is in UTF-8 encoding)
12
+ data = pd.read_csv('bible_talmud_data.csv', encoding='utf-8')
13
+
14
+ # Separate features (text) and labels (0, 1, or 2)
15
+ X = data['text']
16
+ y = data['label']
17
+
18
+ # Create a TF-IDF vectorizer with Hebrew tokenizer
19
+ vectorizer = TfidfVectorizer(tokenizer=word_tokenize, lowercase=True)
20
+
21
+ # Fit and transform the data with TF-IDF vectorizer
22
+ X_tfidf = vectorizer.fit_transform(X)
23
+
24
+ # Split data into training and test sets
25
+ X_train, X_test, y_train, y_test = train_test_split(X_tfidf, y, test_size=0.2, random_state=15)
26
+
27
+ # Create a Support Vector Machine (SVM) classifier
28
+ classifier = SVC(kernel='linear', C=2.0, probability=True)
29
+
30
+ # Train the SVM classifier on the training data
31
+ classifier.fit(X_train, y_train)
32
+
33
+ # Evaluate the model on the test data
34
+ y_pred = classifier.predict(X_test)
35
+ accuracy = accuracy_score(y_test, y_pred)
36
+ precision = precision_score(y_test, y_pred, average='weighted', zero_division=1)
37
+ recall = recall_score(y_test, y_pred, average='weighted')
38
+ f1 = f1_score(y_test, y_pred, average='weighted')
39
+
40
+ print("Accuracy:", accuracy)
41
+ print("Precision:", precision)
42
+ print("Recall:", recall)
43
+ print("F1 Score:", f1)
44
+
45
+ # Save the trained model and vectorizer to files
46
+ model_filename = "text_identification_model.pkl"
47
+ vectorizer_filename = "text_identification_vectorizer.pkl"
48
+ joblib.dump(classifier, model_filename)
49
+ joblib.dump(vectorizer, vectorizer_filename)
data_creation/creat_csv.py ADDED
@@ -0,0 +1,168 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import csv
2
+
3
+ def process_text(input_text):
4
+ lines = input_text.strip().split('\n')
5
+ processed_lines = []
6
+ for line in lines:
7
+ line = line.strip()
8
+ if line:
9
+ if not line.endswith('1'):
10
+ line += ',1'
11
+ processed_lines.append(line)
12
+ return processed_lines
13
+
14
+ def save_to_csv(processed_lines, output_file):
15
+ with open(output_file, mode='w', newline='', encoding='utf-8') as csvfile:
16
+ csv_writer = csv.writer(csvfile)
17
+ for line in processed_lines:
18
+ csv_writer.writerow([line])
19
+
20
+ if __name__ == "__main__":
21
+ input_text = """
22
+ ื•ื™ืขืŸ ืืœื™ืคื– ื”ืชื™ืžื ื™ ื•ื™ืืžืจ
23
+
24
+ ื”ื ืกื” ื“ื‘ืจ ืืœื™ืš ืชืœืื” ื•ืขืฆืจ ื‘ืžืœื™ืŸ ืžื™ ื™ื•ื›ืœ
25
+
26
+ ื”ื ื” ื™ืกืจืช ืจื‘ื™ื ื•ื™ื“ื™ื ืจืคื•ืช ืชื—ื–ืง
27
+
28
+ ื›ื•ืฉืœ ื™ืงื™ืžื•ืŸ ืžืœื™ืš ื•ื‘ืจื›ื™ื ื›ืจืขื•ืช ืชืืžืฅ
29
+
30
+ ื›ื™ ืขืชื” ืชื‘ื•ื ืืœื™ืš ื•ืชืœื ืชื’ืข ืขื“ื™ืš ื•ืชื‘ื”ืœ
31
+
32
+ ื”ืœื ื™ืจืืชืš ื›ืกืœืชืš ืชืงื•ืชืš ื•ืชื ื“ืจื›ื™ืš
33
+
34
+ ื–ื›ืจ ื ื ืžื™ ื”ื•ื ื ืงื™ ืื‘ื“ ื•ืื™ืคื” ื™ืฉืจื™ื ื ื›ื—ื“ื•
35
+
36
+ ื›ืืฉืจ ืจืื™ืชื™ ื—ืจืฉื™ ืื•ืŸ ื•ื–ืจืขื™ ืขืžืœ ื™ืงืฆืจื”ื•
37
+
38
+ ืžื ืฉืžืช ืืœื•ื” ื™ืื‘ื“ื• ื•ืžืจื•ื— ืืคื• ื™ื›ืœื•
39
+
40
+ ืฉืื’ืช ืืจื™ื” ื•ืงื•ืœ ืฉื—ืœ ื•ืฉื ื™ ื›ืคื™ืจื™ื ื ืชืขื•
41
+
42
+ ืœื™ืฉ ืื‘ื“ ืžื‘ืœื™ ื˜ืจืฃ ื•ื‘ื ื™ ืœื‘ื™ื ื™ืชืคืจื“ื•
43
+
44
+ ื•ืืœื™ ื“ื‘ืจ ื™ื’ื ื‘ ื•ืชืงื— ืื–ื ื™ ืฉืžืฅ ืžื ื”ื•
45
+
46
+ ื‘ืฉืขืคื™ื ืžื—ื–ื™ื ื•ืช ืœื™ืœื” ื‘ื ืคืœ ืชืจื“ืžื” ืขืœ ืื ืฉื™ื
47
+
48
+ ืคื—ื“ ืงืจืื ื™ ื•ืจืขื“ื” ื•ืจื‘ ืขืฆืžื•ืชื™ ื”ืคื—ื™ื“
49
+
50
+ ื•ืจื•ื— ืขืœ ืคื ื™ ื™ื—ืœืฃ ืชืกืžืจ ืฉืขืจืช ื‘ืฉืจื™
51
+
52
+ ื™ืขืžื“ ื•ืœื ืื›ื™ืจ ืžืจืื”ื• ืชืžื•ื ื” ืœื ื’ื“ ืขื™ื ื™ ื“ืžืžื” ื•ืงื•ืœ ืืฉืžืข
53
+
54
+ ื”ืื ื•ืฉ ืžืืœื•ื” ื™ืฆื“ืง ืื ืžืขืฉื”ื• ื™ื˜ื”ืจ ื’ื‘ืจ
55
+
56
+ ื”ืŸ ื‘ืขื‘ื“ื™ื• ืœื ื™ืืžื™ืŸ ื•ื‘ืžืœืื›ื™ื• ื™ืฉื™ื ืชื”ืœื”
57
+
58
+ ืืฃ ืฉื›ื ื™ ื‘ืชื™ ื—ืžืจ ืืฉืจ ื‘ืขืคืจ ื™ืกื•ื“ื ื™ื“ื›ืื•ื ืœืคื ื™ ืขืฉ
59
+
60
+ ืžื‘ืงืจ ืœืขืจื‘ ื™ื›ืชื• ืžื‘ืœื™ ืžืฉื™ื ืœื ืฆื— ื™ืื‘ื“ื•
61
+
62
+ ื”ืœื ื ืกืข ื™ืชืจื ื‘ื ื™ืžื•ืชื• ื•ืœื ื‘ื—ื›ืžื”
63
+
64
+ ืงืจื ื ื ื”ื™ืฉ ืขื•ื ืš ื•ืืœ ืžื™ ืžืงื“ืฉื™ื ืชืคื ื”
65
+
66
+ ื›ื™ ืœืื•ื™ืœ ื™ื”ืจื’ ื›ืขืฉ ื•ืคืชื” ืชืžื™ืช ืงื ืื”
67
+
68
+ ืื ื™ ืจืื™ืชื™ ืื•ื™ืœ ืžืฉืจื™ืฉ ื•ืืงื•ื‘ ื ื•ื”ื• ืคืชืื
69
+
70
+ ื™ืจื—ืงื• ื‘ื ื™ื• ืžื™ืฉืข ื•ื™ื“ื›ืื• ื‘ืฉืขืจ ื•ืื™ืŸ ืžืฆื™ืœ
71
+
72
+ ืืฉืจ ืงืฆื™ืจื• ืจืขื‘ ื™ืื›ืœ ื•ืืœ ืžืฆื ื™ื ื™ืงื—ื”ื• ื•ืฉืืฃ ืฆืžื™ื ื—ื™ืœื
73
+
74
+ ื›ื™ ืœื ื™ืฆื ืžืขืคืจ ืื•ืŸ ื•ืžืื“ืžื” ืœื ื™ืฆืžื— ืขืžืœ
75
+
76
+ ื›ื™ ืื“ื ืœืขืžืœ ื™ื•ืœื“ ื•ื‘ื ื™ ืจืฉืฃ ื™ื’ื‘ื™ื”ื• ืขื•ืฃ
77
+
78
+ ืื•ืœื ืื ื™ ืื“ืจืฉ ืืœ ืืœ ื•ืืœ ืืœื”ื™ื ืืฉื™ื ื“ื‘ืจืชื™
79
+
80
+ ืขืฉื” ื’ื“ืœื•ืช ื•ืื™ืŸ ื—ืงืจ ื ืคืœืื•ืช ืขื“ ืื™ืŸ ืžืกืคืจ
81
+
82
+ ื”ื ืชืŸ ืžื˜ืจ ืขืœ ืคื ื™ ืืจืฅ ื•ืฉืœื— ืžื™ื ืขืœ ืคื ื™ ื—ื•ืฆื•ืช
83
+
84
+ ืœืฉื•ื ืฉืคืœื™ื ืœืžืจื•ื ื•ืงื“ืจื™ื ืฉื’ื‘ื• ื™ืฉืข
85
+
86
+ ืžืคืจ ืžื—ืฉื‘ื•ืช ืขืจื•ืžื™ื ื•ืœื ืชืขืฉื™ื ื” ื™ื“ื™ื”ื ืชื•ืฉื™ื”
87
+
88
+ ืœื›ื“ ื—ื›ืžื™ื ื‘ืขืจืžื ื•ืขืฆืช ื ืคืชืœื™ื ื ืžื”ืจื”
89
+
90
+ ื™ื•ืžื ื™ืคื’ืฉื• ื—ืฉืš ื•ื›ืœื™ืœื” ื™ืžืฉืฉื• ื‘ืฆื”ืจื™ื
91
+
92
+ ื•ื™ืฉืข ืžื—ืจื‘ ืžืคื™ื”ื ื•ืžื™ื“ ื—ื–ืง ืื‘ื™ื•ืŸ
93
+
94
+ ื•ืชื”ื™ ืœื“ืœ ืชืงื•ื” ื•ืขืœืชื” ืงืคืฆื” ืคื™ื”
95
+
96
+ ืื ืขืœ ื”ืžืœืš ื˜ื•ื‘ ื™ื›ืชื‘ ืœืื‘ื“ื ื•ืขืฉืจืช ืืœืคื™ื ื›ื›ืจ ื›ืกืฃ ืืฉืงื•ืœ ืขืœ ื™ื“ื™ ืขืฉื™ ื”ืžืœืื›ื” ืœื”ื‘ื™ื ืืœ ื’ื ื–ื™ ื”ืžืœืš
97
+
98
+ ื•ื™ืกืจ ื”ืžืœืš ืืช ื˜ื‘ืขืชื• ืžืขืœ ื™ื“ื• ื•ื™ืชื ื” ืœื”ืžืŸ ื‘ืŸ ื”ืžื“ืชื ื”ืื’ื’ื™ ืฆืจืจ ื”ื™ื”ื•ื“ื™ื
99
+
100
+ ื•ื™ืืžืจ ื”ืžืœืš ืœื”ืžืŸ ื”ื›ืกืฃ ื ืชื•ืŸ ืœืš ื•ื”ืขื ืœืขืฉื•ืช ื‘ื• ื›ื˜ื•ื‘ ื‘ืขื™ื ื™ืš
101
+
102
+ ื•ื™ืงืจืื• ืกืคืจื™ ื”ืžืœืš ื‘ื—ื“ืฉ ื”ืจืืฉื•ืŸ ื‘ืฉืœื•ืฉื” ืขืฉืจ ื™ื•ื ื‘ื• ื•ื™ื›ืชื‘ ื›ื›ืœ ืืฉืจ ืฆื•ื” ื”ืžืŸ ืืœ ืื—ืฉื“ืจืคื ื™ ื”ืžืœืš ื•ืืœ ื”ืคื—ื•ืช ืืฉืจ ืขืœ ืžื“ื™ื ื” ื•ืžื“ื™ื ื” ื•ืืœ ืฉืจื™ ืขื ื•ืขื ืžื“ื™ื ื” ื•ืžื“ื™ื ื” ื›ื›ืชื‘ื” ื•ืขื ื•ืขื ื›ืœืฉื•ื ื• ื‘ืฉื ื”ืžืœืš ืื—ืฉื•ืจืฉ ื ื›ืชื‘ ื•ื ื—ืชื ื‘ื˜ื‘ืขืช ื”ืžืœืš
103
+
104
+ ื•ื ืฉืœื•ื— ืกืคืจื™ื ื‘ื™ื“ ื”ืจืฆื™ื ืืœ ื›ืœ ืžื“ื™ื ื•ืช ื”ืžืœืš ืœื”ืฉืžื™ื“ ืœื”ืจื’ ื•ืœืื‘ื“ ืืช ื›ืœ ื”ื™ื”ื•ื“ื™ื ืžื ืขืจ ื•ืขื“ ื–ืงืŸ ื˜ืฃ ื•ื ืฉื™ื ื‘ื™ื•ื ืื—ื“ ื‘ืฉืœื•ืฉื” ืขืฉืจ ืœื—ื“ืฉ ืฉื ื™ื ืขืฉืจ ื”ื•ื ื—ื“ืฉ ืื“ืจ ื•ืฉืœืœื ืœื‘ื•ื–
105
+
106
+ ืคืชืฉื’ืŸ ื”ื›ืชื‘ ืœื”ื ืชืŸ ื“ืช ื‘ื›ืœ ืžื“ื™ื ื” ื•ืžื“ื™ื ื” ื’ืœื•ื™ ืœื›ืœ ื”ืขืžื™ื ืœื”ื™ื•ืช ืขืชื“ื™ื ืœื™ื•ื ื”ื–ื”
107
+
108
+ ื”ืจืฆื™ื ื™ืฆืื• ื“ื—ื•ืคื™ื ื‘ื“ื‘ืจ ื”ืžืœืš ื•ื”ื“ืช ื ืชื ื” ื‘ืฉื•ืฉืŸ ื”ื‘ื™ืจื” ื•ื”ืžืœืš ื•ื”ืžืŸ ื™ืฉื‘ื• ืœืฉืชื•ืช ื•ื”ืขื™ืจ ืฉื•ืฉืŸ ื ื‘ื•ื›ื”
109
+
110
+
111
+ ื•ืžืจื“ื›ื™ ื™ื“ืข ืืช ื›ืœ ืืฉืจ ื ืขืฉื” ื•ื™ืงืจืข ืžืจื“ื›ื™ ืืช ื‘ื’ื“ื™ื• ื•ื™ืœื‘ืฉ ืฉืง ื•ืืคืจ ื•ื™ืฆื ื‘ืชื•ืš ื”ืขื™ืจ ื•ื™ื–ืขืง ื–ืขืงื” ื’ื“ืœื” ื•ืžืจื”
112
+
113
+ ื•ื™ื‘ื•ื ืขื“ ืœืคื ื™ ืฉืขืจ ื”ืžืœืš ื›ื™ ืื™ืŸ ืœื‘ื•ื ืืœ ืฉืขืจ ื”ืžืœืš ื‘ืœื‘ื•ืฉ ืฉืง
114
+
115
+ ื•ื‘ื›ืœ ืžื“ื™ื ื” ื•ืžื“ื™ื ื” ืžืงื•ื ืืฉืจ ื“ื‘ืจ ื”ืžืœืš ื•ื“ืชื• ืžื’ื™ืข ืื‘ืœ ื’ื“ื•ืœ ืœื™ื”ื•ื“ื™ื ื•ืฆื•ื ื•ื‘ื›ื™ ื•ืžืกืคื“ ืฉืง ื•ืืคืจ ื™ืฆืข ืœืจื‘ื™ื
116
+
117
+ ื•ืชื‘ื•ืื™ื ื” [ื•ืชื‘ื•ืื ื”] ื ืขืจื•ืช ืืกืชืจ ื•ืกืจื™ืกื™ื” ื•ื™ื’ื™ื“ื• ืœื” ื•ืชืชื—ืœื—ืœ ื”ืžืœื›ื” ืžืื“ ื•ืชืฉืœื— ื‘ื’ื“ื™ื ืœื”ืœื‘ื™ืฉ ๏ฟฝ๏ฟฝืช ืžืจื“ื›ื™ ื•ืœื”ืกื™ืจ ืฉืงื• ืžืขืœื™ื• ื•ืœื ืงื‘ืœ
118
+
119
+ ื•ืชืงืจื ืืกืชืจ ืœื”ืชืš ืžืกืจื™ืกื™ ื”ืžืœืš ืืฉืจ ื”ืขืžื™ื“ ืœืคื ื™ื” ื•ืชืฆื•ื”ื• ืขืœ ืžืจื“ื›ื™ ืœื“ืขืช ืžื” ื–ื” ื•ืขืœ ืžื” ื–ื”
120
+
121
+ ื•ื™ืฆื ื”ืชืš ืืœ ืžืจื“ื›ื™ ืืœ ืจื—ื•ื‘ ื”ืขื™ืจ ืืฉืจ ืœืคื ื™ ืฉืขืจ ื”ืžืœืš
122
+
123
+ ื•ื™ื’ื“ ืœื• ืžืจื“ื›ื™ ืืช ื›ืœ ืืฉืจ ืงืจื”ื• ื•ืืช ืคืจืฉืช ื”ื›ืกืฃ ืืฉืจ ืืžืจ ื”ืžืŸ ืœืฉืงื•ืœ ืขืœ ื’ื ื–ื™ ื”ืžืœืš ื‘ื™ื”ื•ื“ื™ื™ื [ื‘ื™ื”ื•ื“ื™ื] ืœืื‘ื“ื
124
+
125
+ ื•ืืช ืคืชืฉื’ืŸ ื›ืชื‘ ื”ื“ืช ืืฉืจ ื ืชืŸ ื‘ืฉื•ืฉืŸ ืœื”ืฉืžื™ื“ื ื ืชืŸ ืœื• ืœื”ืจืื•ืช ืืช ืืกืชืจ ื•ืœื”ื’ื™ื“ ืœื” ื•ืœืฆื•ื•ืช ืขืœื™ื” ืœื‘ื•ื ืืœ ื”ืžืœืš ืœื”ืชื—ื ืŸ ืœื• ื•ืœื‘ืงืฉ ืžืœืคื ื™ื• ืขืœ ืขืžื”
126
+
127
+ ื•ื™ื‘ื•ื ื”ืชืš ื•ื™ื’ื“ ืœืืกืชืจ ืืช ื“ื‘ืจื™ ืžืจื“ื›ื™
128
+
129
+ ื•ืชืืžืจ ืืกืชืจ ืœื”ืชืš ื•ืชืฆื•ื”ื• ืืœ ืžืจื“ื›ื™
130
+
131
+ ื›ืœ ืขื‘ื“ื™ ื”ืžืœืš ื•ืขื ืžื“ื™ื ื•ืช ื”ืžืœืš ื™ื•ื“ืขื™ื ืืฉืจ ื›ืœ ืื™ืฉ ื•ืืฉื” ืืฉืจ ื™ื‘ื•ื ืืœ ื”ืžืœืš ืืœ ื”ื—ืฆืจ ื”ืคื ื™ืžื™ืช ืืฉืจ ืœื ื™ืงืจื ืื—ืช ื“ืชื• ืœื”ืžื™ืช ืœื‘ื“ ืžืืฉืจ ื™ื•ืฉื™ื˜ ืœื• ื”ืžืœืš ืืช ืฉืจื‘ื™ื˜ ื”ื–ื”ื‘ ื•ื—ื™ื” ื•ืื ื™ ืœื ื ืงืจืืชื™ ืœื‘ื•ื ืืœ ื”ืžืœืš ื–ื” ืฉืœื•ืฉื™ื ื™ื•ื
132
+
133
+ ื•ื™ื’ื™ื“ื• ืœืžืจื“ื›ื™ ืืช ื“ื‘ืจื™ ืืกืชืจ
134
+
135
+ ื•ื™ืืžืจ ืžืจื“ื›ื™ ืœื”ืฉื™ื‘ ืืœ ืืกืชืจ ืืœ ืชื“ืžื™ ื‘ื ืคืฉืš ืœื”ืžืœื˜ ื‘ื™ืช ื”ืžืœืš ืžื›ืœ ื”ื™ื”ื•ื“ื™ื
136
+
137
+ ื›ื™ ืื ื”ื—ืจืฉ ืชื—ืจื™ืฉื™ ื‘ืขืช ื”ื–ืืช ืจื•ื— ื•ื”ืฆืœื” ื™ืขืžื•ื“ ืœื™ื”ื•ื“ื™ื ืžืžืงื•ื ืื—ืจ ื•ืืช ื•ื‘ื™ืช ืื‘ื™ืš ืชืื‘ื“ื• ื•ืžื™ ื™ื•ื“ืข ืื ืœืขืช ื›ื–ืืช ื”ื’ืขืช ืœืžืœื›ื•ืช
138
+
139
+ ื•ืชืืžืจ ืืกืชืจ ืœื”ืฉื™ื‘ ืืœ ืžืจื“ื›ื™
140
+ ื›ืžื’ื“ืœ ื“ื•ื™ื“ ืฆื•ืืจืš ื‘ื ื•ื™ ืœืชืœืคื™ื•ืช ืืœืฃ ื”ืžื’ืŸ ืชืœื•ื™ ืขืœื™ื• ื›ืœ ืฉืœื˜ื™ ื”ื’ื‘ื•ืจื™ื
141
+
142
+ ืฉื ื™ ืฉื“ื™ืš ื›ืฉื ื™ ืขืคืจื™ื ืชืื•ืžื™ ืฆื‘ื™ื” ื”ืจื•ืขื™ื ื‘ืฉื•ืฉื ื™ื
143
+
144
+ ืขื“ ืฉื™ืคื•ื— ื”ื™ื•ื ื•ื ืกื• ื”ืฆืœืœื™ื ืืœืš ืœื™ ืืœ ื”ืจ ื”ืžื•ืจ ื•ืืœ ื’ื‘ืขืช ื”ืœื‘ื•ื ื”
145
+
146
+ ื›ืœืš ื™ืคื” ืจืขื™ืชื™ ื•ืžื•ื ืื™ืŸ ื‘ืš ืืชื™ ืžืœื‘ื ื•ืŸ ื›ืœื” ืืชื™ ืžืœื‘ื ื•ืŸ ืชื‘ื•ืื™ ืชืฉื•ืจื™ ืžืจืืฉ ืืžื ื” ืžืจืืฉ ืฉื ื™ืจ ื•ื—ืจืžื•ืŸ ืžืžืขื ื•ืช ืืจื™ื•ืช ืžื”ืจืจื™ ื ืžืจื™ื
147
+
148
+ ืœื‘ื‘ืชื ื™ ืื—ืชื™ ื›ืœื” ืœื‘ื‘ืชื™ื ื™ ื‘ืื—ื“ [ื‘ืื—ืช] ืžืขื™ื ื™ืš ื‘ืื—ื“ ืขื ืง ืžืฆื•ืจื ื™ืš
149
+
150
+ ืžื” ื™ืคื• ื“ื“ื™ืš ืื—ืชื™ ื›ืœื” ืžื” ื˜ื‘ื• ื“ื“ื™ืš ืžื™ื™ืŸ ื•ืจื™ื— ืฉืžื ื™ืš ืžื›ืœ ื‘ืฉืžื™ื
151
+
152
+ ื ืคืช ืชื˜ืคื ื” ืฉืคืชื•ืชื™ืš ื›ืœื” ื“ื‘ืฉ ื•ื—ืœื‘ ืชื—ืช ืœืฉื•ื ืš ื•ืจื™ื— ืฉืœืžืชื™ืš ื›ืจื™ื— ืœื‘ื ื•ืŸ ื’ืŸ ื ืขื•ืœ ืื—ืชื™ ื›ืœื” ื’ืœ ื ืขื•ืœ ืžืขื™ืŸ ื—ืชื•ื
153
+
154
+ ืฉืœื—ื™ืš ืคืจื“ืก ืจืžื•ื ื™ื ืขื ืคืจื™ ืžื’ื“ื™ื ื›ืคืจื™ื ืขื ื ืจื“ื™ื
155
+
156
+ ื ืจื“ ื•ื›ืจื›ื ืงื ื” ื•ืงื ืžื•ืŸ ืขื ื›ืœ ืขืฆื™ ืœื‘ื•ื ื” ืžืจ ื•ืื”ืœื•ืช ืขื ื›ืœ ืจืืฉื™ ื‘ืฉืžื™ื
157
+
158
+ ืžืขื™ืŸ ื’ื ื™ื ื‘ืืจ ืžื™ื ื—ื™ื™ื ื•ื ื–ืœื™ื ืžืŸ ืœื‘ื ื•ืŸ
159
+
160
+ ืขื•ืจื™ ืฆืคื•ืŸ ื•ื‘ื•ืื™ ืชื™ืžืŸ ื”ืคื™ื—ื™ ื’ื ื™ ื™ื–ืœื• ื‘ืฉืžื™ื• ื™ื‘ื ื“ื•ื“ื™ ืœื’ื ื• ื•ื™ืื›ืœ ืคืจื™ ืžื’ื“ื™ื•
161
+
162
+ ื‘ืืชื™ ืœื’ื ื™ ืื—ืชื™ ื›ืœื” ืืจื™ืชื™ ืžื•ืจื™ ืขื ื‘ืฉืžื™ ืื›ืœืชื™ ื™ืขืจื™ ืขื ื“ื‘ืฉื™ ืฉืชื™ืชื™ ื™ื™ื ื™ ืขื ื—ืœื‘ื™ ืื›ืœื• ืจืขื™ื ืฉืชื• ื•ืฉื›ืจื• ื“ื•ื“ื™ื
163
+ """
164
+
165
+ processed_text = process_text(input_text)
166
+
167
+ output_file = 'processed_text.csv'
168
+ save_to_csv(processed_text, output_file)
data_creation/creat_model_bible.py ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import pandas as pd
2
+ import re
3
+ import nltk
4
+ from nltk.tokenize import word_tokenize
5
+ from sklearn.feature_extraction.text import TfidfVectorizer
6
+ from sklearn.svm import SVC
7
+ from sklearn.model_selection import train_test_split
8
+ from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
9
+ import joblib
10
+
11
+
12
+ """
13
+ # Download the Hebrew stopwords (if not already downloaded)
14
+ nltk.download('stopwords')
15
+
16
+ # Function to remove punctuation and special characters from text
17
+ def remove_punctuation(text):
18
+ return re.sub(r'[^\w\s]', '', text)
19
+
20
+ # Function to remove custom stop words from text
21
+ def remove_custom_stopwords(text):
22
+ hebrew_stopwords = {'ืื ื™', 'ืืชื”', 'ืืช', 'ืื ื—ื ื•', 'ืืชื', 'ืืชืŸ', 'ื”ื', 'ื”ืŸ'} # Add your custom Hebrew stopwords here
23
+ return ' '.join(word for word in text.split() if word not in hebrew_stopwords)
24
+
25
+ # Remove punctuation and custom stop words from the text data
26
+ data['text'] = data['text'].apply(remove_punctuation)
27
+ data['text'] = data['text'].apply(remove_custom_stopwords)
28
+ """
29
+
30
+ # Load the dataset (assuming it is in UTF-8 encoding)
31
+ data = pd.read_csv('bible_data.csv', encoding='utf-8')
32
+
33
+
34
+
35
+ # Separate features (text) and labels (0 or 1)
36
+ X = data['text']
37
+ y = data['label']
38
+
39
+ # Create a TF-IDF vectorizer with Hebrew tokenizer
40
+ vectorizer = TfidfVectorizer(tokenizer=word_tokenize, lowercase=True)
41
+
42
+ # Fit and transform the data with TF-IDF vectorizer
43
+ X_tfidf = vectorizer.fit_transform(X)
44
+
45
+ # Split data into training and test sets
46
+ X_train, X_test, y_train, y_test = train_test_split(X_tfidf, y, test_size=0.2, random_state=47)
47
+
48
+ # Create a Support Vector Machine (SVM) classifier
49
+ classifier = SVC(kernel='linear', C=0.5, probability=True)
50
+
51
+ # Train the SVM classifier on the training data
52
+ classifier.fit(X_train, y_train)
53
+
54
+ # Evaluate the model on the test data
55
+ y_pred = classifier.predict(X_test)
56
+ accuracy = accuracy_score(y_test, y_pred)
57
+ precision = precision_score(y_test, y_pred)
58
+ recall = recall_score(y_test, y_pred)
59
+ f1 = f1_score(y_test, y_pred)
60
+
61
+ print("Accuracy:", accuracy)
62
+ print("Precision:", precision)
63
+ print("Recall:", recall)
64
+ print("F1 Score:", f1)
65
+
66
+ # Save the trained model and vectorizer to files
67
+ model_filename = "is_this_bible_model.pkl"
68
+ vectorizer_filename = "is_this_bible_vectorizer.pkl"
69
+ joblib.dump(classifier, model_filename)
70
+ joblib.dump(vectorizer, vectorizer_filename)
data_creation/divided_sentences.txt ADDED
@@ -0,0 +1,107 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ืืข"ืค ืฉืื™ ืืคืฉืจ ื•ื”ืœื ื‘ื ื‘ื ืœืจ' ืžืื™ืจ ืืข"ืค ืฉืื™ ืืคืฉืจ ืœืจื‘ื ืŸ,2
2
+ ื•ืœืชื ื™ ื‘ื ื”ืขืœื™ื•ืŸ ืจ"ืž ืื•ืžืจ ืœื ื—ื•ืœืฆืช ื•ืœื ืžืชื™ื‘ืžืช ื•ื—ื›"ื ืื• ื—ื•ืœืฆืช,2
3
+ ืื• ืžืชื™ื‘ืžืช ื•ืื ื ื™ื“ืขื ื ืžืฉื•ื ื“ืื™ ืืคืฉืจ ื”ื•ื ืื™ ืœื ืชื ื ืืข"ืค,2
4
+ ืฉืื™ ืืคืฉืจ ื”ื•ื” ืืžื™ื ื ืจื•ื‘ ื ืฉื™ื ืชื—ืชื•ืŸ ืืชื™ ื‘ืจื™ืฉื ื•ืžื™ืขื•ื˜ ืขืœื™ื•ืŸ ืืชื™,2
5
+ ื‘ืจื™ืฉื ื•ืจื‘ื™ ืžืื™ืจ ืœื˜ืขืžื™ื” ื“ื—ื™ื™ืฉ ืœืžื™ืขื•ื˜ื ื•ืจื‘ื ืŸ ืœื˜ืขืžื™ื™ื”ื• ื“ืœื ื—ื™ื™ืฉื™ ืœืžื™ืขื•ื˜ื ื•ื”ื ื™,2
6
+ ืžื™ืœื™ ื‘ืกืชืžื ืื‘ืœ ื”ื™ื›ื ื“ื‘ื“ืงืŸ ื•ืœื ืืฉื›ื—ืŸ ืื™ืžืจ ืžื•ื“ื• ืœื™ื” ืจื‘ื ืŸ ืœืจ"ืž,2
7
+ ื“ืขืœื™ื•ืŸ ืงื“ื™ื ืงืž"ืœ ื“ืื™ ืืคืฉืจ ื•ื“ืื™ ืืชื™ ื•ืžื ืชืจ ื”ื•ื ื“ื ืชืจ ื‘ืฉืœืžื ืœืจ"ืž,2
8
+ ื”ื™ื™ื ื• ื“ื›ืชื™ื‘ (ื™ื—ื–ืงืืœ ื˜ื–) ืฉื“ื™ื ื ื›ื•ื ื• ื•ืฉืขืจืš ืฆืžื— ืืœื ืœืจื‘ื ืŸ ืื™ืคื›ื ืžื‘ืขื™,2
9
+ ืœื™ื” ื”"ืง ื›ื™ื•ืŸ ืฉืฉื“ื™ื ื ื›ื•ื ื• ื‘ื™ื“ื•ืข ืฉืฉืขืจืš ืฆืžื— ื‘ืฉืœืžื ืœืจ"ืž ื”ื™ื™ื ื• ื“ื›ืชื™ื‘,2
10
+ (ื™ื—ื–ืงืืœ ื›ื’) ื‘ืขืฉื•ืช ืžืžืฆืจื™ื ื“ื“ื™ืš ืœืžืขืŸ ืฉื“ื™ ื ืขื•ืจื™ืš ืืœื ืœืจื‘ื ืŸ ืื™ืคื›ื ืžื‘ืขื™,2
11
+ ืœื™ื” ื”"ืง ื›ื™ื•ืŸ ืฉื‘ืื• ื“ื“ื™ืš ื‘ื™ื“ื•ืข ืฉื‘ืื• ื ืขื•ืจื™ืš ื•ืื™ื‘ืขื™ืช ืื™ืžื ืžืื™ ืฉื“ื™,2
12
+ ื›ื•ืœื” ื‘ื“ื“ื™ ื›ืชื™ื‘ ื•ื”"ืง ื”ืงื‘"ื” ืœื™ืฉืจืืœ ืื™ื›ืจืคื• ื“ื“ื™ืš ืœื ื”ื“ืจืช ื‘ืš ืื™ืฉืชื“ื•,2
13
+ ื“ื“ื™ืš ื ืžื™ ืœื ื”ื“ืจืช ื‘ืš ื“ื›ื•ืœื™ ืขืœืžื ืžื™ื”ื ืืชื—ืชื•ืŸ ืกืžื›ื™ื ืŸ ืžื ืœืŸ ืืžืจ,2
14
+ ืจื‘ ื™ื”ื•ื“ื” ืืžืจ ืจื‘ ื•ื›ืŸ ืชื ื ื“ื‘ื™ ืจ' ื™ืฉืžืขืืœ ืืžืจ ืงืจื (ื‘ืžื“ื‘ืจ,2
15
+ ื”) ืื™ืฉ ืื• ืืฉื” ื›ื™ ื™ืขืฉื• ืžื›ืœ ื—ื˜ืื•ืช ื”ืื“ื ื”ืฉื•ื” ื”ื›ืชื•ื‘ ืืฉื”,2
16
+ ืœืื™ืฉ ืœื›ืœ ืขื•ื ืฉื™ืŸ ืฉื‘ืชื•ืจื” ืžื” ืื™ืฉ ื‘ืกื™ืžืŸ ืื—ื“ ืืฃ ืืฉื” ื‘ืกื™ืžืŸ ืื—ื“,2
17
+ ื•ืื™ืžื ืื• ื”ืื™ ืื• ื”ืื™ ื›ืื™ืฉ ืžื” ืื™ืฉ ืชื—ืชื•ืŸ ื•ืœื ืขืœื™ื•ืŸ ืืฃ,2
18
+ ืืฉื” ืชื—ืชื•ืŸ ื•ืœื ืขืœื™ื•ืŸ ืชื ื™ื ื ืžื™ ื”ื›ื™ ื"ืจ ืืœื™ืขื–ืจ ื‘ืจ' ืฆื“ื•ืง ื›ืš,2
19
+ ื”ื™ื• ืžืคืจืฉื™ืŸ ื‘ื™ื‘ื ื” ื•ืืžืจื• ื›ื™ื•ืŸ ืฉื‘ื ืชื—ืชื•ืŸ ืฉื•ื‘ ืื™ืŸ ืžืฉื’ื™ื—ื™ืŸ ืขืœ ืขืœื™ื•ืŸ,2
20
+ ืชื ื™ื ืจืฉื‘"ื’ ืื•ืžืจ ื‘ื ื•ืช ื›ืจื›ื™ื ืชื—ืชื•ืŸ ืžืžื”ืจ ืœื‘ื ืžืคื ื™ ืฉืจื’ื™ืœื•ืช ื‘ืžืจื—ืฆืื•ืช ื‘ื ื•ืช,2
21
+ ื›ืคืจื™ื ืขืœื™ื•ืŸ ืžืžื”ืจ ืœื‘ื ืžืคื ื™ ืฉื˜ื•ื—ื ื•ืช ื‘ืจื—ื™ื ืจ"ืฉ ื‘ืŸ ืืœืขื–ืจ ืื•ืžืจ ื‘ื ื•ืช,2
22
+ ืขืฉื™ืจื™ื ืฆื“ ื™ืžื™ืŸ ืžืžื”ืจ ืœื‘ื ืฉื ื™ืฉื•ืฃ ื‘ืืคืงืจื™ืกื•ืชืŸ ื‘ื ื•ืช ืขื ื™ื™ื ืฆื“ ืฉืžืืœ ืžืžื”ืจ,2
23
+ ืœื‘ื ืžืคื ื™ ืฉืฉื•ืื‘ื•ืช ื›ื“ื™ ืžื™ื ืขืœื™ื”ืŸ ื•ืื™ื‘ืขื™ืช ืื™ืžื ืžืคื ื™ ืฉื ื•ืฉืื™ืŸ ืื—ื™ื”ืŸ ืขืœ,2
24
+ ื’ืกืกื™ื”ืŸ ืช"ืจ ืฆื“ ืฉืžืืœ ืงื•ื“ื ืœืฆื“ ื™ืžื™ืŸ ืจื‘ื™ ื—ื ื™ื ื ื‘ืŸ ืื—ื™ ืจ',2
25
+ ื™ื”ื•ืฉืข ืื•ืžืจ ืžืขื•ืœื ืœื ืงื“ื ืฆื“ ืฉืžืืœ ืœืฆื“ ื™ืžื™ืŸ ื—ื•ืฅ ืžืื—ืช ืฉื”ื™ืชื”,2
26
+ ื‘ืฉื›ื•ื ืชื™ ืฉืงื“ื ืฆื“ ืฉืžืืœ ืœืฆื“ ื™ืžื™ืŸ ื•ื—ื–ืจ ืœืื™ืชื ื• ืช"ืจ ื›ืœ ื”ื ื‘ื“ืงื•ืช ื ื‘ื“ืงื•ืช,2
27
+ ืขืœ ืคื™ ื ืฉื™ื ื•ื›ืŸ ื”ื™ื” ืจื‘ื™ ืืœื™ืขื–ืจ ืžื•ืกืจ ืœืืฉืชื• ื•ืจื‘ื™ ื™ืฉืžืขืืœ ืžื•ืกืจ,2
28
+ ืœืืžื• ืจื‘ื™ ื™ื”ื•ื“ื” ืื•ืžืจ ืœืคื ื™ ื”ืคืจืง ื•ืœืื—ืจ ื”ืคืจืง ื ืฉื™ื ื‘ื•ื“ืงื•ืช ืื•ืชืŸ ืชื•ืš,2
29
+ ื”ืคืจืง ืื™ืŸ ื ืฉื™ื ื‘ื•ื“ืงื•ืช ืื•ืชืŸ ืฉืื™ืŸ ืžืฉื™ืื™ืŸ ืกืคืงื•ืช ืขืœ ืคื™ ื ืฉื™ื ืจ"ืฉ,2
30
+ ืื•ืžืจ ืืฃ ืชื•ืš ื”ืคืจืง ื ืฉื™ื ื‘ื•ื“ืงื•ืช ืื•ืชืŸ ื•ื ืืžื ืช ืืฉื” ืœื”ื—ืžื™ืจ ืื‘ืœ ืœื,2
31
+ ืœื”ืงืœ ื›ื™ืฆื“ ื’ื“ื•ืœื” ื”ื™ื ืฉืœื ืชืžืืŸ ืงื˜ื ื” ื”ื™ื ืฉืœื ืชื—ืœื•ืฅ ืื‘ืœ ืื™ืŸ,2
32
+ ื ืืžื ืช ืœื•ืžืจ ืงื˜ื ื” ื”ื™ื ืฉืชืžืืŸ ื•ื’ื“ื•ืœื” ื”ื™ื ืฉืชื—ืœื•ืฅ ืืžืจ ืžืจ ืจื‘ื™ ื™ื”ื•ื“ื”,2
33
+ ืื•ืžืจ ืœืคื ื™ ื”ืคืจืง ื•ืœืื—ืจ ื”ืคืจืง ื ืฉื™ื ื‘ื•ื“ืงื•ืช ืื•ืชืŸ ื‘ืฉืœืžื ืœืคื ื™ ื”ืคืจืง ื‘ืขื™,2
34
+ ื‘ื“ื™ืงื” ื“ืื™ ืžืฉืชื›ื—ื™ ืœืื—ืจ ื”ืคืจืง ืฉื•ืžื ื ื™ื ื”ื• ืืœื ืœืื—ืจ ื”ืคืจืง ืœืžื” ืœื™,2
35
+ ื‘ื“ื™ืงื” ื•ื”ืืžืจ ืจื‘ื ืงื˜ื ื” ืฉื”ื’ื™ืขื” ืœื›ืœืœ ืฉื ื•ืชื™ื” ืื™ื ื” ืฆืจื™ื›ื” ื‘ื“ื™ืงื” ื—ื–ืงื” ื”ื‘ื™ืื”,2
36
+ ืกื™ืžื ื™ืŸ ื›ื™ ืืžืจ ืจื‘ื ื—ื–ืงื” ืœืžื™ืื•ืŸ ืื‘ืœ ืœื—ืœื™ืฆื” ื‘ืขื™ื ื‘ื“ื™ืงื” ืชื•ืš ื”ืคืจืง,2
37
+ ืื™ืŸ ื ืฉื™ื ื‘ื•ื“ืงื•ืช ืื•ืชืŸ ืงืกื‘ืจ ืชื•ืš ื”ืคืจืง ื›ืœืื—ืจ ื”ืคืจืง <ื“ืžื™> ื•ืœืื—ืจ ื”ืคืจืง,2
38
+ ื“ืื™ื›ื ื—ื–ืงื” ื“ืจื‘ื ืกืžื›ื™ื ืŸ ืื ืฉื™ื ื•ื‘ื“ืงื™ ืชื•ืš ื”ืคืจืง ื“ืœื™ื›ื ื—ื–ืงื” ื“ืจื‘ื ืœื,2
39
+ ืกืžื›ื™ื ืŸ ืื ืฉื™ื ื•ืœื ื‘ื“ืงื™ ื ืฉื™ื ืจ"ืฉ ืื•ืžืจ ืืฃ ืชื•ืš ื”ืคืจืง ื ืฉื™ื ื‘ื•ื“ืงื•ืช,2
40
+ ืื•ืชืŸ ืงืกื‘ืจ ืชื•ืš ื”ืคืจืง ื›ืœืคื ื™ ื”ืคืจืง ื•ื‘ืขื™ื ื‘ื“ื™ืงื” ื“ืื™ ืžืฉืชื›ื—ื™ ืœืื—ืจ ื”ืคืจืง,2
41
+ ืฉื•ืžื ื ื™ื ื”ื• ื•ื ืืžื ืช ืืฉื” ืœื”ื—ืžื™ืจ ืื‘ืœ ืœื ืœื”ืงืœ ื”ืื™ ืžืืŸ ืงืชื ื™ ืœื”,2
42
+ ืื™ื‘ืขื™ืช ืื™ืžื ืจื‘ื™ ื™ื”ื•ื“ื” ื•ืืชื•ืš ื”ืคืจืง ื•ืื™ื‘ืขื™ืช ืื™ืžื ืจื‘ื™ ืฉืžืขื•ืŸ ื•ืœืื—ืจ ื”ืคืจืง,2
43
+ ื•ืœื™ืช ืœื™ื” ื—ื–ืงื” ื“ืจื‘ื: ืžืคื ื™ ืฉืืžืจื• ืืคืฉืจ ื›ื•': ื”ื ืชื• ืœืžื” ืœื™,2
44
+ ื”ื ืชื ื ืœื™ื” ืจื™ืฉื ื•ื›ื™ ืชื™ืžื ืžืฉื•ื ื“ืงื ื‘ืขื™ ืœืžืกืชืžื” ื›ืจื‘ื ืŸ ืคืฉื™ื˜ื,2
45
+ ื™ื—ื™ื“ ื•ืจื‘ื™ื ื”ืœื›ื” ื›ืจื‘ื™ื ืžื”ื• ื“ืชื™ืžื ืžืกืชื‘ืจื ื˜ืขืžื ื“ืจ"ืž ื“ืงื ืžืกื™ื™ืข ืœื™ื”,2
46
+ ืงืจืื™ ืงืž"ืœ ื•ืื™ื‘ืขื™ืช ืื™ืžื ืžืฉื•ื ื“ืงื ื‘ืขื™ ืœืžืชื ื™ ื›ื™ื•ืฆื ื‘ื•: ื›ื™ื•ืฆื ื‘ื•,2
47
+ ื›ืœ ื›ืœื™ ื—ืจืก ืฉื”ื•ื ืžื›ื ื™ืก ืžื•ืฆื™ื ื•ื™ืฉ ืฉืžื•ืฆื™ื ื•ืื™ื ื• ืžื›ื ื™ืก ื›ืœ ืื‘ืจ,2
48
+ ืฉื™ืฉ ื‘ื• ืฆืคื•ืจืŸ ื™ืฉ ื‘ื• ืขืฆื ื•ื™ืฉ ืฉื™ืฉ ื‘ื• ืขืฆื ื•ืื™ืŸ ื‘ื•,2
49
+ ืฆืคื•ืจืŸ ื›ืœ ื”ืžื˜ืžื ืžื“ืจืก ืžื˜ืžื ื˜ืžื ืžืช ื•ื™ืฉ ืฉืžื˜ืžื ื˜ืžื ืžืช ื•ืื™ื ื•,2
50
+ ืžื˜ืžื ืžื“ืจืก: ืžื›ื ื™ืก ืคืกื•ืœ ืœืžื™ ื—ื˜ืืช ื•ืคืกื•ืœ ืžืฉื•ื ื’ืกื˜ืจื ืžื•ืฆื™ื ื›ืฉืจ ืœืžื™,2
51
+ ื—ื˜ืืช ื•ืคืกื•ืœ ืžืฉื•ื ื’ืกื˜ืจื ืืžืจ ืจื‘ ืืกื™ ืฉื•ื ื™ืŸ ื›ืœื™ ื—ืจืก ืฉื™ืขื•ืจื• ื‘ื›ื•ื ืก,2
52
+ ืžืฉืงื” ื•ืœื ืืžืจื• ืžื•ืฆื™ื ืžืฉืงื” ืืœื ืœืขื ื™ืŸ ื’ืกื˜ืจื ื‘ืœื‘ื“ ืžืื™ ื˜ืขืžื ืืžืจ,2
53
+ ืžืจ ื–ื•ื˜ืจื ื‘ืจื™ื” ื“ืจื‘ ื ื—ืžืŸ ืœืคื™ ืฉืื™ืŸ ืื•ืžืจื™ื ื”ื‘ื ื’ืกื˜ืจื ืœื’ืกื˜ืจื ืชื ื•,2
54
+ ืจื‘ื ืŸ ื›ื™ืฆื“ ื‘ื•ื“ืงื™ืŸ ื›ืœื™ ื—ืจืก ืœื™ื“ืข ืื ื ื™ืงื‘ ื‘ื›ื•ื ืก ืžืฉืงื” ืื ืœืื•,2
55
+ ื™ื‘ื™ื ืขืจื™ื‘ื” ืžืœืื” ืžื™ื ื•ื ื•ืชืŸ ืงื“ืจื” ืœืชื•ื›ื” ืื ื›ื ืกื” ื‘ื™ื“ื•ืข ืฉื›ื•ื ืก ืžืฉืงื”,2
56
+ ื•ืื ืœืื• ื‘ื™ื“ื•ืข ืฉืžื•ืฆื™ื ืžืฉืงื” ืจื‘ื™ ื™ื”ื•ื“ื” ืื•ืžืจ ื›ื•ืคืฃ ืื–ื ื™ ืงื“ืจื” ืœืชื•ื›ื”,2
57
+ ื•ืžืฆื™ืฃ ืขืœื™ื” ืžื™ื ื•ืื ื›ื•ื ืก ื‘ื™ื“ื•ืข ืฉื›ื•ื ืก ืžืฉืงื” ื•ืื ืœืื• ื‘ื™ื“ื•ืข ืฉืžื•ืฆื™ื,2
58
+ ืžืฉืงื” ืื• ืฉื•ืคืชื” ืขืœ ื’ื‘ื™ ื”ืื•ืจ ืื ื”ืื•ืจ ืžืขืžื™ื“ื” ื‘ื™ื“ื•ืข ืฉืžื•ืฆื™ื ืžืฉืงื”,2
59
+ ื•ืื ืœืื• ื‘ื™ื“ื•ืข ืฉืžื›ื ื™ืก ืžืฉืงื” ืจ' ื™ื•ืกื™ ืื•ืžืจ ืืฃ ืœื ืฉื•ืคืชื” ืขืœ,2
60
+ ื’ื‘ื™ ื”ืื•ืจ ืžืคื ื™ ืฉื”ืื•ืจ ืžืขืžื™ื“ื” ืืœื ืฉื•ืคืชื” ืขืœ ื’ื‘ื™ ื”ืจืžืฅ ืื ืจืžืฅ,2
61
+ ืžืขืžื™ื“ื” ื‘ื™ื“ื•ืข ืฉืžื•ืฆื™ื ืžืฉืงื” ื•ืื ืœืื• ื‘ื™ื“ื•ืข ืฉื›ื•ื ืก ืžืฉืงื” ื”ื™ื” ื˜ื•ืจื“ ื˜ื™ืคื”,2
62
+ ืื—ืจ ื˜ื™ืคื” ื‘ื™ื“ื•ืข ืฉื›ื•ื ืก ืžืฉืงื” ืžืื™ ืื™ื›ื ื‘ื™ืŸ ืช"ืง ืœืจ' ื™ื”ื•ื“ื” ืืžืจ,2
63
+ ืขื•ืœื ื›ื™ื ื•ืก ืขืœ ื™ื“ื™ ื”ื“ื—ืง ืื™ื›ื ื‘ื™ื ื™ื™ื”ื•: ื›ืœ ืื‘ืจ ืฉื™ืฉ ื‘ื• ืฆืคื•ืจืŸ,2
64
+ ื•ื›ื•': ื™ืฉ ื‘ื• ืฆืคื•ืจืŸ ืžื˜ืžื ื‘ืžื’ืข ื•ื‘ืžืฉื ื•ื‘ืื”ืœ ื™ืฉ ื‘ื• ืขืฆื ื•ืื™ืŸ,2
65
+ ื‘ื• ืฆืคื•ืจืŸ ืžื˜ืžื ื‘ืžื’ืข ื•ื‘ืžืฉื ื•ืื™ื ื• ืžื˜ืžื ื‘ืื”ืœ ืืžืจ ืจื‘ ื—ืกื“ื ื“ื‘ืจ,2
66
+ ื–ื” ืจื‘ื™ื ื• ื”ื’ื“ื•ืœ ืืžืจื• ื”ืžืงื•ื ื™ื”ื™ื” ื‘ืขื–ืจื• ืืฆื‘ืข ื™ืชืจื” ืฉื™ืฉ ื‘ื• ืขืฆื,2
67
+ ื•ืื™ืŸ ื‘ื• ืฆืคื•ืจืŸ ืžื˜ืžื ื‘ืžื’ืข ื•ื‘ืžืฉื ื•ืื™ื ื• ืžื˜ืžื ื‘ืื”ืœ ืืžืจ ืจื‘ื” ื‘ืจ,2
68
+ ื‘ืจ ื—ื ื” ื"ืจ ื™ื•ื—ื ืŸ ื•ื›ืฉืื™ื ื” ื ืกืคืจืช ืขืœ ื’ื‘ ื”ื™ื“: ื›ืœ ื”ืžื˜ืžื ืžื“ืจืก,2
69
+ ื•ื›ื•': ื›ืœ ื“ื—ื–ื™ ืœืžื“ืจืก ืžื˜ืžื ื˜ืžื ืžืช ื•ื™ืฉ ืฉืžื˜ืžื ื˜ืžื ืžืช ื•ืื™ืŸ,2
70
+ ืžื˜ืžื ืžื“ืจืก ืœืืชื•ื™ื™ ืžืื™ ืœืืชื•ื™ื™ ืกืื” ื•ืชืจืงื‘ ื“ืชื ื™ื (ื•ื™ืงืจื ื˜ื•) ื•ื”ื™ื•ืฉื‘ ืขืœ,2
71
+ ื”ื›ืœื™ ื™ื›ื•ืœ ื›ืคื” ืกืื” ื•ื™ืฉื‘ ืขืœื™ื” ืื• ืชืจืงื‘ ื•ื™ืฉื‘ ืขืœื™ื• ื™ื”ื ื˜ืžื,2
72
+ ืช"ืœ (ื•ื™ืงืจื ื˜ื•) ืืฉืจ ื™ืฉื‘ ืขืœื™ื• ื”ื–ื‘ ืžื™ ืฉืžื™ื•ื—ื“ ืœื™ืฉื™ื‘ื” ื™ืฆื ื–ื”,2
73
+ ืฉืื•ืžืจื™ื ืœื• ืขืžื•ื“ ื•ื ืขืฉื” ืžืœืื›ืชื ื•: ื›ืœ ื”ืจืื•ื™ ืœื“ื•ืŸ ื“ื™ื ื™ ื ืคืฉื•ืช ืจืื•ื™ ืœื“ื•ืŸ,2
74
+ ื“ื™ื ื™ ืžืžื•ื ื•ืช ื•ื™ืฉ ืฉืจืื•ื™ ืœื“ื•ืŸ ื“ื™ื ื™ ืžืžื•ื ื•ืช ื•ืื™ื ื• ืจืื•ื™ ืœื“ื•ืŸ ื“ื™ื ื™ ื ืคืฉื•ืช:,2
75
+ ืืžืจ ืจื‘ ื™ื”ื•ื“ื” ืœืืชื•ื™ื™ ืžืžื–ืจ ืชื ื™ื ื ื—ื“ื ื–ื™ืžื ื ื”ื›ืœ ื›ืฉืจื™ืŸ ืœื“ื•ืŸ ื“ื™ื ื™,2
76
+ ืžืžื•ื ื•ืช ื•ืื™ืŸ ื”ื›ืœ ื›ืฉืจื™ืŸ ืœื“ื•ืŸ ื“ื™ื ื™ ื ืคืฉื•ืช ื•ื”ื•ื™ื ืŸ ื‘ื” ืœืืชื•ื™ื™ ืžืื™ ื•ืืžืจ,2
77
+ ืจื‘ ื™ื”ื•ื“ื” ืœืืชื•ื™ื™ ืžืžื–ืจ ื—ื“ื ืœืืชื•ื™ื™ ื’ืจ ื•ื—ื“ื ืœืืชื•ื™ื™ ืžืžื–ืจ ื•ืฆืจื™ื›ื™ ื“ืื™,2
78
+ ืืฉืžืขื™ื ืŸ ื’ืจ ืžืฉื•ื ื“ืจืื•ื™ ืœื‘ื ื‘ืงื”ืœ ืื‘ืœ ืžืžื–ืจ ื“ืื™ืŸ ืจืื•ื™ ืœื‘ื ื‘ืงื”ืœ,2
79
+ ืื™ืžื ืœื ื•ืื™ ืืฉืžืขื™ื ืŸ ืžืžื–ืจ ืžืฉื•ื ื“ืงืืชื™ ืžื˜ืคื” ื›ืฉืจื” ืื‘ืœ ื’ืจ ื“ืงืืชื™,2
80
+ ืžื˜ืคื” ืคืกื•ืœื” ืื™ืžื ืœื ืฆืจื™ื›ื: ื›ืœ ื”ื›ืฉืจ ืœื“ื•ืŸ ื›ืฉืจ ืœื”ืขื™ื“ ื•ื™ืฉ ืฉื›ืฉืจ,2
81
+ ืœื”ืขื™ื“ ื•ืื™ื ื• ื›ืฉืจ ืœื“ื•ืŸ: ืœืืชื•ื™ื™ ืžืื™ ื"ืจ ื™ื•ื—ื ืŸ ืœืืชื•ื™ื™ ืกื•ืžื ื‘ืื—ืช ืžืขื™ื ื™ื•,2
82
+ ื•ืžื ื™ ืจื‘ื™ ืžืื™ืจ ื”ื™ื ื“ืชื ื™ื ื”ื™ื” ืจื‘ื™ ืžืื™ืจ ืื•ืžืจ ืžื” ืช"ืœ (ื“ื‘ืจื™ื,2
83
+ ื›ื) ืขืœ ืคื™ื”ื ื™ื”ื™ื” ื›ืœ ืจื™ื‘ ื•ื›ืœ ื ื’ืข ื•ื›ื™ ืžื” ืขื ื™ืŸ ืจื™ื‘ื™ื,2
84
+ ืืฆืœ ื ื’ืขื™ื ืžืงื™ืฉ ืจื™ื‘ื™ื ืœื ื’ืขื™ื ืžื” ื ื’ืขื™ื ื‘ื™ื•ื ื“ื›ืชื™ื‘ (ื•ื™ืงืจื ื™ื’) ื•ื‘ื™ื•ื,2
85
+ ื”ืจืื•ืช ื‘ื• ืืฃ ืจื™ื‘ื™ื ื‘ื™ื•ื ื•ืžื” ื ื’ืขื™ื ืฉืœื ื‘ืกื•ืžื ื“ื›ืชื™ื‘ (ื•ื™ืงืจื ื™ื’),2
86
+ ืœื›ืœ ืžืจืื” ืขื™ื ื™ ื”ื›ื”ืŸ ืืฃ ืจื™ื‘ื™ื ืฉืœื ื‘ืกื•ืžื ื•ืžืงื™ืฉ ื ื’ืขื™ื ืœืจื™ื‘ื™ื ืžื”,2
87
+ ืจื™ื‘ื™ื ืฉืœื ื‘ืงืจื•ื‘ื™ื ืืฃ ื ื’ืขื™ื ืฉืœื ื‘ืงืจื•ื‘ื™ื ืื™ ืžื” ืจื™ื‘ื™ื ื‘ืฉืœืฉื” ืืฃ,2
88
+ ื ื’ืขื™ื ื‘ืฉืœืฉื” ื•ื“ื™ืŸ ื”ื•ื ืžืžื•ื ื• ื‘ืฉืœืฉื” ื’ื•ืคื• ืœื ื›"ืฉ ืช"ืœ (ื•ื™ืงืจื ื™ื’),2
89
+ ื•ื”ื•ื‘ื ืืœ ืื”ืจืŸ ื”ื›ื”ืŸ ืื• ืืœ ืื—ื“ ืžื‘ื ื™ื• ื”ื›ื”ื ื™ื ื”ื ืœืžื“ืช ืฉืืคื™ืœื•,2
90
+ ื›ื”ืŸ ืื—ื“ ืจื•ืื” ืืช ื”ื ื’ืขื™ื ื”ื”ื•ื ืกืžื™ื ื“ื”ื•ื” ื‘ืฉื‘ื‘ื•ืชื™ื” ื“ืจื‘ื™ ื™ื•ื—ื ืŸ ื“ื”ื•ื”,2
91
+ ืงื“ื™ื™ืŸ ื“ื™ื ื ื•ืœื ืงืืžืจ ืœื™ื” ื•ืœื ืžื™ื“ื™ ื”ื™ื›ื™ ืขื‘ื™ื“ ื”ื›ื™ ื•ื”ืืžืจ ืจื‘ื™,2
92
+ ื™ื•ื—ื ืŸ ื”ืœื›ื” ื›ืกืชื ืžืฉื ื” ื•ืชื ืŸ ื›ืœ ื”ื›ืฉืจ ืœื“ื•ืŸ ื›ืฉืจ ืœื”ืขื™ื“ ื•ื™ืฉ ื›ืฉืจ,2
93
+ ืœื”ืขื™ื“ ื•ืื™ืŸ ื›ืฉืจ ืœื“ื•ืŸ ื•ืืžืจื™ื ืŸ ืœืืชื•ื™ื™ ืžืื™ ื•ืืžืจ ืจื‘ื™ ื™ื•ื—ื ืŸ ืœืืชื•ื™ื™ ืกื•ืžื,2
94
+ ื‘ืื—ืช ืžืขื™ื ื™ื• ืจื‘ื™ ื™ื•ื—ื ืŸ ืกืชืžื ืื—ืจื™ื ื ืืฉื›ื— ื“ืชื ืŸ ื“ื™ื ื™ ืžืžื•ื ื•ืช ื“ื ื™ืŸ ื‘ื™ื•ื,2
95
+ ื•ื’ื•ืžืจื™ืŸ ื‘ืœื™ืœื” ื•ืžืื™ ืื•ืœืžื™ื” ื“ื”ืื™ ืกืชืžื ืžื”ืื™ ืกืชืžื ืื™ื‘ืขื™ืช ืื™ืžื ืกืชืžื ื“ืจื‘ื™ื,2
96
+ ืขื“ื™ืฃ ื•ืื™ื‘ืขื™ืช ืื™ืžื ืžืฉื•ื ื“ืงืชื ื™ ืœื” ื’ื‘ื™ ื”ืœื›ืชื ื“ื“ื™ื ื™: ื›ืœ ืฉื—ื™ื™ื‘ ื‘ืžืขืฉืจื•ืช,2
97
+ ืžื˜ืžื ื˜ื•ืžืืช ืื•ื›ืœื™ืŸ ื•ื™ืฉ ืฉืžื˜ืžื ื˜ื•ืžืืช ืื•ื›ืœื™ืŸ ื•ืื™ื ื• ื—ื™ื™ื‘ ื‘ืžืขืฉืจื•ืช: ืœืืชื•ื™ื™ ืžืื™,2
98
+ ืœืืชื•ื™ื™ ื‘ืฉืจ ื•ื“ื’ื™ื ื•ื‘ื™ืฆื™ื: ื›ืœ ืฉื—ื™ื™ื‘ ื‘ืคืื” ื—ื™ื™ื‘ ื‘ืžืขืฉืจื•ืช ื•ื™ืฉ ืฉื—ื™ื™ื‘ ื‘ืžืขืฉืจื•ืช,2
99
+ ื•ืื™ื ื• ื—ื™ื™ื‘ ื‘ืคืื”: ืœืืชื•ื™ื™ ืžืื™ ืœืืชื•ื™ื™ ืชืื ื” ื•ื™ืจืง ืฉืื™ื ื• ื—ื™ื™ื‘ ื‘ืคืื” ื“ืชื ืŸ,2
100
+ ื›ืœืœ ืืžืจื• ื‘ืคืื” ื›ืœ ืฉื”ื•ื ืื•ื›ืœ ื•ื ืฉืžืจ ื•ื’ื™ื“ื•ืœื• ืžืŸ ื”ืืจืฅ ื•ืœืงื™ื˜ืชื• ื›ืื—ื“,2
101
+ ื•ืžื›ื ื™ืกื• ืœืงื™ื•ื ื—ื™ื™ื‘ ื‘ืคืื” ืื•ื›ืœ ืœืžืขื•ื˜ื™ ืกืคื™ื—ื™ ืกื˜ื™ื ื•ืงื•ืฆื” ื•ื ืฉืžืจ ืœืžืขื•ื˜ื™ ื”ืคืงืจ,2
102
+ ื•ื’ื™ื“ื•ืœื• ืžืŸ ื”ืืจืฅ ืœืžืขื•ื˜ื™ ื›ืžื”ื™ื ื•ืคื˜ืจื™ื•ืช ื•ืœืงื™ื˜ืชื• ื›ืื—ื“ ืœืžืขื•ื˜ื™ ืชืื ื” ื•ืžื›ื ื™ืกื• ืœืงื™ื•ื,2
103
+ ืœืžืขื•ื˜ื™ ื™ืจืง ื•ืื™ืœื• ื’ื‘ื™ ืžืขืฉืจ ืชื ืŸ ื›ืœ ืฉื”ื•ื ืื•ื›ืœ ื•ื ืฉืžืจ ื•ื’ื™ื“ื•ืœื• ืžืŸ,2
104
+ ื”ืืจืฅ ื—ื™ื™ื‘ ื‘ืžืขืฉืจื•ืช ื•ืื™ืœื• ืœืงื™ื˜ืชื• ื›ืื—ื“ ื•ืžื›ื ื™ืกื• ืœืงื™ื•ื ืœื ืงืชื ื™ ืื ื”ื™ื•,2
105
+ ื‘ื”ื ืฉื•ืžื™ื ื•ื‘ืฆืœื™ืŸ ื—ื™ื™ื‘ื™ืŸ ื“ืชื ืŸ ืžืœื‘ื ื•ืช ื‘ืฆืœื™ื ืฉื‘ื™ืŸ ื”ื™ืจืง ืจ' ื™ื•ืกื™ ืื•ืžืจ,2
106
+ ืคืื” ืžื›ืœ ืื—ืช ื•ืื—ืช ื•ื—ื›"ื ืžืื—ืช ืขืœ ื”ื›ืœ ืืžืจ ืจื‘ื” ื‘ืจ ื‘ืจ,2
107
+ ื—ื ื” ื"ืจ ื™ื•ื—ื ืŸ ืขื•ืœืฉื™ืŸ ืฉื–ืจืขืŸ ืžืชื—ื™ืœื” ืœื‘ื”ืžื” ื•ื ืžืœืš ืขืœื™ื”ืŸ ืœืื“ื,2
data_creation/graf_model.py ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import joblib
2
+ import numpy as np
3
+ import matplotlib.pyplot as plt
4
+ from sklearn.svm import SVC
5
+
6
+ # Load the trained model and vectorizer
7
+ model_filename = "text_identification_model.pkl"
8
+ vectorizer_filename = "text_identification_vectorizer.pkl"
9
+ loaded_classifier = joblib.load(model_filename)
10
+ vectorizer = joblib.load(vectorizer_filename)
11
+
12
+ # Create a sample text
13
+ sample_text = "ืงื™ืฅ ื‘ืจื™ื ื•ื ืขื™ื ื—ื‘ืจื™ื!"
14
+
15
+ # Transform the sample text using the vectorizer
16
+ sample_text_tfidf = vectorizer.transform([sample_text])
17
+
18
+ # Make predictions using the loaded model
19
+ predicted_class = loaded_classifier.predict(sample_text_tfidf)
20
+
21
+ # Visualize the decision boundaries (example with a simple 2D dataset)
22
+ # Modify this part according to your data and model
23
+ # For complex data, consider using libraries like plotly
24
+ # to create more informative visualizations
25
+
26
+ # Generate data for visualization
27
+ X_visual = np.random.rand(300, 2) * 10
28
+ y_visual = np.random.randint(0, 3, size=300)
29
+
30
+ # Train an SVM on the generated data
31
+ svm_classifier = SVC(kernel='linear', C=1.0)
32
+ svm_classifier.fit(X_visual, y_visual)
33
+
34
+ # Plot the data points
35
+ plt.scatter(X_visual[:, 0], X_visual[:, 1], c=y_visual, cmap=plt.cm.Paired)
36
+
37
+ # Plot the decision boundaries
38
+ ax = plt.gca()
39
+ xlim = ax.get_xlim()
40
+ ylim = ax.get_ylim()
41
+
42
+ xx, yy = np.meshgrid(np.linspace(xlim[0], xlim[1], 50),
43
+ np.linspace(ylim[0], ylim[1], 50))
44
+
45
+ Z = svm_classifier.predict(np.c_[xx.ravel(), yy.ravel()])
46
+ Z = Z.reshape(xx.shape)
47
+
48
+ plt.contourf(xx, yy, Z, cmap=plt.cm.Paired, alpha=0.8)
49
+
50
+ # Highlight the support vectors
51
+ plt.scatter(svm_classifier.support_vectors_[:, 0],
52
+ svm_classifier.support_vectors_[:, 1],
53
+ s=100, facecolors='none', edgecolors='k')
54
+
55
+ # Plot the predicted sample point
56
+ plt.scatter(sample_text_tfidf[0, 0], sample_text_tfidf[0, 1], marker='x', color='red', label=f'Predicted Class: {predicted_class[0]}')
57
+
58
+ plt.title('Support Vector Machine Visualization')
59
+ plt.xlabel('Feature 1')
60
+ plt.ylabel('Feature 2')
61
+ plt.legend()
62
+ plt.show()
data_creation/processed_text.csv ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ "ื•ื™ืขืŸ ืืœื™ืคื– ื”ืชื™ืžื ื™ ื•ื™ืืžืจ,1"
2
+ "ื”ื ืกื” ื“ื‘ืจ ืืœื™ืš ืชืœืื” ื•ืขืฆืจ ื‘ืžืœื™ืŸ ืžื™ ื™ื•ื›ืœ,1"
3
+ "ื”ื ื” ื™ืกืจืช ืจื‘ื™ื ื•ื™ื“ื™ื ืจืคื•ืช ืชื—ื–ืง,1"
4
+ "ื›ื•ืฉืœ ื™ืงื™ืžื•ืŸ ืžืœื™ืš ื•ื‘ืจื›ื™ื ื›ืจืขื•ืช ืชืืžืฅ,1"
5
+ "ื›ื™ ืขืชื” ืชื‘ื•ื ืืœื™ืš ื•ืชืœื ืชื’ืข ืขื“ื™ืš ื•ืชื‘ื”ืœ,1"
6
+ "ื”ืœื ื™ืจืืชืš ื›ืกืœืชืš ืชืงื•ืชืš ื•ืชื ื“ืจื›ื™ืš,1"
7
+ "ื–ื›ืจ ื ื ืžื™ ื”ื•ื ื ืงื™ ืื‘ื“ ื•ืื™ืคื” ื™ืฉืจื™ื ื ื›ื—ื“ื•,1"
8
+ "ื›ืืฉืจ ืจืื™ืชื™ ื—ืจืฉื™ ืื•ืŸ ื•ื–ืจืขื™ ืขืžืœ ื™ืงืฆืจื”ื•,1"
9
+ "ืžื ืฉืžืช ืืœื•ื” ื™ืื‘ื“ื• ื•ืžืจื•ื— ืืคื• ื™ื›ืœื•,1"
10
+ "ืฉืื’ืช ืืจื™ื” ื•ืงื•ืœ ืฉื—ืœ ื•ืฉื ื™ ื›ืคื™ืจื™ื ื ืชืขื•,1"
11
+ "ืœื™ืฉ ืื‘ื“ ืžื‘ืœื™ ื˜ืจืฃ ื•ื‘ื ื™ ืœื‘ื™ื ื™ืชืคืจื“ื•,1"
12
+ "ื•ืืœื™ ื“ื‘ืจ ื™ื’ื ื‘ ื•ืชืงื— ืื–ื ื™ ืฉืžืฅ ืžื ื”ื•,1"
13
+ "ื‘ืฉืขืคื™ื ืžื—ื–ื™ื ื•ืช ืœื™ืœื” ื‘ื ืคืœ ืชืจื“ืžื” ืขืœ ืื ืฉื™ื,1"
14
+ "ืคื—ื“ ืงืจืื ื™ ื•ืจืขื“ื” ื•ืจื‘ ืขืฆืžื•ืชื™ ื”ืคื—ื™ื“,1"
15
+ "ื•ืจื•ื— ืขืœ ืคื ื™ ื™ื—ืœืฃ ืชืกืžืจ ืฉืขืจืช ื‘ืฉืจื™,1"
16
+ "ื™ืขืžื“ ื•ืœื ืื›ื™ืจ ืžืจืื”ื• ืชืžื•ื ื” ืœื ื’ื“ ืขื™ื ื™ ื“ืžืžื” ื•ืงื•ืœ ืืฉืžืข,1"
17
+ "ื”ืื ื•ืฉ ืžืืœื•ื” ื™ืฆื“ืง ืื ืžืขืฉื”ื• ื™ื˜ื”ืจ ื’ื‘ืจ,1"
18
+ "ื”ืŸ ื‘ืขื‘ื“ื™ื• ืœื ื™ืืžื™ืŸ ื•ื‘ืžืœืื›ื™ื• ื™ืฉื™ื ืชื”ืœื”,1"
19
+ "ืืฃ ืฉื›ื ื™ ื‘ืชื™ ื—ืžืจ ืืฉืจ ื‘ืขืคืจ ื™ืกื•ื“ื ื™ื“ื›ืื•ื ืœืคื ื™ ืขืฉ,1"
20
+ "ืžื‘ืงืจ ืœืขืจื‘ ื™ื›ืชื• ืžื‘ืœื™ ืžืฉื™ื ืœื ืฆื— ื™ืื‘ื“ื•,1"
21
+ "ื”ืœื ื ืกืข ื™ืชืจื ื‘ื ื™ืžื•ืชื• ื•ืœื ื‘ื—ื›ืžื”,1"
22
+ "ืงืจื ื ื ื”ื™ืฉ ืขื•ื ืš ื•ืืœ ืžื™ ืžืงื“ืฉื™ื ืชืคื ื”,1"
23
+ "ื›ื™ ืœืื•ื™ืœ ื™ื”ืจื’ ื›ืขืฉ ื•ืคืชื” ืชืžื™ืช ืงื ืื”,1"
24
+ "ืื ื™ ืจืื™ืชื™ ืื•ื™ืœ ืžืฉืจื™ืฉ ื•ืืงื•ื‘ ื ื•ื”ื• ืคืชืื,1"
25
+ "ื™ืจื—ืงื• ื‘ื ื™ื• ืžื™ืฉืข ื•ื™ื“ื›ืื• ื‘ืฉืขืจ ื•ืื™ืŸ ืžืฆื™ืœ,1"
26
+ "ืืฉืจ ืงืฆื™ืจื• ืจืขื‘ ื™ืื›ืœ ื•ืืœ ืžืฆื ื™ื ื™ืงื—ื”ื• ื•ืฉืืฃ ืฆืžื™ื ื—ื™ืœื,1"
27
+ "ื›ื™ ืœื ื™ืฆื ืžืขืคืจ ืื•ืŸ ื•ืžืื“ืžื” ืœื ื™ืฆืžื— ืขืžืœ,1"
28
+ "ื›ื™ ืื“ื ืœืขืžืœ ื™ื•ืœื“ ื•ื‘ื ื™ ืจืฉืฃ ื™ื’ื‘ื™ื”ื• ืขื•ืฃ,1"
29
+ "ืื•ืœื ืื ื™ ืื“ืจืฉ ืืœ ืืœ ื•ืืœ ืืœื”ื™ื ืืฉื™ื ื“ื‘ืจืชื™,1"
30
+ "ืขืฉื” ื’ื“ืœื•ืช ื•ืื™ืŸ ื—ืงืจ ื ืคืœืื•ืช ืขื“ ืื™ืŸ ืžืกืคืจ,1"
31
+ "ื”ื ืชืŸ ืžื˜ืจ ืขืœ ืคื ื™ ืืจืฅ ื•ืฉืœื— ืžื™ื ืขืœ ืคื ื™ ื—ื•ืฆื•ืช,1"
32
+ "ืœืฉื•ื ืฉืคืœื™ื ืœืžืจื•ื ื•ืงื“ืจื™ื ืฉื’ื‘ื• ื™ืฉืข,1"
33
+ "ืžืคืจ ืžื—ืฉื‘ื•ืช ืขืจื•ืžื™ื ื•ืœื ืชืขืฉื™ื ื” ื™ื“ื™ื”ื ืชื•ืฉื™ื”,1"
34
+ "ืœื›ื“ ื—ื›ืžื™ื ื‘ืขืจืžื ื•ืขืฆืช ื ืคืชืœื™ื ื ืžื”ืจื”,1"
35
+ "ื™ื•ืžื ื™ืคื’ืฉื• ื—ืฉืš ื•ื›ืœื™ืœื” ื™ืžืฉืฉื• ื‘ืฆื”ืจื™ื,1"
36
+ "ื•ื™ืฉืข ืžื—ืจื‘ ืžืคื™ื”ื ื•ืžื™ื“ ื—ื–ืง ืื‘ื™ื•ืŸ,1"
37
+ "ื•ืชื”ื™ ืœื“ืœ ืชืงื•ื” ื•ืขืœืชื” ืงืคืฆื” ืคื™ื”,1"
38
+ "ืื ืขืœ ื”ืžืœืš ื˜ื•ื‘ ื™ื›ืชื‘ ืœืื‘ื“ื ื•ืขืฉืจืช ืืœืคื™ื ื›ื›ืจ ื›ืกืฃ ืืฉืงื•ืœ ืขืœ ื™ื“ื™ ืขืฉื™ ื”ืžืœืื›ื” ืœื”ื‘ื™ื ืืœ ื’ื ื–ื™ ื”ืžืœืš,1"
39
+ "ื•ื™ืกืจ ื”ืžืœืš ืืช ื˜ื‘ืขืชื• ืžืขืœ ื™ื“ื• ื•ื™ืชื ื” ืœื”ืžืŸ ื‘ืŸ ื”ืžื“ืชื ื”ืื’ื’ื™ ืฆืจืจ ื”ื™ื”ื•ื“ื™ื,1"
40
+ "ื•ื™ืืžืจ ื”ืžืœืš ืœื”ืžืŸ ื”ื›ืกืฃ ื ืชื•ืŸ ืœืš ื•ื”ืขื ืœืขืฉื•ืช ื‘ื• ื›ื˜ื•ื‘ ื‘ืขื™ื ื™ืš,1"
41
+ "ื•ื™ืงืจืื• ืกืคืจื™ ื”ืžืœืš ื‘ื—ื“ืฉ ื”ืจืืฉื•ืŸ ื‘ืฉืœื•ืฉื” ืขืฉืจ ื™ื•ื ื‘ื• ื•ื™ื›ืชื‘ ื›ื›ืœ ืืฉืจ ืฆื•ื” ื”ืžืŸ ืืœ ืื—ืฉื“ืจืคื ื™ ื”ืžืœืš ื•ืืœ ื”ืคื—ื•ืช ืืฉืจ ืขืœ ืžื“ื™ื ื” ื•ืžื“ื™ื ื” ื•ืืœ ืฉืจื™ ืขื ื•ืขื ืžื“ื™ื ื” ื•ืžื“ื™ื ื” ื›ื›ืชื‘ื” ื•ืขื ื•ืขื ื›ืœืฉื•ื ื• ื‘ืฉื ื”ืžืœืš ืื—ืฉื•ืจืฉ ื ื›ืชื‘ ื•ื ื—ืชื ื‘ื˜ื‘ืขืช ื”ืžืœืš,1"
42
+ "ื•ื ืฉืœื•ื— ืกืคืจื™ื ื‘ื™ื“ ื”ืจืฆื™ื ืืœ ื›ืœ ืžื“ื™ื ื•ืช ื”ืžืœืš ืœื”ืฉืžื™ื“ ืœื”ืจื’ ื•ืœืื‘ื“ ืืช ื›ืœ ื”ื™ื”ื•ื“ื™ื ืžื ืขืจ ื•ืขื“ ื–ืงืŸ ื˜ืฃ ื•ื ืฉื™ื ื‘ื™ื•ื ืื—ื“ ื‘ืฉืœื•ืฉื” ืขืฉืจ ืœื—ื“ืฉ ืฉื ื™ื ืขืฉืจ ื”ื•ื ื—ื“ืฉ ืื“ืจ ื•ืฉืœืœื ืœื‘ื•ื–,1"
43
+ "ืคืชืฉื’ืŸ ื”ื›ืชื‘ ืœื”ื ืชืŸ ื“ืช ื‘ื›ืœ ืžื“ื™ื ื” ื•ืžื“ื™ื ื” ื’ืœื•ื™ ืœื›ืœ ื”ืขืžื™ื ืœื”ื™ื•ืช ืขืชื“ื™ื ืœื™ื•ื ื”ื–ื”,1"
44
+ "ื”ืจืฆื™ื ื™ืฆืื• ื“ื—ื•ืคื™ื ื‘ื“ื‘ืจ ื”ืžืœืš ื•ื”ื“ืช ื ืชื ื” ื‘ืฉื•ืฉืŸ ื”ื‘ื™ืจื” ื•ื”ืžืœืš ื•ื”ืžืŸ ื™ืฉื‘ื• ืœืฉืชื•ืช ื•ื”ืขื™ืจ ืฉื•ืฉืŸ ื ื‘ื•ื›ื”,1"
45
+ "ื•ืžืจื“ื›ื™ ื™ื“ืข ืืช ื›ืœ ืืฉืจ ื ืขืฉื” ื•ื™ืงืจืข ืžืจื“ื›ื™ ืืช ื‘ื’ื“ื™ื• ื•ื™ืœื‘ืฉ ืฉืง ื•ืืคืจ ื•ื™ืฆื ื‘ืชื•ืš ื”ืขื™ืจ ื•ื™ื–ืขืง ื–ืขืงื” ื’ื“ืœื” ื•ืžืจื”,1"
46
+ "ื•ื™ื‘ื•ื ืขื“ ืœืคื ื™ ืฉืขืจ ื”ืžืœืš ื›ื™ ืื™ืŸ ืœื‘ื•ื ืืœ ืฉืขืจ ื”ืžืœืš ื‘ืœื‘ื•ืฉ ืฉืง,1"
47
+ "ื•ื‘ื›ืœ ืžื“ื™ื ื” ื•ืžื“ื™ื ื” ืžืงื•ื ืืฉืจ ื“ื‘ืจ ื”ืžืœืš ื•ื“ืชื• ืžื’ื™ืข ืื‘ืœ ื’ื“ื•ืœ ืœื™ื”ื•ื“ื™ื ื•ืฆื•ื ื•ื‘ื›ื™ ื•ืžืกืคื“ ืฉืง ื•ืืคืจ ื™ืฆืข ืœืจื‘ื™ื,1"
48
+ "ื•ืชื‘ื•ืื™ื ื” [ื•ืชื‘ื•ืื ื”] ื ืขืจื•ืช ืืกืชืจ ื•ืกืจื™ืกื™ื” ื•ื™ื’ื™ื“ื• ืœื” ื•ืชืชื—ืœื—ืœ ื”ืžืœื›ื” ืžืื“ ื•ืชืฉืœื— ื‘ื’ื“ื™ื ืœื”ืœื‘ื™ืฉ ืืช ืžืจื“ื›ื™ ื•ืœื”ืกื™ืจ ืฉืงื• ืžืขืœื™ื• ื•ืœื ืงื‘ืœ,1"
49
+ "ื•ืชืงืจื ืืกืชืจ ืœื”ืชืš ืžืกืจื™ืกื™ ื”ืžืœืš ืืฉืจ ื”ืขืžื™ื“ ืœืคื ื™ื” ื•ืชืฆื•ื”ื• ืขืœ ืžืจื“ื›ื™ ืœื“ืขืช ืžื” ื–ื” ื•ืขืœ ืžื” ื–ื”,1"
50
+ "ื•ื™ืฆื ื”ืชืš ืืœ ืžืจื“ื›ื™ ืืœ ืจื—ื•ื‘ ื”ืขื™ืจ ืืฉืจ ืœืคื ื™ ืฉืขืจ ื”ืžืœืš,1"
51
+ "ื•ื™ื’ื“ ืœื• ืžืจื“ื›ื™ ืืช ื›ืœ ืืฉืจ ืงืจื”ื• ื•ืืช ืคืจืฉืช ื”ื›ืกืฃ ืืฉืจ ืืžืจ ื”ืžืŸ ืœืฉืงื•ืœ ืขืœ ื’ื ื–ื™ ื”ืžืœืš ื‘ื™ื”ื•ื“ื™ื™ื [ื‘ื™ื”ื•ื“ื™ื] ืœืื‘ื“ื,1"
52
+ "ื•ืืช ืคืชืฉื’ืŸ ื›ืชื‘ ื”ื“ืช ืืฉืจ ื ืชืŸ ื‘๏ฟฝ๏ฟฝื•ืฉืŸ ืœื”ืฉืžื™ื“ื ื ืชืŸ ืœื• ืœื”ืจืื•ืช ืืช ืืกืชืจ ื•ืœื”ื’ื™ื“ ืœื” ื•ืœืฆื•ื•ืช ืขืœื™ื” ืœื‘ื•ื ืืœ ื”ืžืœืš ืœื”ืชื—ื ืŸ ืœื• ื•ืœื‘ืงืฉ ืžืœืคื ื™ื• ืขืœ ืขืžื”,1"
53
+ "ื•ื™ื‘ื•ื ื”ืชืš ื•ื™ื’ื“ ืœืืกืชืจ ืืช ื“ื‘ืจื™ ืžืจื“ื›ื™,1"
54
+ "ื•ืชืืžืจ ืืกืชืจ ืœื”ืชืš ื•ืชืฆื•ื”ื• ืืœ ืžืจื“ื›ื™,1"
55
+ "ื›ืœ ืขื‘ื“ื™ ื”ืžืœืš ื•ืขื ืžื“ื™ื ื•ืช ื”ืžืœืš ื™ื•ื“ืขื™ื ืืฉืจ ื›ืœ ืื™ืฉ ื•ืืฉื” ืืฉืจ ื™ื‘ื•ื ืืœ ื”ืžืœืš ืืœ ื”ื—ืฆืจ ื”ืคื ื™ืžื™ืช ืืฉืจ ืœื ื™ืงืจื ืื—ืช ื“ืชื• ืœื”ืžื™ืช ืœื‘ื“ ืžืืฉืจ ื™ื•ืฉื™ื˜ ืœื• ื”ืžืœืš ืืช ืฉืจื‘ื™ื˜ ื”ื–ื”ื‘ ื•ื—ื™ื” ื•ืื ื™ ืœื ื ืงืจืืชื™ ืœื‘ื•ื ืืœ ื”ืžืœืš ื–ื” ืฉืœื•ืฉื™ื ื™ื•ื,1"
56
+ "ื•ื™ื’ื™ื“ื• ืœืžืจื“ื›ื™ ืืช ื“ื‘ืจื™ ืืกืชืจ,1"
57
+ "ื•ื™ืืžืจ ืžืจื“ื›ื™ ืœื”ืฉื™ื‘ ืืœ ืืกืชืจ ืืœ ืชื“ืžื™ ื‘ื ืคืฉืš ืœื”ืžืœื˜ ื‘ื™ืช ื”ืžืœืš ืžื›ืœ ื”ื™ื”ื•ื“ื™ื,1"
58
+ "ื›ื™ ืื ื”ื—ืจืฉ ืชื—ืจื™ืฉื™ ื‘ืขืช ื”ื–ืืช ืจื•ื— ื•ื”ืฆืœื” ื™ืขืžื•ื“ ืœื™ื”ื•ื“ื™ื ืžืžืงื•ื ืื—ืจ ื•ืืช ื•ื‘ื™ืช ืื‘ื™ืš ืชืื‘ื“ื• ื•ืžื™ ื™ื•ื“ืข ืื ืœืขืช ื›ื–ืืช ื”ื’ืขืช ืœืžืœื›ื•ืช,1"
59
+ "ื•ืชืืžืจ ืืกืชืจ ืœื”ืฉื™ื‘ ืืœ ืžืจื“ื›ื™,1"
60
+ "ื›ืžื’ื“ืœ ื“ื•ื™ื“ ืฆื•ืืจืš ื‘ื ื•ื™ ืœืชืœืคื™ื•ืช ืืœืฃ ื”ืžื’ืŸ ืชืœื•ื™ ืขืœื™ื• ื›ืœ ืฉืœื˜ื™ ื”ื’ื‘ื•ืจื™ื,1"
61
+ "ืฉื ื™ ืฉื“ื™ืš ื›ืฉื ื™ ืขืคืจื™ื ืชืื•ืžื™ ืฆื‘ื™ื” ื”ืจื•ืขื™ื ื‘ืฉื•ืฉื ื™ื,1"
62
+ "ืขื“ ืฉื™ืคื•ื— ื”ื™ื•ื ื•ื ืกื• ื”ืฆืœืœื™ื ืืœืš ืœื™ ืืœ ื”ืจ ื”ืžื•ืจ ื•ืืœ ื’ื‘ืขืช ื”ืœื‘ื•ื ื”,1"
63
+ "ื›ืœืš ื™ืคื” ืจืขื™ืชื™ ื•ืžื•ื ืื™ืŸ ื‘ืš ืืชื™ ืžืœื‘ื ื•ืŸ ื›ืœื” ืืชื™ ืžืœื‘ื ื•ืŸ ืชื‘ื•ืื™ ืชืฉื•ืจื™ ืžืจืืฉ ืืžื ื” ืžืจืืฉ ืฉื ื™ืจ ื•ื—ืจืžื•ืŸ ืžืžืขื ื•ืช ืืจื™ื•ืช ืžื”ืจืจื™ ื ืžืจื™ื,1"
64
+ "ืœื‘ื‘ืชื ื™ ืื—ืชื™ ื›ืœื” ืœื‘ื‘ืชื™ื ื™ ื‘ืื—ื“ [ื‘ืื—ืช] ืžืขื™ื ื™ืš ื‘ืื—ื“ ืขื ืง ืžืฆื•ืจื ื™ืš,1"
65
+ "ืžื” ื™ืคื• ื“ื“ื™ืš ืื—ืชื™ ื›ืœื” ืžื” ื˜ื‘ื• ื“ื“ื™ืš ืžื™ื™ืŸ ื•ืจื™ื— ืฉืžื ื™ืš ืžื›ืœ ื‘ืฉืžื™ื,1"
66
+ "ื ืคืช ืชื˜ืคื ื” ืฉืคืชื•ืชื™ืš ื›ืœื” ื“ื‘ืฉ ื•ื—ืœื‘ ืชื—ืช ืœืฉื•ื ืš ื•ืจื™ื— ืฉืœืžืชื™ืš ื›ืจื™ื— ืœื‘ื ื•ืŸ ื’ืŸ ื ืขื•ืœ ืื—ืชื™ ื›ืœื” ื’ืœ ื ืขื•ืœ ืžืขื™ืŸ ื—ืชื•ื,1"
67
+ "ืฉืœื—ื™ืš ืคืจื“ืก ืจืžื•ื ื™ื ืขื ืคืจื™ ืžื’ื“ื™ื ื›ืคืจื™ื ืขื ื ืจื“ื™ื,1"
68
+ "ื ืจื“ ื•ื›ืจื›ื ืงื ื” ื•ืงื ืžื•ืŸ ืขื ื›ืœ ืขืฆื™ ืœื‘ื•ื ื” ืžืจ ื•ืื”ืœื•ืช ืขื ื›ืœ ืจืืฉื™ ื‘ืฉืžื™ื,1"
69
+ "ืžืขื™ืŸ ื’ื ื™ื ื‘ืืจ ืžื™ื ื—ื™ื™ื ื•ื ื–ืœื™ื ืžืŸ ืœื‘ื ื•ืŸ,1"
70
+ "ืขื•ืจื™ ืฆืคื•ืŸ ื•ื‘ื•ืื™ ืชื™ืžืŸ ื”ืคื™ื—ื™ ื’ื ื™ ื™ื–ืœื• ื‘ืฉืžื™ื• ื™ื‘ื ื“ื•ื“ื™ ืœื’ื ื• ื•ื™ืื›ืœ ืคืจื™ ืžื’ื“ื™ื•,1"
71
+ "ื‘ืืชื™ ืœื’ื ื™ ืื—ืชื™ ื›ืœื” ืืจื™ืชื™ ืžื•ืจื™ ืขื ื‘ืฉืžื™ ืื›ืœืชื™ ื™ืขืจื™ ืขื ื“ื‘ืฉื™ ืฉืชื™ืชื™ ื™ื™ื ื™ ืขื ื—ืœื‘ื™ ืื›ืœื• ืจืขื™ื ืฉืชื• ื•ืฉื›ืจื• ื“ื•ื“ื™ื,1"
data_creation/text_identification_model.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d34b21921901b9ca8fb1ed6ad2896e731a2bfb0d83a0202f874e2244bd0aa44c
3
+ size 180099
data_creation/text_identification_vectorizer.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ef67b7027f2a506145ea801c6daeb177c535b97b2cbfefabafeb033e84487366
3
+ size 241793
data_creation/try_model.py ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from sys import argv
2
+ import nltk
3
+ from nltk.tokenize import word_tokenize
4
+ from sklearn.feature_extraction.text import TfidfVectorizer
5
+ import joblib
6
+
7
+ # Load the trained model from the file
8
+ loaded_classifier = joblib.load("text_identification_model.pkl")
9
+
10
+ # Load the TF-IDF vectorizer used for training
11
+ vectorizer = joblib.load("text_identification_vectorizer.pkl")
12
+
13
+ # Define labels for your categories
14
+ categories = {0: 'Other', 1: 'Bible', 2: 'Talmud'}
15
+
16
+ def parse_text(new_text):
17
+ # Transform the new text using the TF-IDF vectorizer
18
+ new_text_tfidf = vectorizer.transform([new_text])
19
+
20
+ # Make predictions on the new text
21
+ prediction = loaded_classifier.predict(new_text_tfidf)
22
+
23
+ # Get the confidence score for the predicted class
24
+ probabilities = loaded_classifier.predict_proba(new_text_tfidf)
25
+ confidence_score = probabilities[0, 1] # Confidence score for class "Bible" (index 1)
26
+
27
+ # Determine the predicted category label
28
+ predicted_category = categories[prediction[0]]
29
+
30
+ # Print the prediction and the confidence score
31
+ print(f"Text: {new_text} | Prediction: {predicted_category} | Confidence Score: {confidence_score:.4f}")
32
+
33
+
34
+ text_list = [
35
+ 'ื›ืžื” ื™ืคื” ื•ื ืื” ื›ืฉืฉื•ืžืขื™ื ื”ืฉื™ืจื” ืฉืœื”ื',
36
+ 'ื—ื“ืฉื•ืช ื”ืขืจื‘: ืฉืœื•ืฉื” ืื ืฉื™ื ื ืฆืื• ื˜ื•ื‘ืขื™ื ื‘ื›ื™ื ืจืช',
37
+ 'ื•ื”ื™ื” ื‘ืขืช ื”ื”ื™ื ืื—ืคืฉ ืืช ื™ืจื•ืฉืœื™ื ื‘ื ืจื•ืช ื•ื”ื•ื“ืขืชื™ื” ืืช ื›ืœ ืชื•ืขื‘ื•ืชื™ื”',
38
+ 'ื•ื™ืืžืจ ืžืฉื” ืืœ ื‘ื ื™ ื™ืฉืจืืœ',
39
+ 'ื“ืืžืจ ื ืฉื™ื ืžื‘ื™ื ืฉืขื™ืจ ืชื• ื”ื ื“ืชื ืŸ',
40
+ 'ืืžืจ ืœื™ื” ืื‘ื™ื™ ืœืจื‘ ื–ืขื™ืจื',
41
+ 'ื•ืื™ื”ื• ืœื ืงื ื™ื”ื™ื‘ ืฉืขื•ืจื ื‘ืžืฉื›ื',]
42
+
43
+
44
+ if argv[1:]:
45
+ new_text = argv[1]
46
+ parse_text(new_text)
47
+ else:
48
+ for new_text in text_list:
49
+ parse_text(new_text)
is_this_bible_model.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fb51c50be730acff9cf5af92f2e322aee9572e2b4b9381c434559f2fa0a87da1
3
+ size 115035
is_this_bible_vectorizer.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0d5a76c9d59793b194d9022b091115b85ecbef1f7feef33bd87b503aaabc93ea
3
+ size 181677
templates/index.html ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html dir="rtl" lang="he">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>ื’ื™ืœื•ื™ ืคืกื•ืงื™ ื”ืชื "ืš ื‘ืืžืฆืขื•ืช AI</title>
7
+ <style>
8
+ body {
9
+ font-family: 'Tahoma', sans-serif;
10
+ background-color: #f9f9f9;
11
+ color: #333;
12
+ margin: 0;
13
+ padding: 0;
14
+ display: flex;
15
+ align-items: center;
16
+ justify-content: center;
17
+ min-height: 100vh;
18
+ overflow: hidden;
19
+ }
20
+ .container {
21
+ width: 400px;
22
+ padding: 20px;
23
+ background-color: #fff;
24
+ border-radius: 15px;
25
+ text-align: center;
26
+ box-shadow: 0px 10px 25px rgba(0, 0, 0, 0.1);
27
+ }
28
+ .fixed-input {
29
+ width: 90%;
30
+ padding: 10px;
31
+ margin-bottom: 15px;
32
+ border: none;
33
+ border-radius: 8px;
34
+ background-color: #f2f2f2;
35
+ color: #333;
36
+ resize: none;
37
+ }
38
+ input[type="submit"] {
39
+ width: 90%;
40
+ padding: 10px;
41
+ border: none;
42
+ border-radius: 8px;
43
+ background-color: #5d8d77;
44
+ color: white;
45
+ cursor: pointer;
46
+ transition: background-color 0.3s ease-in-out;
47
+ }
48
+ input[type="submit"]:hover {
49
+ background-color: #507b66;
50
+ }
51
+ h1 {
52
+ color: #5d8d77;
53
+ margin-bottom: 5px;
54
+ }
55
+ h2 {
56
+ color: #777;
57
+ margin-top: 20px;
58
+ }
59
+ .result {
60
+ border-top: 1px solid #ddd;
61
+ padding-top: 20px;
62
+ margin-top: 20px;
63
+ transition: all 0.5s ease-in-out;
64
+ }
65
+ </style>
66
+ <script>
67
+ function revealResults() {
68
+ var result = document.querySelector('.result');
69
+ result.style.opacity = 1;
70
+ result.style.marginTop = '20px';
71
+ }
72
+ </script>
73
+ </head>
74
+ <body>
75
+ <div class="container">
76
+ <h1>ื’ื™ืœื•ื™ ืคืกื•ืงื™ ื”ืชื "ืš ื‘ืืžืฆืขื•ืช AI</h1>
77
+ <p>ื”ืงืœื™ื“ื• ืืช ื”ื˜ืงืกื˜ ืฉืชืจืฆื•, ื•ื’ืœื• ื”ืื ื”ื•ื ืื›ืŸ ืžื•ืคื™ืข ื‘ืชื "ืš ื‘ืืžืฆืขื•ืช ืงืกื ื”ื‘ื™ื ื” ื”ืžืœืื›ื•ืชื™ืช</p>
78
+ <form method="POST" action="/" onsubmit="revealResults()">
79
+ <textarea class="fixed-input" name="new_text" rows="4" cols="50" placeholder="ื”ืงืœื™ื“ื• ืืช ื”ื˜ืงืกื˜ ื›ืืŸ..."></textarea><br>
80
+ <input type="submit" value="ื”ืคืขื™ืœื• ืืช ื”ืงืกื">
81
+ </form>
82
+
83
+ <div class="result">
84
+ {% if prediction %}
85
+ <h2>ื”ืกื•ื“ื•ืช ื”ืชื "ื›ื™ื™ื ื ื—ืฉืคื™ื</h2>
86
+ <p>ื”ื˜ืงืกื˜: {{ new_text }}</p>
87
+ <p>ื–ื•ื”ื” ื›...: {{ prediction }}</p>
88
+ <p>ืฆื™ื•ืŸ ื•ื“ืื•ืช: {{ confidence_score }}</p>
89
+ {% endif %}
90
+ </div>
91
+ </div>
92
+ </body>
93
+ </html>
try_model.py ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from sys import argv
2
+ #import re
3
+ import nltk
4
+ from nltk.corpus import stopwords
5
+ import joblib
6
+
7
+
8
+ """
9
+ # Remove punctuation and special characters
10
+ def remove_punctuation(text):
11
+ return re.sub(r'[^\w\s]', '', text)
12
+
13
+ # Function to remove custom stop words from text
14
+ def remove_custom_stopwords(text):
15
+ hebrew_stopwords = set(stopwords.words('hebrew'))
16
+ additional_stopwords = {'ืื ื™', 'ืืชื”', 'ืืช', 'ืื ื—ื ื•', 'ืืชื', 'ืืชืŸ', 'ื”ื', 'ื”ืŸ'}
17
+ hebrew_stopwords.update(additional_stopwords)
18
+ return ' '.join(word for word in text.split() if word not in hebrew_stopwords)
19
+
20
+
21
+ # Preprocess the new text (remove punctuation and custom stop words)
22
+ # ืื ืจื•ืฆื™ื ืœื”ื—ื–ื™ืจ ืืช ื”ืคื•ื ืงืฆื™ื™ื” ื”ืœื ืคืขื™ืœื” ื™ืฉ ืœื”ืขื‘ื™ืจ ืืช ื”ืžืฉืชื ื” ืื—ืจื™ ื”ืžืฉืชื ื” new_text
23
+ new_text_cleaned = remove_custom_stopwords(remove_punctuation(new_text))
24
+ """
25
+
26
+
27
+ # Load the trained model from the file
28
+ loaded_classifier = joblib.load("is_this_bible_model.pkl")
29
+
30
+ # Load the TF-IDF vectorizer used for training
31
+ vectorizer = joblib.load("is_this_bible_vectorizer.pkl")
32
+
33
+ def parse_text(new_text):
34
+ # Transform the new text using the TF-IDF vectorizer
35
+ new_text_tfidf = vectorizer.transform([new_text])
36
+
37
+ # Make predictions on the new text
38
+ prediction = loaded_classifier.predict(new_text_tfidf)
39
+
40
+ # Get the confidence score for the predicted class
41
+ probabilities = loaded_classifier.predict_proba(new_text_tfidf)
42
+ confidence_score = probabilities[0, 1] # The confidence score for class "Bible" (index 1)
43
+
44
+ # Print the prediction and the confidence score
45
+ print(f"Text: {new_text} | Prediction: {'Bible' if prediction[0] == 1 else 'Other'} | Confidence Score: {confidence_score:.4f}")
46
+
47
+
48
+ text_list = [
49
+ 'ืื ื™ ื™ื•ืฉื‘ ืคื” ื‘ืฉืงื˜ ื•ืžืงืœืœ ืืช ื”ืขื•ื‘ื“ื” ืฉื—ืœืง ืžื”ืชื•ื›ื ื•ืช ืฉืื ื™ ืžืชื—ื–ืง ืงืฉื•ืจื” ืœืคื™ื™ืชื•ืŸ 2.4, ืฉืื™ืŸ ืœื” ืืช ื–ื”',
50
+ 'ื›ืžื” ื™ืคื” ื•ื ืื” ื›ืฉืฉื•ืžืขื™ื ื”ืฉื™ืจื” ืฉืœื”ื',
51
+ 'ื•ื”ื™ื” ื‘ืขืช ื”ื”ื™ื ืื—ืคืฉ ืืช ื™ืจื•ืฉืœื™ื ื‘ื ืจื•ืช ื•ื”ื•ื“ืขืชื™ื” ืืช ื›ืœ ืชื•ืขื‘ื•ืชื™ื”',
52
+ 'ื•ื”ื™ื ืฉืขืžื“ื” ืœืื‘ื•ืชื™ื ื• ื•ืœื ื• ืฉืœื ืื—ื“ ื‘ืœื‘ื“ ืขืžื“ ืขืœื™ื ื• ืœื›ืœื•ืชื™ื ื•',
53
+ 'ืื ื™ ื”ืกืชื›ืœืชื™ ืœืฉืžื™ื ืืชื” ืฆืœืœืช ื‘ืžื™ื',
54
+ 'ื”ืฆื‘ ื”ื•ื ื‘ืขืœ ื—ื™ื™ื ืฉื—ื™ ื‘ื™ื ื•ื‘ื™ื‘ืฉื”',
55
+ 'ื•ื”ื™ื” ื”ื ืฉืืจ ื‘ืฆื™ื•ืŸ ื•ื”ื ื•ืชืจ ื‘ื™ืจื•ืฉืœื™ื ืงื“ื•ืฉ ื™ืืžืจ ืœื•',
56
+ 'ืฉื™ืจ ื”ืฉื™ืจื™ื ืืฉืจ ืœืฉืœืžื”',
57
+ 'ื™ืฉืงื ื™ ืžื ืฉื™ืงื•ืช ืคื™ื”ื• ื›ื™ ื˜ื•ื‘ื™ื ื“ื•ื“ื™ืš ืžื™ื™ืŸ',
58
+ 'ื•ื”ื™ื” ืจืง ืžืœื ืฉืžื—ื” ื•ื—ื“ื•ื” ืชืžื™ื“ ื›ืฉื”ื™ื” ื’ื•ืžืจ ื”ืžื ืขืœ ื•ืžืŸ ื”ืกืชื ื”ื™ื” ืœื• ืฉืœืฉื” ืงืฆื•ื•ืช',
59
+ 'ื–ื” ืžืขืฉื” ืฉืœื• ื•ื–ื” ืžืขืฉื” ืฉืœื™ ื•ืขื•ื“ ืžื” ืœื ื• ืœื“ื‘ืจ ืžืื—ืจื™ื',
60
+ 'ื“ื•ื“ื™ ื™ืจื“ ืœื’ื ื• ืœืขืจื•ื’ื•ืช ื”ื‘ื•ืฉื ืœืจืขื•ืช ื‘ื’ื ื™ื ื•ืœืœืงื•ื˜ ืฉื•ืฉื ื™ื',
61
+ 'ื•ื™ืžืจื• ื‘ื™ ื‘ื™ืช ื™ืฉืจืืœ ื‘ืžื“ื‘ืจ ื‘ื—ืงื•ืชื™ ืœื ื”ืœื›ื• ื•ืืช ืžืฉืคื˜ื™ ืžืืกื• ืืฉืจ ื™ืขืฉื” ืืชื ื”ืื“ื ื•ื—ื™ ื‘ื”ื',
62
+ 'ื–ื” ืœื ืžืฉื ื” ืื•ืคื ื™ื™ื ื ืขืœื™ื™ื ื”ืขื™ืงืจ ื–ื” ื‘ื—ื™ื™ื',
63
+ 'ื–ื›ื•ืจ ืืช ื™ื•ื ื”ืฉื‘ืช ืœืงื“ืฉื•',
64
+ 'ื•ื™ืฉืœื— ื™ืขืงื‘ ืžืœืื›ื™ื ืœืคื ื™ื• ืืœ ืขืฉื™ื• ืื—ื™ื•',
65
+ 'ืœืš ืœืš ืžืืจืฆืš ื•ืžืžื•ืœื“ืชืš ื•ืžื‘ื™ืช ืื‘ื™ืš',
66
+ 'ืขื“ื›ื•ืŸ :ื“ื•ืจ ืœื“ื•ืจ ืชื "ืš ,ืžืื•ืจืขื•ืช ื‘ื–ืžืŸ ื”ืชื "ืš ืงืจื“ื™ื˜']
67
+
68
+ if argv[1:]:
69
+ new_text = argv[1]
70
+ parse_text(new_text)
71
+
72
+ else:
73
+ for new_text in text_list:
74
+ parse_text(new_text)
try_model_webui.py ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from flask import Flask, render_template, request
2
+ import webbrowser
3
+ import nltk
4
+ from nltk.corpus import stopwords
5
+ import joblib
6
+
7
+ app = Flask(__name__)
8
+
9
+ # Load the trained model and vectorizer outside the routes for better performance
10
+ loaded_classifier = joblib.load("is_this_bible_model.pkl")
11
+ vectorizer = joblib.load("is_this_bible_vectorizer.pkl")
12
+
13
+ def parse_text(new_text):
14
+ new_text_tfidf = vectorizer.transform([new_text])
15
+ prediction = loaded_classifier.predict(new_text_tfidf)
16
+ probabilities = loaded_classifier.predict_proba(new_text_tfidf)
17
+ confidence_score = probabilities[0, 1]
18
+ return 'ืชื "ืš' if prediction[0] == 1 else 'ืื—ืจ', confidence_score
19
+
20
+ @app.route('/', methods=['GET', 'POST'])
21
+ def index():
22
+ prediction = None
23
+ confidence_score = None
24
+ new_text = None
25
+
26
+ if request.method == 'POST':
27
+ new_text = request.form['new_text']
28
+ if new_text:
29
+ prediction, confidence_score = parse_text(new_text)
30
+ return render_template('index.html', new_text=new_text, prediction=prediction, confidence_score=confidence_score)
31
+
32
+
33
+ if __name__ == '__main__':
34
+ webbrowser.open('http://127.0.0.1:5000/')
35
+ app.run(debug=True)