aliasgerovs commited on
Commit
b472976
·
1 Parent(s): e81407b
Files changed (2) hide show
  1. nohup.out +39 -201
  2. predictors.py +11 -0
nohup.out CHANGED
@@ -1,192 +1,48 @@
1
- 2024-05-10 14:22:25.922427: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
2
- To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
3
- 2024-05-10 14:22:30.394731: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
4
- [nltk_data] Downloading package punkt to /root/nltk_data...
5
- [nltk_data] Package punkt is already up-to-date!
6
- [nltk_data] Downloading package stopwords to /root/nltk_data...
7
- [nltk_data] Package stopwords is already up-to-date!
8
  The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
9
  The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
10
  The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
11
  The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
12
  The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
13
- Traceback (most recent call last):
14
- File "/home/aliasgarov/copyright_checker/app.py", line 4, in <module>
15
- from predictors import predict_bc_scores, predict_mc_scores
16
- File "/home/aliasgarov/copyright_checker/predictors.py", line 80, in <module>
17
- iso_reg = joblib.load("isotonic_regression_model.joblib")
18
- File "/usr/local/lib/python3.9/dist-packages/joblib/numpy_pickle.py", line 658, in load
19
- obj = _unpickle(fobj, filename, mmap_mode)
20
- File "/usr/local/lib/python3.9/dist-packages/joblib/numpy_pickle.py", line 577, in _unpickle
21
- obj = unpickler.load()
22
- File "/usr/lib/python3.9/pickle.py", line 1212, in load
23
- dispatch[key[0]](self)
24
- KeyError: 118
25
- /usr/lib/python3/dist-packages/requests/__init__.py:87: RequestsDependencyWarning: urllib3 (2.2.1) or chardet (4.0.0) doesn't match a supported version!
26
- warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
27
- 2024-05-10 14:46:28.024164: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
28
- To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
29
- 2024-05-10 14:46:29.168832: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
30
- [nltk_data] Downloading package punkt to /root/nltk_data...
31
- [nltk_data] Package punkt is already up-to-date!
32
- [nltk_data] Downloading package stopwords to /root/nltk_data...
33
- [nltk_data] Package stopwords is already up-to-date!
34
- The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
35
- The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
36
- The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
37
- The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
38
- The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
39
- Traceback (most recent call last):
40
- File "/home/aliasgarov/copyright_checker/app.py", line 4, in <module>
41
- from predictors import predict_bc_scores, predict_mc_scores
42
- File "/home/aliasgarov/copyright_checker/predictors.py", line 80, in <module>
43
- iso_reg = joblib.load("isotonic_regression_model.joblib")
44
- File "/usr/local/lib/python3.9/dist-packages/joblib/numpy_pickle.py", line 658, in load
45
- obj = _unpickle(fobj, filename, mmap_mode)
46
- File "/usr/local/lib/python3.9/dist-packages/joblib/numpy_pickle.py", line 577, in _unpickle
47
- obj = unpickler.load()
48
- File "/usr/lib/python3.9/pickle.py", line 1212, in load
49
- dispatch[key[0]](self)
50
- KeyError: 118
51
- /usr/lib/python3/dist-packages/requests/__init__.py:87: RequestsDependencyWarning: urllib3 (2.2.1) or chardet (4.0.0) doesn't match a supported version!
52
- warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
53
- 2024-05-10 14:47:42.322028: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
54
- To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
55
- 2024-05-10 14:47:43.447037: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
56
- [nltk_data] Downloading package punkt to /root/nltk_data...
57
- [nltk_data] Package punkt is already up-to-date!
58
- [nltk_data] Downloading package stopwords to /root/nltk_data...
59
- [nltk_data] Package stopwords is already up-to-date!
60
- The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
61
- The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
62
- The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
63
  The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
64
- The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
65
- Traceback (most recent call last):
66
- File "/home/aliasgarov/copyright_checker/app.py", line 4, in <module>
67
- from predictors import predict_bc_scores, predict_mc_scores
68
- File "/home/aliasgarov/copyright_checker/predictors.py", line 80, in <module>
69
- iso_reg = joblib.load("isotonic_regression_model.joblib")
70
- File "/usr/local/lib/python3.9/dist-packages/joblib/numpy_pickle.py", line 658, in load
71
- obj = _unpickle(fobj, filename, mmap_mode)
72
- File "/usr/local/lib/python3.9/dist-packages/joblib/numpy_pickle.py", line 577, in _unpickle
73
- obj = unpickler.load()
74
- File "/usr/lib/python3.9/pickle.py", line 1212, in load
75
- dispatch[key[0]](self)
76
- KeyError: 118
77
- /usr/lib/python3/dist-packages/requests/__init__.py:87: RequestsDependencyWarning: urllib3 (2.2.1) or chardet (4.0.0) doesn't match a supported version!
78
- warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
79
- 2024-05-10 14:51:34.826683: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
80
- To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
81
- 2024-05-10 14:51:35.945870: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
82
- [nltk_data] Downloading package punkt to /root/nltk_data...
83
- [nltk_data] Package punkt is already up-to-date!
84
- [nltk_data] Downloading package stopwords to /root/nltk_data...
85
- [nltk_data] Package stopwords is already up-to-date!
86
- The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
87
- The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
88
- The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
89
- The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
90
- The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
91
- [nltk_data] Downloading package cmudict to /root/nltk_data...
92
- [nltk_data] Package cmudict is already up-to-date!
93
- [nltk_data] Downloading package punkt to /root/nltk_data...
94
- [nltk_data] Package punkt is already up-to-date!
95
- [nltk_data] Downloading package stopwords to /root/nltk_data...
96
- [nltk_data] Package stopwords is already up-to-date!
97
- [nltk_data] Downloading package wordnet to /root/nltk_data...
98
- [nltk_data] Package wordnet is already up-to-date!
99
- /usr/lib/python3/dist-packages/requests/__init__.py:87: RequestsDependencyWarning: urllib3 (2.2.1) or chardet (4.0.0) doesn't match a supported version!
100
- warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
101
- Collecting en_core_web_sm==2.3.1
102
- Using cached en_core_web_sm-2.3.1-py3-none-any.whl
103
- Requirement already satisfied: spacy<2.4.0,>=2.3.0 in /usr/local/lib/python3.9/dist-packages (from en_core_web_sm==2.3.1) (2.3.9)
104
- Requirement already satisfied: thinc<7.5.0,>=7.4.1 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (7.4.6)
105
- Requirement already satisfied: numpy>=1.15.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.26.4)
106
- Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (2.0.8)
107
- Requirement already satisfied: catalogue<1.1.0,>=0.0.7 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.0.2)
108
- Requirement already satisfied: setuptools in /usr/lib/python3/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (52.0.0)
109
- Requirement already satisfied: blis<0.8.0,>=0.4.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (0.7.11)
110
- Requirement already satisfied: srsly<1.1.0,>=1.0.2 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.0.7)
111
- Requirement already satisfied: wasabi<1.1.0,>=0.4.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (0.10.1)
112
- Requirement already satisfied: requests<3.0.0,>=2.13.0 in /usr/lib/python3/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (2.25.1)
113
- Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.0.10)
114
- Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (4.66.2)
115
- Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (3.0.9)
116
- Requirement already satisfied: plac<1.2.0,>=0.9.6 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.1.3)
117
- ✔ Download and installation successful
118
- You can now load the model via spacy.load('en_core_web_sm')
119
- huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
120
- To disable this warning, you can either:
121
- - Avoid using `tokenizers` before the fork if possible
122
- - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
123
- /usr/local/lib/python3.9/dist-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML
124
- warnings.warn("Can't initialize NVML")
125
- /usr/local/lib/python3.9/dist-packages/optimum/bettertransformer/models/encoder_models.py:301: UserWarning: The PyTorch API of nested tensors is in prototype stage and will change in the near future. (Triggered internally at ../aten/src/ATen/NestedTensorImpl.cpp:178.)
126
- hidden_states = torch._nested_tensor_from_mask(hidden_states, ~attention_mask)
127
- IMPORTANT: You are using gradio version 4.28.3, however version 4.29.0 is available, please upgrade.
128
- --------
129
- Running on local URL: http://0.0.0.0:80
130
- Running on public URL: https://49d1a9e1ca41e9bb0f.gradio.live
131
 
132
- This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
133
- Original BC scores: AI: 0.0003359480469953269, HUMAN: 0.9996640682220459
134
- Calibration BC scores: AI: 0.035897435897435895, HUMAN: 0.9641025641025641
135
- Input Text: sThe Felix M. Warburg House is a mansion at 1109 Fifth Avenue on the Upper East Side of Manhattan in New York City. It was built from 1907 to 1908 for the German-American Jewish financier Felix M. Warburg, in the Châteauesque style, and designed by C. P. H. Gilbert. After Warburg's death in 1937, his widow sold it to a real estate developer. When plans to replace it with luxury apartments fell through, ownership reverted to the Warburgs, who donated it in 1944 to the Jewish Theological Seminary of America. In 1947, the Seminary opened the Jewish Museum in the mansion. The house was named a New York City designated landmark in 1981 and was added to the National Register of Historic Places in 1982. /s
136
- huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
137
- To disable this warning, you can either:
138
- - Avoid using `tokenizers` before the fork if possible
139
- - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
140
- huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
141
- To disable this warning, you can either:
142
- - Avoid using `tokenizers` before the fork if possible
143
- - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
144
- huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
145
- To disable this warning, you can either:
146
- - Avoid using `tokenizers` before the fork if possible
147
- - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
148
- huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
149
- To disable this warning, you can either:
150
- - Avoid using `tokenizers` before the fork if possible
151
- - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
152
- huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
153
- To disable this warning, you can either:
154
- - Avoid using `tokenizers` before the fork if possible
155
- - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
156
- huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
157
- To disable this warning, you can either:
158
- - Avoid using `tokenizers` before the fork if possible
159
- - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
160
- huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
161
- To disable this warning, you can either:
162
- - Avoid using `tokenizers` before the fork if possible
163
- - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
164
- huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
165
- To disable this warning, you can either:
166
- - Avoid using `tokenizers` before the fork if possible
167
- - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
168
- /home/aliasgarov/copyright_checker/predictors.py:234: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
169
- probas = F.softmax(tensor_logits).detach().cpu().numpy()
170
- Traceback (most recent call last):
171
- File "/usr/local/lib/python3.9/dist-packages/gradio/queueing.py", line 527, in process_events
172
- response = await route_utils.call_process_api(
173
- File "/usr/local/lib/python3.9/dist-packages/gradio/route_utils.py", line 270, in call_process_api
174
- output = await app.get_blocks().process_api(
175
- File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 1847, in process_api
176
- result = await self.call_function(
177
- File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 1433, in call_function
178
- prediction = await anyio.to_thread.run_sync(
179
- File "/usr/local/lib/python3.9/dist-packages/anyio/to_thread.py", line 56, in run_sync
180
- return await get_async_backend().run_sync_in_worker_thread(
181
- File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread
182
- return await future
183
- File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 851, in run
184
- result = context.run(func, *args)
185
- File "/usr/local/lib/python3.9/dist-packages/gradio/utils.py", line 788, in wrapper
186
- response = f(*args, **kwargs)
187
- File "/home/aliasgarov/copyright_checker/predictors.py", line 106, in update
188
- corrected_text, corrections = correct_text(text, bias_checker, bias_corrector)
189
- NameError: name 'bias_checker' is not defined
190
  Traceback (most recent call last):
191
  File "/usr/local/lib/python3.9/dist-packages/gradio/queueing.py", line 527, in process_events
192
  response = await route_utils.call_process_api(
@@ -205,23 +61,5 @@ Traceback (most recent call last):
205
  File "/usr/local/lib/python3.9/dist-packages/gradio/utils.py", line 788, in wrapper
206
  response = f(*args, **kwargs)
207
  File "/home/aliasgarov/copyright_checker/predictors.py", line 106, in update
208
- corrected_text, corrections = correct_text(text, bias_checker, bias_corrector)
209
  NameError: name 'bias_checker' is not defined
210
- Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.
211
- {'Archive Intel, a firm founded by industry veteran Larry Shumbres to combine communication archiving and compliance screening, announced that they raised a $1mm Seed Round from early-stage VC fund Social Leverage.': -0.0357487445636403, "From our standpoint, it's nice to see a VC stepping up and sole-funding a WealthTech startup.": 0.34833026271133577, 'Hopefully, that is another signs of a thaw in the funding market.': 0.34863399934833345, 'Of course, it also probably helps that it solidly falls into one of our thesis opportunities for WealthTech for the rest of this decade: making compliance integrated, scaled, and automated.': 0.19047304290779848, 'As our followers know, we separate compliance in wealth management into "regulatory compliance" and "direct surveillance".': -0.16152331111075888, 'Regulatory compliance is things like establishing and maintaining licenses, creating and maintaining policies and procedures, marketing reviews, etc.': -0.1912077539978308, 'These activities are usually applied at the firm level and, while technology can help significantly, it is not a gating factor.': 0.17226992177976566} quillbot
212
- huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
213
- To disable this warning, you can either:
214
- - Avoid using `tokenizers` before the fork if possible
215
- - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
216
- huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
217
- To disable this warning, you can either:
218
- - Avoid using `tokenizers` before the fork if possible
219
- - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
220
- huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
221
- To disable this warning, you can either:
222
- - Avoid using `tokenizers` before the fork if possible
223
- - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
224
- huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
225
- To disable this warning, you can either:
226
- - Avoid using `tokenizers` before the fork if possible
227
- - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
 
 
 
 
 
 
 
 
1
  The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
2
  The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
3
  The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
4
  The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
5
  The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
6
+ Some weights of the model checkpoint at textattack/roberta-base-CoLA were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
7
+ - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
8
+ - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
10
+ Framework not specified. Using pt to export the model.
11
+ Using the export variant default. Available variants are:
12
+ - default: The default ONNX variant.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
+ ***** Exporting submodel 1/1: DebertaV2ForSequenceClassification *****
15
+ Using framework PyTorch: 2.3.0+cu121
16
+ /usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:550: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
17
+ torch.tensor(mid - 1).type_as(relative_pos),
18
+ /usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:554: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
19
+ torch.ceil(torch.log(abs_pos / mid) / torch.log(torch.tensor((max_position - 1) / mid)) * (mid - 1)) + mid
20
+ /usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:713: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
21
+ scale = torch.sqrt(torch.tensor(query_layer.size(-1), dtype=torch.float) * scale_factor)
22
+ /usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:713: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
23
+ scale = torch.sqrt(torch.tensor(query_layer.size(-1), dtype=torch.float) * scale_factor)
24
+ /usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:788: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
25
+ scale = torch.sqrt(torch.tensor(pos_key_layer.size(-1), dtype=torch.float) * scale_factor)
26
+ /usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:788: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
27
+ scale = torch.sqrt(torch.tensor(pos_key_layer.size(-1), dtype=torch.float) * scale_factor)
28
+ /usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:800: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
29
+ scale = torch.sqrt(torch.tensor(pos_query_layer.size(-1), dtype=torch.float) * scale_factor)
30
+ /usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:800: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
31
+ scale = torch.sqrt(torch.tensor(pos_query_layer.size(-1), dtype=torch.float) * scale_factor)
32
+ /usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:801: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
33
+ if key_layer.size(-2) != query_layer.size(-2):
34
+ /usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:108: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
35
+ output = input.masked_fill(rmask, torch.tensor(torch.finfo(input.dtype).min))
36
+ Framework not specified. Using pt to export the model.
37
+ Using the export variant default. Available variants are:
38
+ - default: The default ONNX variant.
39
+ Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model) instead. This warning will be raised to an exception in v4.41.
40
+ Non-default generation parameters: {'max_length': 512, 'min_length': 8, 'num_beams': 2, 'no_repeat_ngram_size': 4}
41
+
42
+ ***** Exporting submodel 1/3: T5Stack *****
43
+ Using framework PyTorch: 2.3.0+cu121
44
+ Overriding 1 configuration item(s)
45
+ - use_cache -> False
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
  Traceback (most recent call last):
47
  File "/usr/local/lib/python3.9/dist-packages/gradio/queueing.py", line 527, in process_events
48
  response = await route_utils.call_process_api(
 
61
  File "/usr/local/lib/python3.9/dist-packages/gradio/utils.py", line 788, in wrapper
62
  response = f(*args, **kwargs)
63
  File "/home/aliasgarov/copyright_checker/predictors.py", line 106, in update
64
+ results = bias_checker(raw_text)
65
  NameError: name 'bias_checker' is not defined
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
predictors.py CHANGED
@@ -78,6 +78,17 @@ text_bc_model = BetterTransformer.transform(text_bc_model)
78
  text_mc_model = BetterTransformer.transform(text_mc_model)
79
  quillbot_model = BetterTransformer.transform(quillbot_model)
80
 
 
 
 
 
 
 
 
 
 
 
 
81
  # model score calibration
82
  iso_reg = joblib.load("isotonic_regression_model.joblib")
83
 
 
78
  text_mc_model = BetterTransformer.transform(text_mc_model)
79
  quillbot_model = BetterTransformer.transform(quillbot_model)
80
 
81
+ bias_model_checker = AutoModelForSequenceClassification.from_pretrained(bias_checker_model_name)
82
+ tokenizer = AutoTokenizer.from_pretrained(bias_checker_model_name)
83
+ bias_model_checker = BetterTransformer.transform(bias_model_checker, keep_original_model=False)
84
+ bias_checker = pipeline(
85
+ "text-classification",
86
+ model=bias_checker_model_name,
87
+ tokenizer=bias_checker_model_name,
88
+ )
89
+ gc.collect()
90
+ bias_corrector = pipeline( "text2text-generation", model=bias_corrector_model_name, accelerator="ort")
91
+
92
  # model score calibration
93
  iso_reg = joblib.load("isotonic_regression_model.joblib")
94