Spaces:

polygraf-ai
/

copyright_checker

Sleeping

App Files Files Community

aliasgerovs commited on May 10, 2024

Commit

b472976

1 Parent(s): e81407b

Updated

Browse files

Files changed (2) hide show

nohup.out +39 -201
predictors.py +11 -0

nohup.out CHANGED Viewed

@@ -1,192 +1,48 @@
-2024-05-10 14:22:25.922427: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
-To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
-2024-05-10 14:22:30.394731: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
-[nltk_data] Downloading package punkt to /root/nltk_data...
-[nltk_data]   Package punkt is already up-to-date!
-[nltk_data] Downloading package stopwords to /root/nltk_data...
-[nltk_data]   Package stopwords is already up-to-date!
 The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
 The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
 The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
 The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
 The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
-Traceback (most recent call last):
-  File "/home/aliasgarov/copyright_checker/app.py", line 4, in <module>
-    from predictors import predict_bc_scores, predict_mc_scores
-  File "/home/aliasgarov/copyright_checker/predictors.py", line 80, in <module>
-    iso_reg = joblib.load("isotonic_regression_model.joblib")
-  File "/usr/local/lib/python3.9/dist-packages/joblib/numpy_pickle.py", line 658, in load
-    obj = _unpickle(fobj, filename, mmap_mode)
-  File "/usr/local/lib/python3.9/dist-packages/joblib/numpy_pickle.py", line 577, in _unpickle
-    obj = unpickler.load()
-  File "/usr/lib/python3.9/pickle.py", line 1212, in load
-    dispatch[key[0]](self)
-KeyError: 118
-/usr/lib/python3/dist-packages/requests/__init__.py:87: RequestsDependencyWarning: urllib3 (2.2.1) or chardet (4.0.0) doesn't match a supported version!
-  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
-2024-05-10 14:46:28.024164: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
-To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
-2024-05-10 14:46:29.168832: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
-[nltk_data] Downloading package punkt to /root/nltk_data...
-[nltk_data]   Package punkt is already up-to-date!
-[nltk_data] Downloading package stopwords to /root/nltk_data...
-[nltk_data]   Package stopwords is already up-to-date!
-The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
-The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
-The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
-The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
-The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
-Traceback (most recent call last):
-  File "/home/aliasgarov/copyright_checker/app.py", line 4, in <module>
-    from predictors import predict_bc_scores, predict_mc_scores
-  File "/home/aliasgarov/copyright_checker/predictors.py", line 80, in <module>
-    iso_reg = joblib.load("isotonic_regression_model.joblib")
-  File "/usr/local/lib/python3.9/dist-packages/joblib/numpy_pickle.py", line 658, in load
-    obj = _unpickle(fobj, filename, mmap_mode)
-  File "/usr/local/lib/python3.9/dist-packages/joblib/numpy_pickle.py", line 577, in _unpickle
-    obj = unpickler.load()
-  File "/usr/lib/python3.9/pickle.py", line 1212, in load
-    dispatch[key[0]](self)
-KeyError: 118
-/usr/lib/python3/dist-packages/requests/__init__.py:87: RequestsDependencyWarning: urllib3 (2.2.1) or chardet (4.0.0) doesn't match a supported version!
-  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
-2024-05-10 14:47:42.322028: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
-To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
-2024-05-10 14:47:43.447037: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
-[nltk_data] Downloading package punkt to /root/nltk_data...
-[nltk_data]   Package punkt is already up-to-date!
-[nltk_data] Downloading package stopwords to /root/nltk_data...
-[nltk_data]   Package stopwords is already up-to-date!
-The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
-The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
-The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
 The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
-The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
-Traceback (most recent call last):
-  File "/home/aliasgarov/copyright_checker/app.py", line 4, in <module>
-    from predictors import predict_bc_scores, predict_mc_scores
-  File "/home/aliasgarov/copyright_checker/predictors.py", line 80, in <module>
-    iso_reg = joblib.load("isotonic_regression_model.joblib")
-  File "/usr/local/lib/python3.9/dist-packages/joblib/numpy_pickle.py", line 658, in load
-    obj = _unpickle(fobj, filename, mmap_mode)
-  File "/usr/local/lib/python3.9/dist-packages/joblib/numpy_pickle.py", line 577, in _unpickle
-    obj = unpickler.load()
-  File "/usr/lib/python3.9/pickle.py", line 1212, in load
-    dispatch[key[0]](self)
-KeyError: 118
-/usr/lib/python3/dist-packages/requests/__init__.py:87: RequestsDependencyWarning: urllib3 (2.2.1) or chardet (4.0.0) doesn't match a supported version!
-  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
-2024-05-10 14:51:34.826683: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
-To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
-2024-05-10 14:51:35.945870: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
-[nltk_data] Downloading package punkt to /root/nltk_data...
-[nltk_data]   Package punkt is already up-to-date!
-[nltk_data] Downloading package stopwords to /root/nltk_data...
-[nltk_data]   Package stopwords is already up-to-date!
-The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
-The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
-The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
-The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
-The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
-[nltk_data] Downloading package cmudict to /root/nltk_data...
-[nltk_data]   Package cmudict is already up-to-date!
-[nltk_data] Downloading package punkt to /root/nltk_data...
-[nltk_data]   Package punkt is already up-to-date!
-[nltk_data] Downloading package stopwords to /root/nltk_data...
-[nltk_data]   Package stopwords is already up-to-date!
-[nltk_data] Downloading package wordnet to /root/nltk_data...
-[nltk_data]   Package wordnet is already up-to-date!
-/usr/lib/python3/dist-packages/requests/__init__.py:87: RequestsDependencyWarning: urllib3 (2.2.1) or chardet (4.0.0) doesn't match a supported version!
-  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
-Collecting en_core_web_sm==2.3.1
-  Using cached en_core_web_sm-2.3.1-py3-none-any.whl
-Requirement already satisfied: spacy<2.4.0,>=2.3.0 in /usr/local/lib/python3.9/dist-packages (from en_core_web_sm==2.3.1) (2.3.9)
-Requirement already satisfied: thinc<7.5.0,>=7.4.1 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (7.4.6)
-Requirement already satisfied: numpy>=1.15.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.26.4)
-Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (2.0.8)
-Requirement already satisfied: catalogue<1.1.0,>=0.0.7 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.0.2)
-Requirement already satisfied: setuptools in /usr/lib/python3/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (52.0.0)
-Requirement already satisfied: blis<0.8.0,>=0.4.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (0.7.11)
-Requirement already satisfied: srsly<1.1.0,>=1.0.2 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.0.7)
-Requirement already satisfied: wasabi<1.1.0,>=0.4.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (0.10.1)
-Requirement already satisfied: requests<3.0.0,>=2.13.0 in /usr/lib/python3/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (2.25.1)
-Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.0.10)
-Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (4.66.2)
-Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (3.0.9)
-Requirement already satisfied: plac<1.2.0,>=0.9.6 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.1.3)
-[38;5;2m✔ Download and installation successful[0m
-You can now load the model via spacy.load('en_core_web_sm')
-huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
-To disable this warning, you can either:
-	- Avoid using `tokenizers` before the fork if possible
-	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
-/usr/local/lib/python3.9/dist-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML
-  warnings.warn("Can't initialize NVML")
-/usr/local/lib/python3.9/dist-packages/optimum/bettertransformer/models/encoder_models.py:301: UserWarning: The PyTorch API of nested tensors is in prototype stage and will change in the near future. (Triggered internally at ../aten/src/ATen/NestedTensorImpl.cpp:178.)
-  hidden_states = torch._nested_tensor_from_mask(hidden_states, ~attention_mask)
-IMPORTANT: You are using gradio version 4.28.3, however version 4.29.0 is available, please upgrade.
---------
-Running on local URL:  http://0.0.0.0:80
-Running on public URL: https://49d1a9e1ca41e9bb0f.gradio.live
-This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
-Original BC scores: AI: 0.0003359480469953269, HUMAN: 0.9996640682220459
-Calibration BC scores: AI: 0.035897435897435895, HUMAN: 0.9641025641025641
-Input Text: sThe Felix M. Warburg House is a mansion at 1109 Fifth Avenue on the Upper East Side of Manhattan in New York City. It was built from 1907 to 1908 for the German-American Jewish financier Felix M. Warburg, in the Châteauesque style, and designed by C. P. H. Gilbert. After Warburg's death in 1937, his widow sold it to a real estate developer. When plans to replace it with luxury apartments fell through, ownership reverted to the Warburgs, who donated it in 1944 to the Jewish Theological Seminary of America. In 1947, the Seminary opened the Jewish Museum in the mansion. The house was named a New York City designated landmark in 1981 and was added to the National Register of Historic Places in 1982. /s
-huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
-To disable this warning, you can either:
-	- Avoid using `tokenizers` before the fork if possible
-	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
-huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
-To disable this warning, you can either:
-	- Avoid using `tokenizers` before the fork if possible
-	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
-huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
-To disable this warning, you can either:
-	- Avoid using `tokenizers` before the fork if possible
-	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
-huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
-To disable this warning, you can either:
-	- Avoid using `tokenizers` before the fork if possible
-	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
-huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
-To disable this warning, you can either:
-	- Avoid using `tokenizers` before the fork if possible
-	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
-huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
-To disable this warning, you can either:
-	- Avoid using `tokenizers` before the fork if possible
-	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
-huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
-To disable this warning, you can either:
-	- Avoid using `tokenizers` before the fork if possible
-	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
-huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
-To disable this warning, you can either:
-	- Avoid using `tokenizers` before the fork if possible
-	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
-/home/aliasgarov/copyright_checker/predictors.py:234: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
-  probas = F.softmax(tensor_logits).detach().cpu().numpy()
-Traceback (most recent call last):
-  File "/usr/local/lib/python3.9/dist-packages/gradio/queueing.py", line 527, in process_events
-    response = await route_utils.call_process_api(
-  File "/usr/local/lib/python3.9/dist-packages/gradio/route_utils.py", line 270, in call_process_api
-    output = await app.get_blocks().process_api(
-  File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 1847, in process_api
-    result = await self.call_function(
-  File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 1433, in call_function
-    prediction = await anyio.to_thread.run_sync(
-  File "/usr/local/lib/python3.9/dist-packages/anyio/to_thread.py", line 56, in run_sync
-    return await get_async_backend().run_sync_in_worker_thread(
-  File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread
-    return await future
-  File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 851, in run
-    result = context.run(func, *args)
-  File "/usr/local/lib/python3.9/dist-packages/gradio/utils.py", line 788, in wrapper
-    response = f(*args, **kwargs)
-  File "/home/aliasgarov/copyright_checker/predictors.py", line 106, in update
-    corrected_text, corrections = correct_text(text, bias_checker, bias_corrector)
-NameError: name 'bias_checker' is not defined
 Traceback (most recent call last):
   File "/usr/local/lib/python3.9/dist-packages/gradio/queueing.py", line 527, in process_events
     response = await route_utils.call_process_api(
@@ -205,23 +61,5 @@ Traceback (most recent call last):
   File "/usr/local/lib/python3.9/dist-packages/gradio/utils.py", line 788, in wrapper
     response = f(*args, **kwargs)
   File "/home/aliasgarov/copyright_checker/predictors.py", line 106, in update
-    corrected_text, corrections = correct_text(text, bias_checker, bias_corrector)
 NameError: name 'bias_checker' is not defined
-Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.
-{'Archive Intel, a firm founded by industry veteran Larry Shumbres to combine communication archiving and compliance screening, announced that they raised a $1mm Seed Round from early-stage VC fund Social Leverage.': -0.0357487445636403, "From our standpoint, it's nice to see a VC stepping up and sole-funding a WealthTech startup.": 0.34833026271133577, 'Hopefully, that is another signs of a thaw in the funding market.': 0.34863399934833345, 'Of course, it also probably helps that it solidly falls into one of our thesis opportunities for WealthTech for the rest of this decade: making compliance integrated, scaled, and automated.': 0.19047304290779848, 'As our followers know, we separate compliance in wealth management into "regulatory compliance" and "direct surveillance".': -0.16152331111075888, 'Regulatory compliance is things like establishing and maintaining licenses, creating and maintaining policies and procedures, marketing reviews, etc.': -0.1912077539978308, 'These activities are usually applied at the firm level and, while technology can help significantly, it is not a gating factor.': 0.17226992177976566} quillbot
-huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
-To disable this warning, you can either:
-	- Avoid using `tokenizers` before the fork if possible
-	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
-huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
-To disable this warning, you can either:
-	- Avoid using `tokenizers` before the fork if possible
-	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
-huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
-To disable this warning, you can either:
-	- Avoid using `tokenizers` before the fork if possible
-	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
-huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
-To disable this warning, you can either:
-	- Avoid using `tokenizers` before the fork if possible
-	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)

 The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
 The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
 The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
 The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
 The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
+Some weights of the model checkpoint at textattack/roberta-base-CoLA were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
+- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
+- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
 The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
+Framework not specified. Using pt to export the model.
+Using the export variant default. Available variants are:
+    - default: The default ONNX variant.
+***** Exporting submodel 1/1: DebertaV2ForSequenceClassification *****
+Using framework PyTorch: 2.3.0+cu121
+/usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:550: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
+  torch.tensor(mid - 1).type_as(relative_pos),
+/usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:554: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
+  torch.ceil(torch.log(abs_pos / mid) / torch.log(torch.tensor((max_position - 1) / mid)) * (mid - 1)) + mid
+/usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:713: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
+  scale = torch.sqrt(torch.tensor(query_layer.size(-1), dtype=torch.float) * scale_factor)
+/usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:713: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
+  scale = torch.sqrt(torch.tensor(query_layer.size(-1), dtype=torch.float) * scale_factor)
+/usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:788: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
+  scale = torch.sqrt(torch.tensor(pos_key_layer.size(-1), dtype=torch.float) * scale_factor)
+/usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:788: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
+  scale = torch.sqrt(torch.tensor(pos_key_layer.size(-1), dtype=torch.float) * scale_factor)
+/usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:800: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
+  scale = torch.sqrt(torch.tensor(pos_query_layer.size(-1), dtype=torch.float) * scale_factor)
+/usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:800: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
+  scale = torch.sqrt(torch.tensor(pos_query_layer.size(-1), dtype=torch.float) * scale_factor)
+/usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:801: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
+  if key_layer.size(-2) != query_layer.size(-2):
+/usr/local/lib/python3.9/dist-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:108: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
+  output = input.masked_fill(rmask, torch.tensor(torch.finfo(input.dtype).min))
+Framework not specified. Using pt to export the model.
+Using the export variant default. Available variants are:
+    - default: The default ONNX variant.
+Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model) instead. This warning will be raised to an exception in v4.41.
+Non-default generation parameters: {'max_length': 512, 'min_length': 8, 'num_beams': 2, 'no_repeat_ngram_size': 4}
+***** Exporting submodel 1/3: T5Stack *****
+Using framework PyTorch: 2.3.0+cu121
+Overriding 1 configuration item(s)
+	- use_cache -> False
 Traceback (most recent call last):
   File "/usr/local/lib/python3.9/dist-packages/gradio/queueing.py", line 527, in process_events
     response = await route_utils.call_process_api(
   File "/usr/local/lib/python3.9/dist-packages/gradio/utils.py", line 788, in wrapper
     response = f(*args, **kwargs)
   File "/home/aliasgarov/copyright_checker/predictors.py", line 106, in update
+    results = bias_checker(raw_text)
 NameError: name 'bias_checker' is not defined

predictors.py CHANGED Viewed

@@ -78,6 +78,17 @@ text_bc_model = BetterTransformer.transform(text_bc_model)
 text_mc_model = BetterTransformer.transform(text_mc_model)
 quillbot_model = BetterTransformer.transform(quillbot_model)
 # model score calibration
 iso_reg = joblib.load("isotonic_regression_model.joblib")

 text_mc_model = BetterTransformer.transform(text_mc_model)
 quillbot_model = BetterTransformer.transform(quillbot_model)
+bias_model_checker = AutoModelForSequenceClassification.from_pretrained(bias_checker_model_name)
+tokenizer = AutoTokenizer.from_pretrained(bias_checker_model_name)
+bias_model_checker = BetterTransformer.transform(bias_model_checker, keep_original_model=False)
+bias_checker = pipeline(
+    "text-classification",
+    model=bias_checker_model_name,
+    tokenizer=bias_checker_model_name,
+)
+gc.collect()
+bias_corrector = pipeline( "text2text-generation", model=bias_corrector_model_name, accelerator="ort")
 # model score calibration
 iso_reg = joblib.load("isotonic_regression_model.joblib")