Spaces:
Runtime error
Runtime error
gchhablani
commited on
Commit
•
06cb314
1
Parent(s):
dc3ad35
Fix issues
Browse files- app.py +1 -1
- sections/intro.md +1 -1
app.py
CHANGED
@@ -49,7 +49,7 @@ def read_markdown(path, parent="./sections/"):
|
|
49 |
# new_width = int(w * new_height / h)
|
50 |
# return cv2.resize(image, (new_width, new_height))
|
51 |
|
52 |
-
checkpoints = ["./ckpt/ckpt-60k-5999"] # TODO: Maybe add more checkpoints?
|
53 |
dummy_data = pd.read_csv("dummy_vqa_multilingual.tsv", sep="\t")
|
54 |
code_to_name = {
|
55 |
"en": "English",
|
|
|
49 |
# new_width = int(w * new_height / h)
|
50 |
# return cv2.resize(image, (new_width, new_height))
|
51 |
|
52 |
+
checkpoints = ["./ckpt/vqa/ckpt-60k-5999"] # TODO: Maybe add more checkpoints?
|
53 |
dummy_data = pd.read_csv("dummy_vqa_multilingual.tsv", sep="\t")
|
54 |
code_to_name = {
|
55 |
"en": "English",
|
sections/intro.md
CHANGED
@@ -1,4 +1,4 @@
|
|
1 |
-
This demo uses a [ViTBert model checkpoint](https://huggingface.co/flax-community/multilingual-vqa-pt-60k-ft/tree/main/ckpt-5999) fine-tuned on a [MarianMT](https://huggingface.co/transformers/model_doc/marian.html)-translated version of the [VQA v2 dataset](https://visualqa.org/challenge.html). The fine-tuning is performed
|
2 |
|
3 |
The model predicts one out of 3129 classes in English which can be found [here](https://huggingface.co/spaces/flax-community/Multilingual-VQA/blob/main/answer_reverse_mapping.json), and then the translated versions are provided based on the language chosen as `Answer Language`. The question can be present or written in any of the following: English, French, German and Spanish.
|
4 |
|
|
|
1 |
+
This demo uses a [ViTBert model checkpoint](https://huggingface.co/flax-community/multilingual-vqa-pt-60k-ft/tree/main/ckpt-5999) fine-tuned on a [MarianMT](https://huggingface.co/transformers/model_doc/marian.html)-translated version of the [VQA v2 dataset](https://visualqa.org/challenge.html). The fine-tuning is performed after pre-training using text-only Masked LM on approximately 10 million image-text pairs taken from the [Conceptual 12M dataset](https://github.com/google-research-datasets/conceptual-12m) translated using [MBart](https://huggingface.co/transformers/model_doc/mbart.html). The translations are performed in the following four languages: English, French, German and Spanish.
|
2 |
|
3 |
The model predicts one out of 3129 classes in English which can be found [here](https://huggingface.co/spaces/flax-community/Multilingual-VQA/blob/main/answer_reverse_mapping.json), and then the translated versions are provided based on the language chosen as `Answer Language`. The question can be present or written in any of the following: English, French, German and Spanish.
|
4 |
|