Spaces:
Runtime error
Runtime error
gchhablani
commited on
Merge branch 'main' of https://huggingface.co/spaces/flax-community/Multilingual-VQA into main
Browse files
app.py
CHANGED
@@ -9,7 +9,7 @@ def main():
|
|
9 |
st.set_page_config(
|
10 |
page_title="Multilingual VQA",
|
11 |
layout="wide",
|
12 |
-
initial_sidebar_state="
|
13 |
page_icon="./misc/mvqa-logo-3-white.png",
|
14 |
)
|
15 |
|
|
|
9 |
st.set_page_config(
|
10 |
page_title="Multilingual VQA",
|
11 |
layout="wide",
|
12 |
+
initial_sidebar_state="auto",
|
13 |
page_icon="./misc/mvqa-logo-3-white.png",
|
14 |
)
|
15 |
|
apps/article.py
CHANGED
@@ -61,11 +61,11 @@ def app(state=None):
|
|
61 |
st.write(read_markdown("limitations.md"))
|
62 |
|
63 |
toc.header("Conclusion, Future Work, and Social Impact")
|
64 |
-
|
65 |
-
|
66 |
-
|
67 |
-
|
68 |
-
|
69 |
st.write(read_markdown("conclusion_future_work/social_impact.md"))
|
70 |
|
71 |
toc.header("References")
|
|
|
61 |
st.write(read_markdown("limitations.md"))
|
62 |
|
63 |
toc.header("Conclusion, Future Work, and Social Impact")
|
64 |
+
toc.subheader("Conclusion")
|
65 |
+
st.write(read_markdown("conclusion_future_work/conclusion.md"))
|
66 |
+
toc.subheader("Future Work")
|
67 |
+
st.write(read_markdown("conclusion_future_work/future_work.md"))
|
68 |
+
toc.subheader("Social Impact")
|
69 |
st.write(read_markdown("conclusion_future_work/social_impact.md"))
|
70 |
|
71 |
toc.header("References")
|
sections/conclusion_future_work/conclusion.md
CHANGED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
In this project, we presented Proof-of-Concept with our CLIP Vision + BERT model baseline which leverages a multilingual checkpoint with pre-trained image encoders in four languages - **English, French, German, and Spanish**. We hope to improve this in the future by using better translators (for e.g. Google Translate API) to get more multilingual data, especially in low-resource languages.
|
sections/conclusion_future_work/future_work.md
CHANGED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
We hope to improve this project in the future by using:
|
2 |
+
- Superior translation model: Translation has a very huge impact on how the end model would perform. Better translators (for e.g. Google Translate API) and language specific seq2seq models for translation are able to generate better data, both for high-resource and low-resource languages.
|
3 |
+
- Checking translation quality: Inspecting quality of translated data is as important as the translation model itself. For this we'll either require native speakers to manually inspect a sample of translated data or devise some unsupervised translation quality metrics for the same.
|
4 |
+
- More data: Currently we are using only 2.5M images of Conceptual 12M for image captioning. We plan to include other datasets like Conceptual Captions 3M, subset of YFCC100M dataset etc.
|
5 |
+
- Low resource languages: With better translation tools we also wish to train our model in low resource languages which would further democratize the image captioning solution and help people realise the potential of language systems.
|