import streamlit as st
def app():
with open('style.css') as f:
st.markdown(f"", unsafe_allow_html=True)
footer = """
"""
st.markdown(footer, unsafe_allow_html=True)
st.subheader("Intro")
intro = """
Wikipedia Assistant is an example of a task usually referred to as the Long-Form Question Answering (LFQA).
These systems function by querying large document stores for relevant information and subsequently using
the retrieved documents to generate accurate, multi-sentence answers. The documents related to a given
query, colloquially called context passages, are not used merely as source tokens for extracted answers,
but instead provide a larger context for the synthesis of original, abstractive long-form answers.
LFQA systems usually consist of three components:
- A document store including content passages for a variety of topics
- Encoder models to encode documents/questions such that it is possible to query the document store
- A Seq2Seq language model capable of generating paragraph-long answers when given a question and
context passages retrieved from the document store
Wikipedia Assistant converts the text-based answer to speech via either Google text-to-speech engine or
Espnet model hosted on
HuggingFace hub
"""
st.markdown(tts, unsafe_allow_html=True)
st.subheader("Tips")
tips = """
LFQA task is far from solved. Wikipedia Assistant will sometimes generate an answer unrelated to a question asked,
even downright wrong. However, if the question is elaborate and more specific, there is a decent chance of
getting a legible answer. LFQA systems are targeting ELI5 non-factoid type of questions. A general guideline
is - questions starting with why, what, and how are better suited than where and who questions. Be elaborate.
For example, to ask a science-based question, Wikipedia Assistant is better suited to answer the question: "Why do
airplane jet engines leave contrails in the sky?" than "Why do contrails exist?". Detailed and precise questions
are more likely to match the right half a dozen relevant passages in a 20+ GB Wikipedia dump to construct a good
answer.
"""
st.markdown(tips, unsafe_allow_html=True)
st.subheader("Technical details")
techinical_intro = """
A question asked will be encoded with an
encoder
and sent to a server to find the most relevant Wikipedia passages. The Wikipedia
passages
were previously encoded using a passage
encoder and
stored in the
Faiss index. The question matching passages (a.k.a context passages) are retrieved from the Faiss
index and passed to a BART-based seq2seq
model to
synthesize an original answer to the question.
"""
st.markdown(techinical_intro, unsafe_allow_html=True)