G.Hemanth Sai
update readme
e7dd348
|
raw
history blame
4.56 kB
metadata
title: Question Generator
emoji: πŸ”‘
colorFrom: yellow
colorTo: yellow
sdk: streamlit
sdk_version: 1.10.0
app_file: app.py
pinned: false

Internship-IVIS-labs

  • The Intelligent Question Generator app is an easy-to-use interface built in Streamlit which uses KeyBERT, Sense2vec, T5
  • It uses a minimal keyword extraction technique that leverages multiple NLP embeddings and relies on Transformers πŸ€— to create keywords/keyphrases that are most similar to a document.
  • sense2vec (Trask et. al, 2015) is a nice twist on word2vec that lets you learn more interesting and detailed word vectors.

Repository Breakdown

src Directory


  • src/Pipeline/QAhaystack.py: This file contains the code of question answering using haystack.
  • src/Pipeline/QuestGen.py: This file contains the code of question generation.
  • src/Pipeline/Reader.py: This file contains the code of reading the document.
  • src/Pipeline/TextSummariztion.py: This file contains the code of text summarization.
  • src/PreviousVersionCode/context.py: This file contains the finding the context of the paragraph.
  • src/PreviousVersionCode/QuestionGenerator.py: This file contains the code of first attempt of question generation.

Installation

$ git clone https://github.com/HemanthSai7/Internship-IVIS-labs.git
$ cd Internship-IVIS-labs
pip install -r requirements.txt
  • For the running the app for the first time locally, you need to uncomment the the lines in src/Pipeline/QuestGen.py to download the models to the models directory.
streamlit run app.py
  You can now view your Streamlit app in your browser.

  Local URL: http://localhost:8501
  Network URL: http://192.168.0.103:8501

Tech Stack Used

image image image image image image image image image image

Timeline

Week 1-2:

Tasks

  • Understanding and brushing up the concepts of NLP.
  • Extracting images and text from a pdf file and storing it in a texty file.
  • Exploring various open source tools for generating questions from a given text.
  • Read papers related to the project (Bert,T5,RoBERTa etc).
  • Summarizing the extracted text using T5 base pre-trained model from the pdf file.

Week 3-4:

Tasks

  • Understanding the concept of QA systems.
  • Created a basic script for generating questions from the text.
  • Created a basic script for finding the context of the paragraph.

Week 5-6:

Tasks

  • Understanding how Transformers models work for NLP tasks Question answering and generation
  • Understanding how to use the Haystack library for QA systems.
  • Understanding how to use the Haystack library for Question generation.
  • PreProcessed the document for Haystack QA for better results .

Week 7-8:

Tasks

  • Understanding how to generate questions intelligently.
  • Explored wordnet to find synonyms
  • Used BertWSD for disambiguating the sentence provided.
  • Used KeyBERT for finding the keywords in the document.
  • Used sense2vec for finding better words with high relatedness for the keywords generated.

Week 9-10:

Tasks

  • Create a streamlit app to demonstrate the project.