Spaces:

bluebalam
/

paper-rec

Sleeping

App Files Files Community

bluebalam commited on Feb 6, 2022

Commit

5cb07ef

1 Parent(s): 5b6c150

app upgrade, ignore files, add requirements, and update README

Browse files

Files changed (4) hide show

.gitignore +137 -0
README.md +8 -36
app.py +55 -0
requirements.txt +8 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,137 @@

+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+pip-wheel-metadata/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# pycharm
+.idea
+# ipynb
+*.ipynb
+# PyBuilder
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+.python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+# idea
+.idea

README.md CHANGED Viewed

@@ -1,46 +1,18 @@
 ---
-title: Paper Rec
-emoji: 💩
-colorFrom: gray
 colorTo: blue
 sdk: gradio
 app_file: app.py
-pinned: false
 license: mit
 ---
-# Configuration
-`title`: _string_
-Display title for the Space
-`emoji`: _string_
-Space emoji (emoji-only character allowed)
-`colorFrom`: _string_
-Color for Thumbnail gradient (red, yellow, green, blue, indigo, purple, pink, gray)
-`colorTo`: _string_
-Color for Thumbnail gradient (red, yellow, green, blue, indigo, purple, pink, gray)
-`sdk`: _string_
-Can be either `gradio`, `streamlit`, or `static`
-`sdk_version` : _string_
-Only applicable for `streamlit` SDK.
-See [doc](https://hf.co/docs/hub/spaces) for more info on supported versions.
-`app_file`: _string_
-Path to your main application file (which contains either `gradio` or `streamlit` Python code, or `static` html code).
-Path is relative to the root of the repository.
-`models`: _List[string]_
-HF model IDs (like "gpt2" or "deepset/roberta-base-squad2") used in the Space.
-Will be parsed automatically from your code if not specified here.
-`datasets`: _List[string]_
-HF dataset IDs (like "common_voice" or "oscar-corpus/OSCAR-2109") used in the Space.
-Will be parsed automatically from your code if not specified here.
-`pinned`: _boolean_
-Whether the Space stays on top of your list.

 ---
+title: `paper-rec`
+emoji: 📃 🤖 💙
+colorFrom: indigo
 colorTo: blue
 sdk: gradio
 app_file: app.py
+pinned: true
 license: mit
 ---
+# `paper-rec` demo
+What paper in ML/AI should I read next? It is difficult to choose from all great research publications published daily. This demo gives you a personalized selection of papers from the latest scientific contributions available in [arXiv](https://arxiv.org/).
+You just input the title or abstract (or both) of paper(s) you liked in the past or you can also use keywords of topics of interest and get the top-10 article recommednations tailored to your tase.
+Enjoy!

app.py ADDED Viewed

	@@ -0,0 +1,55 @@

+import gradio as gr
+import torch
+from paper_rec import recommender, etl
+from gradio.inputs import Textbox
+def recommend(txt):
+    if len(txt.strip()) <= 0:
+        return {"msg": "no recommendations available for the input text."}
+    top_n = 10
+    # model user preferences:
+    cleaned_txt = etl.clean_text(txt)
+    sentences = etl.get_sentences_from_txt(txt)
+    rec = recommender.Recommender()
+    # loading data and model from HF
+    rec.load_data()
+    rec.load_model()
+    # compute user embedding
+    user_embedding = torch.from_numpy(rec.embedding(sentences))
+    # get recommendations based on user preferences
+    recs = rec.recommend(user_embedding, top_k=100)
+    # deduplicate
+    recs_output = []
+    seen_paper = set()
+    for p in recs:
+        if p["id"] not in seen_paper:
+            recs_output.append({"id": p["id"],
+             "title": p["title"],
+             "abstract": p["authors"],
+             "abstract": p["abstract"]
+             })
+            seen_paper.add(p["id"])
+        if len(recs_output) >= top_n:
+            break
+    # report top-n
+    return recs_output
+def inputs():
+    pass
+title = "Interactive demo: paper-rec"
+description = "Demo that recommends you what recent papers in AI/ML to read next based on what you like."
+iface = gr.Interface(fn=recommend,
+                     inputs=[Textbox(lines=10, placeholder="Titles and abstracts from papers you like", default="", label="Sample of what I like <3")],
+                     outputs="json",
+                     layout='vertical'
+                     )
+iface.launch()

requirements.txt ADDED Viewed

	@@ -0,0 +1,8 @@

+torch
+sentence_transformers
+huggingface-hub
+feedparser
+beautifulsoup4
+lxml
+git+https://github.com/bluebalam/paper-rec.git
+gradio