Stefano Fiorucci

anakin87

AI & ML interests

Contributing to Haystack, the LLM Framework ๐Ÿ—๏ธ. NLP / LLMs.

Organizations

anakin87's activity

posted an update 3 days ago
view post
Post
1435
๐ŸŒŒ Creating adventures with local LLMs

What if ๐Ÿค”... Homer Simpson met Spider-Man and they went on a quest for donuts? ๐Ÿฉ
Or if Fred Astaire and Corporal Hicks teamed up to fight xenomorphs? ๐Ÿ‘พ

In the words of Karpathy, LLMs are dream machines...
they seem specially made to simulate these wild scenarios!

๐„๐ฑ๐ฉ๐ž๐ซ๐ข๐ฆ๐ž๐ง๐ญ๐ข๐ง๐  ๐ฐ๐ข๐ญ๐ก ๐ญ๐ก๐ข๐ฌ ๐ข๐๐ž๐š ๐Ÿ‘‡
Nous Research / @teknium recently released NousResearch/CharacterCodex:
a massive dataset with information on 16k characters, both fictional and real.
I couldn't wait to play it...

After a few attempts, I found that combining the information in this dataset with a good model (like meta-llama/Meta-Llama-3-8B-Instruct) opens the doors to a myriad of chat adventures.

๐Ÿ› ๏ธ Stack:
๐Ÿ”นHaystack for orchestration ๐Ÿ—๏ธ
๐Ÿ”นllamafile ๐Ÿฆ™๐Ÿ—‚๏ธ to run our model locally.

๐Ÿ““ Check out the notebook: https://t.ly/y6jrZ
(includes a bonus ๐Ÿ•ต๏ธ Mystery Character Quiz)
posted an update 12 days ago
view post
Post
873
๐Ÿงช RAG Evaluation with ๐Ÿ”ฅ Prometheus 2 + Haystack

๐Ÿ“ Blog post: https://haystack.deepset.ai/blog/rag-evaluation-with-prometheus-2
๐Ÿ““ Notebook: https://github.com/deepset-ai/haystack-cookbook/blob/main/notebooks/prometheus2_evaluation.ipynb

โ”€โ”€โ”€ โ‹†โ‹…โ˜†โ‹…โ‹† โ”€โ”€โ”€

When evaluating LLMs' responses, ๐ฉ๐ซ๐จ๐ฉ๐ซ๐ข๐ž๐ญ๐š๐ซ๐ฒ ๐ฆ๐จ๐๐ž๐ฅ๐ฌ like GPT-4 are commonly used due to their strong performance.
However, relying on closed models presents challenges related to data privacy ๐Ÿ”’, transparency, controllability, and cost ๐Ÿ’ธ.

On the other hand, ๐จ๐ฉ๐ž๐ง ๐ฆ๐จ๐๐ž๐ฅ๐ฌ typically do not correlate well with human judgments and lack flexibility.


๐Ÿ”ฅ Prometheus 2 is a new family of open-source models designed to address these gaps:
๐Ÿ”น two variants: prometheus-eval/prometheus-7b-v2.0; prometheus-eval/prometheus-8x7b-v2.0
๐Ÿ”น trained on open-source data
๐Ÿ”น high correlation with human evaluations and proprietary models
๐Ÿ”น highly flexible: capable of performing direct assessments and pairwise rankings, and allowing the definition of custom evaluation criteria.

See my experiments with RAG evaluation in the links above.
posted an update 23 days ago
view post
Post
2006
โš™๏ธ Prompt Optimization with Haystack and DSPy

Experimental notebook: ๐Ÿงช๐Ÿ““ https://github.com/deepset-ai/haystack-cookbook/blob/main/notebooks/prompt_optimization_with_dspy.ipynb

When building applications with LLMs, writing effective prompts is a long process of trial and error. ๐Ÿ”„
Often, if you switch models, you also have to change the prompt. ๐Ÿ˜ฉ
What if you could automate this process?


๐Ÿ’ก That's where DSPy comes in - a framework designed to algorithmically optimize prompts for Language Models.
By applying classical machine learning concepts (training and evaluation data, metrics, optimization), DSPy generates better prompts for a given model and task.


Recently, I explored combining DSPy with the robustness of Haystack Pipelines.

Here's how it works:
โ–ถ๏ธ Start from a Haystack RAG pipeline with a basic prompt
๐ŸŽฏ Define a goal (in this case, get correct and concise answers)
๐Ÿ“Š Create a DSPy program, define data and metrics
โœจ Optimize and evaluate -> improved prompt
๐Ÿš€ Build a refined Haystack RAG pipeline using the optimized prompt
  • 1 reply
ยท
posted an update about 1 month ago
view post
Post
1270
Do you want to play a game against Llama 3? ๐Ÿฆ™๐Ÿฆ™๐Ÿฆ™

Meet ๐Ÿง‘โ€๐Ÿซ ๐€๐ฎ๐ญ๐จ๐๐ฎ๐ข๐ณ๐ณ๐ž๐ซ, a new LLM application that you can use for learning or just for fun.

Try it out on Hugging Face Spaces ๐Ÿค— deepset/autoquizzer

๐‡๐จ๐ฐ ๐ข๐ญ ๐ฐ๐จ๐ซ๐ค๐ฌ
You provide an URL -> A multiple-choice quiz is instantly generated.

๐Ÿ”น You can play the quiz yourself.

๐Ÿ”น You can let the LLM play in two different ways
๐Ÿ“• Closed book: the LLM responds only by knowing the general topic and using its parametric knowledge and reasoning abilities.
๐Ÿ”Ž๐ŸŒ Web RAG: for each question, a Google search is done and the top 3 snippets are included in the prompt for the LLM.

๐’๐ญ๐š๐œ๐ค
๐Ÿ—๏ธ Haystack LLM framework https://haystack.deepset.ai/
๐Ÿฆ™ Llama 3 8B Instruct
โšก Groq

Original idea: @Tuana
  • 1 reply
ยท