llmware
/

bling-1b-0.1

Text Generation

text-generation-inference

Model card Files Files and versions Community

doberst commited on Nov 4, 2023

Commit

492f901

•

1 Parent(s): ee3ab5a

Update README.md

Files changed (1) hide show

README.md +18 -1

README.md CHANGED Viewed

@@ -49,12 +49,29 @@ The first BLING models have been trained for common RAG scenarios, specifically:
 without the need for a lot of complex instruction verbiage - provide a text passage context, ask questions, and get clear fact-based responses.
 ## Bias, Risks, and Limitations
 <!-- This section is meant to convey both technical and sociotechnical limitations. -->
 Any model can provide inaccurate or incomplete information, and should be used in conjunction with appropriate safeguards and fact-checking mechanisms.
 ## How to Get Started with the Model
@@ -97,7 +114,7 @@ BLING models are built on top of EleutherAI/Pythia base - please see citation fo
 Darren Oberst & llmware team
-Please reach out anytime if you are interested in this project and would like to participate and work with us!

 without the need for a lot of complex instruction verbiage - provide a text passage context, ask questions, and get clear fact-based responses.
+### Benchmark Tests
+Evaluated against the benchmark test:   [RAG-Instruct-Benchmark-Tester][https://www.huggingface.co/llmware/rag_instruct_benchmark_tester]
+Average of 2 Test Runs with 1 point for correct answer, 0.5 point for partial correct or blank / NF, 0.0 points for incorrect, and -1 points for hallucinations.
+--Score:  73.25 correct out of 100
+--Not Found Classification:  17.5%
+--Boolean:  29%
+--Math/Logic:  0%
+--Complex Questions (1-5):  1 (Low)
+--Summarization Quality (1-5):  1 (Coherent, extractive)
+For test run results, please see the files ("core_rag_test" and "answer_sheet" in the repo).
 ## Bias, Risks, and Limitations
 <!-- This section is meant to convey both technical and sociotechnical limitations. -->
 Any model can provide inaccurate or incomplete information, and should be used in conjunction with appropriate safeguards and fact-checking mechanisms.
+This model can be used effective for quick testing and will be generally accurate in relatively simple extractive Q&A and basic summarization.
 ## How to Get Started with the Model
 Darren Oberst & llmware team
+Please reach out anytime if you are interested in this project.