Improve performance with contextual compression, a technique where retrieved documents are compressed, and irrelevant information is filtered out. 9caad80 vinhnx90 commited on Apr 2
Disable tokenizer transformer parallelism to avoid deadlocks 91d4c2f unverified Vinh Nguyen commited on Apr 2
Use all-mpnet-base-v2 vector embedding for highest performance 1ec5b20 unverified Vinh Nguyen commited on Apr 2
Refactor app with better Document retrieval embedding and better chat streaming db70198 vinhnx90 commited on Mar 31
Refactor app with better Document retrieval embedding and better chat streaming 5440da0 vinhnx90 commited on Mar 31