Santiago Viquez
santiviquez
AI & ML interests
ML @ NannyML. A bit of everything. NLP, RL, and, of course, tabular. In the GenAI era, how can you not love tabular data? Educational content and OSS.
Articles
Organizations
Posts
19
Post
1445
I ran 580 experiments (yes, 580 🤯) to check if we can quantify data drift's impact on model performance using only drift metrics.
For these experiments, I built a technique that relies on drift signals to estimate model performance. I compared its results against the current SoTA performance estimation methods and checked which technique performs best.
The plot below summarizes the general results. It measures the quality of performance estimation versus the absolute performance change. (The lower, the better).
Full experiment: https://www.nannyml.com/blog/data-drift-estimate-model-performance
In it, I describe the setup, datasets, models, benchmarking methods, and the code used in the project.
For these experiments, I built a technique that relies on drift signals to estimate model performance. I compared its results against the current SoTA performance estimation methods and checked which technique performs best.
The plot below summarizes the general results. It measures the quality of performance estimation versus the absolute performance change. (The lower, the better).
Full experiment: https://www.nannyml.com/blog/data-drift-estimate-model-performance
In it, I describe the setup, datasets, models, benchmarking methods, and the code used in the project.
Post
1555
Looking for someone with +10 years of experience training Deep Kolmogorov-Arnold Networks.
Any suggestions?
Any suggestions?
Collections
1
Collection of LLM hallucination and evaluation papers that I've been exploring and implementing. Some of them have my comments and annotated doodles.
-
Looking for a Needle in a Haystack: A Comprehensive Study of Hallucinations in Neural Machine Translation
Paper • 2208.05309 • Published • 1 -
LLM-Eval: Unified Multi-Dimensional Automatic Evaluation for Open-Domain Conversations with Large Language Models
Paper • 2305.13711 • Published • 2 -
Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation
Paper • 2302.09664 • Published • 2 -
BARTScore: Evaluating Generated Text as Text Generation
Paper • 2106.11520 • Published • 1
models
16
santiviquez/t5-small-finetuned-samsum-en
Summarization
•
Updated
•
27
santiviquez/bart-base-finetuned-samsum-en
Summarization
•
Updated
•
12
santiviquez/amazon-reviews-sentiment-bert-base-uncased-6000-samples
Updated
santiviquez/amazon-reviews-sentiment-distilbert-base-uncased-6000-samples
Text Classification
•
Updated
•
12
santiviquez/amazon-reviews-finetuning-distilbert-base-uncased
Text Classification
•
Updated
•
11
santiviquez/amazon-reviews-finetuning-distilbert-base-uncased_books
Text Classification
•
Updated
•
5
santiviquez/amazon-reviews-finetuning-bert-base-sentiment
Text Classification
•
Updated
•
13
santiviquez/amazon_reviews_finetuning-sentiment-model-3000-samples
Text Classification
•
Updated
•
12
santiviquez/noisy_human_cnn
Updated
santiviquez/ssr-base-finetuned-samsum-en
Summarization
•
Updated
•
13