--- library_name: transformers tags: - e-commerce - query-generation license: mit datasets: - smartcat/Amazon-2023-GenQ language: - en metrics: - rouge base_model: - BeIR/query-gen-msmarco-t5-base-v1 pipeline_tag: text2text-generation --- # Model Card for T5-GenQ-TD-v1 🤖 ✨ 🔍 Generate precise, realistic user-focused search queries from product text 🛒 🚀 📊 ### Model Description - **Model Name:** Fine-Tuned Query-Generation Model - **Model type:** Text-to-Text Transformer - **Finetuned from model:** [BeIR/query-gen-msmarco-t5-base-v1](https://huggingface.co/BeIR/query-gen-msmarco-t5-base-v1) - **Dataset**: [smartcat/Amazon-2023-GenQ](https://huggingface.co/datasets/smartcat/Amazon-2023-GenQ) - **Primary Use Case**: Generating accurate and relevant search queries from item descriptions - **Repository:** [smartcat-labs/product2query](https://github.com/smartcat-labs/product2query) ### Model variations
Model | ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-Lsum |
---|---|---|---|---|
T5-GenQ-T-v1 | 75.2151 | 54.8735 | 74.5142 | 74.5262 |
T5-GenQ-TD-v1 | 78.2570 | 58.9586 | 77.5308 | 77.5466 |
T5-GenQ-TDE-v1 | 76.9075 | 57.0980 | 76.1464 | 76.1502 |
T5-GenQ-TDC-v1 (best) | 80.0754 | 61.5974 | 79.3557 | 79.3427 |
Model | ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-Lsum |
---|---|---|---|---|
T5-GenQ-TD-v1 | 76.15 | 56.23 | 75.49 | 75.49 |
query-gen-msmarco-t5-base-v1 | 34.92 | 15.28 | 34.17 | 34.17 |
Input Text | Target Query | Before Fine-tuning | After Fine-tuning |
---|---|---|---|
Dr. Scholl's Women's Trance Slip Resistant Clog Our trance work shoe combines exceptional style and performance. An oil and slip-resistant outsole combined with a molded EVA construction will add layers of safety and comfort. |
Dr. Scholl's Women's Trance Clog | dr scholl trance shoes | Dr. Scholl's Trance Clog |
Girls Birthday Tutu Skirts Dress with Mermaid Birthday Girl Tshirt, Headband, Satin Sash Girls Mermaid Dress Set with Tshirt, Dress, Headband and sash. |
girls mermaid dress set | what to wear for a mermaid birthday | Girls Mermaid Birthday Dress Set |
Saucony Women's Omni 15 Running Shoe If we could design shoelaces for pronators, we’d do that too. The omni 15 delivers everything a moderate to severe pronator could need, including enhanced cushioning, exceptional support, flexibility and a smooth, fluid ride. |
Saucony Omni 15 Women's Running Shoe | what shoes are good for pronators | Saucony Omni 15 Running Shoe |
Epoch | Step | Loss | Grad Norm | Learning Rate | Eval Loss | ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-Lsum |
---|---|---|---|---|---|---|---|---|---|
1.0 | 4285 | 0.2515 | 1.890405 | 0.000049 | 0.165247 | 76.4578 | 56.4813 | 75.7754 | 75.7835 |
2.0 | 8570 | 0.1744 | 1.433518 | 0.000042 | 0.157739 | 77.2138 | 57.4609 | 76.5478 | 76.5589 |
3.0 | 12855 | 0.1595 | 1.340541 | 0.000035 | 0.154977 | 77.5761 | 57.9620 | 76.8824 | 76.8854 |
4.0 | 17140 | 0.1488 | 1.370982 | 0.000028 | 0.153134 | 77.9366 | 58.5720 | 77.2561 | 77.2692 |
5.0 | 21425 | 0.1407 | 1.549360 | 0.000021 | 0.153177 | 78.1102 | 58.7207 | 77.4106 | 77.4241 |
6.0 | 25710 | 0.1344 | 1.258538 | 0.000014 | 0.152852 | 78.1691 | 58.8640 | 77.4554 | 77.4651 |
7.0 | 29995 | 0.1299 | 1.200458 | 0.000007 | 0.153884 | 78.2001 | 58.8603 | 77.4833 | 77.4984 |
8.0 | 34280 | 0.1267 | 1.079393 | 0.000000 | 0.154507 | 78.2570 | 58.9586 | 77.5308 | 77.5466 |
![]() |
```checkpoint-34280``` (T5-GenQ-TD-v1) significantly outperforms ```query-gen-msmarco-t5-base-v1``` across all ROUGE metrics. The difference is most notable in ROUGE-2, where ```checkpoint-34280``` achieves 56.24% vs. 15.29% for the baseline model. These results suggest ```checkpoint-34280``` produces more precise and high-overlap text generations. |
![]() |
```checkpoint-34280``` (T5-GenQ-TD-v1) has strong peaks near 100%, indicating high overlap with reference texts. ```query-gen-msmarco-t5-base-v1``` shows a broader distribution, with peaks at low to mid-range scores (10-40%), suggesting greater variability but lower precision. ROUGE-2 has a high density at 0% for the baseline model, implying many instances with no bigram overlap. |
![]() |
```checkpoint-34280``` (T5-GenQ-TD-v1, blue) shows a steady increase toward high ROUGE scores, peaking at 100%. ```query-gen-msmarco-t5-base-v1``` (orange) has multiple low-score peaks, particularly in ROUGE-2, reinforcing its lower text overlap performance. These histograms confirm that ```checkpoint-34280``` consistently generates more accurate outputs. |
![]() |
This visualization compares average ROUGE scores and score differences across different word sizes. Consistent ROUGE Scores (Sizes 2-8): ROUGE-1, ROUGE-2, ROUGE-L, and ROUGE-LSUM scores remain high and stable across most word sizes. Sharp Drop at Size 9: A significant decrease in scores occurs for size 9 words, with negative score differences, suggesting longer phrases are less aligned with reference texts. Score Differences Stay Near Zero (Sizes 2-8): Models perform similarly for shorter text spans but diverge at larger word sizes. |
![]() |
This histogram visualizes the distribution of cosine similarity scores, which measure the semantic similarity between paired texts (generated query and target query). A strong peak near 1.0 indicates that most pairs are highly semantically similar. Low similarity scores (0.0–0.4) are rare, suggesting the dataset consists mostly of highly related text pairs. |
![]() |
This scatter plot matrix shows the relationship between semantic similarity (cosine similarity) and ROUGE scores: Higher similarity → Higher ROUGE scores, indicating a positive correlation. ROUGE-1 & ROUGE-L show the strongest alignment, while ROUGE-2 has greater variance. Some low-similarity outliers still achieve moderate ROUGE scores, suggesting surface-level overlap without deep semantic alignment. This analysis helps understand how semantic similarity aligns with n-gram overlap metrics for evaluating text models. |