@deoxykev , I am still working on it, but for now, other zero-shot approaches that directly build for that like gliner demonstrate better precision.
Stepanov
Ihor
AI & ML interests
Text classification, computational biology, relations extraction, path reasoning
Recent Activity
replied to
their
post
4 days ago
🚀 Reproducing DeepSeek R1 for Text-to-Graph Extraction
I’ve been working on replicating DeepSeek R1, focusing on zero-shot text-to-graph extraction—a challenging task where LMs extract entities and relations from text based on predefined types.
🧠 Key Insight:
Language models struggle when constrained by entity/relation types. Supervised training alone isn’t enough, but reinforcement learning (RL), specifically Guided Reward Policy Optimization (GRPO), shows promise.
💡 Why GRPO?
It trains the model to generate structured graphs, optimizing multiple reward functions (format, JSON validity, and extraction accuracy).
It allows the model to learn from both positive and hard negative examples dynamically.
RL can be fine-tuned to emphasize relation extraction improvements.
📊 Early Results:
Even with limited training, F1 scores consistently improved, and we saw clear benefits from RL-based optimization. More training = better performance!
🔬 Next Steps:
We’re scaling up experiments with larger models and high-quality data. Stay tuned for updates! Meanwhile, check out one of our experimental models here:
https://huggingface.co/Ihor/Text2Graph-R1-Qwen2.5-0.5b
📔 Learn more details from the blog post: https://medium.com/p/d8b648d9f419
Feel free to share your thoughts and ask questions!
liked
a model
4 days ago
takara-ai/SwarmFormer-Sentiment-Base
updated
a dataset
5 days ago
Ihor/Text2Graph-Open-R1
Organizations
Ihor's activity
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1658166666371-noauth.png)
replied to
their
post
4 days ago
Release the training / fine-tuning scripts?
1
#1 opened 6 days ago
by
yashmalviya
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1658166666371-noauth.png)
commented on
Replicating DeepSeek R1 for Information Extraction
6 days ago
Additionally, we feed generated with structured prediction JSON data and feed them and text into DeepSeek-R1 Llama 70B to generate a chain of thought that can explain the extraction process.
Why don't you use R1 original (>600B) to get the best results?
From my tests, I didn't see a major difference for this task between the models so a smaller model was chosen as more efficient.
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1658166666371-noauth.png)
upvoted
an
article
6 days ago
Article
Open-R1: Update #1
By
and 7 others
•
•
252![](https://cdn-avatars.huggingface.co/v1/production/uploads/1658166666371-noauth.png)
upvoted
an
article
8 days ago
Article
Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial
By
•
•
31![](https://cdn-avatars.huggingface.co/v1/production/uploads/1658166666371-noauth.png)
posted
an
update
9 days ago
Post
1356
🚀 Reproducing DeepSeek R1 for Text-to-Graph Extraction
I’ve been working on replicating DeepSeek R1, focusing on zero-shot text-to-graph extraction—a challenging task where LMs extract entities and relations from text based on predefined types.
🧠 Key Insight:
Language models struggle when constrained by entity/relation types. Supervised training alone isn’t enough, but reinforcement learning (RL), specifically Guided Reward Policy Optimization (GRPO), shows promise.
💡 Why GRPO?
It trains the model to generate structured graphs, optimizing multiple reward functions (format, JSON validity, and extraction accuracy).
It allows the model to learn from both positive and hard negative examples dynamically.
RL can be fine-tuned to emphasize relation extraction improvements.
📊 Early Results:
Even with limited training, F1 scores consistently improved, and we saw clear benefits from RL-based optimization. More training = better performance!
🔬 Next Steps:
We’re scaling up experiments with larger models and high-quality data. Stay tuned for updates! Meanwhile, check out one of our experimental models here:
Ihor/Text2Graph-R1-Qwen2.5-0.5b
📔 Learn more details from the blog post: https://medium.com/p/d8b648d9f419
Feel free to share your thoughts and ask questions!
I’ve been working on replicating DeepSeek R1, focusing on zero-shot text-to-graph extraction—a challenging task where LMs extract entities and relations from text based on predefined types.
🧠 Key Insight:
Language models struggle when constrained by entity/relation types. Supervised training alone isn’t enough, but reinforcement learning (RL), specifically Guided Reward Policy Optimization (GRPO), shows promise.
💡 Why GRPO?
It trains the model to generate structured graphs, optimizing multiple reward functions (format, JSON validity, and extraction accuracy).
It allows the model to learn from both positive and hard negative examples dynamically.
RL can be fine-tuned to emphasize relation extraction improvements.
📊 Early Results:
Even with limited training, F1 scores consistently improved, and we saw clear benefits from RL-based optimization. More training = better performance!
🔬 Next Steps:
We’re scaling up experiments with larger models and high-quality data. Stay tuned for updates! Meanwhile, check out one of our experimental models here:
Ihor/Text2Graph-R1-Qwen2.5-0.5b
📔 Learn more details from the blog post: https://medium.com/p/d8b648d9f419
Feel free to share your thoughts and ask questions!