Stepanov

Ihor

AI & ML interests

Text classification, computational biology, relations extraction, path reasoning

Recent Activity

updated a model 1 day ago
Ihor/gliner-biomed-base-1stg-v1.0
updated a model 1 day ago
Ihor/gliner-biomed-small-1stg-v1.0
published a dataset 1 day ago
Ihor/Text2Graph-Open-R1
View all activity

Articles

Organizations

Knowledgator Engineering's profile picture Blog-explorers's profile picture GLiNER Community's profile picture eyva.ai's profile picture

Posts 8

view post
Post
1130
๐Ÿš€ Reproducing DeepSeek R1 for Text-to-Graph Extraction

Iโ€™ve been working on replicating DeepSeek R1, focusing on zero-shot text-to-graph extractionโ€”a challenging task where LMs extract entities and relations from text based on predefined types.

๐Ÿง  Key Insight:
Language models struggle when constrained by entity/relation types. Supervised training alone isnโ€™t enough, but reinforcement learning (RL), specifically Guided Reward Policy Optimization (GRPO), shows promise.

๐Ÿ’ก Why GRPO?
It trains the model to generate structured graphs, optimizing multiple reward functions (format, JSON validity, and extraction accuracy).
It allows the model to learn from both positive and hard negative examples dynamically.
RL can be fine-tuned to emphasize relation extraction improvements.

๐Ÿ“Š Early Results:
Even with limited training, F1 scores consistently improved, and we saw clear benefits from RL-based optimization. More training = better performance!

๐Ÿ”ฌ Next Steps:
Weโ€™re scaling up experiments with larger models and high-quality data. Stay tuned for updates! Meanwhile, check out one of our experimental models here:
Ihor/Text2Graph-R1-Qwen2.5-0.5b

๐Ÿ“” Learn more details from the blog post: https://medium.com/p/d8b648d9f419

Feel free to share your thoughts and ask questions!
view post
Post
1039
๐Ÿš€ Welcome the New and Improved GLiNER-Multitask! ๐Ÿš€

Since the release of our beta version, GLiNER-Multitask has received many positive responses. It's been embraced in many consulting, research, and production environments. Thank you everyone for your feedback, it helped us rethink the strengths and weaknesses of the first model and we are excited to present the next iteration of this multi-task information extraction model.

๐Ÿ’ก Whatโ€™s New?
Here are the key improvements in this latest version:
๐Ÿ”น Expanded Task Support: Now includes text classification and other new capabilities.
๐Ÿ”น Enhanced Relation Extraction: Significantly improved accuracy and robustness.
๐Ÿ”น Improved Prompt Understanding: Optimized for open-information extraction tasks.
๐Ÿ”น Better Named Entity Recognition (NER): More accurate and reliable results.

๐Ÿ”ง How We Made It Better:
These advancements were made possible by:
๐Ÿ”น Leveraging a better and more diverse dataset.
๐Ÿ”น Using a larger backbone model for increased capacity.
๐Ÿ”น Implementing advanced model merging techniques.
๐Ÿ”น Employing self-learning strategies for continuous improvement.
๐Ÿ”น Better training strategies and hyperparameters tuning.

๐Ÿ“„ Read the Paper: https://arxiv.org/abs/2406.12925
โš™๏ธ Try the Model: knowledgator/gliner-multitask-v1.0
๐Ÿ’ป Test the Demo: knowledgator/GLiNER_HandyLab
๐Ÿ“Œ Explore the Repo: https://github.com/urchade/GLiNER