Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
maywell 
posted an update Apr 28
Post
8745
🔥 Transfer model's Chat feature, Context length and Knowledge to another under 1 minute without any train.

Imagine being able to create chat models, expand context, and transfer domain-specific knowledge to models, all within a matter of minutes. Our innovative approach, based on a combination of diff-based techniques and sigmoid ratio calculations, makes this possible.

By considering the diffs between the desired information model (long context or chat) and the base model, as well as the diffs between the base model and the target model, we can efficiently transfer features and expand context without the need for extensive training or resources.

Our method minimizes model degradation and ensures that only the desired information is captured, resulting in high-quality models that can be created with just a single click. Whether you need a chat model, expanded context, or domain-specific knowledge transfer, our approach offers a rapid and effective solution.

In blog post below, we will dive into the details of our method, provide code examples, and showcase the impressive results achieved using our approach. Get ready to revolutionize your model creation process and unlock new possibilities with this powerful technique.

Blog - https://huggingface.co/blog/maywell/llm-feature-transfer

Hi!
Could you please read this article: https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail
Interesting to hear your thoughts!

·

Hi, just read it. It's merging method with calibration looks interesting. I don't see their method without it have significant benefit over previous methods.