29 4 33

Fidite Nemini PRO

FiditeNemini

FiditeNemini2023

AI & ML interests

Prompt engineering, unalignment, MLX, model merging, diffusion models

Recent Activity

liked a model 5 days ago

huihui-ai/DeepSeek-V3-abliterated

liked a model 5 days ago

huihui-ai/QwQ-32B-abliterated

new activity 10 days ago

TheDrummer/Fallen-Llama-3.3-R1-70B-v1-GGUF:Wrong gguf's in repo?

View all activity

Organizations

FiditeNemini's activity

liked 2 models 5 days ago

huihui-ai/DeepSeek-V3-abliterated

Updated about 9 hours ago • 64

huihui-ai/QwQ-32B-abliterated

Text Generation • Updated about 18 hours ago • 768 • 39

New activity in TheDrummer/Fallen-Llama-3.3-R1-70B-v1-GGUF 10 days ago

Wrong gguf's in repo?

#1 opened 11 days ago by

FiditeNemini

reacted to luigi12345's post with 👍 22 days ago

Post

1770

✅BEST DEBUG PROMPT
Language: Any. 🌀Project Type: Any

What prompt if sent to you  will make you  detect and fix all the code crashing  issues in the COMPLETE codebase so I don't have to ask you to fix again and again ?
Step 1. Gimme such prompt
Step 2 Follow it yourself quietly and COMPLETELY
Step3: State  that if you are asked again about finding fatal bugs and logic issues and inconsistencies in the current codebase you would not be able  to  find more. (You cannot lie, so you must make all the code adjustments necessary prior to  such statement).

reacted to grimjim's post with 👍 27 days ago

Post

2104

This recent paper points to an explanation for the unreasonable effectiveness of Frankenmerges: Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach (2502.05171)

Specifically, the duplication of layers in Frankenmerges serves a purpose similar to what occurs in their recurrent-depth architecture. Successful frankenmerges that operate without additional fine-tuning are able to recover or "heal" from any damage due to abrupt transitions between layer blocks. Operational replicated layer blocks can provide functional benefits grounded in latent reasoning. Frankenmerges can also result in hybrid reasoning, by splicing together the latent reasoning of different models.

Back in April 2024, I was able to duplicate a few layers in the Llama 3 8B model, turning it into a 9B model, without harming benchmarks significantly, despite any transition damage.
grimjim/llama-3-experiment-v1-9B
My informal experimentation suggested that latent reasoning circuits could occupy continguous stacks of 2-4 layers, though the result was highly sensitive to the choice of transition location between layers.

1 reply

liked a dataset about 1 month ago

cognitivecomputations/dolphin-r1

Viewer • Updated Jan 30 • 814k • 4.87k • 271

updated a model about 1 month ago

FiditeNemini/Unhinged-Author-70B

Text Generation • Updated Jan 29 • 46 • 1

published a model about 1 month ago

FiditeNemini/Unhinged-Author-70B

Text Generation • Updated Jan 29 • 46 • 1

updated a model about 1 month ago

FiditeNemini/Unhinged-Qwen2.5-R1-1M-Uncensored-BF16

Updated Jan 28 • 38

published a model about 1 month ago

FiditeNemini/Unhinged-Qwen2.5-R1-1M-Uncensored-BF16

Updated Jan 28 • 38

reacted to mkurman's post with 🔥 about 1 month ago

Post

1918

I’ve simplified things for the AI OS community!

Check out Qwen-2.5-14B-DeepSeek-R1-1M! This one's a cool blend of the latest Qwen 2.5 with 14 billion parameters and has a massive 1 million token context window. It also comes with the DeepSeek R1 version of the Qwen 2.5 14B base model.

Enjoy! 🚀

mkurman/Qwen2.5-14B-DeepSeek-R1-1M