wing lian's picture

wing lian PRO

winglian

·

AI & ML interests

None yet

Recent Activity

liked a dataset about 11 hours ago

Intelligent-Internet/II-Thought-RL-v0

liked a dataset 2 days ago

secemp9/instruction_solution_thought

updated a model 2 days ago

winglian/reasoning-llama-3.1-70b-stratos-cold-start

View all activity

Organizations

winglian's activity

New activity in moonshotai/Moonlight-16B-A3B-Instruct 16 days ago

remove import/call to code no longer in latest transformers

#3 opened 16 days ago by

New activity in deepseek-ai/DeepSeek-V3 about 1 month ago

remove reference to deprecated transformers code

#74 opened about 1 month ago by

New activity in nvidia/Hymba-1.5B-Base 3 months ago

fix int/str for conv_dim indexing

#5 opened 3 months ago by

New activity in axolotl-ai-co/romulus-mistral-nemo-12b-simpo 6 months ago

Update README.md

#2 opened 6 months ago by

New activity in deepseek-ai/DeepSeek-Prover-V1.5-Base 7 months ago

the config class and config.json uses DeepseekConfig, not v2

#5 opened 7 months ago by

Match the config class name to what the modeling code expects

#4 opened 7 months ago by

New activity in microsoft/Phi-3.5-mini-instruct 7 months ago

trust_remote_code=True

#9 opened 7 months ago by

New activity in NousResearch/Hermes-2-Pro-Llama-3-8B 10 months ago

add axolotl tag

#1 opened 10 months ago by

New activity in mattshumer/Llama-3-8B-16K 11 months ago

add axolotl tag

#3 opened 11 months ago by

New activity in cognitivecomputations/dolphin-2.9-llama3-8b 11 months ago

add axolotl tag

#12 opened 11 months ago by

New activity in openbmb/Eurus-RM-7b 11 months ago

Enable flash_attention_2 support since the underlying Mistral model supports it

#3 opened 11 months ago by

New activity in meta-llama/Meta-Llama-3-8B 11 months ago

Rename original/tokenizer.model to tokenizer.model

#6 opened 11 months ago by

commented a paper 11 months ago

Octopus v2: On-device language model for super agent

Paper • 2404.01744 • Published Apr 2, 2024 • 58 •

New activity in PrunaAI/dbrx-base-bnb-4bit 11 months ago

invalid weights doesn't match modeling code

#3 opened 11 months ago by

New activity in SinclairSchneider/dbrx-base-quantization-fixed 11 months ago

reduce verbosity of logging

#1 opened 11 months ago by

New activity in databricks/dbrx-instruct 12 months ago

The fused expert parameters means load_in_4bit doesn't work properly, nor does LoRA

#10 opened 12 months ago by

New activity in LnL-AI/dbrx-base-converted-v2 12 months ago

reduce logging verbosity

#3 opened 12 months ago by

New activity in SinclairSchneider/dbrx-instruct-quantization-fixed 12 months ago

dbrx-base

#2 opened 12 months ago by

New activity in ai21labs/Jamba-v0.1 12 months ago

finetuning issues

#9 opened 12 months ago by

Fix bias logic to enable QLoRA finetuning

#5 opened 12 months ago by