JJ's picture

54

JJ

J22

·

AI & ML interests

None yet

Recent Activity

new activity about 2 months ago

ibm-granite/granite-3.0-3b-a800m-instruct:Upload tokenizer.json

updated a model about 2 months ago

ibm-granite/granite-3.0-3b-a800m-instruct

new activity about 2 months ago

facebook/MobileLLM-1B:a horrible function in `modeling_mobilellm.py`

View all activity

Organizations

None yet

J22's activity

New activity in ibm-granite/granite-3.0-3b-a800m-instruct about 2 months ago

Upload tokenizer.json

#1 opened about 2 months ago by

New activity in facebook/MobileLLM-1B about 2 months ago

a horrible function in `modeling_mobilellm.py`

#5 opened about 2 months ago by

New activity in allenai/OLMoE-1B-7B-0924-Instruct 3 months ago

Run this on CPU

#6 opened 3 months ago by

New activity in openbmb/MiniCPM3-4B 3 months ago

Run on CPU

#13 opened 3 months ago by

New activity in microsoft/Phi-3.5-MoE-instruct 4 months ago

need gguf

#4 opened 4 months ago by

New activity in meta-llama/Llama-3.1-8B-Instruct 5 months ago

Best practice for tool calling with meta-llama/Meta-Llama-3.1-8B-Instruct

#33 opened 5 months ago by

Run this on CPU and use tool calling

#38 opened 5 months ago by

New activity in AI-MO/NuminaMath-7B-TIR 5 months ago

My alternative quantizations.

#5 opened 5 months ago by

New activity in mistralai/Mistral-7B-Instruct-v0.3 6 months ago

Tool calling is supported by ChatLLM.cpp

#36 opened 6 months ago by

New activity in mistralai/Mistral-7B-Instruct-v0.3 7 months ago

can't say hello

#9 opened 7 months ago by

no system message?

#14 opened 7 months ago by

New activity in microsoft/Phi-3-small-8k-instruct 7 months ago

"small" is so different from "mini" and "medium"

#8 opened 7 months ago by

New activity in nvidia/Llama3-ChatQA-1.5-8B 8 months ago

how to set context in multi-turn QA?

#14 opened 8 months ago by

New activity in microsoft/Phi-3-mini-128k-instruct 8 months ago

clarification on the usage of `short_factor` and `long_factor`?

#49 opened 8 months ago by

Continue the discussion: `long_factor` and `short_factor`

#32 opened 8 months ago by

New activity in microsoft/Phi-3-mini-4k-instruct 8 months ago

is the '\n' after `'<|end|>'`?

#43 opened 8 months ago by

Is sliding window used or not?

#25 opened 8 months ago by

New activity in microsoft/Phi-3-mini-128k-instruct 8 months ago

`long_factor` is never used?

#22 opened 8 months ago by

generate +6 min, +20GB V-ram

#17 opened 8 months ago by

`sliding_window` is larger than `max_position_embeddings`

#21 opened 8 months ago by