Jue Wang's picture

37 6

Jue Wang

juewang

·

https://juewang.me/about/

AI & ML interests

None yet

Organizations

juewang's activity

New activity in codellama/CodeLlama-70b-Instruct-hf about 1 year ago

Context length?

#2 opened about 1 year ago by

New activity in EleutherAI/neox-ckpt-pythia-12b-v1 over 1 year ago

Missing files?

#1 opened over 1 year ago by

New activity in togethercomputer/LLaMA-2-7B-32K over 1 year ago

Correct the output dtype of rmsnorm_func

#13 opened over 1 year ago by

how to fine tune peft qlora and SFTTrainer?

#2 opened over 1 year ago by

New activity in togethercomputer/RedPajama-INCITE-7B-Instruct over 1 year ago

Poor performance?

#6 opened over 1 year ago by

New activity in togethercomputer/GPT-NeoXT-Chat-Base-20B almost 2 years ago

Can you help me fine-tune this with LoRA? (Having an error)

#12 opened almost 2 years ago by

What kind of machine would be suitable for this model (in amazon sagemaker)?

#7 opened almost 2 years ago by

Will it be possible to run this on PC with 8 GeForce RTX 3060 with 8 Gb VRAM each?

#11 opened almost 2 years ago by

New activity in togethercomputer/GPT-JT-6B-v1 almost 2 years ago

Any way to set the "stop, split by" when running the model locally?

#26 opened almost 2 years ago by

New activity in togethercomputer/GPT-NeoXT-Chat-Base-20B almost 2 years ago

Issue with loading model to GPU when using pipeline

#5 opened almost 2 years ago by

Is it a wrong prompt?

#8 opened almost 2 years ago by

tatyanavidrevich

New activity in togethercomputer/GPT-JT-6B-v1 almost 2 years ago

Feature requests and suggestions for V2

#4 opened over 2 years ago by

New activity in togethercomputer/GPT-NeoXT-Chat-Base-20B almost 2 years ago

use accelerate to load model

#4 opened almost 2 years ago by

This model requires A LOT of resources... But how much? Trying to build a chatbot

#3 opened about 2 years ago by

New activity in togethercomputer/GPT-JT-6B-v1 almost 2 years ago

Generated Text have issues

#22 opened about 2 years ago by

New activity in togethercomputer/GPT-NeoXT-Chat-Base-20B almost 2 years ago

Is UL2 used?

#2 opened about 2 years ago by

New activity in togethercomputer/GPT-JT-6B-v1 about 2 years ago

Question-Answering over documents

#19 opened about 2 years ago by

Confused about bidirectional attention when implementing custom sampling loop

#25 opened about 2 years ago by

ericanthonymitchell

Model behavior during adaptation phase

#24 opened about 2 years ago by

Fine Tuning // Download Full Weights

#23 opened about 2 years ago by