which dataset?

by cruiser - opened Mar 31, 2023

Mar 31, 2023

gpt4all or something else? If gpt4all, hopefully it was on the unfiltered dataset with all the "as a large language model" removed.

TheYuriLover

Mar 31, 2023

I hope it's a gpt 4 dataset without some "I'm sorry, as a large language model" bullshit inside

teknium

Mar 31, 2023

It is gpt-4 self instruct.

JonOne

Mar 31, 2023

I have Gpt4all installed. How do I install this?’

teknium

Apr 1, 2023

This has nothing to do with gpt4all, you pip install transformers and run inference on this model

karan4d

Apr 2, 2023

It is gpt-4 self instruct.

will you be releasing the dataset for further research?

teknium

Apr 2, 2023

It is gpt-4 self instruct.

will you be releasing the dataset for further research?

Here is the dataset: https://github.com/teknium1/GPTeacher
Its the general instruct set

Gustavoson

Apr 2, 2023

ValueError: Tokenizer class LlamaTokenizer does not exist or is not currently imported.

mynamesisjames

Apr 2, 2023

ValueError: Tokenizer class LlamaTokenizer does not exist or is not currently imported.

You are not in the right thread ! Try 'pip install git+https://github.com/huggingface/transformers' in your environment

teknium

Apr 2, 2023

Yes there is an issue with the class names in the model.config, open it up and change LLaMA to Llama

teknium

Apr 2, 2023

Also an issue in the tokenizer.config, change 512 to 2048 in the max token spot

timlim123

Apr 6, 2023

HI @teknium , where does it show the fine-tuning code change the max sequence length to 2048?

teknium

Apr 6, 2023

HI @teknium , where does it show the fine-tuning code change the max sequence length to 2048?

LLaMA is already set for 2048 tokens, its just set wrong in the config here

timlim123

Apr 6, 2023

Thank you, I got it now.

It seems weird that Llama recommends changing the config to 512 to make it fit better with GPUs, I always thought that the input size into LLM are fixed and lower length input are always padded to the maximum length anyways. A question that does not relate to this repo but:

How does reducing the sequence length to 512 during inference (like in Llama) help? Wouldn't the model just pad to the maximum size that it was trained on?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment