manojpreveen
/

gpt-neoxt-20b-v2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

gpt-neoxt-20b-v2 / README.md

manojpreveen's picture

Duplicate from iamplus/gpt-neoxt-20b-v2

b86fe23 verified 9 months ago

|

history blame contribute delete

695 Bytes

	---
	license: bigscience-openrail-m
	datasets:
	- iamplus/Instruction_Tuning
	---
	Instruction Tuned GPT-NeoXT-20B model on Stanford Alpaca-2 Instruction Tuning dataset (outputs from ChatGPT) (52k data) using *Colossal AI*

	Base Model: togethercomputer/GPT-NeoXT-Chat-Base-20B (not fine-tuned on feedback data)

	Training Details :
	* Epochs: 5
	* Batch Size : 16 instantaneous per device x 1 gradient accumulation steps x 8 gpus = 128
	* Max Length : 1024
	* Weight Decay : 0
	* Learning Rate : 2e-5
	* Learning Rate Scheduler Type : Cosine
	* Number of warmup steps : 30
	* Machine : 8xA100 80GB

	Dataset Details :

	Dataset : iamplus/Instruction_Tuning

	Files :
	* stanford_alpaca_it_v2.csv