amazon/MistralLite · Request: Would the amazon team be willing to train a model on my high quality dataset?

Oct 27, 2023

I have created and refined open source dataset named "LosslessmegacodeV3" (Linked at the end) which I believe could create one of the best open source ai models if trained on the right base model. However seeing as I am severally lacking in funding (Aka im broke af) I haven't been able to do the training myself. I'm curious if your team would be willing to take on the challenge of training one ai model on my dataset for coding and non-coding tasks (the dataset is made for both) to create possibly one of the best ai models available. If you are up for the challenge, here is a list of the top models I would recommend training with my dataset in order or highest priority (Note that I would only ask you to train 1 model, I am merely giving multiple options):

1: WizardLM/WizardCoder-Python-13B-V1.0

https://huggingface.co/WizardLM/WizardCoder-Python-13B-V1.0

2: amazon/MistralLite

https://huggingface.co/amazon/MistralLite

3: WizardLM/WizardCoder-Python-34B-V1.0 (#3 and #4 are equal in priority)

https://huggingface.co/WizardLM/WizardCoder-Python-34B-V1.0

4: Phind/Phind-CodeLlama-34B-v2 (#3 and #4 are equal in priority)

https://huggingface.co/Phind/Phind-CodeLlama-34B-v2

5: jondurbin/airoboros-l2-c70b-3.1.2

https://huggingface.co/jondurbin/airoboros-l2-c70b-3.1.2

If you agree I have some names for the model that would release if you would allow me. I've listed them bellow. Lossless and V3 referring to the datasets That were used to train the models on.

1: LosslessWizardCoder-Python-13B-V3

2: LosslessMistralLitecoderV3

3: LosslessWizardCoder-Python-34B-V3

4: LoesslessPhind-LlamaCoder-34B-V3

5: LosslessAiroborosCoder-l2-c70b-V3

Dataset link:

https://huggingface.co/datasets/rombodawg/LosslessMegaCodeTrainingV3_1.6m_Evol

yinsong1986

Amazon Web Services org Oct 30, 2023

Hi @rombodawg Thanks for the advice!

Unfortunately, we have our internal process to work on this topic.

Based on your description, codewhisperer might be sth you are interested. Please have a try :)

yinsong1986 changed discussion status to closed Nov 4, 2023