--- language: - en license: apache-2.0 tags: - text-generation-inference - transformers - unsloth - mistral - trl base_model: LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III datasets: - gretelai/synthetic_text_to_sql - HuggingFaceTB/cosmopedia - teknium/OpenHermes-2.5 - Open-Orca/SlimOrca - Open-Orca/OpenOrca - cognitivecomputations/dolphin-coder - databricks/databricks-dolly-15k - yahma/alpaca-cleaned - uonlp/CulturaX - mwitiderrick/SwahiliPlatypus - swahili - Rogendo/English-Swahili-Sentence-Pairs - ise-uiuc/Magicoder-Evol-Instruct-110K - meta-math/MetaMathQA - abacusai/ARC_DPO_FewShot - abacusai/MetaMath_DPO_FewShot - abacusai/HellaSwag_DPO_FewShot - HaltiaAI/Her-The-Movie-Samantha-and-Theodore-Dataset - gretelai/synthetic_text_to_sql - HuggingFaceTB/cosmopedia - teknium/OpenHermes-2.5 - cognitivecomputations/dolphin-coder - databricks/databricks-dolly-15k - yahma/alpaca-cleaned - uonlp/CulturaX - mwitiderrick/SwahiliPlatypus - swahili - Rogendo/English-Swahili-Sentence-Pairs - ise-uiuc/Magicoder-Evol-Instruct-110K - meta-math/MetaMathQA metrics: - accuracy - bertscore - bleu - brier_score - cer - character - charcut_mt - chrf - code_eval y-Gene: - LeroyDyer/Mixtral_AI_DeepMind - LeroyDyer/Mixtral_AI_CyberUltron_DPO - LeroyDyer/Mixtral_AI_Chat_2.0 - LeroyDyer/Mixtral_AI_DeepMedicalMind - LeroyDyer/Mixtral_AI_Samantha x-Gene: - LeroyDyer/Mixtral_AI_Chat_2.0 - LeroyDyer/Mixtral_BioMedical - LeroyDyer/Mixtral_AI_Medic - LeroyDyer/Mixtral_Cyber_BioMedic - LeroyDyer/Mixtral_AI_DeepMedicalMind Variant: - LeroyDyer/MetaMath_LLM - LeroyDyer/TruthfulQA_LLM - LeroyDyer/HellaSwag_LLM - LeroyDyer/Mixtral_AI_DeepMedicalMind --- # ::: DEEP MIND PROJECT ::: here we begin the models for Deep mind : this model created from the first trained models : deepmind! these models contain: ## thoughts and processes : ## SelfRAG: ## Agent Generation: ## Chain of thoughts : ## Deep thinking and memory recall: Training Prompt version - Working GREAT! - checks itsef discussing complex questions (question it does not know the answer to ... it trys to discuss with itself to find a result(sometimes unsucessfully)) It generates Mini agents to perform small tasks such as entity recognition; step by step definitions, write psuedo codebases , generare uscases... perform calculations, analize content It thinks.... sometimes sarcasim , sometimes reflection... sometimes random thoughts ... it has personalitys : by installing various long discussions with chat gpt in persona it weas able to generate role coversation data, which was added to its conversation chat Q/A; as well as a datset from the samantha tv show ... and HER!.... so it is a personal assistant and very friendly; It has been really training mainly on coding datasets and medical information : from experiments to research to patient/doctor .. to diagnosis ... to problem solving : it has been trained to be a counseller and assist with psycological problems :: empathtetic discussion : this one has its own thoughts despite the prompt given : (if you allow the thought prompt it will display the thoughts) this is a highly focused model : ### Methodology: many functions such as defining words andnlp task we also added via datsets and very complexed datstructures and prompts : These prompts are removed after training and standard alpaca training given on top:(this enables for the previous highly over fit task to become embedded underneath the previous layer): its important to Change Lora configuration for Embedding layers within the model as well as fine tuning above previous training: Usually i deploy a factor of 8 calcuculation for my loras by this one i chose factor of 9 (9-18/18/36) .... which actually trained so smoothly that i was able to train many different datsets in a signle sitting ; to below 0.9 all varioations of the alpaca prompt ! after testing the was absolutly 0 loss from previous knowledge as well as enhancing some responses and providing comparitive responses for others; I personally use a topK of 1000.... this allows the model to have many choices (this is the context window of results), i put my topP to 0.68(68%).... hence it will select from that percentage of probabiltys... enabling for my temp to be 1 .. therfore it will normalize the selected quartile of next probablity selection enabling for the lower probabiltys to have a scaled chace in being selected : It is important to have a degree of randomness in the respopnse or you will ask the same question and get the same answer ! .... we need varied answer to ome querys and focues for other ? how do we do this ?..... Duplicates!!!!! raising the probability of some information by repetition : as this is how the human learns truth ! truth is that which has been repeated so many times it cannot be disputed! hence some information being absolute and others being transient and constantly updateing: As a predictve model it needs to be ables to have the ability to calculate and predicte and cclassify as wel as recall exact information : hence when utilizing a rag : the conversation history is the dats to be fine tuned into the model as frequent data! as well as producing multiple simular querys to query the rag system for Q/A pairs : also to be updted onto the model : as we are in this development period we are focused on BRAIN cureently ....... # Uploaded model - **Developed by:** LeroyDyer - **License:** apache-2.0 - **Finetuned from model :** LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. [](https://github.com/unslothai/unsloth)