Edit model card

Posted here first: https://www.reddit.com/r/Oobabooga/comments/192qb2c/mermaidmistral_a_work_in_progress_model_for_flow/

It's kinda barely under 16GB in full precision, you can give it a try using Google Colab Notebook I threw this together but be warned it will OOM on long context due to Free Tier being limited to 15GB VRAM.

Introduction:

Introducing MermaidMistral, a powerful yet compact 7-billion-parameter language model adept at Python code understanding and crafting engaging story flow maps. Trained on a meticulously hand curated dataset of 478 diverse Python examples and hand crafted mermaid flow maps utilizing https://mermaid.live, this model goes beyond its size to deliver exceptional performance in code understanding and story visualization.

Key Features:

MermaidMistral is not a "Chatty Kathy" and should only respond with a mermaid code block with a flow diagram in mermaid js syntax and nothing more.

1. Code Understanding:

  • Grasps Python intricacies with finesse.
  • Generates clear and accurate Mermaid Diagram Flow Charts.
  • Ideal for developers seeking visual representations of their code's logic.

2. Storytelling Capabilities:

  • Converts narrative inputs into captivating Mermaid Diagrams.
  • Maps character interactions, plot developments, and narrative arcs effortlessly.

3. Unmatched Performance:

  • Surpasses larger models, like GPT-4, in generating well-organized and detailed Mermaid Diagrams for story flows.

4. Training Insights:

  • Trained on a 478 Python examples for just under three epochs on a single RTX 3090 following batch size equal to 1, known as stochastic gradient descent.
  • Exhibited emergent properties in story-to-flow map translations.
  • Adaptable and efficient in resource utilization
  • Due to hardware constraints this fine tune has a token limit of 2048.
  1. Mermaid Mistral Generation 1: Mermaid Mistral Generation 1

  2. Mermaid Mistral Generation 2: Mermaid Mistral Generation 2

  3. ChatGPT Generation: ChatGPT Generation

Collaboration:

MermaidMistral is open to collaboration to further strengthen its capabilities. The dataset, formatted in Alpaca, provides a unique foundation for understanding Python intricacies. If you're interested in contributing or collaborating to enhance the model's performance, feel free to reach out to troydoesai@gmail.com. Your expertise could play a pivotal role in refining MermaidMistral.

Example Use Cases:

1. Code Documentation:

  • Developers can use MermaidMistral to automatically generate visual flow charts from their Python code, aiding in documentation and code understanding.

2. Storyboarding:

  • Storytellers and writers can input their narrative and receive visually appealing Mermaid Diagrams, offering a structured overview of character interactions and plot progression.

3. Project Planning:

  • Project managers can leverage MermaidMistral to create visual project flow maps, facilitating effective communication and planning among team members.

4. Learning Python:

  • Students and beginners can use MermaidMistral to visually understand Python code structures, enhancing their learning experience.

5. Game Design:

  • Game developers can utilize MermaidMistral for visualizing game storylines, ensuring a coherent narrative structure and character development.

Proof of Concept:

MermaidMistral proves that innovation thrives in compact packages, delivering exceptional performance across diverse applications. Its adaptability and efficiency showcase the potential for groundbreaking results even in resource-constrained environments.

These mermaid codeblocks can be converted directly into images using mermaid cli tool found here: https://github.com/mermaid-js/mermaid-cli

I plan to release my working proof of concept VSCode Extension that currently displays the Live Flow Map every time a user stops typing for more than 10 seconds.

Stay tuned.

Example Story -> Flow

https://chat.openai.com/share/e3163857-981b-4968-b2db-98ad869c9259

Insights on how to get best results

For best results use full precision using one of the 3 different instruction types:

  • "instruction": "Create the mermaid diagram for the following code:",
  • "instruction": "Create the mermaid diagram for the following story:",
  • "instruction": "Create the mermaid diagram for the following:",
Downloads last month
21
Safetensors
Model size
7.24B params
Tensor type
FP16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using TroyDoesAI/MermaidMistral 1