librarian-bot's picture
Librarian Bot: Add base_model information to model
51dc329
|
raw
history blame
No virus
6.53 kB
metadata
language:
  - en
license:
  - apache-2.0
  - cc-by-nc-4.0
tags:
  - generated_from_trainer
  - instruct
  - instructions
  - code
  - instructiongen
datasets: pszemraj/fleece2instructions-codealpaca
metrics:
  - rouge
widget:
  - text: |
      git lfs install
      huggingface-cli lfs-enable-largefiles .
      git lfs track "*.bin"
      git add .
      git commit -a -m "add fp32 chkpt"
      git push
    example_title: bash
  - text: |
      export interface DocumentParams {
        pageContent: string;

        // eslint-disable-next-line @typescript-eslint/no-explicit-any
        metadata: Record<string, any>;
      }

      /**
       * Interface for interacting with a document.
       */
      export class Document implements DocumentParams {
        pageContent: string;

        // eslint-disable-next-line @typescript-eslint/no-explicit-any
        metadata: Record<string, any>;

        constructor(fields?: Partial<DocumentParams>) {
          this.pageContent = fields?.pageContent ?? this.pageContent;
          this.metadata = fields?.metadata ?? {};
        }
      }
    example_title: js
  - text: |
      def merge(left, right):
          if len(left) == 0:
              return right

          if len(right) == 0:
              return left

          result = []
          index_left = index_right = 0

          while len(result) < len(left) + len(right):
              if left[index_left] <= right[index_right]:
                  result.append(left[index_left])
                  index_left += 1
              else:
                  result.append(right[index_right])
                  index_right += 1

              if index_right == len(right):
                  result += left[index_left:]
                  break

              if index_left == len(left):
                  result += right[index_right:]
                  break

          return result
    example_title: merge
  - text: >
      import pandas as pd

      import plotly.graph_objects as go


      df =
      pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/2014_apple_stock.csv')


      fig = go.Figure(go.Scatter(x = df['AAPL_x'], y = df['AAPL_y'],
                        name='Share Prices (in USD)'))

      fig.update_layout(title='Apple Share Prices over time (2014)',
                         plot_bgcolor='rgb(230, 230,230)',
                         showlegend=True)

      fig.show()
    example_title: plot
  - text: |
      from spellchecker import SpellChecker

      spell = SpellChecker()

      def check_word_spelling(word: str):
          misspelled = spell.unknown([word])
          return len(misspelled) == 0

      def eval_and_replace(text: str, match_token: str = "- "):
          if match_token not in text:
              return text
          else:
              while True:
                  full_before_text = text.split(match_token, maxsplit=1)[0]
                  before_text = [
                      char for char in full_before_text.split()[-1] if char.isalpha()
                  ]
                  before_text = "".join(before_text)
                  full_after_text = text.split(match_token, maxsplit=1)[-1]
                  after_text = [char for char in full_after_text.split()[0] if char.isalpha()]
                  after_text = "".join(after_text)
                  full_text = before_text + after_text
                  if check_word_spelling(full_text):
                      text = full_before_text + full_after_text
                  else:
                      text = full_before_text + " " + full_after_text
                  if match_token not in text:
                      break
              return text

      text = "I- am- a go- od- boy"
      eval_and_replace(text)
    example_title: spell check
  - text: >
      import torch

      from transformers import AutoTokenizer, AutoModelForSequenceClassification


      checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"

      tokenizer = AutoTokenizer.from_pretrained(checkpoint)

      model = AutoModelForSequenceClassification.from_pretrained(checkpoint)

      sequences = ["I've been waiting for a HuggingFace course my whole life.",
      "So have I!"]


      tokens = tokenizer(sequences, padding=True, truncation=True,
      return_tensors="pt")

      output = model(**tokens)
    example_title: model inference
inference:
  parameters:
    max_length: 96
    num_beams: 4
base_model: facebook/bart-base

bart-base-code-instructiongen

Use this text2text model to find out what LLM instructions might be able to generate an arbitary piece of code!

This model is a fine-tuned version of facebook/bart-base on the pszemraj/fleece2instructions-codealpaca dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0136
  • Rouge1: 59.9513
  • Rouge2: 33.9118
  • Rougel: 55.7815
  • Rougelsum: 56.9064
  • Gen Len: 29.7146

Intended uses & limitations

🚨 note: as the authors elected to release the original dataset under cc-by-nc, the license carries over to this model and cannot be used for commercial activity.

This is just a base size model, which does a decent job for its size, but is not perfect. For better quality instructions, check out bart-large or fine tune your own larger model on the dataset :)

Intended use: Research on domain adaptation and/or other improvements to LLMs by extending instruction:text data pairs.

Training and evaluation data

Refer to the linked dataset card for pszemraj/fleece2instructions-codealpaca or the original dataset repo.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.02
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.1165 1.0 281 1.1090 57.9239 31.9259 53.8737 54.9811 28.2924
1.0763 2.0 563 1.0267 59.9605 34.0298 55.7523 56.8021 29.6966
0.9595 2.99 843 1.0136 59.9513 33.9118 55.7815 56.9064 29.7146