README.md · crumb/bloom-560m-RLHF-SD2-prompter at 329565e1b88cd2a30bb433e36d963822d5ba09c2

metadata

license: bigscience-bloom-rail-1.0
tags:
  - stable-diffusion
  - diffusion
model-index:
  - name: bloom-560m-RLHF-SD2-prompter
    results: []
datasets:
  - Gustavosta/Stable-Diffusion-Prompts
widget:
  - text: '<s>Prompt: '
inference:
  parameters:
    eos_token_id: 2
    max_length: 128
    do_sample: true

Using RLHF (Reinforcement Learning from Human Feedback) to finetune mrm8488/bloom-560m-finetuned-sd-prompts further for SD2.0

batch_size = 16
learning_rate = 0.001 # this is why I didn't have to spend _forever_ on it

Generate extension with "<s>Prompt: " and whatever your normal prompt is.

I did this myself. I sat down and just ranked images for so long. It's gone through a couple iterations. Only the biases and layernorm weights were trained. The commit messages are a MESS