metadata
license: bigscience-bloom-rail-1.0
tags:
- stable-diffusion
- diffusion
model-index:
- name: bloom-560m-RLHF-SD2-prompter
results: []
datasets:
- Gustavosta/Stable-Diffusion-Prompts
widget:
- text: '<s>Prompt: '
inference:
parameters:
eos_token_id: 2
max_length: 128
do_sample: true
Using RLHF (Reinforcement Learning from Human Feedback) to finetune mrm8488/bloom-560m-finetuned-sd-prompts further for SD2.0
batch_size = 16
learning_rate = 0.001 # this is why I didn't have to spend _forever_ on it
Generate extension with "<s>Prompt: " and whatever your normal prompt is.
I did this myself. I sat down and just ranked images for so long. It's gone through a couple iterations. Only the biases and layernorm weights were trained. The commit messages are a MESS