It's the 4th version and you still have not eliminated text from the dataset?
Lots of work for poor orgs, it's 4th version and we can still mitigate stuff like this with negative prompt or regenerate seed.
Hey there! I totally get your frustration, but let me break this down for you! π
Imagine trying to manually check 8.4 MILLION images - that's like trying to count every grain of sand on a beach! Even with automated tools, detecting and removing text is super tricky because text appears in so many different ways in images.
But here's the cool part - that's exactly why negative prompts exist! Instead of trying to remove all text from the dataset (which would be a massive undertaking), you have full control to remove text whenever you want by simply using negative prompts like "text" It's actually a more flexible solution that puts the power in your hands!
Think of it as a feature rather than a limitation. You get to decide when you want text-free images! π¨β¨
Also make sure to follow our guideline on how to use the model~
Yea look regardless of what ifs ands or butts come out of either side, i have trouble checking 200 images, let alone 8.4 million lol.
You guys do amazing work.
And of course, users in general do amazing work checking in with the trainers.
Lots of work for poor orgs, it's 4th version and we can still mitigate stuff like this with negative prompt or regenerate seed.
it's already with the negatives, naturally the default NAI negatives used everywhere include "text" in them. But Noobai and Pony don't require those to be free of text so it's not a matter of 8.4 mil etc.
Okay, is this also a feature when model forgets characters it used to know in 3.1?
This was supposed to be Garnet from FF9, now it's a furry. (4-opt on top, 3.1 below).
3.1 and 4.0 are very different models. 3.1 was fine-tuned from 2.0, while 4.0 was fine-tuned directly from SDXL 1.0, which makes a new branch. So, some of the prompt styling and the knowledge might be very different.
Also, 'NAI negative' is not working in this model due to 'masterpiece, best quality' being less important than 'high score, great score.'
And yes, it's not only a matter of comparison with NoobAI or Pony. NoobAI is a continuation of Illustrious XL v0.1, which used a very high learning rate and 20 epochs, then continued for another 20+ epochs with 32x H100, while our model had just 10 epochs with a very low learning rate, not comparable. Not to mention, Pony is using 2.6M images out of 10M, all of which are high quality. We're doing very minimal dataset cleaning to make it a better pre-trained model rather than an aesthetic model, so people can have the freedom to fine-tune it or simply steer it with tags.
When using this model, you need to forget how to prompt on Pony or NoobAI, because the usage guide is very different, please refer to the guides.
Text is completely fine when training at 1024x1024 and at danbooru scales. It made sense to remove in the sd1.5 era because of the mess it made when downscaled and passed through the vae, but that's not a problem anymore. Regular non-schizo negatives tend to prevent it. Noob didn't remove text from their set either, idk about pony.
Pony didn't..
And in XL formatted base models it's fairly easy to get text OUT because unlike XL base, most finetunes aren't as woried about text compatibility.
I haven't tested trying to trained sassy VHS tape covers on AnimagineXL yet....
Anyone wanna dare me?