Spaces:
Running
on
CPU Upgrade
Second and third tries on a text description to see if it draws it well
Text: "a lovely and lively elegant fantasy woman weighing 110 kilograms and standing 5 feet 9 inches tall, wearing bluejean overalls and barefoot, stands on the sidewalk in front of her house, and bends over to pick up the newspaper. Her long, red, shiny hair falls to the ground, and two boys watch her from across the street. The sky is mostly clear and blue, with a plane flying overhead.", perfect-face, painted with flair in an abstract fashion, highly detailed, sharp focus, perfect-body, masterpiece, critically-acclaimed, intricate, volumetric lighting, beautiful composition
Text: "a lovely and lively elegant fantasy man, wearing purple sweatpants, a green sweater, and enormous clown-shoes, sits on the hood of a car that is parked in front of the United States Capital building. He holds an enormous bug-net in the air and catches dollar-bills with it.", perfect-face, photorealistic, highly detailed, sharp focus, perfect-body, masterpiece, critically-acclaimed, intricate, volumetric lighting, beautiful composition
(Very slight change... removed the word "fantasy" from the man's description and replaced "car" with "dark-blue BMW")
Text: "a lovely and lively elegant man, wearing purple sweatpants, a green sweater, and enormous clown-shoes, sits on the hood of a dark-blue BMW that is parked in front of the United States Capital building. He holds an enormous bug-net in the air and catches dollar-bills with it.", perfect-face, photorealistic, highly detailed, sharp focus, perfect-body, masterpiece, critically-acclaimed, intricate, volumetric lighting, beautiful composition
(Changed his shoes from clownshoes to black reeboks. this drastically changed pretty much everything, strangely. It really thinks it knows things about someone based on their shoes)
"a lovely and lively elegant man, wearing purple sweatpants, a green sweater, and black Reeboks, sits on the hood of a dark-blue BMW that is parked in front of the United States Capital building. He holds an enormous bug-net in the air and catches dollar-bills with it.", perfect-face, photorealistic, highly detailed, sharp focus, perfect-body, masterpiece, critically-acclaimed, intricate, volumetric lighting, beautiful composition
this time, I change the shoes again to dockers, and call him "bombastic" rather than "lovely" (why not). But more importantly, made the two sentences into one run-on sentence. I feel like the second sentence about the bug net and dollar bills has been almost ignored every time, and so, I made one (even longer) run-on sentence instead. However, this seems to have not improved anything at all. I notice that because I mentioned the colors green, purple, and black in the description, these colors are used for everything.
Text: "bombastic and lively elegant man, wearing purple sweatpants, a green sweater, and black dockers, sits on the hood of a dark-blue BMW that is parked in front of the United States Capital building, and holds an enormous bug-net in the air and catches dollar-bills with it.", perfect-face, photorealistic, highly detailed, sharp focus, perfect-body, masterpiece, critically-acclaimed, intricate, volumetric lighting, beautiful composition
overhauled the sentence with brackets, to see if it aids the a.i.'s comprehension.
surprised that it still gets the color of the car wrong so often. Changed it to blue but that didn't do much.
Text: "man who is [ bombastic and lively elegant ] and [ wearing [ purple sweatpants] and [ green sweater ] and [ black dockers ] ], sits on [ [ the hood of ] a blue BMW parked [ [ in front of ] the [ United States Capital building ] and [ holds [ [in the air ] [ an enormous ] net ] and catches [ dollars [ that are flying around ] ].", perfect-face, photorealistic, highly detailed, sharp focus, perfect-body, masterpiece, critically-acclaimed, intricate, volumetric lighting, beautiful composition
Wow, I love your method!
Wow, I love your method!
Thank you :)
The AI has a very limited "memory". It can mash together many general and stylistic terms, but when it comes to simple facts about the things to be depicted, the more you add, the more confused the generated image becomes.
It can do "a cat", "a black cat", and "a black cat sitting in a box" consistently, as long as you're not too picky about anatomical accuracy. It can even do "a black cat sitting in a white box on a lawn" 1/4 of the time. "A black cat with green eyes sitting in a white box on a lawn" will work about as often. But the moment you add another detail, it takes a lot more tries. So "a black cat with green eyes wearing a collar sitting in a white box on a lawn" can still be done, but it's down to less than 7%. And as for "a black cat with green eyes wearing a red collar sitting in a white box on a lawn", it might be faster to get a cat, collar, and box, and photograph the thing itself. ๐
Looks like 8 or 9 precise concrete facts might be the limit for a reasonable amount of time, unless you just happen to get lucky. Or you've asked for something that appears in millions of images (e.g. "A wide angle close-up selfie taken from above of an adult female with pouty lips, wearing lingerie, in a cluttered bathroom with bad lighting," which I'm guessing you might get on the first try, even if you didn't want it. ๐)
- Cat 2. Black cat 3. Black cat with green eyes 4. Black cat with green eyes sitting 5. Black cat with green eyes sitting in a box 6. Black cat with green eyes sitting in a white box 7. Black cat with green eyes sitting in a white box on a lawn 8. Black cat with green eyes wearing a collar sitting in a white box on a lawn
The AI has a very limited "memory". It can mash together many general and stylistic terms, but ...
... which I'm guessing you might get on the first try, even if you didn't want it. ๐)
ok so just for fun:
so just for fun I tried this description you gave, Lol.
text: "A wide angle close-up selfie taken from above of an adult female with pouty lips, wearing lingerie, in a cluttered bathroom with bad lighting,"
then I modified it just to see if trying to find and eliminate unnecessary words helps. I also added "lively and elegant," idk, to make her look a bit more ... lively and .. elegant..
text: wide-angle, close-up selfie, from above, lively and elegant woman, pouty lips, wearing lingerie, in a cluttered bathroom, bad lighting