Talk:Stable Diffusion/Archive 1

This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

Gallery of examples?

The Japanese language page has a gallery of various examples that Stable Diffusion can create, perhaps we should do the same to showcase a few examples for people to see. I'd be curious to hear others weigh in. Camdoodlebop (talk) 00:57, 11 September 2022 (UTC)

The built-in Batch, Matrix and XY plot functions are great for this. Please feel free to use this example for the Img2Img section to explain parameters: https://i.imgur.com/I6I4AGu.jpeg Here I've used an original photo of a dirty bathroom window and transformed it using the prompt "(jean-michel basquiat) painting" using various CFG and denoising 73.28.226.42 (talk) 16:42, 8 October 2022 (UTC)

External links

The AUTOMATIC1111 fork of Stable Diffusion is indubitably the most popular client for Stable Diffusion. It should definitely have its place in the external links section. Thoughts? Leszek.hanusz (talk) 16:26, 7 October 2022 (UTC)

Reddit comments aren't reliable for anything, and Wikipedia is WP:NOT a link directory. We should not be providing links to clients at all. MrOllie (talk) 16:30, 7 October 2022 (UTC)

This is just one metric, it has more than 7K stars on GitHub, what more do you want? Do you actually use Stable Diffusion yourself? It is now THE reference. Leszek.hanusz (talk) 16:37, 7 October 2022 (UTC)

GitHub stars (or any form of social media likes) are also indicative of precisely nothing. What I want is that you do not advertise on wikipedia by adding external links to your own project. - MrOllie (talk) 16:38, 7 October 2022 (UTC)

Automatic1111 is not my own project, it has nothing to do with me. http://diffusionui.com is my own project and I agree it should not be in the external links. Leszek.hanusz (talk) 16:51, 7 October 2022 (UTC)

I agree with MrOllie. Nothing here (reddit comments, GitHub stars) is the type of sourcing that would suggest this should be included in this article. Elspea756 (talk) 16:53, 7 October 2022 (UTC)

I think its evident in that most, nearly all, published/shared prompts for SD use the parentheses/brackets/prompt-editing syntactical sugar, which is a feature exclusively from Automatic1111-webui's version. That should be a good indicator of its popularity if you can't use github stats for some reason. 73.28.226.42 (talk) 13:41, 10 October 2022 (UTC)

Using images to promote unsourced opinions

I've removed two different versions of editors trying to use images to promote unsourced legal opinions and other viewpoints. Please, just use reliable sources that support that these images illustrate these the opinions, if those sources exist. You can't just place an image and claim that it illustrates an unsourced opinion. Thanks. Elspea756 (talk) 15:54, 6 October 2022 (UTC)

"But Stable Diffusion’s lack of safeguards compared to systems like DALL-E 2 poses tricky ethical questions for the AI community. Even if the results aren’t perfectly convincing yet, making fake images of public figures opens a large can of worms." - TechCrunch. "And three weeks ago, a start-up named Stable AI released a program called Stable Diffusion. The AI image-generator is an open-source program that, unlike some rivals, places few limits on the images people can create, leading critics to say it can be used for scams, political disinformation and privacy violations." - Washington Post. I don't know what additional convincing you need. As for your edit summary of "Removing unsourced claim that it is a "common concern" that this particular image might mislead people to believe this is an actual photograph of Vladimir Putin", nowhere in the caption was that ever mentioned, that's purely your own personal interpretation that completely misses the mark of what the caption meant. --benlisquare_T•C•E 16:02, 6 October 2022 (UTC)

Thank you for discussing here on the talking page. I see images of Barack Obama and Boris Johnson included in that Tech Crunch article, so those do seem to illustrate the point you are trying to make and are supported by the source you are citing. Can we agree to replace the previously used unsourced image with either that Barack Obama image or series of Boris Johnson images? Elspea756 (talk) 16:06, 6 October 2022 (UTC)

That would not be suitable, because those images of Boris Johnson and Barack Obama are copyrighted by whoever created those images in Stable Diffusion and added those to the TechCrunch article. Per the WP:IUP and WP:NFCC policies, we do not use non-free images if a free-licence image is already available. A free licence image is available, because I literally made one, and released it under a Creative Commons licence. --benlisquare_T•C•E 16:09, 6 October 2022 (UTC)

OK, now I understand why you feel so strongly about this image, it's because as you say you "literally made" an image and now you want to include your image in this wikipedia article. I hope you can understand you are not a neutral editor when it comes to decisions about this image you "literally made", that you have a conflict of interest here, and shouldn't be spamming your image you made into this article. Your image you are spamming into this article does not accurately illustrate the topic, so it should be removed. Elspea756 (talk) 16:15, 6 October 2022 (UTC)

It's your perogative to gain WP:CONSENSUS for your revert, given that you are the reverting party. If you can convince myself, and the wider community of editors, that your revert is justified, then I will by all means happily agree with your revert. --benlisquare_T•C•E 16:17, 6 October 2022 (UTC)

Nope, it is your obligation to provide sources and gain consensus for your "image made literally 27 minutes ago ffs." We have no obligation to host your "image made literally 27 minutes ago ffs." Elspea756 (talk) 16:21, 6 October 2022 (UTC)

Point to me the Wikipedia policy that says this. Almost all image content on Wikipedia is user self-created, anyway; your idea that Wikipedia editors cannot upload their own files to expand articles is completely nonsensical. All of your arguments have not been grounded in any form of Wikipedia policy; rather, they are exclusively grounded in subjective opinion, and a misunderstanding of how Wikipedia works. "We" "Your" - my brother in Christ, you joined Wikipedia on 2021-06-14, it's wholly inappropriate for you to be condescending as if you were the exclusive in-group participant here. --benlisquare_T•C•E 16:23, 6 October 2022 (UTC)

As, I've said you have a very clear conflict of interest here. It is very evident from your language choices here, writing "ffs," "Christ," etc, that you not a neutral editor and that you feel very strongly about spamming your unsourced, user generated content here. I understand very clearly where you are coming from now There is no need for you to continually restate your opinions with further escalating profanity. Elspea756 (talk) 16:34, 6 October 2022 (UTC)

I totally agree with Elspea756 and removed some images. This is clearly original research; while it is reasonable to be more lax with WP:OR as it applies to images, WP:OI (quite reasonably) states: [Original images are acceptable ...] so long as they do not illustrate or introduce unpublished ideas or arguments. Commentary on the differences between specific pictures is very different than something like File:Phospholipids aqueous solution structures.svg, which is inspired by existing diagrams and does not introduce "unpublished ideas". Yes, the idea that "AI images are dependent on the input", is published; no, no one independent has analyzed these specific pictures. Also, using AI-generated art with prompts asking to emulate specific artists' styles is not only blatantly unethical, but also potentially a copyright violation; that it is legally acceptable is not yet established. Finally, we shouldn't have a preponderance of pictures appealing to the male gaze. There are thousands, millions of potential subjects, and there is nothing special about these. Ovinus (talk) 01:47, 8 October 2022 (UTC)

This thread is specifically in reference to the Putin image used in the "Societal impacts" section, however. The disagreement here is whether or not it's appropriate to use the Putin image to illustrate the ethics concerns raised in the TechCrunch article; my position is that we cannot use the Boris Johnson image from the TechCrunch article as that would fall afoul of WP:NFCC. As discussed in a previous thread, I had already planned to replace a few of the sample images in the article with ones that are less entwined with the female form and/or male gaze, I just haven't found the time to do so yet, since creating prompts of acceptable quality is more time-consuming than most might actually assume. --benlisquare_T•C•E 02:07, 8 October 2022 (UTC)

I understand, but the images in this article are broadly problematic, not just the Putin image. It's quite arguably a WP:BLP violation, actually. A much less controversial alternative could be chosen; for example, using someone who's been dead for a while. Ovinus (talk) 02:12, 8 October 2022 (UTC)

In that case, that's definitely an easy job to fix. I'll figure out an alternative deceased person in due time. --benlisquare_T•C•E 02:15, 8 October 2022 (UTC)

Thank you, Ovinus, for "total agree"ment that these spam images are a problem and that they are "clearly original research." In a moment I will be once again removing the unsourced, inaccurate image of Vladimir Putin from this article that has been repeatedly spammed into this article. Besides being obvious spam and unsourced original research, it is also a non-neutral political illustration created to express an individual wikipedia editor's personal point of view, and its subject is a living person so this violates our policy on biographies of living persons. Once again, the obligation to provide sources and gain consensus is on the editor who wants their content included. We do not need a week-long discussion before removing an unsourced user-generated spam image expressing a personal political viewpoint about a living person. Elspea756 (talk) 13:22, 13 October 2022 (UTC)

I wouldn't call it spam. Benlisquare is clearly here in good faith. Ovinus (talk) 14:42, 13 October 2022 (UTC)

Wiki Education assignment: WRIT 340 for Engineers - Fall 2022 - 66826

This article was the subject of a Wiki Education Foundation-supported course assignment, between 22 August 2022 and 2 December 2022. Further details are available on the course page. Student editor(s): Bruhjuice, Aswiki1, Johnheo1128, Kyang454 (article contribs).

— Assignment last updated by 1namesake1 (talk) 23:38, 17 October 2022 (UTC)

Not Open Source

The license has usage restrictions, and therefore does not meet the Open Source Definition (OSD):

https://opensource.org/faq#restrict

https://stability.ai/blog/stable-diffusion-public-release

Nor is the "Creative ML OpenRAIL-M" license OSI-approved:

https://opensource.org/licenses/alphabetical

It would be correct to refer to it as "source available" or perhaps "ethical source", but it certainly isn't Open Source.

Gladrim (talk) 12:40, 7 September 2022 (UTC)

This is my understanding as well, and I thought about editing this article to reflect this. However I'm not sure how to do this in a way that is compliant with WP:NOR, as the Stability press release clearly states that the model is open source and I have been unable to find a WP:RS that clearly contradicts that specific claim. The obvious solution is to say "Stability claims it is open source" but even that doesn't seem appropriate given the lack of sourcing saying anything else (after all, the explicit purpose of that language is to cast implicit doubt on the claim). I have a relatively weak understanding of Wikipedia policy and would be more than happy if someone can point to evidence that correcting this claim would be consistent with Wikipedia policy, but at the current moment I don't see a way to justify it.

It's also worth noting that the OSI-approved list hasn't been updated since Stable Diffusion came out, and SD is the first model to be released with this license as far as I can tell. Thus the lack of endorsement is not evidence of non-endorsement. Perhaps we could say "Stability claims it is open source, though OSI has not commented on the novel license" (this is poorly worded but you get my point)

Stellaathena (talk) 17:41, 7 September 2022 (UTC)

According to the license which is adapted from the Open RAIL-M(Responsible AI Licenses) which the 'M' means the usage restrictions only applies to the published Model or derivative of the Model, not source code.

Open RAIL has various types of licenses available: RAIL-D(Use restriction applies only to the Data), RAIL-A(Use restriction applies only to the application/executable), RAIL-m(Use restriction applies only to the Model), RAIL-S(Use restriction applies only to the Source code) and it can combined in D-A-M-S order e.g. RAIL-DAMS, RAIL-MS, RAIL-AM

The term ''''Open'''' can be added to the licenses to clarify the license is royalty-free and the works/subsequent derivative works can be re-licensed 'as long as the Use Restrictions similarly apply to the relicensed artifacts'

"

Open RAIL Licenses

Does a RAIL License include open-access/free-use terms, akin to what is used with open source software?

If it does, it would be helpful for the community to know upfront that the license promotes free use and re-distribution of the applicable artifact, albeit subject to Use Restrictions. We suggest the use of the prefix "Open" to each RAIL license to clarify, on its face, that the licensor offers the licensed artifact at no charge and allows licensees to re-license such artifact or any subsequent derivative works as they choose, as long as the Use Restrictions similarly apply to the relicensed artifacts and its subsequent derivatives. A RAIL license that does not offer the artifact royalty-free and/or does not permit downstream licensing of the artifact or derivative versions of it in any form would not use the “Open” prefix." source

so technically the source code is 'Open Source'

Maybe a useful link:

https://huggingface.co/blog/open_rail

https://www.licenses.ai/ai-licenses

https://www.licenses.ai/blog/2022/8/26/bigscience-open-rail-m-license Dorakuthelekor (talk) 23:04, 17 September 2022 (UTC)

It is definitely not open source, and to describe it that way is misleading. Ichiji (talk) 15:56, 3 October 2022 (UTC)

Just a short note for now as I'll revisit this issue later: Stable Diffusion is clearly open source. The questions is whether it's free and open source software (FOSS) or just open source (including any potential subtypes of open source, which don't have to be approved by any "Open Source Definition (OSD)). The source code is voluntarily fully openly available in accessible manner, it's open source by definition. Concerning, whether or not it's FOSS: I would argue it is but maybe there should be a clarification that it's not a type of a condition-less fully free types of FOSS.

Several WP:RS sources have stated that is FOSS and even made that a major topic of their articles. See these refs and there are probably more: ^[1]^[2]^[3]^[4]^[5]^[6]

References

^ Edwards, Benj (6 September 2022). "With Stable Diffusion, you may never believe what you see online again". Ars Technica. Retrieved 15 September 2022.
^ "Stable Diffusion Public Release". Stability.Ai. Retrieved 15 September 2022.
^ "Stable Diffusion creator Stability AI accelerates open-source AI, raises $101M". VentureBeat. 18 October 2022. Retrieved 10 November 2022.
^ Kamath, Bhuvana (19 October 2022). "Stability AI, the Company Behind Stable Diffusion, Raises $101 Mn at A Billion Dollar Valuation". Analytics India Magazine. Retrieved 10 November 2022.
^ Pesce, Mark. "Creative AI gives overpowered PCs something to do, at last". The Register. Retrieved 10 November 2022.
^ "Is generative AI really a threat to creative professionals?". The Guardian. 12 November 2022. Retrieved 12 November 2022.

Prototyperspective (talk) 17:27, 12 November 2022 (UTC)

Image variety

Benlisquare, I appreciate all the work you've done expanding this article, including the addition of images, but I think the article would be improved if we could get a greater variety of subject matter in the examples. To be honest, I think any amount of "cute anime girl with eye-popping cleavage" content has the potential to raise the hackles of readers who are sensitive to the well-known biases of Wikipedia's editorship, so it might be better to avoid that minefield altogether. At the very least though, we should strive for variety.

I was thinking about maybe replacing the inpainting example with figure 12 from the latent diffusion paper, but that's not totally ideal since it's technically not the output of Stable Diffusion itself (but rather a model trained by LMU researchers under very similar conditions, though I think with slightly fewer parameters). Colin M (talk) 21:49, 28 September 2022 (UTC)

My rationale for leaving the examples as-is is threefold:

Firstly, based on my completely anecdotal and non-scientific experimentation from generating over 9,500+ images (approx. 11GB+) of images using SD at least, non-photorealistic images play best with the ability for img2img to upscale images and fill in tiny, closer details without the final result appearing too uncanny for the human eye, which is why I opted for working with generating a non-photorealistic image of a person for my inpainting/outpainting example. Sure, we theoretically could leave all our demonstration examples as 512x512 images (akin to how the majority of example images throughout that paper were small squares), but my spicy and highly subjective take on this is, why not strive for better? If we can generate high detail, high resolution images, then I may as well should. The technology exists, the software exists, the means to go above and beyond exists. At least, that's how I feel.
Specifically regarding figure 12 from that paper, it makes no mention as to whether or not the original inpainted images are generated through txt2img which were then inpainted using img2img, or whether they used img2img to inpaint an existing real-world photograph. If it is the latter, then we'd run into issues concerning Commons:Derivative works. At least with all of the txt2img images that I generate, I can guarantee that there wouldn't be any concern in this area, as long as I don't outright prompt to generate a copyrighted object like the Eiffel Tower or Duke Nukem or something.
Finally, I don't particularly think the systemic bias issue on this page is that severe. Out of the four images currently on this article, we have a photorealistic image of an astronaut, an architectural diagram, and two demonstration images containing artworks featuring non-photorealistic women. From my perspective, I don't think that's at the point of concern. Of course, if you still express concern in spite of my assurances, give me time I could generate another 10+ row array of different txt2img prompts featuring a different subject, but it'll definitely take me quite some time to finetune and perfect to a reasonable standard (given the unpredictability of txt2img outputs). As a sidenote, the original 13-row array I generated was over 300MB+ with dimensions of 14336 x 26624 pixels, and the filesize limit for uploading to Commons was 100MB, hence why I needed to split the image into four parts.

Let me know of your thoughts, @Colin M. Cheers, --benlisquare_T•C•E 03:08, 29 September 2022 (UTC)

Actually, now that I think about it, would you be keen on a compromise where I generate a fifth image, either containing a landscape, or an object, or a man, to demonstrate how negative prompting works, as a counterbalance to the images already present? The final result would be something like this: Astronaut in the infobox, diagram under "Architecture", the 13-row matrix comparing art styles under "Usage" (right-align), some nature landscape or urban skyline image under "Text to image generation" (left-align), the inpainting/outpainting demonstration under "Inpainting and outpainting" (right-align). I'm open to adjustments if suggested, of course. --benlisquare_T•C•E 03:33, 29 September 2022 (UTC)

Regarding your point 1:

I don't think we're obliged to carefully curate prompts and outputs that give the nicest possible results. We're trying to document the actual capabilities of the model, not advertise it. Seeing the ways that the model fails to generate photorealistic faces, for example, could be very helpful to the reader's understanding.
Even if we accept the reasoning of your point 1, that's merely an argument for showing examples in a non-photorealistic style. But why specifically non-photorealistic images of sexualized young women? Why not cartoonish images of old women, or sharks, or clocktowers, or literally anything else? It's distracting and borderline WP:GRATUITOUS.

Colin M (talk) 04:19, 29 September 2022 (UTC)

Creating the inpainting example took me quite a few hours worth of trial-and-error, given that for any satisfactory img2img output obtained one would need to cherrypick through dozens upon dozens of poor quality images with deformities and physical mutations, so I hope you can understand why I might be a bit hesitant with replacing it. Yes, I'm aware that's not a valid argument for keeping or not keeping something, I'm merely providing my viewpoint. As for WP:GRATUITOUS, I don't think that particularly applies, the subject looks like any other youthful woman one would find on the street in inner-city Melbourne during office hours, but I can understand the concern that it may reflect poorly on the systemic bias of Wikipedia's editorbase. Hence, my suggested solution to that issue would be to balance it out with more content, since there's always room for prose and image expansion. --benlisquare_T•C•E 06:01, 29 September 2022 (UTC)

I've gone ahead and added the landscape art demonstration for negative prompting to the article. When generating these, this time I've specifically left in a couple of visual defects (e.g. roof tiles appearing out of nowhere from inside a tree, and strange squiggles appearing on the sides of some columns), because what you mentioned earlier about also showcasing Stable Diffusion's flaws and imperfections does also make sense. There are two potential ways we can layout these, at least with the current amount of text prose we have (which optimistically would increase, one would hope), between this revision and this revision which would seem more preferable? --benlisquare_T•C•E 06:05, 29 September 2022 (UTC)

+1 on avoiding the exploitive images. The history of AI is rife with them, let's not add to that. Ichiji (talk) 15:59, 3 October 2022 (UTC)

I agree with Ichiji that editors should not be adding "exploitive images". I also agree with Colin M above in questioning why editors would be adding "images of sexualized young women" to this article." And I agree with Ovinus below that "we shouldn't have a preponderance of pictures appealing to the male gaze." In a moment I will be removing four unsourced, unnecessary, user-generated images created with "Prompt: busty young girl ..." We are not obligated to host anyone's "busty young girl" image collection. PLEASE NOTE: We don't need four editors to take over a week to disagree with someone spamming into this article 95 user-generated images from their "busty young girl" image collection. The obligation to provide sources and gain consensus is on the editor who wants their content included. An editor adding 90+ unsourced, user-generated images to an article is obvious spam and can be just removed on sight. Elspea756 (talk) 17:09, 8 October 2022 (UTC)

Nonsense. The consensus takes time and thorough discussion, there is no WP:DEADLINE to rush anything through without dialogue. Also, consider reading WP:SPAM to see what spam actually is. Your allegations are bordering upon personal attacks here. --benlisquare_T•C•E 18:22, 8 October 2022 (UTC)

If certain types of images are characterisic of the usage of AI in general or this paricular program in particular, why should this artcle pretend otherwise? Of course, it would be ideal if they were published in some RS first, but this is to be balanced with other concerns, like permissive licenses. See Visual novel article illustrations, for example. Smeagol 17 (talk) 10:12, 16 October 2022 (UTC)

I'd have to concur here; other types of AI-generated images are already well represented throughout Wikipedia, for instance the DALL-E page. In contrast with other text-to-image models, SD is particularly good at creating non-photorealistic art of anatomically realistic humans, and on occasion photorealistic images of people too if you're lucky with how the hands and limbs are generated, so showcasing such outputs make sense compared to generic images of beach balls and teddy bears.

On a sparingly-related tangent, today I have gotten the DreamBooth article onto the front page of Wikipedia which is arguably the part of Wikipedia with the most visibility, where it is showcased within the DYK section, and this article features an AI-generated topless photo of Wikipedia founder Jimmy Wales that I personally created; there were no objections during the entire DYK review process, there were no last minute objections from any sysops managing the DYK queues, and the DYK entry has been on the front page for 23 hours now, with no one raising any objections regarding the image. Not a single Wikipedia reader, not a single Wikipedia editor. It's interesting how there's essentially zero objection to a half-nude Jimmy Wales, but as soon as a fictional woman is involved it suddenly becomes a big issue, heaven forbid anyone sees a woman on the internet. My primary complaint is not that American prudishness exists; rather, my gripe is that Americans have the culturally imperialistic habit to dictate to everyone else on the planet how they should and should not feel about images of women. --benlisquare_T•C•E 23:44, 8 December 2022 (UTC)

So, how about returning at least Inpainting and Outpainting images? They were very illustrative. Smeagol 17 (talk) 23:43, 11 November 2022 (UTC)

I'd definitely welcome it, the article currently looks like an utter desert completely lacking in visual content, which is absolutely silly for an article about image generation AI models out of all things. Feel free to re-apply the image yourself from an old diff if you wish to; personally I'm going to refrain from doing so myself, because I have no interest in having obsessive editors breathing down my neck for weeks and weeks again. I'll leave it to the judgement of someone else. --benlisquare_T•C•E 23:54, 8 December 2022 (UTC)

Regarding Square Brackets as Negative Prompt

This is my first time contributing to a discussion, so please be understanding if I'm not following etiquette properly. I read the talk guidelines but it assumes a rather high level of familiarity of these systems, which I do not have.

In any case, I just wanted to start a discussion regarding the interpretation of citation 16. If I am reading the source material correctly, brackets around a keyword actually creates emphasis around the keyword rather than a negative correlation. It states: "The result of a qauntitative analysis is that square brackets in combination with the aforementioned prompt have a small but statistically significant positive effect. No effect or a negative effect can be excluded with 99.98% confidence." It goes on to state that with very specific prompt engineering square brackets can be used to create an inconsistent and negligible effect.

It then discusses exclamation points and that they do seem to have some negative effect on reducing the appearance of certain keywords in images. Since I am new to both contributing to Wikipedia and to Stable Diffusion I wanted to see if someone smarter than me could confirm my interpretation of the source material before making the corrections to the article. Thank you. — Preceding unsigned comment added by Abidedadude (talk • contribs) 19:58, 30 October 2022 (UTC)

By design (as mentioned here, here and here as examples; note these are just for use as examples, I'd strongly recommend not citing them in any serious work), parentheses, for example (Tall) man with (((brown))) hair, increases emphasis, while square brackets, for example [Red] car with [[[four doors]]], decreases emphasis. Gaessler's findings suggest that while attempting to decrease the occurrence of something via [French] architecture, it actually has a "small but statistically positive effect" yet also "not perceptible to humans" based on his data and methodology; meanwhile, the use of rainy!! weather to emphasise (i.e. increase the occurrence of; since ! can only be used for emphasis and not de-emphasis) was not very coherent and resulted in a high chi2/NDF, and the use of negative prompts to decrease the occurrence of keywords resulted in a highly statistically significant change in outcome. --benlisquare_T•C•E 02:31, 31 October 2022 (UTC)

I probably should also point out that some Stable Diffusion GUI implementations might use different formatting rules for emphasis; for example, NovelAI (which is a custom-modified, anime/furry-oriented model checkpoint of Stable Diffusion hosted on a SaaS cloud service) uses curly brackets, for example {rusty} tractor, for positive emphasis instead of parentheses. Not all Stable Diffusion implementations will process prompt formatting in the same way. --benlisquare_T•C•E 02:56, 31 October 2022 (UTC)

In case it satisfies your curiosity, I've just generated this X/Y plot in Stable Diffusion to demonstrate to you how [deemphasis] (emphasis) prompting works in practice. In my personal opinion, you can barely see any visual difference at all among the [deemphasised examples], so usually I don't see much point in using them at all while prompting. Of course, all of this is original research, so this is just a FYI explanation, nothing more and nothing less. --benlisquare_T•C•E 03:48, 31 October 2022 (UTC)

That's definitely interesting. I do like experimenting with prompts and learning about other people's experiences, you're saying that your info is based on NovelAI's custom implementation, yes? If so, perhaps it would be better to put the emphasis/de-emphasis info in the article about NovelAI? Because it seems the default checkpoint for Stable Diffusion hasn't been trained in any specific way to handle square brackets, and the citation in question doesn't really seem to support the assertion in the article about emphasis. In any case, perhaps it's all moot if everybody seems to more or less agree that the effect is imperceptible, even if it does exist. Maybe this sort of granular prompt fine tuning just shouldn't be mentioned at all, given that it's all pretty unreliable and results can be unpredictable with any machine learning? As a side note, with regards to anecdotal FYI info, I have been experimenting with JSON to input prompts (with default checkpoints) and the results have been pretty interesting. It's obvious that it hasn't been trained to interpret that in any way, but it really seems to make minor changes to the prompt result in much more significant differences vs natural language in terms of the resulting image. I definitely haven't experienced brackets the way anybody else is describing them, but again, it's all anecdotal. Abidedadude (talk) 07:58, 31 October 2022 (UTC)

No, 100.000000% (9 significant figures) of what I've covered above has nothing to do with NovelAI. I just mentioned in passing that NovelAI uses curly brackets instead of parentheses.

given that it's all pretty unreliable

That's precisely what the citation says: that using emphasis markers is less reliable than using negative prompts. And that's what's asserted in the Wikipedia article as well: "The use of negative prompts has a highly statistically significant effect... compared to the use of emphasis markers" (bold emphasis mine).

it really seems to make minor changes to the prompt result in much more significant differences

Yes, that is correct, and it's because even slight adjustments to things like word order, punctuation, and spelling, adds additional noise to the prompt, which will lead to a different output. The model doesn't parse the prompt like a human would, and we see this when big red sedan driving in highway and highway, with red big sedan, driving on it results in different outputs even with the same seed value. --benlisquare_T•C•E 10:54, 31 October 2022 (UTC)

I have to be honest, I'm even more confused now than when we started this discussion, so I'm probably just going to go ahead and bow out at this point. I'm pretty sure your own original research in the example above was intended to show me that negative prompting was less effective than emphasis, but right now you're telling me that the assertion in the article - about negative prompts being more effective - is correct. Even though the original research you conducted is consistent with the citation (which is then inconsistent with the statement in the article). Perhaps it's all a joke of some kind, because the 9 significant digits bit is pretty funny. All that extra emphasis on significance, and yet it doesn't change the end result one bit, much like the topic of discussion, no? Anyway, I did enjoy the discussion, but I'm afraid it's either going over my head or that the contradictions are just becoming too much for me to care about. I thought I was just helping out with a quick fix. I do appreciate you taking the time to engage with me either way. Abidedadude (talk) 17:34, 31 October 2022 (UTC)

An example of one of many different UI implementations of Stable Diffusion. This particular one is built upon the Gradio web frontend library, but there are non-web frontends for Windows and macOS as well. These UI frontends allow the user to interact with the model checkpoint without needing to type commands into a python console, making the barrier to entry easier for new users. All of these UIs have separate text fields for "prompt" and "negative prompt", as seen above. You enter what you want to see in the output image into the "prompt" text field; you enter what you don't want to see in the "negative prompt" field.

My bad, I should definitely be more clearer in my explanation. My first question to you is, are you using a UI frontend while using Stable Diffusion, or are you directly inputting the settings (e.g. sampler steps, CFG, prompts, etc.) into a command-line interface? If you are using a UI frontend, which one are you using? Are you running the model locally on your own computer, or are you using a cloud service via a third-party website?

As mentioned in the article, these features are provided by open-source UI implementations of Stable Diffusion, and not the 3.97GB model checkpoint itself. The UI implementation acts as an interface between the user and the model, so that the user doesn't need to punch parameters into a python console window. There are many different open-source UI implementations for Stable Diffusion, including Stable Diffusion UI by cmdr2, Web-based UI for Stable Diffusion by Sygil.Dev, Stable Diffusion web UI by AUTOMATIC1111, InvokeAI Stable Diffusion Toolkit, and stablediffusion-infinity by lkwq007, among others. All of the aforementioned implementations utilise both negative prompting features and emphasis features. In fact, almost every single Stable Diffusion user interface now has these features; it is now the norm, rather than the exception, for Stable Diffusion prompts to feature negative prompting and emphasis marking given that they significantly reduce the quantity of wasteful, low quality generations to sift through; go to any prompt sharing website or Stable Diffusion online discussion thread, and the vast majority of shared prompts will feature negative prompts or emphasis markers, or even both. Since this is a common question raised by someone else above, I should point out it is inappropriate for the Wikipedia article itself to list all of these UI implementations, as Wikipedia is WP:NOT a repository of external links; the examples I've provided above are just to make sure you have full context on what's going on.

Just like how the original CompVis repo provided a collection of python scripts that allow the user to interact with the model checkpoint file (the 3.97GB *.ckpt file that does much of the heavy lifting, so to speak), and those python scripts aren't "part" of the model checkpoint, open-source user interfaces likewise implement their own interface between the user and the 3.97GB *.ckpt; this space has been rapidly evolving and improving over the past few months, mostly as an open-source community driven effort, and the "norm" for the range of configurable settings available to make prompts has shifted considerably since September.

If you have any additional questions relating to this topic in particular, or if you would like assistance on how to set up any of the aforementioned UI implementations or how to improve your prompting to obtain better outputs, feel free to let me know. As someone who has generated over 33GB of images though experimentation in Stable Diffusion and is quite passionate in fine-tuning prompts to find the most perfect outputs, I'd be quite glad to help out. --benlisquare_T•C•E 22:20, 31 October 2022 (UTC)

Consensus is that we do not need to host a user's "busty young girl" image collection

There seems to be a dispute by editor Smeagol over whether there is a consensus for whether or not this page should be used to once again host benlisquare's unsourced, user-generated images created by prompting Stable Diffusion to create images of "busty young girls." Editor Ovinus above has disagreed with including these images, saying "we shouldn't have a preponderance of pictures appealing to the male gaze. There are thousands, millions of potential subjects, and there is nothing special about these." Editor Colin M above has also disagreed with including these "busty young girls," saying "why specifically non-photorealistic images of sexualized young women? Why not ... literally anything else? It's distracting and borderline WP:GRATUITOUS." Editor Ichiji has agreed, saying "+1 on avoiding the exploitive images." I agree with Ovinus, Colin M, and Ichiji, so that is a very clear consensus of 4 editors in agreement that belisquare should not use wikipedia to host their "busty young girl" image spam. My understanding is now editor Smeagol wishes to add some illustration of inpainting and outpainting, and so has reverted my removal of these images. A desire to include an esample of iinpainitng does not change the current consensus that if an illustration of inpainting were to be included here, it should be from a source other than benlisquare's unsourced, user-generated images of busty young girls. Such an example can be, as Colin M has said and others agreed, of "literally anything else." Does that make sense? Or is there a disagreement with this? Elspea756 (talk) 00:13, 19 December 2022 (UTC)

You are obsessed. Heaven forbid somebody depict your messiah Vladimir Putin in a less-than-positive light, huh. My one and only regret is creating that Putin image, which seems to be the sole origin for this multiple month-long obsession.

There seems to be a dispute by editor Smeagol... so that is a very clear consensus of 4 editors in agreement

Sounds like you don't have a consensus, then. Considering myself, @Smeagol 17, and 49.228.220.126 from Thailand are not in agreement, that makes 3 editors who disagree with 4 editors - hardly a consensus by any stretch. Not to mention, why are you grasping onto a tiny handful of posts from months ago, given that consensus can change, anyway? Furthermore, consensus is built based on the quality of arguments and not the quantity of proponents; frankly, lazy arguments that are shit and can be easily pulled apart can be given less credence.

it should be from a source other than benlisquare's unsourced, user-generated images

You contradict yourself. The Artificial intelligence art article, which is your personal precious baby that you valiantly defend every single day, is full of Wikipedia user-generated content such as this and this and this and this, yet frankly you don't seem to give a shit. Funny that, huh?

...of busty young girls

It's not my problem that you get boners looking at... checks notes... ordinary women wearing clothes. This is strictly your problem, not mine. Unless you live a sheltered life residing in the basement of a Baptist church in the American midwest, women in the real world like to wear fashionable clothes (horror, I know), and the depicted women look like any typical urbanite youth in any non-shithole city. It's the 21st century, don't tell women what to wear, teach men not to harm women instead. Is your solution to gender inequality to unironically erase women from the public eye? If your complaint is that the women are attractive and that I should have made her uglier, then my question is... why should I make pictures of ugly people, of any gender? From the marble statues of Cleopatra to the Mona Lisa, people generally prefer to create art of attractive people, and art will be heavily skewed towards attractive-looking people, news at 11, sports at 11:30.

Apart from this constant whining, have you contributed anything to this article? It's always the people who have nothing useful to bring to the table who whine and moan the loudest. You know why the current inpainting example images are being used? Because literally nobody has bothered to put in the effort to create something better in quality and educational usefulness. If you have such dire complaints, then why don't you create some better images to replace the existing content with? Surely it's easy for you, right? The tools are right there, freely downloadable, readily useable, and fully documented. Or are you conceding that you are incapable of making any contributions yourself to fix the issue of "women that are too attractive" as opposed to whining about it for months and months on end? --benlisquare_T•C•E 02:38, 19 December 2022 (UTC)

As I said earlier, such pictures are fairly typical for Stable Diffusion use, afaik. So why should we pretend otherwise, and on its own page, no less? (And it is one out of four/five, for now). Not to mention, the choice is between this picture and no picture, given that no one created another one in 2 months. Smeagol 17 (talk) 03:00, 19 December 2022 (UTC)

Hi, Smeagol, thank you for discussing on this talk page. No, these don't seem typical to me of the Stable Diffusion images I've created or that have seen created by anyone else I know. As has been repeated several times here, Stable Diffusion can be used to create anything one types a prompt in for. Yes, I am sure we could find a different more suitable image of inpainting from a reliable source, or we could make our own. Would an example of inpainting involving dogs aor cats, maybe somewhat similar to those seen at https://huggingface.co/stabilityai/stable-diffusion-2-inpainting be a suitable solution for you here? Elspea756 (talk) 04:00, 19 December 2022 (UTC)

I don't have a statistic, but I would be surprised if less then 1/5 of all images created by SD right now are of "beautifull women" (see for example this, for midjourney: https://tokenizedhq.com/wp-content/uploads/2022/09/midjourney-statistics-facts-and-figures-popular-prompt-words-infographic.jpg?ezimgfmt=ng:webp/ngcb1) A picture from a reliable source would be ideal... in principle, but our RS-s are unlikely to have permissive enough licenses for that. If someone will create a competing fig. of better quality using cats, we can discuss replacement, but if it is of the same or worse quality, then, imho, the one who first took the effort to create an illustration (in this case, benlisquare) shall have priority. Smeagol 17 (talk) 08:01, 19 December 2022 (UTC)

If the new result is worse in quality, I would still consider that a bad change... Just because we CAN change to a picture of a cat doesn't mean a lower resolution/blurry/unclear picture is better 49.228.220.126 (talk) 17:04, 19 December 2022 (UTC)

these don't seem typical to me of the Stable Diffusion images I've created or that have seen created by anyone else I know - sounds like you should pay more attention to what the SD community has been churning out over the past few months, finding similar examples isn't even difficult. Being unaware is not an excuse. --benlisquare_T•C•E 15:41, 19 December 2022 (UTC)

what part of the wikipedia rules talks about the male gaze, this tutorial is super helpful so even if you might be offended by it why should it go???? i've seen worse content on this site, why not go protest that????? 98.109.134.212 (talk) 22:43, 20 December 2022 (UTC)

Speaking of tutorials, does Elspea756 provide a step-by-step guide on how to replicate his output image in the file description? Of course not, he's not here to educate or help the reader, he's here to win an internet argument. At least my file descriptions actually have some effort put into them. --benlisquare_T•C•E 02:49, 21 December 2022 (UTC)

Since Wikipedia isn't for writing step-by-step guides (WP:NOTHOWTO) that really isn't a problem. - MrOllie (talk) 03:43, 21 December 2022 (UTC)

You're not wrong about WP:NOTHOWTO, but you still haven't addressed my main point that his image is the epitome of low-effort. --benlisquare_T•C•E 03:54, 21 December 2022 (UTC)

It illustrates the concept in the article so the reader can understand it, and it does so without a sexualized image. The amount of effort involved isn't really a factor. - MrOllie (talk) 04:00, 21 December 2022 (UTC)

Understood. If I put the woman in a hijab, would you still have opposition to it? That would remove the entire "sexualisation" element that forms the crux of this dispute. It would take me no more than 15 minutes, and would alleviate your primary concern regarding WP:SYSTEMIC. --benlisquare_T•C•E 04:04, 21 December 2022 (UTC)

Covering the hair really isn't the point. MrOllie (talk) 04:14, 21 December 2022 (UTC)

I should clarify: the entire body, apart from the arms and face. The additional bonus would be that it would allow for representation of non-white cultures, another issue relating to WP:SYSTEMIC. Two birds with one stone, wouldn't you agree? --benlisquare_T•C•E 04:15, 21 December 2022 (UTC)

WP:SOFIXIT

Disruptive and discriminatory imagery and commentary. ~ Pbritti (talk) 20:37, 22 December 2022 (UTC)

Alhamdulillah she has seen the folly of her ways, and has learned to embrace modesty and the grace of Allah. Fatima (fictional character, any resemblance to a real-world Fatima is purely coincidental) shall no longer partake in the folly and hedonism of decadent fashion, inshallah.

Before

Demonstration of inpainting and outpainting techniques using img2img within Stable Diffusion

Step 1: An image is generated, coincidentally with the subject having one arm missing.

Step 2: Via outpainting, the bottom of the image is extended by 512 pixels and filled with AI-generated content.

Step 3: In preparation for inpainting, a makeshift arm is drawn using the paintbrush in GIMP.

Step 4: An inpainting mask is applied over the makeshift arm, and img2img generates a new arm while leaving the remainder of the image untouched.

After

Demonstration of inpainting and outpainting techniques using img2img within Stable Diffusion

File:Demonstration of inpainting and outpainting using Stable Diffusion, halal edition (step 1 of 4).png

Step 1: An image is generated, coincidentally with the subject having one arm missing.

File:Demonstration of inpainting and outpainting using Stable Diffusion, halal edition (step 2 of 4).png

Step 2: Via outpainting, the bottom of the image is extended by 768 pixels and filled with AI-generated content.

File:Demonstration of inpainting and outpainting using Stable Diffusion, halal edition (step 3 of 4).png

Step 3: In preparation for inpainting, a makeshift arm is drawn using the paintbrush in GIMP.

File:Demonstration of inpainting and outpainting using Stable Diffusion, halal edition (step 4 of 4).png

Step 4: An inpainting mask is applied over the makeshift arm, and img2img generates a new arm while leaving the remainder of the image untouched.

This is how things are actually fixed. Are there any further complaints, or can we finally put an end to this multiple month-long circus? --benlisquare_T•C•E 09:52, 21 December 2022 (UTC)

Brilliant, nobody has any objections. I sure hope nobody will start whining again immediately after I put this new version into the article, now that the prior concerns have been resolved. --benlisquare_T•C•E 13:45, 22 December 2022 (UTC)

Yeah, as I've suspected. Elspea756 doesn't give a shit about WP:SYSTEMIC, WP:GRATUITOUS or modesty, he's just spiteful and bitter that he's not getting his way. I guess that puts the "I care about protecting Wikipedia from sexual depictions of women" hypothesis to rest, huh? --benlisquare_T•C•E 13:54, 22 December 2022 (UTC)

New non-WP:GRATUITOUS, more representative, higher resolution, and up-to-date version images to replace previous disputed images created by prompt "busty young girl"

I have uploaded new images to replace the previous disputed series of images that had been created with the prompt "busty young girl." This new set of images is far superior to the previous images for the following reasons: 1) These new images address the concerns of editors Ovinus, Colin M, and Ichiji who requested that we stop using "pictures appealing to the male gaze," "sexualized young women," images in violation of "WP:GRATUITOUS," and "exploitive images." 2) Editor Smeagol requested that the images be created by prompts that are "typical" and "representative" of what AI artists typically use, and Smeagol provided (thank you) a link to a list and chart of 20 "most popular" prompt terms. Terms like "Busty" and "Young" appear nowhere on this list. The top three terms, each of which the chart seems to show as at least twice as popular as any of the other terms below, are "man," "city," and "space." The new images I've uploaded are created with a prompt using 4 of the top 5 words on this list (man, city, space and cat), as well as other words that are in this top 20 ("cyberpunk," etc.). So, these new images are likely several times more representative than any images created with prompts like "busty and young" which are nowhere on the list provided by Smeagol. 3) Smeagol and IP editor 49.228.220.126 expressed a preference for images of higher resolution. These new images are up to twice the resolution of any of the previous images. And 4) Creating these new images used the 2.1 version of Stable Diffusion which was just released on December 7, so these images are a more up-to-date example of Stable Diffusion than any images created previously. So for at least all of those reasons and likely more —non-gratuitous, more representative, higher resolution, and created with a more up-to-date version — I think and hope we can all agree that these images created with a "man with cat in a city in space" prompt are a vast improvement over the previous images created with a prompt for "busty young girl." Thank you again. Elspea756 (talk) 17:42, 20 December 2022 (UTC)

honestly it looks trash, the cat looks stoned on weed and the guy's head anatomy is completely off. bring back the old one, this really reeks of desperation by someone lacking competence. 2003:C6:5F00:9700:E847:13DA:2EC6:9854 (talk) 21:48, 20 December 2022 (UTC)

The old 'busty girl' image was a literal poster child for WP:SYSTEMICBIAS. Keep the new one. - MrOllie (talk) 22:07, 20 December 2022 (UTC)

Your logic appears to be that you're willing to accept any low-quality substitute as long as "systemic bias" is not introduced, is that correct? Even if the non-systemic bias alternative is five steps backwards? I feel like you're more interested in culture wars rather than making the article better. 142.188.128.54 (talk) 03:22, 21 December 2022 (UTC)

Have fun arguing with that straw man. - MrOllie (talk) 03:29, 21 December 2022 (UTC)

Did you really drag the resolution sliders up with no regard to what the final image looks like, just so that you could "beat" the resolution size of my previous image? Just look at it, the finer details are messy and it's clearly very blurry once you zoom in to native resolution. Also, you have set CFG way too high, which is why you are seeing random purple splotches everywhere. Not to mention, your file description contains barely any information at all; what is the reader supposed to "learn" from your image? You haven't even given the reader the courtesy to reproduce your steps should they choose to do so; by comparison, I have been 100% transparent regarding each step I took, and how I got to my final image, within my descriptions.

I don't really know why you like to bring up WP:GRATUITOUS so often, an image of a normal, clothed woman is not gratuitous, and just because someone has mentioned WP:GRATUITOUS doesn't make it true. These images are WP:GRATUITOUS:

Images (Redacted)

...which is why we don't use these images in Wikipedia articles, unless we absolutely need to as a last resort. The earlier inpainting images on the other hand were not WP:GRATUITOUS, and no amount of ad nauseam repetition will turn this falsehood true. --benlisquare_T•C•E 02:47, 21 December 2022 (UTC)

I also vote it is not GRATUITOUS, given context and usage. It is like complaining that an article about Italian Renaissance painting or ancient Greek art has some not-covered-from-head-to-toe women. Smeagol 17 (talk) 10:19, 22 December 2022 (UTC)

[arstechnicasd-1] Edwards, Benj (6 September 2022). "With Stable Diffusion, you may never believe what you see online again". Ars Technica. Retrieved 15 September 2022.

[2] "Stable Diffusion Public Release". Stability.Ai. Retrieved 15 September 2022.

[venturebeatai-3] "Stable Diffusion creator Stability AI accelerates open-source AI, raises $101M". VentureBeat. 18 October 2022. Retrieved 10 November 2022.

[4] Kamath, Bhuvana (19 October 2022). "Stability AI, the Company Behind Stable Diffusion, Raises $101 Mn at A Billion Dollar Valuation". Analytics India Magazine. Retrieved 10 November 2022.

[5] Pesce, Mark. "Creative AI gives overpowered PCs something to do, at last". The Register. Retrieved 10 November 2022.

[6] "Is generative AI really a threat to creative professionals?". The Guardian. 12 November 2022. Retrieved 12 November 2022.

[1]

[2]

[3]

[4]

[5]

[6]