Stable Diffusion and Midjourney can even understand “kawaii.” Anime-style characters created by AI image generators



UK start-up company Stability AI has released the public version of their open-source AI image generator Stable Diffusion. The company also launched the beta version of web service DreamStudio, a front-end API that uses Stable Diffusion.

AI image generators are programs that use AI to create images based on textual descriptions. Midjourney, in particular, has become very popular on the internet in recent times, but many are praising the newly released Stable Diffusion and claiming that it is even more advanced.

Stable Diffusion generates images using text, just like Midjourney, and can be used by anyone who signs up for a DreamStudio account. The code and documentation for Stable Diffusion have been released on AI community site Hugging Face, and there is also a demo page on the site where you can try it out. The Stable Diffusion model has been released under a permissive license that allows for commercial and non-commercial use, provided the license is made available to end users of the model in any service on it.

I (the original author of this article) also tried it out by inputting the word “AUTOMATON” on the demo page. The resulting image can be seen below:


Since I didn’t input text that described something in detail, the results ended up all being abstract images. However, it would take a considerable amount of time and effort for a real person to create images such as these. A service that can automatically generate elaborate images based on a single word in only a few minutes is a major technological innovation.

One of the distinguishing traits of Stable Diffusion is the ability to create realistic images that mimic real photos. By inputting suitable descriptions, you can create fictional animals by combining existing ones and create detailed images that appear no different than photos taken in real life. A large variety of different images can be seen in Yamakazu’s article (in Japanese) explaining the ins and outs of using Stable Diffusion.

Tweet Translation:
It’s fun combining different kinds of cute animals.
1. Dog-cat
2. Angora rabbit-cat
3. Panda-cat

Tweet Translation:
I’m currently experimenting with generating photos. Anything from close-ups to landscapes can be created with this level of quality. Incredible.


Another thing that should be highlighted is the incredibly high precision with which Stable Diffusion can create anime style illustrations. These kinds of images, too, are already being shared on the internet.

https://twitter.com/8co28/status/1561932766002167808

That being said, generating such detailed images requires longs strings of text and is not so simple that it can be done by just anyone. There’s a certain knack to composing text that can generate images just as you envision them, and in Japan, particularly useful words and phrases have come to be known as Jumon (magic words).

It appears that Midjourney, the previous AI image generator to make a splash on the internet, has also undergone some changes. Thanks to an update that likely incorporates learning from Stable Diffusion’s source code, Midjourney has also become able to create images featuring “kawaii” 2D characters (cute anime-style 2D characters).

For example, the below images, generated with the intention of creating a “Kawaii 2D character,” are just as good as the illustration style images that were generated with Stable Diffusion. Being able to incorporate open-source content and evolve in a short period of time is one of the advantages of AI.

https://twitter.com/8co28/status/1561967065741037568

Midjourney has even been used to create seemingly real photos of traffic accidents that have been caused by fictional creatures. They look comparable to movie scenes that were created using CG.

Tweet Translation:
Generating fake traffic accident drive recorder images is super fun.


Despite the innovation of these AI image generators, they can also create a number of issues, particularly in relation to copyrights. Generally, using copyrighted materials as training data for an AI is not seen as an issue because things like the pattern, style, and composition of an image are not subject to copyright. However, there is the possibility that AI generated images could infringe upon the image and likeness rights of specific people.

Whether or not output generated by an AI should be subject to copyright is a hotly debated topic, and there could be cases where an AI image unintentionally bears a close resemblance an existing illustration. However, it is incredibly hard to prove that an AI has deliberately tried to plagiarize an image since coming up with judgment criteria can be a pretty ambiguous task.

In addition to Stable Diffusion and Midjourney, there are a number of other AI image generators that are already available or will be released in the future. The appearance of this technology will have a variety of effects on different industries, and we have already seen Midjourney images used to create horror visual novels (related article). There are still some hurdles that need to be overcome, but it is likely that we will see a rise in content that uses AI images in the future. We may very well be standing on the doorstep to an era in which anybody can easily create high-quality images regardless of device.


Written by. Marco Farinaccia based on the original Japanese article (original article’s publication date: 2022-08-24 16:20 JST)

Len Aoi (VTuber)
Len Aoi (VTuber)

JP AUTOMATON writer

Articles: 17