As soon as Apple announced its plans to inject generative AI into the iPhone, it was as good as official: The technology is now all but unavoidable. Large language models will soon lurk on most of the world’s smartphones, generating images and text in messaging and email apps. AI has already colonized web search, appearing in Google and Bing. OpenAI, the $80 billion start-up that has partnered with Apple and Microsoft, feels ubiquitous; the auto-generated products of its ChatGPTs and DALL-Es are everywhere. And for a growing number of consumers, that’s a problem.
Rarely has a technology risen—or been forced—into prominence amid such controversy and consumer anxiety. Certainly, some Americans are excited about AI, though a majority said in a recent survey, for instance, that they are concerned AI will increase unemployment; in another, three out of four said they believe it will be abused to interfere with the upcoming presidential election. And many AI products have failed to impress. The launch of Google’s “AI Overview” was a disaster; the search giant’s new bot cheerfully told users to add glue to pizza and that potentially poisonous mushrooms were safe to eat. Meanwhile, OpenAI has been mired in scandal, incensing former employees with a controversial nondisclosure agreement and allegedly ripping off one of the world’s most famous actors for a voice-assistant product. Thus far, much of the resistance to the spread of AI has come from watchdog groups, concerned citizens, and creators worried about their livelihood. Now a consumer backlash to the technology has begun to unfold as well—so much so that a market has sprung up to capitalize on it.
Obligatory “fuck 99.9999% of all AI use-cases, the people who make them, and the techbros that push them.”
Not OP but familiar enough with open source diffusion image generators to be able to chime in.
Now I’d argue that being an artist comes down to being able to envision something in your mind’s eye and then reproduce it in the real world using some medium, whether it’s a graphite pencil, oil paint, a block of marble, Wacom tablet on a pc, or even through a negotiation with an AI model. Your definition might be different, but for the sake of conversation this is how I’m thinking about it.
The work flow for an AI generated image can have a few steps before feeling like it sufficiently aligns with your vision. Prompting for specific details can be tricky, so usually step 1 is to generate the basic outline of the image you’re after. Depending on your GPU or cloud service, this could take several minutes or hours before you get a basis that you can work with. Once you have the basic image, you can then use inpainting tools to mask specific areas of the image and change specific details, colors, etc. This again can take many many generations before you land on something that sufficiently matches your vision.
This is all also after you go through the process of reviewing and selecting one of the hundreds of models that have been trained specifically for different types of output. Want to generate anime-style art? There’s a model for that, want something great at landscapes? There’s a different one for that. Surely you can use an all-purpose model for everything, but some models simply don’t have the training to align to your vision, so you either choose to live with ‘close enough’ or you start downloading new options, comparing them with your existing work flow, etc.
There’s certainly skill associated with the current state of image generation. Perhaps not the same level of practice you need to perfectly represent a transparent veil in graphite, but as with other formats I have a hard time suggesting that when someone represents their vision in the real world that it’s automatically “not art”.
So if I walked into a restaurant that specialized in a certain cuisine (choosing the right one out of hundreds is a skill, right?) and wrote down a list of ingredients, and the restaurant made me a meal with those ingredients according to however the restaurant functions (nobody can see into the kitchen, after all), does this make me a chef?
Is there any chance you’re at a kbbq or hotpot restaurant? Because then you get to cook the meal yourself, which is arguably chef-like.
Jokes aside, I see the comparison you’re making and it’s not a bad one. I’d counter by giving the example of a menu - when you get to a restaurant you’re given a menu with text descriptions of the food you can receive from the kitchen. Since this is an analogy and not an exact comparison, let’s say that a meal on the menu is like the starting point of the workflow I described.
Based on that you have an idea of what the output will be when you order - but let’s say you don’t like mushrooms and you prefer your sauce on the side. When you make your order you provide those modifications - this is like inpainting.
Certainly you’re not a ‘chef’, but if the dish you design is both bespoke and previously unimaginable, I’d argue that at the very least you contributed to the creative process and participated in creating something new that matches your internal vision.
Not exactly the same but I don’t think it’s entirely different.
You keep using the word “vision”, but I have a hard time understanding how an AI artist has a vision equivalent to that of a traditional artist based on the explanation you’ve provided. It still sounds they are just cycling through AI generated options until they find something they like/that looks good. That is not the same as seeing something in your mind and then manually recreating that to the best of your ability.
Is a photographer an artist? They need to have some technical skill to capture sharp photos with good lighting, but a lot of the process is designing a scene and later selecting among the photos from a shoot for which one had the right look.
Or to step even further from the actual act of creation, is a creative director an artist? There’s certainly some skill involved in designing and recognizing a compelling image, even if you were not the one who actually produced it.
You’re sort of stepping around the issue here. Are you confirming that AI art is about cycling through options blind until you stumble across something you like?
No, both of those examples involve both design and selection, which is reminiscent to the AI art process. They’re not just typing in “make me a pretty image” and then refreshing a lot.
The only explanation I’ve received so far sounded exactly like this, just with more steps to disguise the underlying process.
It isn’t. People design a scene and then change and refine the prompt to add elements. Some part of it could be refreshing the same prompt, but that’s just like a photographer taking multiple photos of a scene they’ve directed to catch the right flutter of hair or a dress or a creative director saying “give me three versions of X”.
Ready to get back to my original questions?