What is generative AI and how does it work?

Calvin Wankhede / Android Authority

If you’ve read about the buzz around chatbots like ChatGPT and image generators like Midjourney, you may have come across the term generative AI. The term is most commonly used to describe modern artificial intelligence systems that can mimic humans and perform complex tasks in seconds. Generative AI is especially impressive at creative tasks like drawing and writing poetry, which computers have struggled with in the past. But what has caused the sudden explosion of generative AI and how does the technology work? Here’s everything you need to know.

What is Generative AI?

Bing Image Creator on a phone showing a single image of a blue AI creature with orange eyes in front of a screen of zeros and ones

Rita El Khoury / Android Authority

Generative AI is a collective term used to describe computer programs that can generate text, images, videos and audio on their own.

Until now, most AI systems have not been very creative and would produce much worse results than a human. However, that is no longer the case with generative AI. For example, you can ask a generative AI tool like Bing Image Creator to create a photo-realistic image of a “cute blue AI creature with orange eyes” and it will return the results you see above. The tool in question was not explicitly taught or trained to produce this image, but it delivered an impressive result nonetheless.

Generative AI can create text and art in an instant.

Generative AI tools have become increasingly capable, with new developments coming in every few months. The latest version of an AI image generator even managed to fool experts and win a prestigious photo contest. Similarly, several AI-generated images have gone viral on social media, including some with a political agenda.

So whether or not you plan on using generative AI for yourself, it’s important to know that they exist and what their limitations are. Fortunately, we have not yet reached the point where these tools are perfect. In fact, they are prone to making glaring mistakes. This means that with the right information and training, you can differentiate between real and AI-generated content.

How does generative AI work?

barack obama ai image

Generative AI falls under the category of machine learning, a broad term used to describe any computer algorithm that analyzes large amounts of data. These algorithms are designed to mimic the way humans perform tasks.

The first step is extracting patterns from existing data, so if you want an AI that can generate new faces, you enter a dataset with images of faces. With enough training, the algorithm learns what a face looks like, as well as common features such as a nose, eyes, ears, and lips. From there, it can start working on smaller details like expressions, facial hair, and skin tones.

Generative AI can make glaring mistakes, but you have to look closely.

Without sufficient training, the machine learning model in our example will not produce results that resemble a human face. In fact, this issue is currently affecting AI image generators such as Midjourney. Experts were able to quickly detect fictional images of Pope Francis by carefully examining the fingers visible in the image. Since photos of people holding objects don’t include full fingers, generative AI algorithms can struggle to extract enough information from the training data.

Transformers and Reinforcement Learning

Many of the modern generative AI tools you may have heard of, including ChatGPT, rely on the Transformer architecture. Transformers allow the algorithm to focus on relationships within the data. So, for example, in a large language model like GPT-3, they make predictions about which word is likely to appear next.

Reinforcement learning is another common technique used in generative AI. Simply put, a human manually scores a model’s output to filter out bad responses and prompt the algorithm to respond in a certain way. Thanks to a public research paper on the LaMDA language model, we know that Google has hired part-time workers for reinforcement learning. Over time, their feedback helped the model deliver high-quality and actionable answers to user prompts.

What are the advantages and limitations of generative AI?

Stock photo of Google Bard website on Phone 7

Edgar Cervantes / Android Authority

As with any new technology, we will undoubtedly see it used in creative and malicious ways at the same time. Let’s start with the benefits of Generative AI:

  • Less manual work: In tasks that involve a lot of repetition, generative AI can lighten the load with little to no effort. For example, computer code contains a lot of boilerplate text. A developer can automate most of the initial steps using a chatbot.
  • Increased efficiency: Computers can process large amounts of information significantly faster than a human. A language model can quickly summarize a long document or research paper and answer questions that require critical thinking.
  • Human decision making: Generative AI can handle new and unseen scenarios extremely well, which means it can also excel at decision-making. For example, GPT-4 can already pass standardized tests designed for college students and solve complex math problems.

As promising as generative AI tools are, they also have numerous drawbacks. We already have a special post about the dangers of AI, but here’s a quick recap:

  • Prejudice: As mentioned earlier, generative AI tools only perform well after sufficient training. Sadly, however, endless variations in the real world make an impartial or perfect AI pretty unattainable these days. For example, an AI designed to select applicants may inadvertently choose based on certain races or genders due to training biases.
  • Malicious acts: From amateur programmers using ChatGPT to generate malware to social media users creating deepfake images of politicians, generative AI tools can harm or mislead the general population with very little effort.
  • Job loss: Generative AI has the potential to make some jobs obsolete or at least reduce the demand for staff. This is especially true in the art industry, where a single text-based prompt can produce images almost instantly. A trained human can then spend only a short amount of time fine-tuning the AI-generated art rather than creating it from scratch.

What are some examples of Generative AI?

half way through the journey stock image

Calvin Wankhede / Android Authority

We have already discussed some examples of generative AI in this article. But we can also go a step further and group them based on their role.

  1. Text and dialogue: Chatbots such as ChatGPT, Bing Chat, and Google Bard fall under this category. They are trained and refined to handle a back-and-forth conversation, making them perfect for tasks like research and customer support.
  2. Image and video: AI image generators like Midjourney, DALL-E and Stable Diffusion can turn a few words into art. They can also work with existing images to replace backgrounds, add or blend elements, and create scaled-up copies of low-quality inputs.
  3. Speech and sound: Companies like Google have been working on using generative AI to synthesize speech. You may already be familiar with the WaveNet text-to-speech model as it is used for the Google Assistant. But that’s not all, other generative AI like OpenAI Jukebox can also create music with instruments and vocals in specific genres and styles.
  4. Code: What if computers could write their own programs? We’re not quite there yet, but programmers can already use an AI companion like GitHub Copilot or OpenAI Codex to speed up their workflows.

It’s worth noting that most of these generative AI tools didn’t even exist a few years ago. With breakthroughs appearing seemingly every other week, it’s impossible to predict what the future will bring.

Leave a Comment