Generative AI

Generative AI models support the creation of new, novel information based on the data they are trained on

Generative Models

Generative AI ("GenAI") models are capable of creating new, novel information, such as text, images, audio, and video, based on the data they are trained on. Generative models work by learning a "probability distribution" from the data they are trained on, and can generate new samples from that distribution.

A "model" is simply a computer program, based on some data, which can be used for a particular purpose:

  • Traditional AI models have historically been used for pattern recognition (e.g. spam filters), anomaly detection (e.g. safety alerts), decision-making (e.g. product recommendations), or classification (e.g. facial recognition).
  • Generative AI models are, in contrast, at their core designed to help produce new, novel data.

You may also hear other kinds of models discussed:

  • Discriminative models seek to tell the difference between different kinds of data. For example, they may be traditional AI models used for classification tasks (such as distinguishing faces, or even kinds of animals, e.g. dogs from cats).
  • Simulation models, on the other hand, seek to replicate the dynamics of real-world environments, systems or processes in a virtual form. These simulation models attempt to realistically recreate the real-world, often to allow for experimentation within their virtual environments, to allow humans to improve their understanding of them, or to support the training of AI models (e.g. through Deep Reinforcement Learning). Creating simulation models requires a degree of time and data, but once created experimentation within simulated environments is often much cheaper and faster than in offline, physical settings.

Types of GenAI

Transformer Models

Transformers are primarily used for text generation, and are at the core of almost all Large Language Models (LLMs). They are well-suited for processing sequential, ordered information (like text, code, DNA, or time series data), exhibiting an ability to understand dependencies within data.

Diffusion Models

Diffusion models are most commonly used for image generation. They are trained by gradually adding "noise" to a sample, until it is transformed into something unrecognizable to a human, and then learning to reverse the process, to support reconstruction of the original sample data. Once trained, diffusion models are able to transform pure noise into realistic data as implied by or represented within its original sample dataset. Typically this is a multi-step process, involving many "passes" through which noise is iteratively transformed into new data.

Autoregressive Models

Newer image generation models like that of OpenAI's 4o image generation models are autoregressive rather than diffusion-based models. This approach focuses on predicting elements within images one at a time, leading to a deeper understanding and connection between visual elements and language concepts, providing models with a degree of layout awareness while supporting conversational-style image editing.

Generative Adversarial Networks (GANs)

GANs utilize "generator" and "discriminator" neural networks in competition, using each neural network to generate more and more increasingly authentic-seeming new data, based on a given training dataset. For example, new images or new music may be created from a training database containing examples to be replicated. Generators learn to create "plausible" data while discriminators learn to distinguish fake data from real data.

Variational Autoencoders (VAEs)

VAEs learn the patterns or "structure" underlying data, allowing them to be used to generate new, similar data given some input. They are useful for synthetic data generation, anomaly detection, and certain kinds of data compression.

Recurrent Neural Networks (RNNs)

Like transformer models, RNNs are a kind of neural network designed to process "sequential" information whose order matters in its understanding — for example, time series data, or a sentence. Unlike traditional artificial neural networks, RNNs have a "memory" which retains some amount of information from previous inputs, influencing the processing of subsequent inputs. Historically, their primary utility is in tasks such as time series forecasting, speech recognition, and language translation. However, most recent transformer-based approaches typically outperform what we typically think of as RNNs in Natural Language Processing (NLP) tasks, due to their ability to process sequences in parallel and capture long-range dependencies more effectively.

Create a free account

Sign up to try HASH out for yourself, and see what all the fuss is about

By signing up you agree to our terms and conditions and privacy policy