The rapid de­vel­op­ment of ar­ti­fi­cial in­tel­li­gence (AI) is making machines smarter. Since they have the ability to use input data to learn au­tonomous­ly, machines are creating new ways to support humans in carrying out in­creas­ing­ly complex tasks.

One solution that’s very promising and already de­liv­er­ing im­pres­sive results in many areas is known as gen­er­a­tive ad­ver­sar­i­al networks (GANs). GANs are primarily used to generate images, but they also allow the automatic creation of text. But what exactly are GANs? How do they work? And what suitable ap­pli­ca­tions do they offer?

What is a GAN?

Before we explore what GANs can actually do for us, let’s look at what gen­er­a­tive ad­ver­sar­i­al networks actually are.

A GAN is a machine learning system, developed in 2014 by Ian Good­fel­low and his team. The task of a GAN is to generate its own creations based on a range of real example data. This allows the end result to be de­cep­tive­ly real and it becomes hard to tell the computer-generated images were not created by human hands. To do this, two neural networks are used which com­mu­ni­cate with each other.

The generator network is tasked with creating a fake. The network is fed with data – such as photos of women. Based on this in­for­ma­tion, it then creates its own photo. First, the network learns what prop­er­ties the originals have in common. So, the new picture isn’t a duplicate of one piece of source data, but an entirely new image that is similar in nature – in our example, rep­re­sent­ing the photo of a (non-existent) woman.

The basic data and generated in­for­ma­tion are provided to the partner network. The task of the dis­crim­i­na­tor network is to check all the data it receives to determine whether it is real or fake. An image is not only deemed to be fake if it deviates too far from the basic data, but also if it’s too perfect. If the generator simply takes the average of all the data and produces a new image, the machine gen­er­a­tion will be easy to determine. The dis­crim­i­na­tor, therefore, also filters out the results that don’t appear natural.

Both networks try to outdo the other. If the dis­crim­i­na­tor network rec­og­nizes a fake dataset, it rejects the data. In this case, the generator network wasn’t good enough and needs to keep learning. At the same time, the dis­crim­i­na­tor also learns. Since both neural networks train each other, this is referred to as a deep learning system. The generator attempts to create datasets that appear so genuine that the dis­crim­i­na­tor clas­si­fies them as real. On the other hand, the dis­crim­i­na­tor tries to closely examine and un­der­stand the real examples so that false datasets have no chance of being clas­si­fied as real.

How do GANs work?

Like any other ar­ti­fi­cial in­tel­li­gence, GANs also need to be trained. This form of machine learning proceeds through six steps:

  1. Problem de­f­i­n­i­tion: In the first step, a problem has to be defined which the system should try to solve. Here, the de­vel­op­ers collect real data that the system can use.
  2. Ar­chi­tec­ture: Various problems also require various gen­er­a­tive ad­ver­sar­i­al networks. For this reason, the GAN has to be equipped with the right ar­chi­tec­ture for the ap­pli­ca­tion.
  3. First dis­crim­i­na­tor training: Actual training begins during this step. The generator is stopped, while the dis­crim­i­na­tor only analyzes the real data and learns to un­der­stand it.
  4. First generator training: Now the dis­crim­i­na­tor is stopped and the generator starts to generate falsified data.
  5. Second dis­crim­i­na­tor training: The dis­crim­i­na­tor network is now fed the new, falsified data from the generator and has to decide which datasets are true and which are false.
  6. Second generator training: The generator network is further improved with the result of the second dis­crim­i­na­tor training stage. The generator network gets to know the weak­ness­es of the dis­crim­i­na­tor and attempts to exploit them and generate even more realistic, fake datasets.

Both networks develop as part of this com­pe­ti­tion, thereby becoming better and more efficient. The generator network learns how to develop in­creas­ing­ly more realistic datasets. The dis­crim­i­na­tor network learns how to identify even seemingly real datasets as false.

What chal­lenges does the system need to overcome?

Just as is the case for almost any tech­nol­o­gy, the de­vel­op­ers of GANs face a number of chal­lenges that have to be solved in order to ensure training runs smoothly.

Balanced com­pe­ti­tion

As explained above, GANs are based on the com­pe­ti­tion between two neural networks. But this can only work if both networks are just as strong and effective as each other. If one of the two networks is superior, the system will collapse. For instance, if the generator is too effective, the dis­crim­i­na­tor will classify all falsified data as real. Whereas, if the dis­crim­i­na­tor has the upper hand, it will classify all the data from the generator as fake. In this case, neither of the networks can develop them­selves further.

Correctly un­der­stand­ing objects

Gen­er­a­tive ad­ver­sar­i­al networks often have problems correctly rec­og­niz­ing and un­der­stand­ing objects. This is par­tic­u­lar­ly true for images. Here’s an example: A real image shows two cats, each with two eyes. If the generator doesn’t un­der­stand the complete structure and po­si­tion­ing of the image, it might generate an image of one cat with four eyes instead. GANs can also be caught out by per­spec­tives and fail to un­der­stand that two images depict the same motif from different angles.

Where are GANs used?

Gen­er­a­tive ad­ver­sar­i­al networks gained special attention – even beyond the field of computer science – after the artist col­lec­tive Obvious used the tech­nol­o­gy to generate a work of art. The painting was sold at auction for $432,500. But a GAN can also deliver as­ton­ish­ing results outside artistic ap­pli­ca­tions.

Video pre­dic­tion

Based on the in­di­vid­ual video frames, GANs can predict how a video continues and thereby extend the video au­tonomous­ly at the end of the footage. They consider all elements of the video, including motions and actions, as well as back­ground changes like rain or fog.

Image gen­er­a­tion using text

GANs can generate images based on a de­scrip­tion. For example, they can use a script to in­de­pen­dent­ly generate a sto­ry­board.

Gen­er­a­tion of complex objects

Even simple sketches can be au­to­mat­i­cal­ly trans­formed by gen­er­a­tive ad­ver­sar­i­al networks into complex three-di­men­sion­al objects in no time at all. A simple drawing of a tree can be used to create a highly complex image with tiny details, like leaves flut­ter­ing in the wind and a swaying tree trunk, thanks to GANs.

Improving image details

GANs can add new details to an image taken in poor res­o­lu­tion or with missing picture elements. To do so, gen­er­a­tive ad­ver­sar­i­al networks use in­for­ma­tion from similar images to augment the missing image in­for­ma­tion.

De­vel­op­ing new products

Some companies are already ex­per­i­ment­ing with GANs in product de­vel­op­ment and create com­plete­ly new designs and product lines using the system.

Product text gen­er­a­tion

GANs can also handle text creation and are already used to generate product texts that play a greater role in the purchase decisions of consumers. Using GANs, these de­scrip­tions cannot only be created quickly, the networks can also analyze which product texts were most suc­cess­ful in the past and use this in­for­ma­tion to compose similar texts.

Gen­er­a­tive ad­ver­sar­i­al networks are already being suc­cess­ful­ly put to use across all these areas. Companies and de­vel­op­ers are con­stant­ly working on new ap­pli­ca­tion pos­si­bil­i­ties. In the near future, GANs will likely have a major influence on many aspects of our lives and work.

Go to Main Menu