Game Character Dataset

Create a pytorch dataset

Procedural Generation

This step is needless if you already have the images that need to be trained. Since we are constructing the dataset from scratch, we need a procedural generation of images from a sample space of images. We initially have a set of images for each different action that a character and its type can make. For example, if we take the character Pikachu, the types will be the different evolution of the character.

Types of Pikachu

Each type of Pikachu will do an action in a different way. They may walk, or attack in a different way as depicted below.

Different action performed by a type of pikachu

Our image has 2 characters interacting with each other. One is an actor, or the one who instigates, and the other character, the reactor, one who reacts to the actor. Hence, we procedurally generate all images for different types of characters being the actor or reactor doing a different set of actions. In our dataset, we have two characters, Satyr and Golem. Here are a few images from our dataset.

Satyr attacking a Satyr variant and it got hurt
Satyr and Golem attacking each other

Dataset

To create a dataset compatible with PyTorch, we need to create a dataset class and override a few methods. In this section, we'll go over this in detail. The overall code structure is given below. The new class implements a Dataset class from PyTorch and overrides the __getitem__ and __len__ method.

The __getitem__ method takes in an index parameter and we return the image and the labels at that index. The __len__ method returns the length of the dataset.

Dataloader

A Dataloader is used to implement batching of the dataset. If we have a huge dataset, without batching, the weights won't change for a long time during training. Appropriate batch size is chosen, so that backpropagation takes place after every batch. Usually, the batch size is way less than the number of samples in the dataset.

We could have a data loader for training and testing separately. More info on how to create a custom dataset on PyTorch can be found here,

Final Dataset

We end up with 432 images and they are increased further by using data augmentation techniques. We have the following class labels which can be observed with any image.

  1. Actor Character

  2. Reactor Character

  3. Actor Type

  4. Reactor Type

  5. Actor Action

  6. Reactor Reaction

Last updated

Was this helpful?