Improving the Video Game Recommender

In my last post, we took the 380+ game video game attributes that we extracted from the Steam video game dataset and wrote an algorithm to cluster the attributes into 24 groups. In this post we will use the clusters to make an new and improved video game recommender. If you haven't read my first
and second post on the video game recommender, please read them before continuing.

Step 1: Reconstructing the game features table

The game features table that we built in the first post contained hundreds of attributes. We will construct a much smaller table using the game attribute clusters.

Game Features Table

The each column in the table indicates the magnitude of that attribute present in each game. The numbers are computed for each game using the attribute-cluster assignments we obtained from K-means. For each game, we count up the number of attributes that belong to each cluster. The construction of the new game features table is described in the code snippet below.

https://gist.github.com/daryleserrant/729ad34f7e932c6bc3e0547b6572efc8

Step 2: Reconstructing the user features table

Using the game features table we built in step 1, we will rebuild a much smaller user features table.

User Features Table

Similar to the game features table, each column in the user features table indicates the degree each user prefers a game with that attribute. The numbers are computed for each user by retrieving the features for each game played by the user from the game features table and adding them up.

https://gist.github.com/daryleserrant/81308561d720373b6bd5cf8311a0d2b0

Step 3: Defining a new recommender function

With our new game and user feature tables in place a new method for examining similarity between users is in order. In our first recommender, we used the matching dissimilarity score. In our new recommender, we're going to use cosine similarity. Cosine similarity is a measure of similarity between two numerical vectors. It is the dot product between two vectors divided by the product of their lengths.

Let's suppose that we have a user that we want to generate recommendations for. We'll call the user in question u and the number of recommendations we'd like to generate x. We will use cosine similarity to find another user named v who's preference is the most similar to u. We will then select x games from v's play history that has not been played by u and then recommend them

Here's the new recommendation procedure in code form.

https://gist.github.com/daryleserrant/1036526c1e4eaa9413d2d69bdacc4d13

That's all folks!

You can find the code for this post here. Until next time!