Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wp-plugin-bluehost domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home1/jsddgrmy/public_html/wp-includes/functions.php on line 6121

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the google-analytics-for-wordpress domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home1/jsddgrmy/public_html/wp-includes/functions.php on line 6121

Warning: Cannot modify header information - headers already sent by (output started at /home1/jsddgrmy/public_html/wp-includes/functions.php:6121) in /home1/jsddgrmy/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1896

Warning: Cannot modify header information - headers already sent by (output started at /home1/jsddgrmy/public_html/wp-includes/functions.php:6121) in /home1/jsddgrmy/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1896

Warning: Cannot modify header information - headers already sent by (output started at /home1/jsddgrmy/public_html/wp-includes/functions.php:6121) in /home1/jsddgrmy/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1896

Warning: Cannot modify header information - headers already sent by (output started at /home1/jsddgrmy/public_html/wp-includes/functions.php:6121) in /home1/jsddgrmy/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1896

Warning: Cannot modify header information - headers already sent by (output started at /home1/jsddgrmy/public_html/wp-includes/functions.php:6121) in /home1/jsddgrmy/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1896

Warning: Cannot modify header information - headers already sent by (output started at /home1/jsddgrmy/public_html/wp-includes/functions.php:6121) in /home1/jsddgrmy/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1896

Warning: Cannot modify header information - headers already sent by (output started at /home1/jsddgrmy/public_html/wp-includes/functions.php:6121) in /home1/jsddgrmy/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1896

Warning: Cannot modify header information - headers already sent by (output started at /home1/jsddgrmy/public_html/wp-includes/functions.php:6121) in /home1/jsddgrmy/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1896
{"id":276,"date":"2022-05-03T02:50:28","date_gmt":"2022-05-03T02:50:28","guid":{"rendered":"https:\/\/www.rousingdata.com\/?p=276"},"modified":"2022-11-11T06:02:54","modified_gmt":"2022-11-11T06:02:54","slug":"clustering-video-game-attributes","status":"publish","type":"post","link":"https:\/\/daryleserrant.com\/clustering-video-game-attributes\/","title":{"rendered":"Clustering Video Game Attributes"},"content":{"rendered":"\n

In a previous blog post<\/a>, I walked through the creation of a simple recommender system that recommends video games to existing users on Steam. Since creating the recommender, my student and I have been exploring ways to improve it. One enhancement we’ve been looking at is speeding up the computational time of the recommender by clustering video game attributes (tags, genres, and specs) into smaller, more manageable groups. In this post I will describe how we utilized the K-means algorithm to do this.<\/p>\n\n\n\n

Improving the Recommender by Computing Probabilities<\/h2>\n\n\n\n

The current system uses 188 categorical attributes to recommend video games to existing users. The biggest disadvantage of this approach is the large amount of computational time required to build the tables the recommender requires. The numerous categorical features also makes adding numerical features (such as game price) to the system challenging; the influence of the categorical features will outweigh any impact the numerical features could have on the recommendations. To correct both issues, we will attempt to reduce the number of attributes the recommender using K-Means clustering.<\/p>\n\n\n\n

The raw, unprocessed video game metadata contains 380+ attributes. We observed from looking at the data for several video games that some attributes tend to appear together with other attributes. This gave me the idea to try to group these attributes based on the likihood the attributes will appear together. We will do this by constructing a square matrix that contains the conditional probability of observing any two attributes in a video game. We use this matrix to cluster the game attributes.<\/p>\n\n\n

\n
\"\"<\/figure><\/div>\n\n\n

The code snippet below shows the construction of the probability matrix. We iterate through the each game in the steam games dataset and construct a dictionary containing the sets of games that have each attribute.<\/p>\n\n\n\n

\n