DATA MINING CUP 2011

Scenario

Recommendation engines (REs) are increasingly being used in ecommerce for product recommendations. Recommendation algorithms calculate and automatically recommend products on the basis of product detail views opened by visitors to web shops. This maximises the user’s activity (number of views opened) and the success (sales, turnover). Developing powerful algorithms for REs is currently one of the most popular areas of research focus in data mining.

In the scenario in question, the operator of a web shop would like to use a recommendation engine, which maximises both activity and success, with success being weighted higher. It is, therefore, a matter of selecting the best algorithm. Three types of transaction are considered for each web session: opening a product detail view, placing a product in the shopping basket and purchasing a product. A session typically takes the following course: the user browses in the web shop, opening product detail views as he goes. If the user likes a product, he will place it directly in his shopping basket. At the end of the session, the user can then click on his basket and order the products he is interested in.

Task

The DMC 2011 competition consists of two tasks. These are assessed independently of each other.

The first task involves statically analysing an algorithm i.e. training the algorithm to historical transaction data, the training data. In order to be able to evaluate the prediction quality of the recommendations, the first transactions are specified on a test quantity for each session. These are the test data. The objective of the algorithm is to predict the remaining transaction data for the session. The generated prediction file is sent to the prudsys DMC team. The predicted products are then compared against the actual remaining transaction data from the sessions, the evaluation data. The team with the highest score based on the evaluation data wins.

The second task involves dynamically evaluating an algorithm, the implementation of which is sent to the prudsys DMC team. The objective is to apply the algorithm stepbystep to historical transaction data and continuously predict the next products in a session.

As it receives all the transactions from each session one after the other in succession, it learns and predicts at the same time. The team with the highest score over all prediction steps wins.

Downloads

Task	DMC 2011 - Task 1 file(s) 25.44 MB Download
Solution Task 1	DMC 2011 - Realclass Task 1 1 file(s) 2.72 MB Download
Solution Task 2	DMC 2011 - Realclass Task 2 1 file(s) 2.49 MB Download