The year 2014 presented participants with yet another tricky data mining task. The topic this time was “Forecasting returns”. Returns constitute a very significant cost factor for online retailers. On average a half of all customer orders are returns. The topic will take on even greater importance with the introduction in June 2014 of the new EU consumer protection directives. A lower returns rate will therefore become a major factor in competitive advantage in online retailing. On the basis of historical purchase data of an online shop a model had to be learned generating a prediction of the probability that a certain purchase is converted into a return on the basis of new purchase data of the shop. For this purpose the historical data contained as well purchase and shipping data as different product and customer attributes. The information “return yes/no” was known, too, for the historical data.
For the task, historical data of one year are known by means of which a model for the prediction of the returns can be learned. For the purchases of one month it is to be assessed in each case whether the item will be returned or not. For this purpose for each order item a prediction is to be made. The higher the value, the more probable is the return. The error with respect to the real outcome concerning the return of the order item should be as small as possible.