E-commerceReview DataSupervised LearningCounterfactual Explanations
Eindhoven University of Technology
In this research, different machine learning methods have been examined on their ability to accurately predict a seller review score based on a collected tabular dataset. The collected dataset contains features that domain experts suggested to be potentially valuable input variables for this aimed prediction, whereas the various machine learning algorithms and corresponding methods are selected based on the results of academic papers performed under comparable circumstances. In the end, the most accurate predictive performance was achieved by an XGBoost classification model that made use of class weighting and random undersampling as sampling methods. This model proved to be able to correctly predict 65.2% of the ’unsatisfied customer’ class while its corresponding balanced accuracy is equal to 77.2%. Based on this optimal model, counterfactuals are generated to get insight into the mutable order-related features that are often suggested to be changed in order to convert an ’unsatisfied customer’ to a ’satisfied customer’. Based on these counterfactuals, this study’s ultimate aim is to provide the third-party sellers of an ecommerce platform with tailored recommendations on how to improve customer satisfaction in future orders. Using the insights obtained throughout this study, this study concludes by formulating business recommendations and mentioning some suggestions for future research.
Haas Xander J.W.