Abstract:
In the digitalizing world, the rapid development of the Internet has reshaped the customers’ expectations, attitudes and has also changed shopping habits. Visiting online stores for different reasons such as easier product-price comparison, effortless searching and browsing has been becoming much more preferable than the traditional shopping. For this reason, estimating the future behavior of customers is becoming important day by day in order to take advantage of the competitive market. With this motivation, this research focuses on building different behavioral models using clickstream data, which contains the factors that have an impact on the purchasing probability, such as customer past transactions, behavioral frequencies, season and channel. Since the objective of this study is to estimate the likelihood of whether a customer makes a purchase or not, alternative classification approaches such as logistic regression, random forest and boosting are considered. It is determined that models constructed with logistic regression and boosting methods have better predictive accuracy than that of built with other method. According to the results, customers’ past behavior, its frequencies, seasonality and conversion rate related factors are found as significant on the purchasing probability. Moreover, when the computation time of logistic regression and boosting methods are benchmarked, it is investigated that logistic regression requires less time to train a model.