Abstract:
In modern digital systems, algorithms that deliver personalized content shape the user experience and affect user satisfaction, hence long-term engagement with the system. What the system presents also influences the parties providing content to the system since visibility to the user is vital for reachability. Such algorithms learn to deliver personalized content using data on previous user behavior, e.g., their choices, clicks, ratings, etc., interpreted as a proxy for user preferences. In the first part of this work, we review prevalent models for learning from user feedback on content, including our contributions to the literature. As such data is ever-growing, we discuss computational aspects of learning algorithms and focus on software libraries for scalable implementations, including our contributions. The second part is on learning from user interactions with algorithmic personalization systems. Albeit helpful, human behavior is subject to cognitive biases, and data sets comprising their item choices are subject to sampling biases, posing problems to learning algorithms that rely on such data. As users interact with the system, the problem worsens—the algorithms use biased data to compose future content. Further, the algorithms self-reinforce their inaccurate beliefs on user preferences. We review some of the biases and investigate a particular one: the user’s tendency to choose from the alternatives presented by the system, putting the least effort into exploring further. To account for it, we develop a Bayesian choice model that explicitly incorporates in the inference of user preferences their limited exposure to a systematically selected subset of items by an algorithm. The model leads to an efficient online learning algorithm of user preferences through interactions.