Abstract:
Many real-world applications of classification require flexibility in representing complex objects to preserve the relevant information for class separation. Multiple instance learning (MIL) aims to solve classification problem where each object is rep resented with a bag of instances, and class labels are provided for the bags rather than individual instances. The aim is to learn a function that correctly labels new bags. In this thesis, we propose statistical learning and mathematical optimization methods to solve MIL problems from diversified application domains. We first present bag encoding strategies to obtain bag-level feature vectors for MIL. Simple instance space partition ing approaches are utilized to learn representative feature vectors for the bags. Our experiments on a large database of MIL problems show that random tree-based encod ing is scalable and its performance is competitive with the state-of-the-art methods. Mathematical programming-based approaches to MIL problem construct a bag-level decision function. In this context, we formulate MIL problem as a linear programming model to optimize bag orderings for correct classification. Proposed formulation com bines instance-level scores to return an estimate on the bag label. All instances are solved to optimality on various data representations in a reasonable computation time. At last, we develop a quadratic programming formulation that is superior to previous MIL formulations on underlying assumptions and computational difficulties. Proposed MIL framework models contributions of instances to the bag class labels, and provide a bag class decision threshold. Experimental results verify that proposed formulation enables effective classification in various MIL applications.