Özet:
The performance of certain machine learning algorithms in classification and diagnostic prediction of small round blue cell tumors (SRBCTs) of childhood is investigated. Before classifying samples, including both tumor biopsy material and cell lines, based on their gene expression profiles, dimensionality of the problem is reduced. Dimensionality reduction is achieved in a two-step procedure that includes correlation- based feature selection (CFS) followed by principal components analysis (PCA). To classify the samples into four distinct diagnostic categories, logistic model trees (LMT) and multilayer perceptrons (MLP) are trained. The posterior probabilities provided by LMT and MLP algorithms for each sample are then used to construct a measure, by means of which one might decide whether to classify a sample into one of the diagnostic categories or to reject classifying.