Robust Optimization and Machine Learning for Classification of Noisy and Imbalanced Data Theodore Trafalis
School of Industrial and Systems Engineering, University of Oklahoma, Norman, USA
Support vector machine (SVM) methods have found wide use in recent years for data classification problems. However, classification becomes complicated in the uncertain and imbalanced nature of many real data sets. Uncertainty is often prevalent in many data sets and is not addressed efficiently by most data mining techniques. Imbalanced data sets describe rare events, and for such data sets elements in the minority class become critical to the classification algorithm. Most data mining techniques perform poorly in predicting the minority class for imbalanced data. In this seminar we present a robust semi-definite programming (SDP) approach to solve the soft-margin SVM classification problem for both uncertain and imbalanced data. Specifically, we express the SVM classification as a SDP problem and develop its robust counterpart when data points contain Euclidean bounded uncertainties. The robust SDP approach improves the learning of the kernel matrix and misclassification penalty in the SVM classification scheme. Such robust SDP formulations are computationally tractable because they are presented as linear matrix inequalities. Analysis results are presented for several standard data sets, demonstrating strong performance of SDP approach. In addition we present a specific example from weather prediction and compare several machine learning techniques.
Support vector machine (SVM) methods have found wide use in recent years for data classification problems. However, classification becomes complicated in the uncertain and imbalanced nature of many real data sets. Uncertainty is often prevalent in many data sets and is not addressed efficiently by most data mining techniques. Imbalanced data sets describe rare events, and for such data sets elements in the minority class become critical to the classification algorithm. Most data mining techniques perform poorly in predicting the minority class for imbalanced data. In this seminar we present a robust semi-definite programming (SDP) approach to solve the soft-margin SVM classification problem for both uncertain and imbalanced data. Specifically, we express the SVM classification as a SDP problem and develop its robust counterpart when data points contain Euclidean bounded uncertainties. The robust SDP approach improves the learning of the kernel matrix and misclassification penalty in the SVM classification scheme. Such robust SDP formulations are computationally tractable because they are presented as linear matrix inequalities. Analysis results are presented for several standard data sets, demonstrating strong performance of SDP approach. In addition we present a specific example from weather prediction and compare several machine learning techniques.
Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!
To be used only for spelling or punctuation mistakes.