A mixed quadratic programming model for a robust support vector machine

Authors

  • Raquel Serna-Diaz Facultad de Ciencias, Universidad Nacional Agraria la Molina, La molina, Lima, Perú.
  • Raimundo Santos Leite Instituto de Ciencias Exatas e Biológicas , Universidade Federal de Ouro Preto, Campus UniversitárioMorro do Cruzeiro, CEP:35400-000, Ouro Preto, MG, Brasil.
  • Paulo J. S. Silva Instituto de Matemática, Estatística e Computação Científica, Universidade Estadual de Campinas, Rua Sérgio Buarque de Holanda, 651 13083-859, Campinas, SP, Brasil.

DOI:

https://doi.org/10.17268/sel.mat.2021.01.03

Keywords:

SVM, Mixed integer quadratic programming, Outliers, Classification

Abstract

Support Vector Machines are extensively used to solve classification problems in Pattern Recognition. They deal with small errors in the training data using the concept of soft margin, that allowfor imperfect classification.

However, if the training data have systematic errors or outliers such strategy is not robust resulting in bad generalization. In this paper we present a model for robust Support Vector Machine classification that can automatically ignore spurius data. We show then that the model can be solved using a high performance Mixed Integer Quadratic Programming solver and present preliminary numerical experiments using real world data that looks promissing.

References

Zhan Y Shen D. Increasing Efficiency of SVM by Adaptively Penalizing Outliers. In: Rangarajan A., Vemuri B., Yuille A.L.(eds) Energy Minimization Methods in Computer Vision and Pattern Recognition. EMMCVPR 2005. Lecture Notes in Computer Science, vol 3757. Springer, Berlin, Heidelberg; 2005. p. 539–551.

R SHNMMS. On Improving the Generalization of SVMClassifier. vol. 157. Springer, Berlin, Heidelberg: Computer Networks and Intelligent Computing; 2011.

Hoak J. The Effects of Outliers on Support VectorMachines; 2010.

Thongkam J, XuYanchun G, Huang ZF. Support Vector Machine for Outlier Detection in Breast Cancer Survivability Prediction. vol. 4977 (99-109). Springer, Berlin, Heidelberg: Advanced Web and Network Technologies, and Applications. Lecture Notes in Computer Science; 2008.

Debruyne M. An Outlier Map for Support Vector Machine Classification. Universiteit Antwerpen: The Annals of Applied Statistics; 2009.

Xu L, Crammer K, Schuurmans D. Robust Support Vector Machine Training via Convex Outlier Ablation. In: Proceedings of the Twenty-First National Conference on Artificial Intelligence. EUA; 2006. p. 536–542.

Tsyurmasto P, Zabarankin M, Uryasev S. Value-at-Risk Support Vector Machine: Stability to Outliers. vol. 28 (218–232). Journal of Combinatorial Optimization; 2014.

Andreani R, Martínez JM, Martínez L, Yano FS. Low Order-Value Optimization and Applications. vol. 43 (1-22). Journal of Global Optimization; 2009-01.

Hess EJ, Brooks JP. The Support Vector Machine and Mixed Integer Linear Programming: Ramp Loss SVM with L1-Norm Regularization. 14th INFORMS Computing Society Conference; 2015.

Anguita D, Ghio A, Pischiutta S, Ridella S. A support vector machine with integer parameters. vol. 72 (480-489).

Neurocomputing;2008.

Gurobi Optimization L. Gurobi Optimizer ReferenceManual; 2020. Available from: http://www.gurobi.com.

Deng N, Tian Y, Zhang C. Support Vector Machines: Optimization Based Theory, Algorithms, and Extensions. Chapman & Hall/CRC DataMining and Knowledge Discovery Series. Boca Raton: CRC Press, Taylor & Francis Group; 2013.

Cominetti R, Mascarenhas WF, Silva PJS. A Newton’s Method for the Continuous Quadratic Knapsack Problem. vol. 6(151–169). Mathematical Programming Computation; 2014-02.

Birgin EG, Martínez JM, RaydanM. Nonmonotone Spectral ProjectedGradientMethods for Convex Sets. vol. 10 (1196–1211). SIAM Journal on Optimization; 2000.

Chang CC, Lin CJ. LIBSVM: A Library for Support Vector Machines. vol. 2(7:1–27:27). ACM Transactions on Intelligent Systems and Technology; 2011-05-06.

Boyd SP, Vandenberghe L. Convex Optimization. Cambridge, UK ; New York: Cambridge University Press; 2004.

Gaudioso M, Gorgone E, Hiriart-Urruty J. Feature selection in SVM via Polyhedral k-norm. Optimization Letters (19-36); 2020.

Leite RS. Support VectorMachines and the Low Order-Value Optimization. Campinas; 2019.

Street WN, Wolberg WH, Mangasarian OL. Nuclear Feature Extraction for Breast Tumor Diagnosis. In: Biomedical Image Processing and Biomedical Visualization. vol. 1905. International Society for Optics and Photonics; 1993-07-29. p. 861–870.

Mangasarian OL, Street WN, Wolberg WH. Breast Cancer Diagnosis and Prognosis Via Linear Programming. vol. 43 (570–577). Operations Research; 1995-08.

Sigillito VG, Wing SP, Hutton LP, Baker KB. Classification of Radar Returns from the Ionosphere Using Neural Networks. 19. Johns Hopkins APL Technical Digest; 1989.

Dal Pozzolo A, Boracchi G, Caelen O, Alippi C, Bontempi G. Credit Card Fraud Detection: A Realistic Modeling and a Novel Learning Strategy. vol. 29 (3784–3797). IEEE Transactions on Neural Networks and Learning Systems; 2018-08.

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-Learn: Machine Learning in Python. vol. 12. Journal of Machine Learning Research; 2011. Available from: http://jmlr.org/papers/v12/pedregosa11a.html.

Cristianini N, Shawe-Taylor J. An Introduction to Support Vector Machines: And Other Kernel-Based Learning Methods. Cambridge ; New York: Cambridge University Press; 2000.

Horst R, Thoai NV. DC Programming: Overview. vol. 103 (1–43). Journal of Optimization Theory and Applications; 1999-10-01.

Martinez-Legaz JE, VolleM. Duality in D.C. Programming: The Case of Several D.C. Constraints. vol. 237 (657–671). Journal of Mathematical Analysis and Applications; 1999.

Tao PD, An LTH. Convex Analysis Approach to D. C. Programming: Theory, Algorithms and Applications. vol. 22(289–355). ACTA MATHEMATICA VIETNAMICA; 1997.

Bertsekas DP. Convex Optimization Theory. Belmont, Massachusetts: Athena Scientific; 2009-06-30.

Calafiore G, El Ghaoui L. OptimizationModels. Cambridge: Cambridge University Press; 2014.

Hiriart-Urruty JB, Lemaréchal C. Fundamentals of Convex Analysis. Grundlehren Text Editions. Berlin Heidelberg: Springer-Verlag; 2001. Available from: //www.springer.com/la/book/9783540422051.

Arora S, Bhattacharjee D, Nasipuri M, Malik L, KunduM, Basu D. Performance Comparison of SVM and ANN for Handwritten Devnagari Character Recognition. vol. abs/1006.5902. ArXiv; 2010.

Joachims T. Estimating the Generalization Performance of an SVM Efficiently. Proceedings of the Seventeenth International Conference onMachine Learning (ICML 2000), Stanford University; 2000. p. 431–438.

Downloads

Published

2021-07-29

How to Cite

Serna-Diaz, R., Santos Leite, R., & S. Silva, P. J. (2021). A mixed quadratic programming model for a robust support vector machine. Selecciones Matemáticas, 8(01), 27-36. https://doi.org/10.17268/sel.mat.2021.01.03