Abstract
We study perceptrons when datasets are randomly corrupted by noise and subsequently such corrupted examples are discarded from the training process. Overall, perceptrons appear to be remarkably stable; their accuracy drops slightly when large portions of the original datasets have been excluded from training as a response to verifiable random data corruption. Furthermore, we identify a real-world dataset where it appears to be the case that perceptrons require longer time for training, both in the general case, as well as in the framework that we consider. Finally, we explore empirically a bound on the learning rate of Gallant’s “pocket” algorithm for learning perceptrons and observe that the bound is tighter for non-linearly separable datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Homepage: https://archive.ics.uci.edu.
References
Barocas, S., Hardt, M., Narayanan, A.: Fairness and machine learning: limitations and opportunities. fairmlbook.org (2019). http://www.fairmlbook.org
Baum, E.: The perceptron algorithm is fast for non-malicious distributions. In: NeurIPS 1989, vol. 2, pp. 676–685. Morgan-Kaufmann (1989)
Biggio, B., Nelson, B., Laskov, P.: Poisoning attacks against support vector machines. In: ICML 2012. icml.cc/Omnipress (2012)
Brown, T.B., et al.: Language models are few-shot learners. In: NeurIPS 2020, Virtual (2020)
Quiñonero Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N.D.: Dataset Shift in Machine Learning. The MIT Press, Cambridge (2008)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Dekel, O., Shamir, O., Xiao, L.: Learning to classify with missing and corrupted features. Mach. Learn. 81(2), 149–178 (2010)
Diochnos, D.I., Trafalis, T.B.: Learning reliable rules under class imbalance. In: SDM, pp. 28–36. SIAM (2021)
Fellicious, C., Weißgerber, T., Granitzer, M.: Effects of random seeds on the accuracy of convolutional neural networks. In: LOD 2020, Revised Selected Papers, Part II. LNCS, vol. 12566, pp. 93–102. Springer, Heidelberg (2020). https://doi.org/10.1007/978-3-030-64580-9_8
Flansburg, C., Diochnos, D.I.: Wind prediction under random data corruption (student abstract). In: AAAI 2022, pp. 12945–12946. AAAI Press (2022)
Gallant, S.I.: Perceptron-based learning algorithms. IEEE Trans. Neural Netw. 1(2), 179–191 (1990)
García-Laencina, P.J., Sancho-Gómez, J., Figueiras-Vidal, A.R.: Pattern classification with missing data: a review. Neural Comput. Appl. 19(2), 263–282 (2010)
Goldblum, M., et al.: Dataset security for machine learning: data poisoning, backdoor attacks, and defenses. IEEE Trans. Pattern Anal. Mach. Intell. 45(2), 1563–1580 (2023)
Goodfellow, I.J., McDaniel, P.D., Papernot, N.: Making machine learning robust against adversarial inputs. Commun. ACM 61(7), 56–66 (2018)
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Impagliazzo, R., Lei, R., Pitassi, T., Sorrell, J.: Reproducibility in learning. In: STOC 2022, pp. 818–831. ACM (2022)
Kearns, M.J., Li, M.: Learning in the presence of malicious errors. SIAM J. Comput. 22(4), 807–837 (1993)
Koh, P.W., Liang, P.: Understanding black-box predictions via influence functions. In: ICML 2017. Proceedings of Machine Learning Research, vol. 70, pp. 1885–1894. PMLR (2017)
Koh, P.W., Steinhardt, J., Liang, P.: Stronger data poisoning attacks break data sanitization defenses. Mach. Learn. 111(1), 1–47 (2022)
Krishnaswamy, A.K., Li, H., Rein, D., Zhang, H., Conitzer, V.: Classification with strategically withheld data. In: AAAI 2021, pp. 5514–5522. AAAI Press (2021)
Laird, P.D.: Learning from Good and Bad Data, vol. 47. Springer, Heidelberg (2012). https://doi.org/10.1007/978-1-4613-1685-5
Marcus, G.: Hoping for the best as AI evolves. Commun. ACM 66(4), 6–7 (2023). https://doi.org/10.1145/3583078
Molnar, C.: Interpretable Machine Learning, 2 edn. Independently Published, Chappaqua (2022). https://christophm.github.io/interpretable-ml-book
Rosenblatt, F.: Principles of Neurodynamics. Spartan Books, New York (1962)
Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1(5), 206–215 (2019)
Shafahi, A., et al.: Poison frogs! targeted clean-label poisoning attacks on neural networks. In: NeurIPS 2018, pp. 6106–6116 (2018)
Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning - From Theory to Algorithms. Cambridge University Press, Cambridge (2014)
Valiant, L.G.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984)
Varshney, K.R.: Trustworthy Machine Learning. Independently Published, Chappaqua (2022)
Vorobeychik, Y., Kantarcioglu, M.: Adversarial machine learning. In: Synthesis Lectures on Artificial Intelligence and Machine Learning, # 38. Morgan & Claypool, San Rafael (2018)
Acknowledgements
Part of the work was performed at the OU Supercomputing Center for Education & Research (OSCER) at the University of Oklahoma. The work was supported by the second author’s startup fund. The first author worked on this topic while he was an undergraduate McNair Sholar.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Escamilla, J.E.A., Diochnos, D.I. (2024). Perceptrons Under Verifiable Random Data Corruption. In: Nicosia, G., Ojha, V., La Malfa, E., La Malfa, G., Pardalos, P.M., Umeton, R. (eds) Machine Learning, Optimization, and Data Science. LOD 2023. Lecture Notes in Computer Science, vol 14505. Springer, Cham. https://doi.org/10.1007/978-3-031-53969-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-53969-5_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53968-8
Online ISBN: 978-3-031-53969-5
eBook Packages: Computer ScienceComputer Science (R0)