Abstract
Before symbolic rules are extracted from a trained neural network, the network is usually pruned so as to obtain more concise rules. Typical pruning algorithms require retraining the network which incurs additional cost. This paper presents FERNN, a fast method for extracting rules from trained neural networks without network retraining. Given a fully connected trained feedforward network with a single hidden layer, FERNN first identifies the relevant hidden units by computing their information gains. For each relevant hidden unit, its activation values is divided into two subintervals such that the information gain is maximized. FERNN finds the set of relevant network connections from the input units to this hidden unit by checking the magnitudes of their weights. The connections with large weights are identified as relevant. Finally, FERNN generates rules that distinguish the two subintervals of the hidden activation values in terms of the network inputs. Experimental results show that the size and the predictive accuracy of the tree generated are comparable to those extracted by another method which prunes and retrains the network.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
R. Andrews, J. Diederich, and A. Tickle, “Survey and critique of techniques for extracting rules from trained artificial neural networks,” Knowledge Based Systems, vol. 8, no. 6, pp. 373–389.
G.A. Carpenter and A.-H. Tan, “Rule extraction: From neural architecture to symbolic representation,” Connection Science, vol. 7, no. 1, pp. 3–28, 1995.
L.M. Fu, “Rule learning by searching on adapted nets,” in Proc. of the 9th National Conference on Artificial Intelligence, AAAI Press/MIT Press: Menlo Park, CA, 1991, pp. 590–595.
S. Gallant, “Connectionist expert systems,” Communications of the ACM, vol. 4, pp. 152–169, 1988.
R. Setiono and H. Liu, “Symbolic representation of neural networks,” IEEE Computer, vol. 29, no. 3, pp. 71–77, 1996.
R. Setiono and H. Liu, “NeuroLinear: From neural networks to oblique decision rules,” Neurocomputing, vol. 17, pp. 1–24, 1997.
S.B. Thrun, “Extracting rules from artificial neural networks with distributed representations,” in Advances in Neural Information Processing7, in edited by G. Tesauro, D. Touretzky and T. Leen, Morgan Kaufmann, 1995.
G.G. Towell and J.W. Shavlik, “Extracting refined rules from knowledge-based neural networks,” Machine Learning, vol. 13, no. 1, pp. 71–101, 1993.
R. Blassig, “GDS: Gradient descent generation of symbolic rules,” in Advances in Neural Info. Proc. Systems 6, Morgan Kaufmann: San Mateo, CA, 1994, pp. 1093–1100.
R. Setiono, “Extracting rules from neural networks by pruning and hidden-unit splitting,” Neural Computation, vol. 9, no. 1, pp. 205–225, 1997.
B. Hassibi and D.G. Stork, “Second order derivatives for network pruning: optimal brain surgeon,” in Advances in Neural Information Processing Systems 5, San Mateo, Morgan Kaufmann: CA, 1993, pp. 164–171.
M. Hagiwara, “A simple and effective method for removal of hidden units and weights,” Neurocomputing, vol. 6, pp. 207–218, 1994.
R. Setiono, “A penalty function approach for pruning feedforward neural networks,” Neural Computation, vol. 9, no. 1, pp. 185–204, 1997.
G. Castellano, A.M. Fanelli, and M. Pelilo, “An iterative pruning algorithm for feedforward neural networks,” IEEE Transactions on Neural Networks, vol. 8, no. 3, pp. 519–531, 1997.
S.B. Thrun, J. Bala, E. Bloedorn, I. Bratko, B. Cestnik, J. Cheng, K. De Jong, S. DOzeroski, S.E. Fahlman, D. Fisher, R. Hamann, K. Kaufmann, S. Keller, I. Kononenko, J. Kreuziger, R.S. Michalski, T. Mitchell, P. Pachowicz, Y. Reich, H. Vafaie, W. Van de Welde, W. Wenzel, J. Wnek, and J. Zhang, “The MONK's problems—a performance comparison of different learning algorithm,” Preprint CMU-CS–91–197, Carnegie Mellon University, Pittsburgh, PA, 1991.
A. van Ooyen, A. and B. Nienhuis, “Improving the convergence of the backpropagation algorithm,” Neural Networks,vol. 5, no. 3, pp. 465–471, 1992.
J. Hertz, A. Krogh, and R.G. Palmer, Introduction to the Theory of Neural Computation, Addison Wesley: Redwood City, CA 1991.
R. Battiti, “First-and second-order methods for learning: Between steepest descent and Newton's method,” Neural Computation,vol. 4, pp. 141–166, 1992.
J.E. Dennis, Jr. and R.B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Prentice Hall: Englewood Cliffs, NJ, 1983.
R. Setiono, “A neural network construction algorithm which maximizes the likelihood function,” Connection Science,vol. 7, no. 2, pp. 147–166, 1995.
R.L. Watrous, “Learning algorithms for connectionist networks: Applied gradient methods for nonlinear optimization,” in Proc. IEEE 1st Int. Conf. Neural Networks, San Diego, CA, 1987, pp. 619–627.
H. Liu and S.T. Tan, “X2R: A fast rule generator,” in Proceedings of IEEE International Conference on Systems, Man and Cybernetics, IEEE Press: New York, 1995, pp. 1631–1635.
C. Merz and P. Murphy, “UCI repository of machine learning databases,” http://www.ics.uci.edu/~mlearn/MLRepository. html, Dept. of Info. and Comp. Sci., University of California: Irvine, CA, 1996.
R. Vilalta, G. Blix, and L. Rendell, “Global data analysis and the fragmentation problem in decision tree induction,” Machine Learning: ECML-97, edited by M. van Someren and G.Widmer, Springer-Verlag, 1997, pp. 312–326.
S. Sestito and T. Dillon, Automated Knowledge Acquisition, Prentice Hall: Sydney, Australia, 1994.
J.R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann: San Mateo, CA, 1993.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Setiono, R., Leow, W.K. FERNN: An Algorithm for Fast Extraction of Rules from Neural Networks. Applied Intelligence 12, 15–25 (2000). https://doi.org/10.1023/A:1008307919726
Issue Date:
DOI: https://doi.org/10.1023/A:1008307919726