Tangent-cut optimizer on gradient descent: an approach towards Hybrid Heuristics | Artificial Intelligence Review Skip to main content
Log in

Tangent-cut optimizer on gradient descent: an approach towards Hybrid Heuristics

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

The world has witnessed a surfeit of usage of Artificial Intelligence systems for a long time. Nowadays, most of the problems are transforming from logical solutions into statistical domains. This requires the implementation of machine learning algorithms to mine useful data from the statistical datasets which in turn demands high-end computing. Generally, machine learning algorithms utilize Gradient Descent as a tool to find the optimal solution of computationally expensive problems. This gave rise to the development of optimization algorithms like Momentum, RMSProp, Adam and the like, which could speed up the convergence to the global optimum besides increasing the learning accuracy. However, nowadays the supervised machine learning models got more data intensive which increased their computational cost, putting the efficiency of these algorithms into question. In this context, a new optimization algorithm namely, the Tangent-Cut Optimizer (TC-Opt) has been proposed which can converge faster than the traditional optimization algorithms for supervised machine learning models. Furthermore, the proposed work brings forward a phenomenon that intertwines the statistical and logical decision-making model into a single unit while shedding light on a new heuristic approach named “Hybrid Heuristics”. The proposed algorithm has been implemented on the standard dataset of Boston House Pricing Dataset for linear regression and MNIST image dataset of handwritten digits from 0 to 9 for logistic regression and its performance has been compared with the existing algorithms. Finally, the robustness and high accuracy of the proposed optimization algorithm have been proved and demonstrated in the presentation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

No explicit data is available for this work. All experimental data has been provided in the article itself and some more important information related to the work has been provided through the supplementary material.

Code availability

Custom codes are available. No external software was used for experimentation.

References

  • Akhtar A, (2019) Evolution of Ant Colony Optimization Algorithm—A brief literature review. Preprint at https://arxiv.org/abs/1908.08007[cs.NE] 1–11

  • Akram S, Ann Q (2015) Newton Raphson Method. Int J Sci Eng Res 6(7):1748–1752

    Google Scholar 

  • Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In 19th International Conference on Computational Statistics, Paris, France 177–186

  • Chakhlevitch K, Cowling P (2008) Hyperheuristics: recent developments. Adapt Multilevel Metaheuristics Stud Comput Intell 136:3–29

    Article  Google Scholar 

  • Chee J, Toulis P (2018) Convergence diagnostics for stochastic gradient descent with constant learning rate. https://arxiv.org/abs/1710.06382[stat.ML] 1–36

  • Cowling P, Kendall G, Soubeiga E (2001) A hyperheuristic approach to scheduling a sales summit. In: Burke E, Erben W (eds) International conference on the practice and theory of automated timetabling III, vol 2079. Springer. Berlin, Heidelberg, pp 176–190

    Chapter  Google Scholar 

  • D’Angelo G, Palmieri F (2021) GGA: a modified genetic algorithm with gradient-based local search for solving constrained optimization problems. Inf Sci 547(8):136–162

    Article  MathSciNet  Google Scholar 

  • Dauphin YN, Vries H, Chung J, Bengio Y (2015) Equilibrated adaptive learning rates for non-convex optimization. Preprint at https://arxiv.org/abs/1502.04390[cs.LG] 1–10

  • Doncieux S, Mouret JB, Bredeche N, Padois V (2011) Evolutionary robotics exploring new horizons. In: Doncieux S, Bredèche N, Mouret J-B (eds) New horizons in evolutionary robotics. Springer, Berlins

    Chapter  Google Scholar 

  • Dozat T (2016) Incorporating Nesterov Momentum into Adam. International Conference on Learning Representation 2016, San Juan, PR 1–4

  • Duan H, Qiao P (2014) Pigeon-inspired optimization: a new swarm intelligence optimizer for robot path planning. Int J Intell Comput Cybern 7(1):24–37

    Article  MathSciNet  Google Scholar 

  • Gitman I, Dilipkumar D, Parr B (2018) Convergence analysis of gradient descent algorithms with proportional updates. Preprint at https://arxiv.org/abs/1801.03137 [cs.LG] 1–12

  • Henderson D, Jacobson SH, Johnson AW (2003) The theory and practice of simulated annealing. In: Glover F, Kochenberger GA (eds) Handbook of metaheuristics. International series in operations research and management science. Springer, Boston

    Google Scholar 

  • Hennig P, Kiefel M (2013) Quasi-Newton methods: a new direction. J Mach Learn Res 14(1):843–865

    MathSciNet  MATH  Google Scholar 

  • Kamrani AK, Gonzalez R (2002) A heuristic genetic algorithm methodology. Proceedings of the 5th Biannual World Automation Congress, Orlando, FL 71–76

  • Kennedy J, Eberhart R (1995) Particle swarm optimization. Proc. of IEEE International Conference on Neural Networks, Perth, WA 4: 1942–1948

  • Khirirat S, Feyzmahdavian HR, Johansson M (2017) Mini-batch gradient descent: Faster convergence under data sparsity. In 2017 IEEE 56th Annual Conference on Decision and Control, Melbourne, VIC 2880–2887

  • Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. International Conference on Learning Representations, San Diego, CA, 1–13

  • Kokash N (2005) An introduction to heuristic algorithms. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.105.8050. Accessed 22 May 2020 1–8

  • Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  • Lemaréchal C (2012) Cauchy and the gradient method. Doc Math 17:251–254

    MathSciNet  MATH  Google Scholar 

  • Lewis R (2009) A general-purpose hill-climbing method for order independent minimum grouping problems: a case study in graph colouring and bin packing. Comput Oper Res 36(7):2295–2310

    Article  MathSciNet  Google Scholar 

  • Li X, Zhao H, Jin L (2012) An Algorithm for Global Optimization Problems Based on ABC-BFGS. In 2012 Fifth International Joint Conference on Computational Sciences and Optimization, Harbin 881–884

  • Liepins GE, Hilliard MR (1989) Genetic algorithms: foundations and applications. Ann Oper Res 21:31–57

    Article  Google Scholar 

  • Liu L, Jiang H, He P, Chen W, Liu X, Gao J, Han J (2020) On the Variance of the Adaptive Learning Rate and Beyond. Preprint at https://arxiv.org/abs/1908.03265[cs.LG] 1–14

  • Lourenço HR, Martin OC, Stützle T (2003) Iterated local search. In: Glover F, Kochenberger GA (eds) Handbook of Metaheuristics. International Series in Operations Research and Management Science. Springer, Boston

    Google Scholar 

  • Najafabadi M, Khoshgoftaar T, Villanustre F, Holt J (2017) Large-scale distributed L-BFGS. J Big Data 4(22):1–17

    Google Scholar 

  • Nath S, Sing JK, Sarkar SK, (2018) Performance comparison of PSO and its new variants in the context of VLSI global routing. Particle swarm optimization with applications, Pakize Erdoğmuş, IntechOpen 1–21. https://doi.org/10.5772/intechopen.72811. Available from: https://www.intechopen.com/books/particle-swarm-optimization-with-applications/performance-comparison-of-pso-and-its-new-variants-in-the-context-ofvlsi-global-routing

  • Nath S, Gupta S, Biswas S, Banerjee R, Sing JK, Sarkar SK (2020) GPSO Hybrid Algorithm for Rectilinear Steiner Tree Optimization. In 2020 IEEE VLSI DEVICE CIRCUIT AND SYSTEM, Kolkata, WB 365–369

  • Nesterov Y (1983) A method for unconstrained convex minimization problem with the rate of convergence O(1/k2). Soviet Math Dokl 27(2):543–547

    Google Scholar 

  • Noel MM (2012) A new gradient based particle swarm optimization algorithm for accurate computation of global minimum. Appl Soft Comput 12(1):353–359

    Article  Google Scholar 

  • Rice JR (1976) The algorithm selection problem. Adv Comput 15:65–118

    Article  Google Scholar 

  • Ross P (2005) Hyper-Heuristics. In: Burke EK, Kendall G (eds) Search methodologies. Springer, Boston. https://doi.org/10.1007/0-387-28356-0_17

    Chapter  Google Scholar 

  • Ruder S (2017) An overview of gradient descent optimization algorithms. Preprint at https://arxiv.org/abs/1609.04747 [cs.LG] 1–14

  • Ryser-Welch P, Miller JF (2014) A review of hyper-heuristic frameworks. In 50th Annual Convention of the AISB 1–7

  • Salvatierra MA (2013) Quasi-Newton optimization algorithm to solve molecular distance geometry problems. In 2013 XXXIX Latin American Computing Conference, Naiguata 1–3

  • Smith-Miles K, Lopes L (2012) Measuring instance difficulty for combinatorial optimization problems. Comput Oper Res 39(5):875–889

    Article  MathSciNet  Google Scholar 

  • Sutskever I, Martens J, Dahl G, Hinton G (2013) On the importance of initialization and momentum in deep learning. Proceedings of the 30th International Conference on Machine Learning, Atlanta, Ga, 28(3): 1139–1147

  • Veenhuis C (2010) Binary invasive weed optimization. In 2010 Second World Congress on Nature and Biologically Inspired Computing, Fukuoka 449–454

  • Voudouris C, Tsang EPK (2003) Guided local search. In: Glover F, Kochenberger GA (eds) Handbook of metaheuristics, international series in operations research and management science. Springer, Boston

    Google Scholar 

Download references

Funding

None.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Saptarshi Biswas.

Ethics declarations

Conflict of interest

All the authors of this article declare no competing interest related to this research work.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 479 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Biswas, S., Nath, S., Dey, S. et al. Tangent-cut optimizer on gradient descent: an approach towards Hybrid Heuristics. Artif Intell Rev 55, 1121–1147 (2022). https://doi.org/10.1007/s10462-021-09984-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-021-09984-0

Keywords

Navigation