Analysis of Checkpointing for Real-Time Systems | Real-Time Systems Skip to main content
Log in

Analysis of Checkpointing for Real-Time Systems

  • Published:
Real-Time Systems Aims and scope Submit manuscript

Abstract

Predictable performance in the event of failuresis of paramount importance in most safety critical real-timesystems. Various hardware as well as software fault-toleranttechniques are employed towards this goal among which checkpointingis a relatively cost-effective scheme. Since checkpointing schemesdepend on time redundancy, they could affect the correctnessof the system by causing deadlines to be missed. This paper providesexact schedulability tests for fault tolerant task sets undera specified failure hypothesis and employing checkpointing toassist in fault recovery. The effects of checkpointing strategieson task response time are analysed and some insights for optimalcheckpointing are provided. The emphasis here is on utilizingthis analysis as an off-line design support tool.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Audsley, N. C., Burns, A., Richardson, M. F., Tindell, K., and Wellings, A. 1993. Applying new scheduling theory to static priority pre-emptive scheduling. Software Engineering Journal 8(5): 284-292.

    Google Scholar 

  • Audsley, N. C., Burns, A., Richardson, M. F., and Wellings, A. 1991. Hard real-time scheduling: The deadline monotonic approach. IFAC/IFIP Workshop. Atlanta, Georgia, pp. 127-132.

  • Bannister, J. A., and Trivedi, K. S. 1983. Task allocation in fault-tolerant distributed systems. Acta Informatica 20: 261-281.

    Google Scholar 

  • Bowen, N. S., and Pradhan, D. K. 1993. Processor and memory based checkpoint and rollback recovery. IEEE Computer 26(2): 22-29.

    Google Scholar 

  • Burns, A. 1994. Preemptive priority based scheduling: An appropriate engineering approach. In: Advances in Real-Time Systems, Son, S., ed. Prentice Hall, pp. 225-248.

  • Burns, A., Davis, R. I., and Punnekkat, S. 1996. Feasibility analysis of fault-tolerant real-time task sets. Euromicro Real-Time Systems Workshop, pp. 29-33.

  • Campbell, A., McDonald, P., and Ray, K. 1992. Single event upset rates in space. IEEE Transactions on Nuclear Science 39(6): 1828-1835.

    Google Scholar 

  • Castillo, X., McConnel, S., and Siewiorek, D. 1982. Derivation and calibration of a transient error reliability model. IEEE Transactions on Computers 31(7): 658-671.

    Google Scholar 

  • Chandy, K. M., Browne, J. C., Dissly, C. W., and Uhrig, W. R. 1975. Analytic models for rollback and recovery strategies in data base systems. IEEE Transactions on Software Engineering SE-1(1): 100-110.

    Google Scholar 

  • Gelenbe, E. 1979. On the optimum checkpoint interval. Journal of the ACM 26(2): 259-270.

    Google Scholar 

  • Ghosh, S., Melhem, R., and Mosse, D. 1995. Enhancing real-time schedules to tolerate transient faults. Proceedings Real-Time Systems Symposium.

  • Ghosh, S., Melhem, R., Mosse, D., and Sansarma, J. 1998. Fault tolerant rate monotonic scheduling. Real-Time Systems Journal 15:(2).

    Google Scholar 

  • Joseph, M., and Pandya, P. 1986. Finding response times in a real-time system. The Computer Journal-British Computer Society 29(5): 390-395.

    Google Scholar 

  • Katcher, D., Arakawa, H., and Strosnider, J. 1993. Engineering and analysis of fixed priority schedulers. IEEE Transactions on Software Engineering 19(9): 920-934.

    Google Scholar 

  • Krishna, C., and Shin, K. 1986. On scheduling tasks with a quick recovery from failure. IEEE Transactions on Computers 35(5): 448-455.

    Google Scholar 

  • Lehoczky, J., Sha, L., and Ding, Y. 1989. The rate monotonic scheduling algorithm-Exact characterization and average case behaviour. Proceedings of IEEE Real-Time Systems Symposium, pp. 166,171.

  • Liestman, A. L., and Campbell, R. H. 1986. A fault-tolerant scheduling problem. IEEE Transactions on Software Engineering 12(11): 1089-1095.

    Google Scholar 

  • Liu, C. L., and Layland, J.W. 1973. Scheduling algorithms for multiprogramming in a hard real-time environment. Journal of the ACM 20(1): 40-61.

    Google Scholar 

  • Miremadi, G., and Torin, J. 1995. Evaluating processor-behaviour and three error-detection mechanisms using physical fault-injection. IEEE Transactions on Reliability 44(3): 441-453.

    Google Scholar 

  • Nicola, V. F. 1995. Checkpointing and the modeling of program execution time. In Software Fault Tolerance, Lyu, M., ed. John Wiley & Sons Ltd, pp. 167-188.

  • Oh, Y., and Son, S. H. 1994. Enhancing fault-tolerance in rate-monotonic scheduling. Real-Time Systems 7(3): 315-329.

    Google Scholar 

  • Pandya, M., and Malek, M. 1998. Minimum achievable utilization for fault-tolerant processing of periodic tasks. IEEE Transactions on Computers 47(10): 1102-1112.

    Google Scholar 

  • Punnekkat, S., Davis, R., and Burns, A. 1997. Sensitivity analysis of real-time task sets. Asian Computing Science Conference, Kathmandu, Nepal, LNCS-1345, Springer-Verlag, pp. 72-82.

  • Ramos-Thuel, S., and Strosnider, J. 1994. Scheduling fault recovery operations for time-critical applications. In Proceedings of fourth IFIP working conference on Dependable Computing for Critical Applications.

  • Randell, B. 1975. System structure for software fault tolerance. IEEE Transactions on Software Engineering SE-1(2): 220-232.

    Google Scholar 

  • Serlin, O. 1972. Scheduling of time critical processes. Proceedings AFIPS Spring Computing Conference, pp. 925-932.

  • Sha, L., Rajkumar, R., and Lehoczky, J. 1990. Priority inheritance protocols: An approach to real-time synchronization. IEEE Transactions on Computers 39(9): 1175-1185.

    Google Scholar 

  • Shin, K. G., Lin, T., and Lee, Y. 1987. Optimal checkpointing of real-time tasks. IEEE Transactions on Computers C-36(11): 1328-1341.

    Google Scholar 

  • Tantawi, A. N., and Ruschitzka, M. 1984. Performance qnalysis of checkpointing wtrategies. ACM Transactions on Computer Systems 2(2): 123-143.

    Google Scholar 

  • Tindell, K. W., Burns, A., and Wellings, A. 1994. An extendible approach for analysing fixed priority hard real-time tasks. Journal of Real-Time Systems 6(2): 133-151.

    Google Scholar 

  • Tsuchiya, T., Kakuda, Y., and Kikuno, T. 1995. Fault-tolerant scheduling algorithm for distributed real-time systems. In Procceding of 3rd Workshop on Parallel & Distributed Real-time Systems, pp. 99-103.

  • Vaidya, N. H. 1997. Impact of checkpoint latency on overhead ratio of a checkpointing scheme. IEEE Transactions on Computers 46(8): 942-947.

    Google Scholar 

  • Young, J.W. 1974. A first order approximation to the optimum checkpoint interval. Communications of the ACM 17(9): 530-531.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Punnekkat, S., Burns, A. & Davis, R. Analysis of Checkpointing for Real-Time Systems. Real-Time Systems 20, 83–102 (2001). https://doi.org/10.1023/A:1026589200419

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1026589200419

Navigation