{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,15]],"date-time":"2024-09-15T02:53:10Z","timestamp":1726368790728},"reference-count":53,"publisher":"Wiley","issue":"9","license":[{"start":{"date-parts":[[2020,6,1]],"date-time":"2020-06-01T00:00:00Z","timestamp":1590969600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Softw Pract Exp"],"published-print":{"date-parts":[[2020,9]]},"abstract":"Summary<\/jats:title>In recent years, the performance of deep neural networks (DNNs) is significantly improved, making them suitable for many application fields, such as autonomous driving, advanced robotics, and industrial control. Despite a lot of research being devoted to improving the accuracy of DNNs, only limited efforts have been spent to enhance their timing predictability, required in several real\u2010time applications. This paper proposes a software infrastructure based on the Linux operating system to integrate DNNs within a real\u2010time multicore system. It has been realized by modifying both the internal scheduler of the popular TensorFlow framework and the SCHED_DEADLINE scheduling class of Linux. The proposed infrastructure allows providing timing isolation of DNN inference tasks, hence improving the determinism of the temporal interference generated by TensorFlow. The proposal is finally evaluated with a case study derived from a state\u2010of\u2010the\u2010art benchmark inspired by an autonomous industrial system. Extensive experiments demonstrate the effectiveness of the proposed solution and show a significant reduction of both average and longest\u2010observed response times of TensorFlow tasks.<\/jats:p>","DOI":"10.1002\/spe.2840","type":"journal-article","created":{"date-parts":[[2020,6,1]],"date-time":"2020-06-01T11:05:21Z","timestamp":1591009521000},"page":"1760-1777","update-policy":"http:\/\/dx.doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Timing isolation and improved scheduling of deep neural networks for real\u2010time systems"],"prefix":"10.1002","volume":"50","author":[{"ORCID":"http:\/\/orcid.org\/0000-0003-4719-3631","authenticated-orcid":false,"given":"Daniel","family":"Casini","sequence":"first","affiliation":[{"name":"Department of Excellence in Robotics & AI Scuola Superiore Sant'Anna Pisa Italy"},{"name":"TeCIP Institute Scuola Superiore Sant'Anna Pisa Italy"}]},{"given":"Alessandro","family":"Biondi","sequence":"additional","affiliation":[{"name":"Department of Excellence in Robotics & AI Scuola Superiore Sant'Anna Pisa Italy"},{"name":"TeCIP Institute Scuola Superiore Sant'Anna Pisa Italy"}]},{"given":"Giorgio","family":"Buttazzo","sequence":"additional","affiliation":[{"name":"Department of Excellence in Robotics & AI Scuola Superiore Sant'Anna Pisa Italy"},{"name":"TeCIP Institute Scuola Superiore Sant'Anna Pisa Italy"}]}],"member":"311","published-online":{"date-parts":[[2020,6]]},"reference":[{"key":"e_1_2_10_2_1","doi-asserted-by":"crossref","unstructured":"ChenC SeffA KornhauserA XiaoJ. DeepDriving: learning affordance for direct perception in autonomous driving. Paper prresented at: Proceedings of the IEEE International Conference on Computer Vision (ICCV 2015). December 2\u20106 2015; Araucano Park Las Condes Chile.","DOI":"10.1109\/ICCV.2015.312"},{"key":"e_1_2_10_3_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364914549607"},{"key":"e_1_2_10_4_1","doi-asserted-by":"crossref","unstructured":"LenssenJE TomaA SeeboldA et al. Real\u2010time low snr signal processing for nanoparticle analysis with deep neural networks. Paper presented at: Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018). January 19\u201021 2018; Funchal Portugal.","DOI":"10.5220\/0006596400360047"},{"key":"e_1_2_10_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2018.04.061"},{"key":"e_1_2_10_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"key":"e_1_2_10_7_1","unstructured":"http:\/\/blog.paralleldots.com\/data\u2010science\/must\u2010read\u2010path\u2010breaking\u2010papers\u2010about\u2010image\u2010classification\/."},{"key":"e_1_2_10_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4614-0676-1"},{"key":"e_1_2_10_9_1","unstructured":"AbadiM AgarwalA BarhamP et al. TensorFlow: large\u2010scale machine learning on heterogeneous distributed systems;2015.http:\/\/download.tensorflow.org\/paper\/whitepaper2015.pdf."},{"key":"e_1_2_10_10_1","unstructured":"Eigen Library.http:\/\/eigen.tuxfamily.org\/index.php?title=Main_Page."},{"key":"e_1_2_10_11_1","unstructured":"MinSJ IancuC YelickK. Hierarchical work stealing on manycore clusters. Paper presented at: Proceedings of the 5th Conference on Partitioned Global Address Space Programming Models (PGAS 2011); October 15\u201018 2011; Tremont House Galveston Island TX."},{"key":"e_1_2_10_12_1","doi-asserted-by":"crossref","unstructured":"SzegedyC VanhouckeV IoffeS ShlensJ WojnaZ. Rethinking the inception architecture for computer vision. Paper presented at: Proceedings of the IEEE\/CVF 29th Conference on Computer Vision and Pattern Recognition (CVPR 2016); June 26\u2010July 1 2016; Las Vegas NV United States.","DOI":"10.1109\/CVPR.2016.308"},{"key":"e_1_2_10_13_1","doi-asserted-by":"publisher","DOI":"10.1002\/spe.2335"},{"key":"e_1_2_10_14_1","doi-asserted-by":"crossref","unstructured":"BiondiA MelaniA BertognaM. Hard constant bandwidth server: comprehensive formulation and critical scenarios. Paper presented at: Proceedings of the 9th IEEE International Symposium on Industrial Embedded Systems (SIES 2014); June 18\u201020 2014; Pisa Italy.","DOI":"10.1109\/SIES.2014.6871182"},{"key":"e_1_2_10_15_1","doi-asserted-by":"publisher","DOI":"10.1002\/spe.2333"},{"key":"e_1_2_10_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.sysarc.2020.101729"},{"key":"e_1_2_10_17_1","doi-asserted-by":"crossref","unstructured":"AlbaqsamiA HosseiniMS BagherzadehN. HTF\u2010MPR: a heterogeneous tensorflow mapper targeting performance using genetic algorithms and gradient boosting regressors. Paper presented at: Proceedings of the Design Automation Test in Europe Conference Exhibition (DATE 2018); March 19\u201023 2018; Florence Italy.","DOI":"10.23919\/DATE.2018.8342031"},{"key":"e_1_2_10_18_1","doi-asserted-by":"crossref","unstructured":"ZhouH BateniS LiuC. S3DNN: supervised streaming and scheduling for GPU\u2010accelerated real\u2010time DNN workloads. Paper presented at: Proceedings of the 24th IEEE Real\u2010Time and Embedded Technology and Applications Symposium (RTAS 2018); April 11\u201013 2018; Porto Portugal.","DOI":"10.1109\/RTAS.2018.00028"},{"key":"e_1_2_10_19_1","doi-asserted-by":"crossref","unstructured":"M. YangSW BakitaJ VuT SmithFD AndersonJH FrahmJ. Re\u2010thinking CNN frameworks for time\u2010sensitive autonomous\u2010driving applications: addressing an industrial challenge. Paper presented at: Proceedings of the 25th IEEE Real\u2010Time and Embedded Technology and Applications Symposium (RTAS 2019); April 16\u201018 2019; Montreal QC Canada.","DOI":"10.1109\/RTAS.2019.00033"},{"key":"e_1_2_10_20_1","doi-asserted-by":"crossref","unstructured":"LaneND BhattacharyaS GeorgievP et al. DeepX: a software accelerator for low\u2010power deep learning inference on mobile devices. Paper presented at: Proceedings of the 15th ACM\/IEEE International Conference on Information Processing in Sensor Networks (IPSN 2016); April 11\u201014 2016; Vienna Austria.","DOI":"10.1109\/IPSN.2016.7460664"},{"key":"e_1_2_10_21_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11241-018-9314-y"},{"key":"e_1_2_10_22_1","doi-asserted-by":"crossref","unstructured":"BateniS ZhouH ZhuY LiuC. PredJoule: a timing\u2010predictable energy optimization framework for deep neural networks. Paper presented at: Proceedings of the 39th IEEE Real\u2010Time Systems Symposium (RTSS 2018); December 11\u201014 2018; Nashville TN.","DOI":"10.1109\/RTSS.2018.00020"},{"key":"e_1_2_10_23_1","doi-asserted-by":"crossref","unstructured":"HongH OhH HaS. Hierarchical dataflow modeling of iterative applications. Paper presented at: Proceedings of the 54th Annual Design Automation Conference (DAC 2017); June 18\u201022 2017; Austin TX USA.","DOI":"10.1145\/3061639.3062260"},{"key":"e_1_2_10_24_1","doi-asserted-by":"crossref","unstructured":"CasiniD BiondiA ButtazzoG. Analyzing parallel real\u2010time tasks implemented with thread pools. Paper presented at: Proceedings of the 56th Annual Design Automation Conference (DAC 2019); June 2\u20106 2019; Las Vegas NV.","DOI":"10.1145\/3316781.3317771"},{"key":"e_1_2_10_25_1","doi-asserted-by":"crossref","unstructured":"BateniS LiuC. ApNet: approximation\u2010aware real\u2010time neural network. Paper presented at: Proceedings of the 39th IEEE Real\u2010Time Systems Symposium (RTSS 2018); December 11\u201014 2018; Nashville TN.","DOI":"10.1109\/RTSS.2018.00017"},{"key":"e_1_2_10_26_1","unstructured":"TensortRT.https:\/\/developer.nvidia.com\/tensorrt."},{"key":"e_1_2_10_27_1","doi-asserted-by":"crossref","unstructured":"ForsbergB MarongiuA BeniniL. GPUguard: towards supporting a predictable execution model for heterogeneous SoC. Paper presented at: Proceedings of the Design Automation Test in Europe Conference Exhibition (DATE 2017); March 27\u201031 2017; Lausanne Switzerland.","DOI":"10.23919\/DATE.2017.7927008"},{"key":"e_1_2_10_28_1","doi-asserted-by":"crossref","unstructured":"CapodieciN CavicchioliR ValenteP BertognaM. SiGAMMA: server based integrated GPU arbitration mechanism for memory accesses. Paper presented at: Proceedings of the 25th ACM International Conference on Real\u2010Time Networks and Systems (RTNS 2017); October 4\u20106 2017; Grenoble France.","DOI":"10.1145\/3139258.3139270"},{"key":"e_1_2_10_29_1","doi-asserted-by":"crossref","unstructured":"AliW YunH. Protecting real\u2010time GPU kernels on integrated CPU\u2010GPU SoC platforms. Paper presented at: Proceedings of the Euromicro Conference on Real\u2010Time Systems (ECRTS 2018); July 3\u20106 2018; Barcelona Spain.","DOI":"10.1109\/RTAS.2017.26"},{"key":"e_1_2_10_30_1","doi-asserted-by":"crossref","unstructured":"CapodieciN CavicchioliR BertognaM ParamakuruA. Deadline\u2010based scheduling for GPU with preemption support. Paper presented at: Proceedings of the 39th IEEE Real\u2010Time Systems Symposium (RTSS 2018); December 11\u201014 2018; Nashville TN.","DOI":"10.1109\/RTSS.2018.00021"},{"key":"e_1_2_10_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01094172"},{"key":"e_1_2_10_32_1","doi-asserted-by":"crossref","unstructured":"DavisRI BurnsA. Resource sharing in hierarchical fixed priority pre\u2010emptive systems. Paper presented at: Proceedings of the 27th IEEE International Real\u2010Time Systems Symposium (RTSS 2006); December 5\u20108 2006; Rio de Janeiro Brazil.","DOI":"10.1109\/RTSS.2006.42"},{"key":"e_1_2_10_33_1","doi-asserted-by":"crossref","unstructured":"BehnamM ShinI NolteT NolinM. Scheduling of semi\u2010independent real\u2010time components: overrun methods and resource holding times. Paper presented at: Proceedings of the 2008 IEEE International Conference on Emerging Technologies and Factory Automation (ETFA 2008); September 15\u201018 2008; Hamburg Germany.","DOI":"10.1109\/ETFA.2008.4638456"},{"key":"e_1_2_10_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/TII.2009.2037918"},{"key":"e_1_2_10_35_1","unstructured":"LamastraG LipariG AbeniL. A bandwidth inheritance algorithm for real\u2010time task synchronization in open systems. Paper presented at: Proceedings 22nd IEEE Real\u2010Time Systems Symposium (RTSS 2001); December 3\u20106 2001; London UK."},{"key":"e_1_2_10_36_1","unstructured":"deNizD AbeniL SaewongS RajkumarR. Resource sharing in reservation\u2010based systems. Paper presented at: Proceedings 22nd IEEE Real\u2010Time Systems Symposium (RTSS 2001); December 3\u20106 2001; London UK."},{"key":"e_1_2_10_37_1","doi-asserted-by":"crossref","unstructured":"FaggioliD LipariG CucinottaT. The multiprocessor bandwidth inheritance protocol. Paper presented at: Proceedings of the 22nd Euromicro Conference on Real\u2010Time Systems (ECRTS 2010); July 6\u20109 2010; Brussels Belgium.","DOI":"10.1109\/ECRTS.2010.19"},{"key":"e_1_2_10_38_1","doi-asserted-by":"crossref","unstructured":"BehnamM ShinI NolteT NolinM. SIRAP: a synchronization protocol for hierarchical resource sharing in real\u2010time open systems. Paper presented at: Proceedings of the 7th ACM & IEEE International Conference on Embedded Software (EMSOFT 2007); September 30\u2010October 3 2007; Salzburg Austria.","DOI":"10.1145\/1289927.1289970"},{"key":"e_1_2_10_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/TII.2009.2026051"},{"key":"e_1_2_10_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2015.2444833"},{"key":"e_1_2_10_41_1","doi-asserted-by":"crossref","unstructured":"BrandenburgB G\u00fclM. Global scheduling not required: simple near\u2010optimal multiprocessor real\u2010time scheduling with semi\u2010partitioned reservations. Paper presented at: Proceedings of the 37th IEEE Real\u2010Time Systems Symposium (RTSS 2016); November 29\u2010December 2 2016; Porto Portugal.","DOI":"10.1109\/RTSS.2016.019"},{"key":"e_1_2_10_42_1","unstructured":"CasiniD BiondiA ButtazzoG. Semi\u2010partitioned scheduling of dynamic real\u2010time workload: a practical approach based on analysis\u2010driven load balancing. Paper presented at: Proceedings of the 29th Euromicro Conference on Real\u2010Time Systems (ECRTS 2017); June 27\u201030 2017; Dubrovnik Croatia."},{"key":"e_1_2_10_43_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11241-018-9303-1"},{"key":"e_1_2_10_44_1","unstructured":"Linux kernel mailing list: towards implementing proxy execution.https:\/\/lkml.org\/lkml\/2018\/10\/9\/431."},{"key":"e_1_2_10_45_1","unstructured":"ParriA MarinoniM LelliJ LipariG. An implementation of a multiprocessor bandwidth reservation mechanism for groups of tasks. Paper presented at: Proceedings of the 16th Real Time Linux Workshop (RTLWS 2014); October 12\u201013 2014; Dusseldorf Germany."},{"key":"e_1_2_10_46_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01995675"},{"key":"e_1_2_10_47_1","unstructured":"ShinI LeeI. Periodic resource model for compositional real\u2010time guarantees. Paper presented at: Proceedings of the 24th IEEE Real\u2010Time Systems Symposium (RTSS 2003); December 3\u20105 2003; Cancun Mexico."},{"key":"e_1_2_10_48_1","doi-asserted-by":"crossref","unstructured":"CasiniD AbeniL BiondiA CucinottaT ButtazzoG. Constant bandwidth servers with constrained deadlines. Paper presented at: Proceedings of the 25th ACM International Conference on Real\u2010Time Networks and Systems (RTNS 2017); October 4\u20106 2017; Grenoble France.","DOI":"10.1145\/3139258.3139285"},{"key":"e_1_2_10_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2009.58"},{"key":"e_1_2_10_50_1","doi-asserted-by":"crossref","unstructured":"SandlerM HowardA ZhuM ZhmoginovA ChenL. MobileNetV2: inverted residuals and linear bottlenecks. Paper presented at: Proceedings of the IEEE\/CVF 31th Conference on Computer Vision and Pattern Recognition (CVPR 2018); June 18\u201023 2018; Salt Lake City UT.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"e_1_2_10_51_1","unstructured":"FalkH AltmeyerS HellinckxP et al. TACLeBench: a benchmark collection to support worst\u2010case execution time research. Paper presented at: Proceedings of the 16th International Workshop on Worst\u2010Case Execution Time Analysis (WCET 2016); July 5 2016; Toulouse France."},{"key":"e_1_2_10_52_1","doi-asserted-by":"publisher","DOI":"10.3390\/s19030644"},{"key":"e_1_2_10_53_1","unstructured":"Linux kernel profiling with perf.https:\/\/perf.wiki.kernel.org\/index.php\/Tutorial."},{"key":"e_1_2_10_54_1","doi-asserted-by":"crossref","unstructured":"BiondiA BalsiniA PaganiM RossiE MarinoniM ButtazzoG. A framework for supporting real\u2010time applications on dynamic reconfigurable FPGAs. Paper presented at: Proceedings of the 37th IEEE Real\u2010Time Systems Symposium (RTSS 2016); November 29\u2010December 2 2016; Porto Portugal.","DOI":"10.1109\/RTSS.2016.010"}],"container-title":["Software: Practice and Experience"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fspe.2840","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/spe.2840","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1002\/spe.2840","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/spe.2840","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,3]],"date-time":"2023-09-03T11:08:10Z","timestamp":1693739290000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/spe.2840"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,6]]},"references-count":53,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2020,9]]}},"alternative-id":["10.1002\/spe.2840"],"URL":"https:\/\/doi.org\/10.1002\/spe.2840","archive":["Portico"],"relation":{},"ISSN":["0038-0644","1097-024X"],"issn-type":[{"value":"0038-0644","type":"print"},{"value":"1097-024X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,6]]},"assertion":[{"value":"2019-12-20","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-03-25","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-06-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}