$$ LogRank^+ $$ : A Novel Approach to Support Business Process Event Log Sampling | SpringerLink
Skip to main content

\( LogRank^+ \): A Novel Approach to Support Business Process Event Log Sampling

  • Conference paper
  • First Online:
Web Information Systems Engineering – WISE 2020 (WISE 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12343))

Included in the following conference series:

Abstract

Massive amounts of business process event logs are collected and stored by modern information systems. Numerous process discovery approaches have been proposed to extract descriptive process models from such event logs in the past decades. To improve process discovery efficiency, event log sampling techniques are proposed. A sample log is a delicately selected subset of the original log that requires less computational cost. However, existing sampling techniques have difficulties, e.g., low efficiency, in handling large-scale event logs. To tackle this challenge, we propose a novel ranking-based event log sampling approach, denoted as \( LogRank^+ \), to support efficient sampling. In addition, we introduce a framework to evaluate the effectiveness of different sampling techniques by quantifying the sampling efficiency and the quality of sample logs. The proposed sampling approach has been implemented in the open-source process mining toolkit ProM. Experimental evaluation with both synthetic and real-life event logs demonstrates that the proposed sampling approach provides an effective solution to improve event log sampling efficiency as well as ensuring high quality of the obtained sample logs from a process discovery perspective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://www.processmining.org/.

  2. 2.

    https://svn.win.tue.nl/repos/prom/Packages/SoftwareProcessMining/.

References

  1. van der Aalst, W.: Paper review. https://doi.org/10.4121/uuid:da6aafef-5a86-4769-acf3-04e8ae5ab4fe

  2. Aalst, W.: Data science in action. Process Mining, pp. 3–23. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49851-4_1

    Chapter  Google Scholar 

  3. Van der Aalst, W., Weijters, T., Maruster, L.: Workflow mining: discovering process models from event logs. IEEE Trans. Knowl. Data Eng. 16(9), 1128–1142 (2004)

    Article  Google Scholar 

  4. Bauer, M., van der Aa, H., Weidlich, M.: Estimating process conformance by trace sampling and result approximation. In: Hildebrandt, T., van Dongen, B.F., Röglinger, M., Mendling, J. (eds.) BPM 2019. LNCS, vol. 11675, pp. 179–197. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26619-6_13

    Chapter  Google Scholar 

  5. Buijs, J.: BPI challenge (2011). https://doi.org/10.4121/uuid:26aba40d-8b2d-435b-b5af-6d4bfbd7a270

  6. Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: On the role of fitness, precision, generalization and simplicity in process discovery. In: Meersman, R. (ed.) OTM 2012. LNCS, vol. 7565, pp. 305–322. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33606-5_19

    Chapter  Google Scholar 

  7. Cheng, L., Li, T.: Efficient data redistribution to speedup big data analytics in large systems. In: 2016 IEEE 23rd International Conference on High Performance Computing (HiPC), pp. 91–100. IEEE (2016)

    Google Scholar 

  8. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  9. van Dongen, B.: Bpi (2012). https://doi.org/10.4121/uuid:3926db30-f712-4394-aebc-75976070e91f

  10. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, Hoboken (2012)

    MATH  Google Scholar 

  11. Evermann, J.: Scalable process discovery using map-reduce. IEEE Trans. Serv. Comput. 9(3), 469–481 (2016)

    Article  Google Scholar 

  12. Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Discovering block-structured process models from event logs - a constructive approach. In: Colom, J.-M., Desel, J. (eds.) PETRI NETS 2013. LNCS, vol. 7927, pp. 311–329. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38697-8_17

    Chapter  Google Scholar 

  13. Liu, C.: Automatic discovery of behavioral models from software execution data. IEEE Trans. Autom. Sci. Eng. 99, 1–12 (2018)

    Google Scholar 

  14. Liu, C.: Hierarchical business process discovery: identifying sub-processes using lifecycle information. In: International Conference on Web Services, pp. 1–5. IEEE (2020)

    Google Scholar 

  15. Liu, C., van Dongen, B.F., Assy, N., van der Aalst, W.M.P.: Component interface identification and behavioral model discovery from software execution data. In: International Conference on Program Comprehension, pp. 97–107. ACM (2018)

    Google Scholar 

  16. Liu, C., van Dongen, B., Assy, N., van der Aalst, W.M.: Component behavior discovery from software execution data. In: 2016 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–8. IEEE (2016)

    Google Scholar 

  17. Liu, C., Duan, H., Qingtian, Z., Zhou, M., Lu, F., Cheng, J.: Towards comprehensive support for privacy preservation cross-organization business process mining. IEEE Trans. Serv. Comput. 12(4), 639–653 (2019)

    Article  Google Scholar 

  18. Liu, C., Pei, Y., Cheng, L., Zeng, Q., Duan, H.: Sampling business process event logs using graph-based ranking model. Concurrency and Computation: Practice and Experience XX, pp. 1–15 (2020)

    Google Scholar 

  19. Liu, C., Pei, Y., Zeng, Q., Duan, H.: LogRank: an approach to sample business process event log for efficient discovery. In: Liu, W., Giunchiglia, F., Yang, B. (eds.) KSEM 2018. LNCS (LNAI), vol. 11061, pp. 415–425. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99365-2_36

    Chapter  Google Scholar 

  20. Mannhardt, F.: Sepsis. https://doi.org/10.4121/uuid:915d2bfb-7e84-49ad-a286-dc35f063a460

  21. Song, M., Günther, C.W., van der Aalst, W.M.P.: Trace clustering in process mining. In: Ardagna, D., Mecella, M., Yang, J. (eds.) BPM 2008. LNBIP, vol. 17, pp. 109–120. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00328-8_11

    Chapter  Google Scholar 

  22. Verenich, I., Dumas, M., Rosa, M.L., Maggi, F.M., Teinemaa, I.: Survey and cross-benchmark comparison of remaining time prediction methods in business process monitoring. ACM Trans. Intelli. Syst. Technol. (TIST) 10(4), 1–34 (2019)

    Article  Google Scholar 

  23. Weijters, A., Ribeiro, J.: Flexible heuristics miner (FHM). In: 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pp. 310–317. IEEE (2011)

    Google Scholar 

  24. Zeng, Q., Duan, H., Liu, C.: Top-down process mining from multi-source running logs based on refinement of Petri nets. IEEE Access 8, 61355–61369 (2020)

    Article  Google Scholar 

  25. Zeng, Q., Sun, S.X., Duan, H., Liu, C., Wang, H.: Cross-organizational collaborative workflow mining from a multi-source log. Decis. Support Syst. 54(3), 1280–1301 (2013)

    Article  Google Scholar 

Download references

Acknowledgement

This work was supported in part by National Natural Science Foundation of China under Grant 61902222, Science and Technology Development Fund of Shandong Province of China under Grant ZR2017MF027, the Taishan Scholars Program of Shandong Province under Grants ts20190936 and tsqn201909109, SDUST Research Fund under Grant 2015TDJH102.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Qingtian Zeng or Hua Duan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, C., Pei, Y., Zeng, Q., Duan, H., Zhang, F. (2020). \( LogRank^+ \): A Novel Approach to Support Business Process Event Log Sampling. In: Huang, Z., Beek, W., Wang, H., Zhou, R., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2020. WISE 2020. Lecture Notes in Computer Science(), vol 12343. Springer, Cham. https://doi.org/10.1007/978-3-030-62008-0_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-62008-0_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-62007-3

  • Online ISBN: 978-3-030-62008-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics