UIS Log: Synthetic User Interface with Screenshots Log
Creators
Description
These data correspond to the set of problems for evaluating the proposal detailed in Martínez-Rojas et al. 2022. The evaluation utilizes a set of synthetic problems that simulate realistic administrative use cases. Each problem includes a UI Log with a synthetic screenshot corresponding to each event, capturing 3 distinct processes (P) marked by varying complexity levels. These levels are defined by the number of activities, the process execution variants, and the visual features influencing decisions between these variants.
The implementation of this proposal can be found in the tool available at this GitHub repository, which utilizes the logs of these 3 processes for validation. Here they are described:
- P1 Client creation. A process with 5 activities and 2 variants. The single decision in this process is made based on the existence of an attachment in the reception email.
- P2 Client validation. A process with 7 activities and 2 variants. The decision is made based on the user’s response to a query.
- P3 Client deletion. A process with 7 activities and 4 variants. The decisions are made based on two conditions: (1) the existence of pending invoices and (2) the existence of an attachment to justify the payment of the invoices.
These processes all contain a single decision point, although the one in P3 is complex. All processes include
- synthetic screen captures for their activities and
- a sample event log with a single instance for each variant.
To generate the objects for the valuation, we generate event logs of different sizes (|L|) for each of these processes by deriving events from the sample event log. We consider log sizes in the range of {10, 25, 50, 100} events. Note that we consider complete instances in the log and thus, we remove the last instance if it goes beyond |L|.
Some of these logs are generated with a balanced number of instances, while others are unbalanced (B?) which present more than 20% of different frequency between the most frequent and less frequent variants. To average the result over a collection of problems, 30 instances are randomly generated for each tuple < P, |L|, B? >.
In this dataset there are 3 zips, one for each family. Each family corresponds to a process:
- Basic corresponds to P1
- Intermediate corresponds to P12
- Advanced corresponds to P3
Within these folders, we find 30 different scenarios (folder), in which the look and feel of the applications present in the screenshots have suffered little variations. Within each of these scenarios, variations are carried out respecting to the data entered in the forms and the images or attachments present in the user interface to generate log instances depending on the characteristics of each process.
For each scenario, we find 8 folders with the concrete problem which is defined by Log_size (in {10,25,50,100}) and Balanced (in {Balanced, Unbalanced}). The name of these folders have this format: Family_LogSize_Balanced.
Inside each problem folder the UI Log and the screen captures can be found.
References
Martínez-Rojas, A., Jiménez-Ramírez, A., Enríquez, J. G., & Reijers, H. A. (2022, September). Analyzing variable human actions for robotic process automation. In International Conference on Business Process Management (pp. 75-90). Cham: Springer International Publishing.