Keywords

1 Introduction

Open government data (OGD) refers to publishing public sector data in open and reusable formats without restriction or charge for their use by society [1]. This movements have gained a great popularity across the world [2]. The public sectors generate and maintain a large amount of data [3], and its ability to collect data with application of low-cost devices enabled by Internet of Things (IoT), resulted in an explosion of digital data. Nevertheless, data is useless unless it is used to generate benefits from it [4]. Although policy-makers and smart cities could use these data, it needs to be opened and shared with the public in order to be able achieved the full potential of OGD.

Artificial Intelligence (AI) techniques have been extensively used to support the decision making and problem solving in different industries for many years [5,6,7]. AI can create new insights from combined datasets. Furthermore, AI systems can acquire their intelligence by using data to understand how past problems were solved, and applying this learning for predictive purposes. With the advancement of open government data, it becomes possible to combine AI technologies and OGD to obtain more value from data.

Some articles focused on exploring the economic value of OGD [8, 9] and social, political value like transparency, innovation and so on [10, 11]. But the use of AI for OGD remains scarce. There are some examples of trying to employ AI processing open data. For example, Piscopo et al. [12] concluded that the prediction model of community capacity using machine learning is more accurate than other models built using conventional statistics. Saltos and Cocea [13] used the open crime data to test that decision trees can be used to reliably predict crime frequency. Kouziokas [14] forecasted high crime risk transportation areas in urban environment by application of AI to test the most efficient algorithm and optimal model. Most of studies combined different datasets and suggested to give an in-depth investigation of AI especially the application of AI in certain area such as public sector [15]. These studies are beneficial for us to understand how open government data could drive AI. However, most of these studies are conducted from a technical perspective and do not focus on understanding value creation from OGD based on AI.

In this study, we investigated the following research question: what kind of value was generated through AI application and how? For this, we conducted an exploratory study on OGD innovation cases through AI. The whole process from collecting data to creating value is analyzed. AI technologies facilitate data analysis and specific value generated.

2 Research Approach

To investigate the value created from OGD using AI, we employ the comparative case study research method. In such type of research, two or more cases are systematically compared through examination of a real-world phenomenon within its naturally occurring context [16]. In our work we follow the steps suggested by Yin [17]. (1) Identify specific research question to determine that it is appropriate to use comparative case study. (2) Select cases and design the case study. (3) Collect case data and analyze the case evidence. (4) Report case study.

Firstly, our research aim is to investigate how and what kinds of value was generated through AI. The type of research questions most appropriate for case studies are “how” and “why” questions since they focus on the underlying process and the causal relationship [16]. Accordingly, we focus on the cases of using AI to explore how value is created form OGD using AI. For this we examine the whole process from collecting data to evaluate its impact.

Secondly, we identified three cases. The cases should be in the public domain, employ AI and data should be available. We searched the European data portal for finding suitable cases. This portal covers more than 500 cases of OGD application or ideas from EU. Yet the number of cases employing AI is limited. Furthermore, we want to cover a diversity of AI applications, as for example there are several cases using chatbots. Finally, we selected three cases which reflect the three phases of AI application. In phase 1, AI could use structured data to automate simple and repetitive processes. In phase 2, AI is able to adapt to and learn from changes in the automated process. In phase 3, AI provides new and innovative insights by being able to analyze and learn from previous actions [18]. The first selected case is the AI parking which is a German application using real-time traffic data to help people find the parking space. And the second case is a chatbot application in Singapore. The last case is the crime prevention through artificial intelligence employed by Kent Police in UK.

For the third step, we construct a context-input-process-output/outcome framework in order to establish the logic of comparative study, which is discussed in detail in the next section. Finally, we analyze the cases according to the framework and report the results, which are presented in Sect. 4.

3 Towards a Comparison Framework

This section presents a comparative framework for OGD innovation cases using AI. Firstly, we describe the context-input-process-output/outcome (CIPO) model. Subsequently, a series of elements are derived referring to the CIPO model in order to construct the comparative framework of analyzing the AI-based innovation case.

3.1 Context-Input-Process-Output/Outcome (CIPO) Model

Systems theory offers an important insight to the role of the information systems or technology in this sequence from data to information to knowledge. Systems thinking especially soft systems thinking provides us a way to conceptualize the social process in a particular context [19]. Services or products like applications could be created by exploring OGD in which data is transferred into results. Any transformation process can be viewed as an input-output system [20]. This system was then developed to input-process-output model by other researchers and applied in project evaluation, team innovation management etc. [21, 22]. However, projects using government data or applied in public sector should be studied as a complex process because they are influenced by socio-political and other context factor [23]. In addition, a difference between output and outcome is often made in which output is the immediate result and outcome the long-term impact [24]. In this work, we study both the short-term and long-term impact. Therefore, we used the context-input-process-output/outcome (CIPO) framework to better understand the OGD innovation cases, which has been used in the past to analyze OGD policies [24].

3.2 Elements for Comparing Cases

In this section, we derive the elements for comparing the OGD innovation cases from literature, related reports and websites. The findings are sorted by four parts following the CIPO model, namely project context, project input, process and the output/outcome (see in Table 1).

Table 1. CIPO Framework elements for comparing cases

Project Context.

The background describes the situation at hand, including the places, stakeholders and their contribution. The project objectives are referring the aims to be achieved through the projects. These projects may be set up to address multiple societal challenges. The OGD goals emphasized by different countries and different level of governments vary. So, it is necessary to investigate the context of the project to be able to evaluate the outcome.

Project Input.

It mainly refers to the data and technology input since high quality data is the fuel and technologies are core part driving the innovation process with AI. It focuses on what kind of data is required to enable successful AI application and how the data is provided (batch or real time, using a file or API). We then evaluate the quality of data open according to the 5-star open data model developed by Berners Lee as shown in Table 2 [25]. In addition, AI technology takes often like a black box and it is unknown what type of AI technologies is used to process data and how the data is explored.

Table 2. The 5-star open data taken from Berners Lee [25]

Process.

The steps of process are analyzed about how datasets were combined, processed and explored to create value. Data can be analyzed from multiple perspectives, resulting many useful applications.

Output/Outcome.

It comprehensively evaluates the immediate result and the long-term outcome which includes the value and the risks & challenges. The value refers more to the benefits of the AI applications. Although it is hard to identify what exactly values can gain from OGD because of different datasets and various application, some scholars tried to classify the benefits expected from leveraging open government data such as economic value and social value so on [4]. This classification provides us insight to identify the value of these projects.

4 Case Analysis Result

In this section three cases from Germany, Singapore and UK using AI to explore government data are compared. The cases are diverse in background, objectives, data used, AI tools and result (Table 3).

Table 3. Overview of each element of three cases

4.1 Project Context

There are many challenges faced by different stakeholders and their emphasized objectives also differ. In the first case, its main aim to lower the searching time for parking through prediction using AI. Whereas the latter two cases which are launched by governments, the main focus is on the social challenges like improving the service quality and decreasing the crime rate under the situation of reducing expenditure.

4.2 Project Input

Type of Data.

In terms of data content, these three cases use traffic parking data, government Q & A data as well as the crime data separately. From view of data structure, unstructured text data is adopted in case 2 while structured data is collected in case 3 because there are five data points for each crime incident to generate predictions. The data source in case 1 is a mix of the structured location information and the unstructured picture information. This analysis shows the innovative potential of OGD.

In addition, we also evaluated the quality of data open. According to the 5-star model developed by Berners Lee, we give the three cases one star. Because in the case 1, part of data was shared with government. In the case 2, people might find question and answer and the service catalog on the Singapore government website, but the full Q&A data and the catalog data was not in an open and machine-readable format. In the case 3, the number and type of crime in Kent could be found in the website, but not all the data points needed by crime prediction are published due to data privacy.

Type of AI Technologies.

There are several technologies namely neural network, natural language processing and machine learning adopted by these three cases. But it can be seen that algorithm models are a combination of scene-based knowledge and technology. For example, when conducting crime prediction, experts in related fields together created specific machine learning algorithm models based on the knowledge of criminal behavior. Similarly, in the application of parking and chatbots, developers construct various specific algorithm models for better applied in different fields.

4.3 Project Process

We analyzed the process steps taken to transform input into outputs by looking at three steps. (1) Data collection. After collecting the original data, the necessary clear data points are extract from combined datasets. (2) Using AI to process data. Using AI technologies to construct models and process data. (3) Applied in the certain field. At last, the data exploration processes are turned into applications.

4.4 Output/Outcome

Output.

Three cases yield three applications namely AIPARK, Ask Jamie, Predpol. The first two applications are still under development and improvement. But for Predpol system, Kent Police decided to stop using it in 2018.

Outcomes of Generated Value.

Previous literature summarized a variety of benefits of open data including social, political, economic and so on [3, 4, 26]. We derived the values generated by these cases according to the literature. For case 1, the main value is about economy since this team earned the investment and built a company in Germany. Moreover, the team also showed that this innovative service makes the time for searching to park three times shorter than the normal trip. For case 2, the chatbot improves service delivery, especially saving citizens’ time for simple information consultation. The survey by government showed that 58% citizens considered it successful and 60% found the answers usefulFootnote 1, which helped to improve the public satisfaction on government. On the operational level, it also saved the time of public servants and optimized the administrative process so that they focus on more complex tasks. In case 3, at four-month trial in Medway, street violence falls by 6%Footnote 2. It has a social and political value of improving the police service delivery.

Risks and Challenges.

In the first case, the main challenge is about the data update since the outdated data could give a wrong parking suggestion. This application could also be extended by applying number plates data. And it is a similar challenge about the recommendation accuracy in the second case. There are only about 60% users think it useful. For the prediction application in UK, there are several challenges. First issue people concern is lack of transparency of how predictive algorithms reach their decisions. Another concern is the software entrenching pre-existing discrimination since the program had “learned” racism and bias. The last major risk is the data privacy. The principle of AI is that the more data an algorithm receives, the better it will become at its task. The more datasets mean less privacy generally.

5 Discussion

The framework was found to be useful for understanding the cases as a high level. It helps to systematically analyze the cases and enable us to compare them in the same dimension, which provides a comprehensive picture of the OGD innovation using AI. However, each innovation is a complicated process that include much detail like data collection, data cleaning and so on. The framework cannot cover all the process. It focuses on the critical process in the innovation cases. In this section, we discuss the case analysis result according the framework.

5.1 Context- Collaboration Between Government and Private Company

Although these projects were launched by different stakeholders, the entire implementation process of the project was based on the cooperation between government and private companies. In the first case, the company conducted a data collection project in cooperation with the local municipality to validate the parking prediction model. In the second and third cases, the government provided data and then hired technology companies to produce solutions. Therefore, private companies are an indispensable intermediary to create more applications and value for society from OGD using AI [28]. Future research about combining AI and OGD can focus on strengthening cooperation between government and enterprises.

5.2 Input- Improve the Data Quality and Technology Transparency

Improve the Data Quality.

These three applications still provide relatively useful predictions although the datasets are rated as 1 star. Indeed, there are some AI mechanisms that can deal with low data quality. But AI applications are highly dependent on the data quality (accuracy, completeness, availability, timeliness) [29]. Whereas persons can spot easily mistakes in data, AI is not able to spot such mistakes which might result easily in incorrect findings [30]. Evaluating the information quality before using them with AI technology is therefore important. Furthermore, if data was formatted comply with the 5-stars model suggested by Berners lee [18], then the use of OGD based on AI is relatively easy, as the data is structured and well-described and has a persistent URI to ensure a continuous stream of data. Furthermore, there is a need for semantic descriptions of open data to explore the use of OGD. This is a condition for exploration of OGD to finding new insights and adds to representational information quality.

Improve the Transparency of AI Technologies.

There is a wide variety of AI techniques and AI can be used to create various types of value. There seems to be no best AI techniques and which type of AI is used is dependent on the type of problem that is tackled. AI can be used to increase the efficiency (of parking) facilitate positive relationship with citizens (chatbots) and find anomalies (crime prevention). The use of AI might have negative effects like strengthening the discrimination since little information is given to understand how predictive algorithms reach their decisions, which need to be managed and clear accountabilities should be ensured. The need for guidance by a legal framework should be further explored.

5.3 Output/Outcome- Create More Value While Effectively Preventing Risks

Increase the Users’ Take-Up Rate.

In order to sustainably achieve the value from these project, it is necessary to increase the users’ take-up rate [31]. No matter the AI parking or chatbots, they are still in the early stage of development. The number of users is limited. For the crime prediction software, Kent police has stop using it in 2018. So, developing the application is not the end. It is important to improve the service so that to increase the level of usage for sustainable value creation.

Objectives Value.

OGD initiatives often focus on transparency, accountability, innovation for companies and participation [32]. In contrast, the objectives of the three cases were focused on efficiency, innovation and crime prevention. The objectives were often very specific and the value generated had limited contribution to the original OGD objectives. Also, some of AI projects were beneficial for governments instead of the public. This could be due to selection bias, but also that the public have limited capabilities to make use of AI. We recommend to stimulate AI projects for obtaining these general objectives of OGD initiatives.

The Balance Between Privacy and Utility.

In the case of automatic parking, the application might be easily extended with vehicle number plate recognition or in the third case the names of criminals might be identified which might violate the privacy. The more data is released and available, there is a need for data privacy preserving techniques [33]. There is a tradeoff between privacy and utility of OGD publishing [33]. It might be a suggesting to release only the metadata including semantic descripting and only grant access once the purpose has become clear to avoid misuse. With the use of AI algorithm, the data collection and processing become less visible [34]. As such, mechanisms for dealing with the risks and AI-governance becomes more important.

6 Conclusion and Implication

This study developed a framework for comparing OGD innovation cases using AI that consists of the critical elements related to an AI project. The framework was useful to describe and compare the AI cases and draw the attention to the main value creation elements. This framework can be used to analyzed other cases, also outside the OGD or AI domain, which helps government to improve the data provide and adjust the objective.

AI is an area that can be used to create more value from OGD. Surprisingly, different types of value are created using AI than often what is aimed for by OGD initiatives. The cases show the high potential, but also challenges which complicates attaining the value. The analysis of three cases shows that it is necessary to strengthen the collaboration between the company and government since companies might have data and the capacity to process data using AI. In addition, the quality of the data is a key aspect for creating value using AI. This becomes more important as AI is less aware of the shortcomings in data, whereas data analyst can take this into account. Following the 5-stars model when opening data can simplify the use of AI.

There is a wide variety of AI techniques that can be used to create various types of value like neural network, natural language processing and machine learning. But it is important to improve the algorithm transparency and accountability to assess if there is bias or discrimination. Furthermore, the objectives of the project and value created in three cases hardly contributed to the ideals of the open government movement like transparency, accountability, innovation and participation. This can be explained as the projects investigated are initiated by governments. As further recommendation for policy-makers, we suggest more experiments with AI for OGD focused on the OGD objectives to facilitate learning.