{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,23]],"date-time":"2025-04-23T23:25:44Z","timestamp":1745450744838,"version":"3.32.0"},"reference-count":71,"publisher":"Wiley","issue":"10","license":[{"start":{"date-parts":[[2005,12,13]],"date-time":"2005-12-13T00:00:00Z","timestamp":1134432000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Concurrency and Computation"],"published-print":{"date-parts":[[2006,8,25]]},"abstract":"Abstract<\/jats:title>Many scientific disciplines are now data and information driven, and new scientific knowledge is often gained by scientists putting together data analysis and knowledge discovery \u2018pipelines\u2019. A related trend is that more and more scientific communities realize the benefits of sharing their data and computational services, and are thus contributing to a distributed data and computational community infrastructure (a.k.a. \u2018the Grid\u2019). However, this infrastructure is only a means to an end and ideally scientists should not be too concerned with its existence. The goal is for scientists to focus on development and use of what we callscientific workflows<\/jats:italic>. These are networks of analytical steps that may involve, e.g., database access and querying steps, data analysis and mining steps, and many other steps including computationally intensive jobs on high\u2010performance cluster computers. In this paper we describe characteristics of and requirements for scientific workflows as identified in a number of our application projects. We then elaborate on Kepler, a particular scientific workflow system, currently under development across a number of scientific data management projects. We describe some key features of Kepler and its underlying Ptolemy II system, planned extensions, and areas of future research. Kepler is a community\u2010driven, open source project, and we always welcome related projects and new contributors to join. Copyright \u00a9 2005 John Wiley & Sons, Ltd.<\/jats:p>","DOI":"10.1002\/cpe.994","type":"journal-article","created":{"date-parts":[[2005,12,13]],"date-time":"2005-12-13T12:19:57Z","timestamp":1134476397000},"page":"1039-1065","source":"Crossref","is-referenced-by-count":1153,"title":["Scientific workflow management and the Kepler system"],"prefix":"10.1002","volume":"18","author":[{"given":"Bertram","family":"Lud\u00e4scher","sequence":"first","affiliation":[]},{"given":"Ilkay","family":"Altintas","sequence":"additional","affiliation":[]},{"given":"Chad","family":"Berkley","sequence":"additional","affiliation":[]},{"given":"Dan","family":"Higgins","sequence":"additional","affiliation":[]},{"given":"Efrat","family":"Jaeger","sequence":"additional","affiliation":[]},{"given":"Matthew","family":"Jones","sequence":"additional","affiliation":[]},{"given":"Edward A.","family":"Lee","sequence":"additional","affiliation":[]},{"given":"Jing","family":"Tao","sequence":"additional","affiliation":[]},{"given":"Yang","family":"Zhao","sequence":"additional","affiliation":[]}],"member":"311","published-online":{"date-parts":[[2005,12,13]]},"reference":[{"key":"e_1_2_1_2_2","unstructured":"Web Services Description Language (WSDL) Version 1.2 June2003.http:\/\/www.w3.org\/TR\/wsdl12."},{"key":"e_1_2_1_3_2","unstructured":"OWL Web Ontology Language Reference W3C Proposed Recommendation December2003.http:\/\/www.w3.org\/TR\/owl\u2010ref\/."},{"key":"e_1_2_1_4_2","unstructured":"Scientific Data Management Framework Workshop Argonne National Labs August2003. Available at:http:\/\/sdm.lbl.gov\/\u02dcarie\/sdm\/SDM.Framework.wshp.htm."},{"key":"e_1_2_1_5_2","unstructured":"e\u2010Science Workflow Services Workshop e\u2010Science Institute Edinburgh U.K. December2003. Available at:http:\/\/www.nesc.ac.uk\/esi\/events\/303\/index.html."},{"key":"e_1_2_1_6_2","unstructured":"e\u2010Science Grid Environments Workshop e\u2010Science Institute Edinburgh U.K. May2004. Available at:http:\/\/www.nesc.ac.uk\/esi\/events\/."},{"key":"e_1_2_1_7_2","unstructured":"GRIST Workshop on Service Composition for Data Exploration in the Virtual Observatory California Institute of Technology July2004. Available at:http:\/\/grist.caltech.edu\/sc4devo\/."},{"key":"e_1_2_1_8_2","unstructured":"LINK\u2010Up Workshop on Scientific Workflows San Diego Supercomputer Center October2004. Available at:http:\/\/kbis.sdsc.edu\/events\/link\u2010up\u201011\u201004\/."},{"key":"e_1_2_1_9_2","unstructured":"Workflow in Grid Systems Workshop GGF10 Berlin Germany March2004. Available at:http:\/\/www.extreme.indiana.edu\/groc\/Worflow\u2010call.html."},{"key":"e_1_2_1_10_2","unstructured":"NSF\/ITR. Enabling the Science Environment for Ecological Knowledge (SEEK).http:\/\/www.seek.ecoinformatics.org."},{"key":"e_1_2_1_11_2","article-title":"A knowledge environment for the biodiversity and ecological sciences","author":"Michener WK","year":"2004","journal-title":"Journal of Intelligent Information Systems"},{"key":"e_1_2_1_12_2","unstructured":"AltintasIet al.A modeling and execution environment for distributed scientific workflows.Proceedings of the 15th International Conference on Scientific and Statistical Database Management (SSDBM) Boston MA 2003."},{"key":"e_1_2_1_13_2","series-title":"Lecture Notes in Computer Science","volume-title":"Proceedings of the International Workshop on Data Integration in the Life Sciences (DILS)","author":"Bowers S","year":"2004"},{"volume-title":"Proceedings of the 24th International Conference on Conceptual Modeling","series-title":"Lecture Notes in Computer Science","author":"Bowers S","key":"e_1_2_1_14_2"},{"key":"e_1_2_1_15_2","unstructured":"Kepler: A system for scientific workflows.http:\/\/www.kepler\u2010project.org."},{"key":"e_1_2_1_16_2","doi-asserted-by":"publisher","DOI":"10.1016\/S1389-0344(00)00071-X"},{"key":"e_1_2_1_17_2","unstructured":"PetersonL YinE NelsonD AltintasI Lud\u00e4scherB CritchlowT WyrobekAJ ColemanMA.Mining the frequency distribution of transcription factor binding sites of ionizing radiation responsive genes.New Horizons in Genomics DOE\/SC\u20100071 Santa Fe NM 30 March\u20131 April2003."},{"key":"e_1_2_1_18_2","article-title":"An XML\u2010enabled data extraction tool for Web sources","author":"Liu L","year":"2001","journal-title":"International Journal of Information Systems (Special Issue on Data Extraction, Cleaning, and Reconciliation)"},{"key":"e_1_2_1_19_2","unstructured":"National Center for Biotechnology Information (NCBI) 2004.http:\/\/www.ncbi.nlm.nih.gov\/."},{"key":"e_1_2_1_20_2","unstructured":"Ptolemy II project and system. Department of EECS UC Berkeley 2004.http:\/\/ptolemy.eecs.berkeley.edu\/ptolemyII\/."},{"key":"e_1_2_1_21_2","unstructured":"BrooksC LeeEA LiuX NeuendorfferS ZhaoY ZhengH.Heterogeneous concurrent modeling and design in Java (vols. 1\u20133).Technical Memoranda UCB\/ERL M04\/27 M04\/16 M04\/17 Department of EECS University of California Berkeley 2004."},{"key":"e_1_2_1_22_2","doi-asserted-by":"publisher","DOI":"10.1002\/jcc.540141112"},{"key":"e_1_2_1_23_2","unstructured":"AbramsonD GiddyJ KotlerL.High performance parametric modeling with Nimrod\/G: Killer application for the global Grid.Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS) Cancun Mexico May2000. Available at:http:\/\/www.csse.monash.edu.au\/\u02dcdavida\/nimrod\/."},{"key":"e_1_2_1_24_2","doi-asserted-by":"publisher","DOI":"10.1021\/bk-1998-0712"},{"key":"e_1_2_1_25_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF03040952"},{"key":"e_1_2_1_26_2","unstructured":"NSF\/ITR. GEON: A research project to create cyberinfrastructure for the geosciences.http:\/\/www.geongrid.org."},{"key":"e_1_2_1_27_2","unstructured":"Scientific Data Management Center (SDM).http:\/\/sdm.lbl.gov\/sdmcenter\/;http:\/\/www.npaci.edu\/online\/v5.17\/scidac.html."},{"key":"e_1_2_1_28_2","unstructured":"Biomedical Informatics Research Network Coordinating Center (BIRN\u2010CC) University of California San Diego CA.http:\/\/nbirn.net\/."},{"key":"e_1_2_1_29_2","unstructured":"ROADNet: Real\u2010time observatories applications and data management network.http:\/\/roadnet.ucsd.edu."},{"key":"e_1_2_1_30_2","first-page":"5","volume-title":"Interoperating Geographic Information Systems","author":"Sheth A","year":"1998"},{"key":"e_1_2_1_31_2","doi-asserted-by":"publisher","DOI":"10.1007\/s007780100054"},{"key":"e_1_2_1_32_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-24741-8_25"},{"volume-title":"Bioinformatics: Managing Scientific Data","year":"2003","author":"Lud\u00e4scher B","key":"e_1_2_1_33_2"},{"key":"e_1_2_1_34_2","unstructured":"BowersS LinK Lud\u00e4scherB.On integrating scientific resources through semantic registration.Proceedings of the 16th International Conference on Scientific and Statistical Database Management (SSDBM) Santorini Island Greece 2004."},{"key":"e_1_2_1_35_2","unstructured":"Proceedings of the ICAPS Workshop on Planning for Web Services Trento Italy June2003."},{"key":"e_1_2_1_36_2","unstructured":"BlytheJ DeelmanE GilY.Planning for workflow construction and maintenance on the Grid.Proceedings of the ICAPS Workshop on Planning for Web Services Trento Italy June2003."},{"key":"e_1_2_1_37_2","unstructured":"Lud\u00e4scherB AltintasI GuptaA.Compiling abstract scientific workflows into Web service workflows.Proceedings of the 15th International Conference on Scientific and Statistical Database Management (SSDBM) Boston MA 2003. Available at:http:\/\/kbis.sdsc.edu\/SciDAC\u2010SDM\/ludaescher\u2010compiling.pdf."},{"key":"e_1_2_1_38_2","unstructured":"Lud\u00e4scherB NashA.Web service composition through declarative queries: The case of conjunctive queries with union and negation.Proceedings of the 20th International Conference on Data Engineering (ICDE) 2004."},{"key":"e_1_2_1_39_2","doi-asserted-by":"crossref","unstructured":"AlonsoG MohanC.Workflow management systems: The next generation of distributed processing tools.Advanced Transaction Models and Architectures Jajodia S Kerschberg L (eds.) 1997.","DOI":"10.1007\/978-1-4615-6217-7_2"},{"volume-title":"Workflow\u2010based Process Controlling","year":"2004","author":"zur Muehlen M","key":"e_1_2_1_40_2"},{"key":"e_1_2_1_41_2","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/7301.001.0001","volume-title":"Workflow Management: Models, Methods, and Systems (Cooperative Information Systems)","author":"van der Aalst W","year":"2002"},{"key":"e_1_2_1_42_2","unstructured":"van der AalstW.Don't go with the flow: Web services composition standards exposed.IEEE Intelligent Systems. Web Services\u2014Been There Done That? Trends and Controversies January\/February2003. Available at:http:\/\/tmitwww.tm.tue.nl\/research\/patterns\/download\/ieeewebflow.pdf."},{"key":"e_1_2_1_43_2","unstructured":"CurberaF GolandY KleinJ LeymanF RollerD ThatteS WeerawaranaS. Business Process Execution Language for Web Services (BPEL4WS) Version 1.0 2002.http:\/\/www.ibm.com\/developerworks\/library\/ws\u2010bpel\/."},{"issue":"3","key":"e_1_2_1_44_2","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1022883727209","article-title":"Workflow patterns","volume":"14","author":"van der Aalst W","year":"2003","journal-title":"Distributed and Parallel Databases"},{"key":"e_1_2_1_45_2","unstructured":"The Taverna Project.http:\/\/taverna.sf.net\/."},{"key":"e_1_2_1_46_2","unstructured":"The Triana Project.http:\/\/www.trianacode.org\/."},{"issue":"5","key":"e_1_2_1_47_2","article-title":"The Object\u2010Protocol Model: Design, implementation, and scientific applications","volume":"20","author":"Chen I","year":"1995","journal-title":"ACM Transactions on Information Systems"},{"key":"e_1_2_1_48_2","unstructured":"MeidanisJ VossenG WeskeM.Using workflow management in DNA sequencing.Proceedings of the International Conference on Cooperative Information Systems (CoopIS) 1996."},{"key":"e_1_2_1_49_2","unstructured":"AilamakiA IoannidisYE LivnyM.Scientific workflow management by database management.Proceedings of the 10th International Conference on Scientific and Statistical Database Management (SSDBM) Capri Italy 1998."},{"key":"e_1_2_1_50_2","unstructured":"KiepuszewskiB.Expressiveness and suitability of languages for control flow modelling in workflows.PhD Thesis Queensland University of Technology 2002."},{"key":"e_1_2_1_51_2","unstructured":"KahnG MacQueenDB.Coroutines and networks of parallel processes.Proceedings of the IFIP Congress 77 Gilchrist B (ed.) 1977;993\u2013998."},{"key":"e_1_2_1_52_2","doi-asserted-by":"publisher","DOI":"10.1109\/5.381846"},{"volume-title":"Flow\u2010Based Programming\u2014A New Approach to Application Development","year":"1994","author":"Morrison JP","key":"e_1_2_1_53_2"},{"key":"e_1_2_1_54_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-7091-7488-3_27"},{"key":"e_1_2_1_55_2","unstructured":"McPhillipsTM.Pipelined scientific workflows for inferring evolutionary relationships.Unpublished Paper Natural Diversity Discovery Project 2005."},{"key":"e_1_2_1_56_2","unstructured":"ProjectG. GridFTP\u2014Universal Data Transfer for the Grid 2000.http:\/\/www.globus.org\/datagrid\/gridftp.html."},{"key":"e_1_2_1_57_2","unstructured":"The Globus Alliance.http:\/\/www.globus.org."},{"key":"e_1_2_1_58_2","unstructured":"SDSC Storage Resource Broker.http:\/\/www.sdsc.edu\/srb\/."},{"key":"e_1_2_1_59_2","unstructured":"R\u2014Statistical data analysis.http:\/\/www.r\u2010project.org."},{"key":"e_1_2_1_60_2","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2002.805829"},{"key":"e_1_2_1_61_2","unstructured":"Lud\u00e4scherB AltintasI.On providing declarative design and programming constructs for scientific workflows based on process networks.Technical Report SciDAC\u2010SPA\u2010TN\u20102003\u201001 San Diego Supercomputer Center 2003. Available at:http:\/\/kbi.sdsc.edu\/SciDAC\u2010SDM\/scidac\u2010tn\u2010map\u2010constructs."},{"key":"e_1_2_1_62_2","unstructured":"ReekieHJ.Realtime signal processing: Dataflow visual and functional programming.PhD Thesis School of Electrical Engineering University of Technology Sydney 1995."},{"volume-title":"Implicit Parallel Programming in pH","year":"2001","author":"Nikhil RS","key":"e_1_2_1_63_2"},{"key":"e_1_2_1_64_2","unstructured":"Lud\u00e4scherB.Towards actor\u2010oriented Web service\u2010based scientific workflows (or: How to handle handles).Technical Report San Diego Supercomputer Center September2004."},{"key":"e_1_2_1_65_2","doi-asserted-by":"publisher","DOI":"10.1016\/B978-012387582-2\/50033-2"},{"key":"e_1_2_1_66_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/bth361"},{"key":"e_1_2_1_67_2","doi-asserted-by":"crossref","DOI":"10.1002\/cpe.992","article-title":"Programming scientific and distributed workflow with Triana services","author":"Churches D","year":"2006","journal-title":"Concurrency and Computation: Practice and Experience"},{"key":"e_1_2_1_68_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1024000426962"},{"key":"e_1_2_1_69_2","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.938"},{"key":"e_1_2_1_70_2","unstructured":"YuJ BuyyaR.A taxonomy of workflow management systems for Grid computing.Technical Report GRIDS\u2010TR\u20102005\u20101 Grid Computing and Distributed Systems Laboratory University of Melbourne 2005. Available at:http:\/\/www.gridbus.org\/reports\/GridWorkflowTaxonomy.pdf."},{"key":"e_1_2_1_71_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0098-3004(02)00031-6"},{"key":"e_1_2_1_72_2","unstructured":"ArmstrongR GannonD GeistA KeaheyK KohnS McInnesL ParkerS SmolinskiB.Toward a common component architecture for high\u2010performance scientific computing.Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computation August1999."}],"container-title":["Concurrency and Computation: Practice and Experience"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fcpe.994","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/cpe.994","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,6]],"date-time":"2025-01-06T13:17:12Z","timestamp":1736169432000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/cpe.994"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,12,13]]},"references-count":71,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2006,8,25]]}},"alternative-id":["10.1002\/cpe.994"],"URL":"https:\/\/doi.org\/10.1002\/cpe.994","archive":["Portico"],"relation":{},"ISSN":["1532-0626","1532-0634"],"issn-type":[{"type":"print","value":"1532-0626"},{"type":"electronic","value":"1532-0634"}],"subject":[],"published":{"date-parts":[[2005,12,13]]}}}