{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T02:31:19Z","timestamp":1740105079270,"version":"3.37.3"},"reference-count":19,"publisher":"Wiley","issue":"17","license":[{"start":{"date-parts":[[2015,6,5]],"date-time":"2015-06-05T00:00:00Z","timestamp":1433462400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Science Foundation of China","doi-asserted-by":"crossref","award":["61472362"],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Zhejiang Province Science and Technology Plan","award":["2014C33070"]},{"name":"National Key Technology Research and Development Program of the Ministry of Science and Technology of China","award":["2014BAK14B01"]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Concurrency and Computation"],"published-print":{"date-parts":[[2015,12,10]]},"abstract":"Summary<\/jats:title>This paper presented two schemes of parallel 2D discrete wavelet transform (DWT) on Compute Unified Device Architecture graphics processing units. For the first scheme, the image and filter are transformed to spectral domain by using Fast Fourier Transformation (FFT), multiplied and then transformed back to space domain by using inverse FFT. For the second scheme, the image pixels are convolved directly with filters. Because there is no data relevance, the convolution for data points on different positions could be executed concurrently. To reduce data transfer, the boundary extension and down\u2010sampling are processed during data loading stage, and transposing is completed implicitly during data storage. A similar skill is adopted when parallelizing inverse 2D DWT. To further speed up the data access, the filter coefficients are stored in the constant memory. We have parallelized the 2D DWT for dozens of wavelet types and achieved a speedup factor of over 380 times compared with that of its CPU version. We applied the parallel 2D DWT in a ring artifact removal procedure; the executing speed was accelerated near 200 times compared with its CPU version. The experimental results showed that the proposed parallel 2D DWT on graphics processing units can significantly improve the performance for a wide variety of wavelet types and is promising for various applications. Copyright \u00a9 2015 John Wiley & Sons, Ltd.<\/jats:p>","DOI":"10.1002\/cpe.3559","type":"journal-article","created":{"date-parts":[[2015,6,6]],"date-time":"2015-06-06T03:41:31Z","timestamp":1433562091000},"page":"5188-5202","source":"Crossref","is-referenced-by-count":3,"title":["Parallel multi\u2010level 2D\u2010DWT on CUDA GPUs and its application in ring artifact removal"],"prefix":"10.1002","volume":"27","author":[{"given":"Leqing","family":"Zhu","sequence":"first","affiliation":[{"name":"School of Computer Science and Information Engineering Zhejiang Gongshang University Hangzhou 310018 China"}]},{"given":"Yadong","family":"Zhou","sequence":"additional","affiliation":[{"name":"School of Computer Science and Information Engineering Zhejiang Gongshang University Hangzhou 310018 China"}]},{"given":"Daxing","family":"Zhang","sequence":"additional","affiliation":[{"name":"Institute of Graphics and Image Hangzhou Dianzi University Hangzhou 310018 China"}]},{"given":"Dadong","family":"Wang","sequence":"additional","affiliation":[{"name":"Quantitative Imaging CSIRO Computational Informatics North Ryde Sydney NSW 2113 Australia"}]},{"given":"Huiyan","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Information Engineering Zhejiang Gongshang University Hangzhou 310018 China"}]},{"given":"Xun","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Information Engineering Zhejiang Gongshang University Hangzhou 310018 China"}]}],"member":"311","published-online":{"date-parts":[[2015,6,5]]},"reference":[{"key":"e_1_2_8_2_1","unstructured":"JohnDL OwensD NagaG MarkH JensK AaronE LefohnT TimothyJP. A survey of general\u2010purpose computation on graphics hardware.Eurographics 2005 State of the Art Reports 2005;21\u201351."},{"key":"e_1_2_8_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2008.31"},{"key":"e_1_2_8_4_1","doi-asserted-by":"crossref","unstructured":"HopfM ErtlT. Hardware accelerated wavelet transformations.Proc. EG\/IEEE TCVG Symp. Visualization (VisSym \u201800) 2000;93\u2013103.","DOI":"10.1007\/978-3-7091-6783-0_10"},{"key":"e_1_2_8_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00371-005-0332-0"},{"key":"e_1_2_8_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2006.887994"},{"key":"e_1_2_8_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2007.70716"},{"key":"e_1_2_8_8_1","unstructured":"WladimirJL RoerdinkJBTM JalbaAC. Accelerating wavelet\u2010based video coding on graphics hardware using CUDA.IEEE Proceedings of 6th International Symposium on Image and Signal Processing and Analysis2009;608\u2013613."},{"issue":"1","key":"e_1_2_8_9_1","first-page":"132","article-title":"Accelerating wavelet lifting on graphics hardware using CUDA","volume":"22","author":"Wladimir JL","year":"2010","journal-title":"IEEE Transactions on Parallel and Distributed Systems"},{"key":"e_1_2_8_10_1","doi-asserted-by":"crossref","unstructured":"FrancoJ Bernab\u00e9G Fern\u00e1ndezJ AcacioME. A parallel implementation of the 2d wavelet transform using CUDA.2009 17th Euromicro International Conference on Parallel Distributed and Network\u2010based Processing 2009;111\u2013118.","DOI":"10.1109\/PDP.2009.40"},{"key":"e_1_2_8_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11554-011-0224-7"},{"key":"e_1_2_8_12_1","unstructured":"MichalK DavidB MichalK PavelZ. 2\u2010d discrete wavelet transform using GPU.IEEE 26th International Symposium on Computer Architecture and High Performance Computing Workshops 2014;1\u20136."},{"key":"e_1_2_8_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.192463"},{"key":"e_1_2_8_14_1","doi-asserted-by":"crossref","unstructured":"MallatSG. Multifrequency channel decompositons of images and wavelet models.IEEE Transactions On Acoustics.And Signal Processing 1989;2091\u20132110.","DOI":"10.1109\/29.45554"},{"volume-title":"A Wavelet Tour of Signal Processing","year":"1998","author":"Mallat SG","key":"e_1_2_8_15_1"},{"key":"e_1_2_8_16_1","unstructured":"MorelandK AngelE. The FFT on a GPU.Proceedings of the ACM SIGGRAPH\/EUROGRAPHICS Conference on Graphics Hardware 2003 2003;112\u2013119."},{"key":"e_1_2_8_17_1","unstructured":"NVIDIA.CUFFT_Library 5.5.http:\/\/docs.nvidia.com\/cuda\/cufft\/index.html#advanced\u2010data\u2010layout[Accessed on 15 July 2013]."},{"key":"e_1_2_8_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2014.2308221"},{"key":"e_1_2_8_19_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.3183"},{"key":"e_1_2_8_20_1","doi-asserted-by":"publisher","DOI":"10.1364\/OE.17.008567"}],"container-title":["Concurrency and Computation: Practice and Experience"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fcpe.3559","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/cpe.3559","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,2]],"date-time":"2023-09-02T10:38:45Z","timestamp":1693651125000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/cpe.3559"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,6,5]]},"references-count":19,"journal-issue":{"issue":"17","published-print":{"date-parts":[[2015,12,10]]}},"alternative-id":["10.1002\/cpe.3559"],"URL":"https:\/\/doi.org\/10.1002\/cpe.3559","archive":["Portico"],"relation":{},"ISSN":["1532-0626","1532-0634"],"issn-type":[{"type":"print","value":"1532-0626"},{"type":"electronic","value":"1532-0634"}],"subject":[],"published":{"date-parts":[[2015,6,5]]}}}