{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,5]],"date-time":"2024-09-05T02:04:08Z","timestamp":1725501848217},"reference-count":57,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2019,12,7]],"date-time":"2019-12-07T00:00:00Z","timestamp":1575676800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2019,12,7]],"date-time":"2019-12-07T00:00:00Z","timestamp":1575676800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001827","name":"University of Amsterdam","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100001827","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Comput Vis"],"published-print":{"date-parts":[[2020,4]]},"abstract":"Abstract<\/jats:title>While many image colorization algorithms have recently shown the capability of producing plausible color versions from gray-scale photographs, they still suffer from limited semantic understanding. To address this shortcoming, we propose to exploit pixelated object semantics to guide image colorization. The rationale is that human beings perceive and distinguish colors based on the semantic categories of objects. Starting from an autoregressive model, we generate image color distributions, from which diverse colored results are sampled. We propose two ways to incorporate object semantics into the colorization model: through a pixelated semantic embedding and a pixelated semantic generator. Specifically, the proposed network includes two branches. One branch learns what the object is, while the other branch learns the object colors. The network jointly optimizes a color embedding loss, a semantic segmentation loss and a color generation loss, in an end-to-end fashion. Experiments on Pascal VOC2012 and COCO-stuff reveal that our network, when trained with semantic segmentation labels, produces more realistic and finer results compared to the colorization state-of-the-art.<\/jats:p>","DOI":"10.1007\/s11263-019-01271-4","type":"journal-article","created":{"date-parts":[[2019,12,7]],"date-time":"2019-12-07T05:05:41Z","timestamp":1575695141000},"page":"818-834","update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":52,"title":["Pixelated Semantic Colorization"],"prefix":"10.1007","volume":"128","author":[{"given":"Jiaojiao","family":"Zhao","sequence":"first","affiliation":[]},{"given":"Jungong","family":"Han","sequence":"additional","affiliation":[]},{"given":"Ling","family":"Shao","sequence":"additional","affiliation":[]},{"given":"Cees G. M.","family":"Snoek","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2019,12,7]]},"reference":[{"issue":"1","key":"1271_CR1","doi-asserted-by":"publisher","first-page":"298","DOI":"10.1109\/TIP.2013.2288929","volume":"23","author":"A Bugeau","year":"2014","unstructured":"Bugeau, A., Ta, V. T., & Papadakis, N. (2014). Variational exemplar-based image colorization. IEEE Transactions on Image Processing, 23(1), 298\u2013307.","journal-title":"IEEE Transactions on Image Processing"},{"key":"1271_CR2","doi-asserted-by":"crossref","unstructured":"Caesar, H., Uijlings, J., & Ferrari, V. (2018). COCO-stuff: Thing and stuff classes in context. In CVPR.","DOI":"10.1109\/CVPR.2018.00132"},{"key":"1271_CR3","doi-asserted-by":"crossref","unstructured":"Cao, Y., Zhou, Z., Zhang, W., & Yu, Y. (2017). Unsupervised diverse colorization via generative adversarial networks. In Joint European conference on machine learning and knowledge discovery in databases.","DOI":"10.1007\/978-3-319-71249-9_10"},{"key":"1271_CR4","doi-asserted-by":"crossref","unstructured":"Charpiat, G., Hofmann, M., & Sch\u00f6lkopf, B. (2008). Automatic image colorization via multimodal predictions. In ECCV.","DOI":"10.1007\/978-3-540-88690-7_10"},{"key":"1271_CR5","doi-asserted-by":"crossref","unstructured":"Cheng, Z., Yang, Q., & Sheng, B. (2015). Deep colorization. In ICCV.","DOI":"10.1109\/ICCV.2015.55"},{"issue":"4","key":"1271_CR6","doi-asserted-by":"publisher","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","volume":"40","author":"LC Chen","year":"2018","unstructured":"Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2018). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834\u2013848.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"issue":"6","key":"1271_CR7","doi-asserted-by":"publisher","first-page":"156","DOI":"10.1145\/2070781.2024190","volume":"30","author":"AYS Chia","year":"2011","unstructured":"Chia, A. Y. S., Zhuo, S., Gupta, R. K., Tai, Y. W., Cho, S. Y., Tan, P., et al. (2011). Semantic colorization with internet images. ACM Transactions on Graphics, 30(6), 156.","journal-title":"ACM Transactions on Graphics"},{"key":"1271_CR8","doi-asserted-by":"crossref","unstructured":"Comaniciu, D., & Meer, P. (1997). Robust analysis of feature spaces: Color image segmentation. In CVPR.","DOI":"10.1109\/CVPR.1997.609410"},{"key":"1271_CR9","unstructured":"Dai, J., Li, Y., He, K., & Sun, J. (2016). R-FCN: Object detection via region-based fully convolutional networks. In NIPS."},{"key":"1271_CR10","doi-asserted-by":"crossref","unstructured":"Danelljan, M., Shahbaz Khan, F., Felsberg, M., & Van\u00a0de Weijer, J. (2014). Adaptive color attributes for real-time visual tracking. In CVPR.","DOI":"10.1109\/CVPR.2014.143"},{"key":"1271_CR11","doi-asserted-by":"crossref","unstructured":"Deshpande, A., Lu, J., Yeh, M. C., Chong, M. J., & Forsyth, D. (2017). Learning diverse image colorization. In CVPR.","DOI":"10.1109\/CVPR.2017.307"},{"key":"1271_CR12","doi-asserted-by":"crossref","unstructured":"Deshpande, A., Rock, J., & Forsyth, D. (2015). Learning large-scale automatic image colorization. In ICCV.","DOI":"10.1109\/ICCV.2015.72"},{"issue":"1","key":"1271_CR13","doi-asserted-by":"publisher","first-page":"98","DOI":"10.1007\/s11263-014-0733-5","volume":"111","author":"M Everingham","year":"2015","unstructured":"Everingham, M., Eslami, S. A., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2015). The Pascal visual object classes challenge: A retrospective. International Journal of Computer Vision, 111(1), 98\u2013136.","journal-title":"International Journal of Computer Vision"},{"key":"1271_CR14","unstructured":"Frans, K. (2017). Outline colorization through tandem adversarial networks. arXiv:1704.08834."},{"issue":"2\u20133","key":"1271_CR15","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1007\/s11263-008-0171-3","volume":"86","author":"A Gijsenij","year":"2010","unstructured":"Gijsenij, A., Gevers, T., & van de Weijer, J. (2010). Generalized gamut mapping using image derivative structures for color constancy. International Journal of Computer Vision, 86(2\u20133), 127\u2013139.","journal-title":"International Journal of Computer Vision"},{"key":"1271_CR16","doi-asserted-by":"crossref","unstructured":"Guadarrama, S., Dahl, R., Bieber, D., Norouzi, M., Shlens, J., & Murphy, K. (2017). Pixcolor: Pixel recursive colorization. In BMVC.","DOI":"10.5244\/C.31.112"},{"key":"1271_CR17","doi-asserted-by":"crossref","unstructured":"Gupta, R. K., Chia, A. Y. S., Rajan, D., Ng, E. S., & Zhiyong, H. (2012). Image colorization using similar images. In Multimedia.","DOI":"10.1145\/2393347.2393402"},{"key":"1271_CR18","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In CVPR.","DOI":"10.1109\/CVPR.2016.90"},{"issue":"4","key":"1271_CR19","first-page":"47","volume":"37","author":"M He","year":"2018","unstructured":"He, M., Chen, D., Liao, J., Sander, P. V., & Yuan, L. (2018). Deep exemplar-based colorization. ACM Transactions on Graphics, 37(4), 47.","journal-title":"ACM Transactions on Graphics"},{"key":"1271_CR20","doi-asserted-by":"crossref","unstructured":"Huang, Y. C., Tung, Y. S., Chen, J. C., Wang, S. W., & Wu, J. L. (2005). An adaptive edge detection based colorization algorithm and its applications. In Multimedia.","DOI":"10.1145\/1101149.1101223"},{"issue":"4","key":"1271_CR21","doi-asserted-by":"publisher","first-page":"110","DOI":"10.1145\/2897824.2925974","volume":"35","author":"S Iizuka","year":"2016","unstructured":"Iizuka, S., Simo-Serra, E., & Ishikawa, H. (2016). Let there be color!: Joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Transactions on Graphics, 35(4), 110.","journal-title":"ACM Transactions on Graphics"},{"key":"1271_CR22","unstructured":"Ironi, R., Cohen-Or, D., & Lischinski, D. (2005). Colorization by example. In Rendering techniques."},{"key":"1271_CR23","doi-asserted-by":"crossref","unstructured":"Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. In CVPR.","DOI":"10.1109\/CVPR.2017.632"},{"key":"1271_CR24","unstructured":"Khan, F. S., Anwer, R. M., Van\u00a0de Weijer, J., Bagdanov, A. D., Vanrell, M., & Lopez, A. M. (2012). Color attributes for object detection. In CVPR."},{"key":"1271_CR25","unstructured":"Khan, F. S., Van De\u00a0Weijer, J., & Vanrell, M. (2009). Top-down color attention for object recognition. In ICCV."},{"key":"1271_CR26","unstructured":"Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. In ICLR."},{"key":"1271_CR27","unstructured":"Kingma, D. P., & Welling, M. (2014). Auto-encoding variational bayes. In ICLR."},{"key":"1271_CR28","doi-asserted-by":"crossref","unstructured":"Larsson, G., Maire, M., & Shakhnarovich, G. (2016). Learning representations for automatic colorization. In ECCV.","DOI":"10.1007\/978-3-319-46493-0_35"},{"issue":"3","key":"1271_CR29","doi-asserted-by":"publisher","first-page":"689","DOI":"10.1145\/1015706.1015780","volume":"23","author":"A Levin","year":"2004","unstructured":"Levin, A., Lischinski, D., & Weiss, Y. (2004). Colorization using optimization. ACM Transactions on Graphics, 23(3), 689\u2013694.","journal-title":"ACM Transactions on Graphics"},{"key":"1271_CR30","doi-asserted-by":"crossref","unstructured":"Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., et\u00a0al. (2014). Microsoft COCO: Common objects in context. In ECCV.","DOI":"10.1007\/978-3-319-10602-1_48"},{"issue":"5","key":"1271_CR31","doi-asserted-by":"publisher","first-page":"152","DOI":"10.1145\/1409060.1409105","volume":"27","author":"X Liu","year":"2008","unstructured":"Liu, X., Wan, L., Qu, Y., Wong, T. T., Lin, S., Leung, C. S., et al. (2008). Intrinsic colorization. ACM Transactions on Graphics, 27(5), 152.","journal-title":"ACM Transactions on Graphics"},{"key":"1271_CR32","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In CVPR.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"1271_CR33","doi-asserted-by":"crossref","unstructured":"Lou, Z., Gevers, T., Hu, N., Lucassen, M. P., et\u00a0al. (2015). Color constancy by deep learning. In BMVC.","DOI":"10.5244\/C.29.76"},{"key":"1271_CR34","unstructured":"Luan, Q., Wen, F., Cohen-Or, D., Liang, L., Xu, Y. Q., & Shum, H. Y. (2007). Natural image colorization. In Proceedings of the 18th eurographics conference on rendering techniques."},{"key":"1271_CR35","doi-asserted-by":"crossref","unstructured":"Noh, H., Hong, S., & Han, B. (2015). Learning deconvolution network for semantic segmentation. In ICCV.","DOI":"10.1109\/ICCV.2015.178"},{"key":"1271_CR36","doi-asserted-by":"crossref","unstructured":"P\u00e9rez, P., Hue, C., Vermaak, J., & Gangnet, M. (2002). Color-based probabilistic tracking. In ECCV.","DOI":"10.1007\/3-540-47969-4_44"},{"key":"1271_CR37","doi-asserted-by":"crossref","unstructured":"Perez, E., Strub, F., De\u00a0Vries, H., Dumoulin, V., & Courville, A. (2018). Film: Visual reasoning with a general conditioning layer. In AAAI.","DOI":"10.1609\/aaai.v32i1.11671"},{"issue":"4","key":"1271_CR38","doi-asserted-by":"publisher","first-page":"838","DOI":"10.1137\/0330046","volume":"30","author":"BT Polyak","year":"1992","unstructured":"Polyak, B. T., & Juditsky, A. B. (1992). Acceleration of stochastic approximation by averaging. SIAM Journal on Control and Optimization, 30(4), 838\u2013855.","journal-title":"SIAM Journal on Control and Optimization"},{"issue":"3","key":"1271_CR39","doi-asserted-by":"publisher","first-page":"1214","DOI":"10.1145\/1141911.1142017","volume":"25","author":"Y Qu","year":"2006","unstructured":"Qu, Y., Wong, T. T., & Heng, P. A. (2006). Manga colorization. ACM Transactions on Graphics, 25(3), 1214\u20131220.","journal-title":"ACM Transactions on Graphics"},{"key":"1271_CR40","unstructured":"Radford, A., Metz, L., & Chintala, S. (2016). Unsupervised representation learning with deep convolutional generative adversarial networks. In ICLR."},{"key":"1271_CR41","doi-asserted-by":"crossref","unstructured":"Royer, A., Kolesnikov, A., & Lampert, C. H. (2017). Probabilistic image colorization. In BMVC.","DOI":"10.5244\/C.31.85"},{"key":"1271_CR42","unstructured":"Salimans, T., Karpathy, A., Chen, X., Kingma, D. P., & Bulatov, Y. (2017). PixelCNN++: A PixelCNN implementation with discretized logistic mixture likelihood and other modifications. In ICLR."},{"key":"1271_CR43","doi-asserted-by":"crossref","unstructured":"Sanchez, J. M., & Binefa, X. (2000). Improving visual recognition using color normalization in digital video applications. In ICME.","DOI":"10.1109\/ICME.2000.871573"},{"key":"1271_CR44","doi-asserted-by":"crossref","unstructured":"Sangkloy, P., Lu, J., Fang, C., Yu, F., & Hays, J. (2017). Scribbler: Controlling deep image synthesis with sketch and color. In CVPR.","DOI":"10.1109\/CVPR.2017.723"},{"issue":"1","key":"1271_CR45","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1007\/BF00130487","volume":"7","author":"MJ Swain","year":"1991","unstructured":"Swain, M. J., & Ballard, D. H. (1991). Color indexing. International Journal of Computer Vision, 7(1), 11\u201332.","journal-title":"International Journal of Computer Vision"},{"key":"1271_CR46","unstructured":"Tai, Y. W., Jia, J. Y., & Tang, C. K. (2005). Local color transfer via probabilistic segmentation by expectation\u2013maximization. In CVPR."},{"issue":"9","key":"1271_CR47","doi-asserted-by":"publisher","first-page":"1582","DOI":"10.1109\/TPAMI.2009.154","volume":"32","author":"KEA van de Sande","year":"2010","unstructured":"van de Sande, K. E. A., Gevers, T., & Snoek, C. G. M. (2010). Evaluating color descriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9), 1582\u20131596.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"issue":"11","key":"1271_CR48","doi-asserted-by":"publisher","first-page":"1482","DOI":"10.1364\/JOSA.59.001482","volume":"59","author":"GJ Van der Horst","year":"1969","unstructured":"Van der Horst, G. J., & Bouman, M. A. (1969). Spatiotemporal chromaticity discrimination. Journal of the Optical Society of America, 59(11), 1482\u20131488.","journal-title":"Journal of the Optical Society of America"},{"key":"1271_CR49","unstructured":"van\u00a0den Oord, A., Kalchbrenner, N., Espeholt, L., Vinyals, O., Graves, A., et\u00a0al. (2016). Conditional image generation with PixelCNN decoders. In NIPS."},{"key":"1271_CR50","doi-asserted-by":"crossref","unstructured":"Vondrick, C., Shrivastava, A., Fathi, A., Guadarrama, S., & Murphy, K. (2018). Tracking emerges by colorizing videos. In ECCV.","DOI":"10.1007\/978-3-030-01261-8_24"},{"key":"1271_CR51","doi-asserted-by":"crossref","unstructured":"Wang, Z., Simoncelli, E. P., & Bovik, A. C. (2003). Multiscale structural similarity for image quality assessment. In The 37th Asilomar conference on signals, systems & computers.","DOI":"10.1109\/ACSSC.2003.1292216"},{"key":"1271_CR52","doi-asserted-by":"crossref","unstructured":"Wang, X., Yu, K., Dong, C., & Loy, C. C. (2018). Recovering realistic texture in image super-resolution by deep spatial feature transform. In CVPR.","DOI":"10.1109\/CVPR.2018.00070"},{"issue":"3","key":"1271_CR53","doi-asserted-by":"publisher","first-page":"277","DOI":"10.1145\/566654.566576","volume":"21","author":"T Welsh","year":"2002","unstructured":"Welsh, T., Ashikhmin, M., & Mueller, K. (2002). Transferring color to greyscale images. ACM Transactions on Graphics, 21(3), 277\u2013280.","journal-title":"ACM Transactions on Graphics"},{"issue":"5","key":"1271_CR54","doi-asserted-by":"publisher","first-page":"1120","DOI":"10.1109\/TIP.2005.864231","volume":"15","author":"L Yatziv","year":"2006","unstructured":"Yatziv, L., & Sapiro, G. (2006). Fast image and video colorization using chrominance blending. IEEE Transactions on Image Processing, 15(5), 1120\u20131129.","journal-title":"IEEE Transactions on Image Processing"},{"key":"1271_CR55","doi-asserted-by":"crossref","unstructured":"Zhang, R., Isola, P., & Efros, A. A. (2016). Colorful image colorization. In ECCV.","DOI":"10.1007\/978-3-319-46487-9_40"},{"key":"1271_CR56","doi-asserted-by":"crossref","unstructured":"Zhang, R., Zhu, J. Y., Isola, P., Geng, X., Lin, A. S., Yu, T., et\u00a0al. (2017). Real-time user-guided image colorization with learned deep priors. In SIGGRAPH.","DOI":"10.1145\/3072959.3073703"},{"key":"1271_CR57","unstructured":"Zhao, J., Liu, L., Snoek, C. G. M., Han, J., & Shao, L. (2018). Pixel-level semantics guided image colorization. In BMVC."}],"container-title":["International Journal of Computer Vision"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-019-01271-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s11263-019-01271-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-019-01271-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,10,7]],"date-time":"2022-10-07T22:40:34Z","timestamp":1665182434000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s11263-019-01271-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,12,7]]},"references-count":57,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,4]]}},"alternative-id":["1271"],"URL":"https:\/\/doi.org\/10.1007\/s11263-019-01271-4","relation":{},"ISSN":["0920-5691","1573-1405"],"issn-type":[{"value":"0920-5691","type":"print"},{"value":"1573-1405","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,12,7]]},"assertion":[{"value":"20 January 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 November 2019","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 December 2019","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}