DualPrompt: Complementary Prompting for Rehearsal-Free Continual Learning

Wang, Zifeng; Zhang, Zizhao; Ebrahimi, Sayna; Sun, Ruoxi; Zhang, Han; Lee, Chen-Yu; Ren, Xiaoqi; Su, Guolong; Perot, Vincent; Dy, Jennifer; Pfister, Tomas

doi:10.1007/978-3-031-19809-0_36

Zifeng Wang¹²,
Zizhao Zhang¹³,
Sayna Ebrahimi¹³,
Ruoxi Sun¹³,
Han Zhang¹⁴,
Chen-Yu Lee¹³,
Xiaoqi Ren¹³,
Guolong Su¹⁴,
Vincent Perot¹⁴,
Jennifer Dy¹² &
…
Tomas Pfister¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13686))

Included in the following conference series:

European Conference on Computer Vision

5383 Accesses

Abstract

Continual learning aims to enable a single model to learn a sequence of tasks without catastrophic forgetting. Top-performing methods usually require a rehearsal buffer to store past pristine examples for experience replay, which, however, limits their practical value due to privacy and memory constraints. In this work, we present a simple yet effective framework, DualPrompt, which learns a tiny set of parameters, called prompts, to properly instruct a pre-trained model to learn tasks arriving sequentially without buffering past examples. DualPrompt presents a novel approach to attach complementary prompts to the pre-trained backbone, and then formulates the objective as learning task-invariant and task-specific “instructions”. With extensive experimental validation, DualPrompt consistently sets state-of-the-art performance under the challenging class-incremental setting. In particular, DualPrompt outperforms recent advanced continual learning methods with relatively large buffer sizes. We also introduce a more challenging benchmark, Split ImageNet-R, to help generalize rehearsal-free continual learning research. Source code is available at https://github.com/google-research/l2p.

Z. Wang—Work done while the author was an intern at Google Cloud AI Research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 12583; Price includes VAT (Japan)

Softcover Book: JPY 15729; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Computationally Efficient Rehearsal for Online Continual Learning

One-Stage Prompt-Based Continual Learning

Modular-Relatedness for Continual Learning

References

Aljundi, R., Babiloni, F., Elhoseiny, M., Rohrbach, M., Tuytelaars, T.: Memory aware synapses: learning what (not) to forget. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 144–161. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_9
Chapter Google Scholar
Bulatov, Y.: notMNIST dataset (2011). http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html
Buzzega, P., Boschini, M., Porrello, A., Abati, D., Calderara, S.: Dark experience for general continual learning: a strong, simple baseline. In: NeurIPS (2020)
Google Scholar
Cha, H., Lee, J., Shin, J.: Co\(^{2}\)L: contrastive continual learning. In: ICCV (2021)
Google Scholar
Chaudhry, A., Gordo, A., Dokania, P.K., Torr, P., Lopez-Paz, D.: Using hindsight to anchor past knowledge in continual learning. arXiv preprint arXiv:2002.08165 2(7) (2020)
Chaudhry, A., Ranzato, M., Rohrbach, M., Elhoseiny, M.: Efficient lifelong learning with A-GEM. arXiv preprint arXiv:1812.00420 (2018)
Chaudhry, A., et al.: On tiny episodic memories in continual learning. arXiv preprint arXiv:1902.10486 (2019)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR, pp. 248–255. IEEE (2009)
Google Scholar
Dosovitskiy, A., et al.: An image is worth \(16\times 16\) words: transformers for image recognition at scale. In: ICLR. OpenReview.net (2021). https://openreview.net/forum?id=YicbFdNTTy
Ebrahimi, S., Meier, F., Calandra, R., Darrell, T., Rohrbach, M.: Adversarial continual learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12356, pp. 386–402. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58621-8_23
Chapter Google Scholar
Hadsell, R., Rao, D., Rusu, A.A., Pascanu, R.: Embracing change: continual learning in deep neural networks. Trends Cogni. Sci. 24, 1028–1040 (2020)
Article Google Scholar
Hayes, T.L., Cahill, N.D., Kanan, C.: Memory efficient experience replay for streaming learning. In: ICRA (2019)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Google Scholar
Hendrycks, D., et al.: The many faces of robustness: a critical analysis of out-of-distribution generalization. arXiv preprint arXiv:2006.16241 (2020)
Hu, E.J., et al.: LoRa: low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021)
Ke, Z., Liu, B., Huang, X.: Continual learning of a mixed sequence of similar and dissimilar tasks. In: NeurIPS 33 (2020)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. PNAS 114(13), 3521–3526 (2017)
Article MathSciNet MATH Google Scholar
Kolesnikov, A., Beyer, L., Zhai, X., Puigcerver, J., Yung, J., Gelly, S., Houlsby, N.: Big Transfer (BiT): general visual representation learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 491–507. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_29
Chapter Google Scholar
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Google Scholar
Kumaran, D., Hassabis, D., McClelland, J.L.: What learning systems do intelligent agents need? Complementary learning systems theory updated. Trends Cogn. Sci. 20(7), 512–534 (2016)
Article Google Scholar
LeCun, Y.: The MNIST database of handwritten digits (1998). http://yann.lecun.com/exdb/mnist/
Lester, B., Al-Rfou, R., Constant, N.: The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691 (2021)
Li, X.L., Liang, P.: Prefix-tuning: optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190 (2021)
Li, X., Zhou, Y., Wu, T., Socher, R., Xiong, C.: Learn to grow: a continual structure learning framework for overcoming catastrophic forgetting. In: ICML, pp. 3925–3934. PMLR (2019)
Google Scholar
Li, Z., Hoiem, D.: Learning without forgetting. TPAMI 40(12), 2935–2947 (2017)
Article Google Scholar
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586 (2021)
Liu, X., Ji, K., Fu, Y., Du, Z., Yang, Z., Tang, J.: P-tuning v2: prompt tuning can be comparable to fine-tuning universally across scales and tasks. arXiv preprint arXiv:2110.07602 (2021)
Lomonaco, V., Maltoni, D., Pellegrini, L.: Rehearsal-free continual learning over small non-IID batches. In: CVPR Workshops, pp. 989–998 (2020)
Google Scholar
Loo, N., Swaroop, S., Turner, R.E.: Generalized variational continual learning. arXiv preprint arXiv:2011.12328 (2020)
Lopez-Paz, D., Ranzato, M.: Gradient episodic memory for continual learning. NeurIPS (2017)
Google Scholar
Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. JMLR 9(11), 2579–2605 (2008)
MATH Google Scholar
Mai, Z., Li, R., Jeong, J., Quispe, D., Kim, H., Sanner, S.: Online continual learning in image classification: an empirical survey. arXiv preprint arXiv:2101.10423 (2021)
Mallya, A., Davis, D., Lazebnik, S.: Piggyback: adapting a single network to multiple tasks by learning to mask weights. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11208, pp. 72–88. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01225-0_5
Chapter Google Scholar
Mallya, A., Lazebnik, S.: PackNet: adding multiple tasks to a single network by iterative pruning. In: CVPR (2018)
Google Scholar
Masana, M., Liu, X., Twardowski, B., Menta, M., Bagdanov, A.D., van de Weijer, J.: Class-incremental learning: survey and performance evaluation on image classification. arXiv preprint arXiv:2010.15277 (2020)
McClelland, J.L., McNaughton, B.L., O’Reilly, R.C.: Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychol. Rev. 102(3), 419 (1995)
Article Google Scholar
McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: The sequential learning problem. Psychol. Learn. Motiv. 24, 109–165 (1989)
Article Google Scholar
Mehta, S.V., Patil, D., Chandar, S., Strubell, E.: An empirical investigation of the role of pre-training in lifelong learning. In: ICML Workshop (2021)
Google Scholar
Mirzadeh, S.I., et al.: Architecture matters in continual learning. arXiv preprint arXiv:2202.00275 (2022)
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS (2011)
Google Scholar
Pfeiffer, J., Kamath, A., Rücklé, A., Cho, K., Gurevych, I.: AdapterFusion: non-destructive task composition for transfer learning. arXiv preprint arXiv:2005.00247 (2020)
Pham, Q., Liu, C., Hoi, S.: DualNet: continual learning, fast and slow. In: NeurIPS 34 (2021)
Google Scholar
Pham, Q., Liu, C., Sahoo, D., et al.: Contextual transformation networks for online continual learning. In: ICLR (2020)
Google Scholar
Prabhu, A., Torr, P.H.S., Dokania, P.K.: GDumb: a simple approach that questions our progress in continual learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 524–540. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_31
Chapter Google Scholar
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. JMLR 21, 1–67 (2020)
MathSciNet MATH Google Scholar
Raghu, M., Unterthiner, T., Kornblith, S., Zhang, C., Dosovitskiy, A.: Do vision transformers see like convolutional neural networks? In: NeurIPS 34 (2021)
Google Scholar
Rajasegaran, J., Hayat, M., Khan, S.H., Khan, F.S., Shao, L.: Random path selection for continual learning. In: NeurIPS 32 (2019)
Google Scholar
Rebuffi, S.A., Kolesnikov, A., Sperl, G., Lampert, C.H.: iCaRL: incremental classifier and representation learning. In: CVPR, pp. 2001–2010 (2017)
Google Scholar
Ridnik, T., Ben-Baruch, E., Noy, A., Zelnik-Manor, L.: ImageNet-21k pretraining for the masses. arXiv preprint arXiv:2104.10972 (2021)
Rusu, A.A., et al.: Progressive neural networks. arXiv preprint arXiv:1606.04671 (2016)
Serra, J., Suris, D., Miron, M., Karatzoglou, A.: Overcoming catastrophic forgetting with hard attention to the task. In: ICML, pp. 4548–4557 (2018)
Google Scholar
Shokri, R., Shmatikov, V.: Privacy-preserving deep learning. In: Proceedings of SIGSAC Conference on Computer and Communications Security (2015)
Google Scholar
Smith, J., Balloch, J., Hsu, Y.C., Kira, Z.: Memory-efficient semi-supervised continual learning: the world is its own replay buffer. arXiv preprint arXiv:2101.09536 (2021)
Vaswani, A., et al.: Attention is all you need. In: NeurIPS (2017)
Google Scholar
Wang, R., et al.: K-adapter: Infusing knowledge into pre-trained models with adapters. arXiv preprint arXiv:2002.01808 (2020)
Wang, Z., Jian, T., Chowdhury, K., Wang, Y., Dy, J., Ioannidis, S.: Learn-prune-share for lifelong learning. In: ICDM (2020)
Google Scholar
Wang, Z., et al.: Learning to prompt for continual learning. In: CVPR (2022)
Google Scholar
Wortsman, M., et al.: Supermasks in superposition. arXiv preprint arXiv:2006.14769 (2020)
Wu, Y., et al.: Large scale incremental learning. In: CVPR, pp. 374–382 (2019)
Google Scholar
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
Yan, S., Xie, J., He, X.: DER: dynamically expandable representation for class incremental learning. In: CVPR, pp. 3014–3023 (2021)
Google Scholar
Yoon, J., Yang, E., Lee, J., Hwang, S.J.: Lifelong learning with dynamically expandable networks. arXiv preprint arXiv:1708.01547 (2017)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar
Zenke, F., Poole, B., Ganguli, S.: Continual learning through synaptic intelligence. In: ICML (2017)
Google Scholar
Zeno, C., Golan, I., Hoffer, E., Soudry, D.: Task agnostic continual learning using online variational bayes. arXiv preprint arXiv:1803.10123 (2018)
Zhao, T., Wang, Z., Masoomi, A., Dy, J.: Deep Bayesian unsupervised lifelong learning. Neural Netw. 149, 95–106 (2022)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Northeastern University, Boston, USA
Zifeng Wang & Jennifer Dy
Google Cloud AI, Sunnyvale, USA
Zizhao Zhang, Sayna Ebrahimi, Ruoxi Sun, Chen-Yu Lee, Xiaoqi Ren & Tomas Pfister
Google Research, Mountain View, USA
Han Zhang, Guolong Su & Vincent Perot

Authors

Zifeng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zizhao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Sayna Ebrahimi
View author publications
You can also search for this author in PubMed Google Scholar
Ruoxi Sun
View author publications
You can also search for this author in PubMed Google Scholar
Han Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chen-Yu Lee
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoqi Ren
View author publications
You can also search for this author in PubMed Google Scholar
Guolong Su
View author publications
You can also search for this author in PubMed Google Scholar
Vincent Perot
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer Dy
View author publications
You can also search for this author in PubMed Google Scholar
Tomas Pfister
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zifeng Wang .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 3103 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, Z. et al. (2022). DualPrompt: Complementary Prompting for Rehearsal-Free Continual Learning. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13686. Springer, Cham. https://doi.org/10.1007/978-3-031-19809-0_36

Download citation

DOI: https://doi.org/10.1007/978-3-031-19809-0_36
Published: 01 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19808-3
Online ISBN: 978-3-031-19809-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

DualPrompt: Complementary Prompting for Rehearsal-Free Continual Learning