{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,3,19]],"date-time":"2025-03-19T13:59:42Z","timestamp":1742392782585},"reference-count":0,"publisher":"Association for the Advancement of Artificial Intelligence (AAAI)","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["AAAI"],"abstract":"Contrastive Language-Image Pre-training (CLIP) has been shown to learn visual representations with promising zero-shot performance. To further improve its downstream accuracy, existing works propose additional learnable modules upon CLIP and fine-tune them by few-shot training sets. However, the resulting extra training cost and data requirement severely hinder the efficiency for model deployment and knowledge transfer. In this paper, we introduce a free-lunch enhancement method, CALIP, to boost CLIP's zero-shot performance via a parameter-free attention module. Specifically, we guide visual and textual representations to interact with each other and explore cross-modal informative features via attention. As the pre-training has largely reduced the embedding distances between two modalities, we discard all learnable parameters in the attention and bidirectionally update the multi-modal features, enabling the whole process to be parameter-free and training-free. In this way, the images are blended with textual-aware signals and the text representations become visual-guided for better adaptive zero-shot alignment. We evaluate CALIP on various benchmarks of 14 datasets for both 2D image and 3D point cloud few-shot classification, showing consistent zero-shot performance improvement over CLIP. Based on that, we further insert a small number of linear layers in CALIP's attention module and verify our robustness under the few-shot settings, which also achieves leading performance compared to existing methods. Those extensive experiments demonstrate the superiority of our approach for efficient enhancement of CLIP. Code is available at https:\/\/github.com\/ZiyuGuo99\/CALIP.<\/jats:p>","DOI":"10.1609\/aaai.v37i1.25152","type":"journal-article","created":{"date-parts":[[2023,6,27]],"date-time":"2023-06-27T16:04:04Z","timestamp":1687881844000},"page":"746-754","source":"Crossref","is-referenced-by-count":32,"title":["CALIP: Zero-Shot Enhancement of CLIP with Parameter-Free Attention"],"prefix":"10.1609","volume":"37","author":[{"given":"Ziyu","family":"Guo","sequence":"first","affiliation":[]},{"given":"Renrui","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Longtian","family":"Qiu","sequence":"additional","affiliation":[]},{"given":"Xianzheng","family":"Ma","sequence":"additional","affiliation":[]},{"given":"Xupeng","family":"Miao","sequence":"additional","affiliation":[]},{"given":"Xuming","family":"He","sequence":"additional","affiliation":[]},{"given":"Bin","family":"Cui","sequence":"additional","affiliation":[]}],"member":"9382","published-online":{"date-parts":[[2023,6,26]]},"container-title":["Proceedings of the AAAI Conference on Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/25152\/24924","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/25152\/24924","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,27]],"date-time":"2023-06-27T16:04:05Z","timestamp":1687881845000},"score":1,"resource":{"primary":{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/view\/25152"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,26]]},"references-count":0,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,6,27]]}},"URL":"https:\/\/doi.org\/10.1609\/aaai.v37i1.25152","relation":{},"ISSN":["2374-3468","2159-5399"],"issn-type":[{"value":"2374-3468","type":"electronic"},{"value":"2159-5399","type":"print"}],"subject":[],"published":{"date-parts":[[2023,6,26]]}}}