Six Insights on Preference Signals for AI Training - Creative Commons Skip to content

Six Insights on Preference Signals for AI Training

Events, Licenses & Tools, Policy, Technology

In these uncertain times, one thing is clear: there is an urgent need to develop new, nuanced approaches to digital sharing. This is Creative Commons’ speciality and we’re ready to take on this challenge by exploring a possible intervention in the AI space: preference signals. 

Eagle Traffic Signals – 1970s” by RS 1990 is licensed via CC BY-NC-SA 2.0..

At the intersection of rapid advancements in generative AI and our ongoing strategy refresh, we’ve been deeply engaged in researching, analyzing, and fostering conversations about AI and value alignment. Our goal is to ensure that our legal and technical infrastructure remains robust and suitable in this rapidly evolving landscape.

In these uncertain times, one thing is clear: there is an urgent need to develop new, nuanced approaches to digital sharing. This is Creative Commons’ speciality and we’re ready to take on this challenge by exploring a possible intervention in the AI space: preference signals. 

Understanding Preference Signals

We’ve previously discussed preference signals, but let’s revisit this concept. Preference signals would empower creators to indicate the terms by which their work can or cannot be used for AI training. Preference signals would represent a range of creator preferences, all rooted in the shared values that inspired the Creative Commons (CC) licenses. At the moment, preference signals are not meant to be  legally enforceable. Instead, they aim to define a new vocabulary and establish new norms for sharing and reuse in the world of generative AI.

For instance, a preference signal might be “Don’t train,” “Train, but disclose that you trained on my content,” or even “Train, only if using renewable energy sources.”

Why Do We Need New Tools for Expressing Creator Preferences?

Empowering creators to be able to signal how they wish their content to be used to train generative AI models is crucial for several reasons:

We’re in the research phase of exploring what a system of preference signals could look like and over the next several months, we’ll be hosting more roundtables and workshops to discuss and get feedback from a range of stakeholders. In June, we took a big step forward by organizing our most focused and dedicated conversation about preference signals in New York City, hosted by the Engelberg Center at NYU.

Six Highlights from Our NYC Workshop on Preference Signals

Creative Commons is a global movement, making us uniquely positioned to tackle what sharing means in the context of generative AI. We understand the importance of stewarding the commons and the balance between human creation and public sharing. 

Designing tools for sharing in an AI-driven era involves collectively defining a new social contract for the digital commons. This process is essential for maintaining a healthy and collaborative community. Just as the CC licenses gave options for creators beyond no rights reserved and all rights reserved, preference signals have the potential to define a spectrum of sharing preferences in the context of AI that goes beyond the binary options of opt-in or opt-out. 

Should preference signals communicate individual values and principles such as equity and fairness? Adding content to the commons with a CC license is an act of communicating values;  should preference signals do the same? Workshop participants emphasized the need for mechanisms that support informed consent by both the creator and user.

The most obvious and prevalent use case for preference signals is to limit use of content within generative AI models to protect artists and creators. There is also the paradox that users may want to benefit from more relaxed creator preferences than they are willing to grant to other users when it comes to their content. We believe that preference signals that meet the sector-specific needs of creators and users, as well as social and community-driven norms that continue to strengthen the commons, are not mutually exclusive. 

While tags for AI-generated content are becoming common, what about tags for human-created content? The general goal of preference signals should be to foster the commons and encourage more human creativity and sharing.  For many, discussions about AI are inherently discussions about labor issues and a risk of exploitation. At this time, the law has no concept of “lovingly human”,  since humanness has been taken for granted until now. Is “lovingly human” the new “non-commercial”? Generative AI models also force us to consider what it means to be a creator, especially as most digital creative tools will soon be driven by AI. Is there a specific set of activities that need to be protected in the process of creating and sharing? How do we address human and generative AI collaboration inputs and outputs? 

We must ensure that AI benefits everyone. Increased public investment and participatory governance of AI are vital. Large commercial entities should provide a public benefit in exchange for using creator content for training purposes. We cannot rely on commercial players to set forth industry norms that influence the future of the open commons. 

Next Steps

Moving forward, our success will depend on expanded and representative community consultations. Over the coming months, we will:

These high-level steps are just the beginning. Our hope is to be piloting a framework within the next year. Watch this space as we explore and share more details and plans. We’re grateful to Morrison Foerster for providing support for the workshop in New York.

Join us by supporting this ongoing work

You have the power to make a difference in a way that suits you best. By donating to CC, you are not only helping us continue our vital work, but you also benefit from tax-deductible contributions. Making your gift is simple – just click here. Thank you for your support.

Posted 23 August 2024

Tags