Fixed #373 -- Added CompositePrimaryKey. by csirmazbendeguz · Pull Request #18056 · django/django · GitHub
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed #373 -- Added CompositePrimaryKey. #18056

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

csirmazbendeguz
Copy link
Contributor

@csirmazbendeguz csirmazbendeguz commented Apr 7, 2024

Trac ticket number

ticket-373

Branch description

This branch adds the CompositePrimaryKey field. If present, Django will create a composite primary key.

Please refer to the docs for a more in-depth explanation.

Proposal
Previous PR
Composite FK
Admin
Composite Generic FK
Model._is_pk_set()
Tuple Lookups
Serial Fields

class Tenant(models.Model):
    pass


class User(models.Model):
    pk = models.CompositePrimaryKey("tenant_id", "id", primary_key=True)
    tenant = models.ForeignKey(Tenant, on_delete=models.CASCADE)
    id = models.IntegerField()


class Comment(models.Model):
    pk = models.CompositePrimaryKey("tenant_id", "id")
    tenant = models.ForeignKey(Tenant, on_delete=models.CASCADE)
    id = models.IntegerField()
    user_id = models.IntegerField()
    user = models.ForeignObject(
        User,
        on_delete=models.CASCADE,
        from_fields=("tenant_id", "user_id"),
        to_fields=("tenant_id", "id"),
        related_name="+",
    )

Checklist

  • This PR targets the main branch.
  • The commit message is written in past tense, mentions the ticket number, and ends with a period.
  • I have checked the "Has patch" ticket flag in the Trac system.
  • I have added or updated relevant tests.
  • I have added or updated relevant docs, including release notes if applicable.
  • For UI changes, I have attached screenshots in both light and dark modes.

@grjones
Copy link

grjones commented Apr 17, 2024

I was trying out this exciting branch and ran into this error when running a test:

<...>/lib/python3.12/site-packages/django/db/models/lookups.py:30: in __init__
    self.rhs = self.get_prep_lookup()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = TupleIn(<django.db.models.fields.composite.Cols object at 0x107560980>, <django.db.models.sql.query.Query object at 0x1074e23f0>)

    def get_prep_lookup(self):
        if not isinstance(self.lhs, Cols):
            raise ValueError(
                "The left-hand side of the 'in' lookup must be an instance of Cols"
            )
        if not isinstance(self.rhs, Iterable):
>           raise ValueError(
                "The right-hand side of the 'in' lookup must be an iterable"
            )
E           ValueError: The right-hand side of the 'in' lookup must be an iterable

The issue stems from the use of isnull like so:

MyModel.objects.filter(
    type_override__severity__isnull=False
).update(severity="high")

Curious if anyone ran into this as well.

Edited for traceback:

<...>
lib/python3.12/site-packages/django/db/models/sql/compiler.py:2080: in pre_sql_setup
    self.query.add_filter("pk__in", query)
lib/python3.12/site-packages/django/db/models/sql/query.py:1601: in add_filter
    self.add_q(Q((filter_lhs, filter_rhs)))
lib/python3.12/site-packages/django/db/models/sql/query.py:1617: in add_q
    clause, _ = self._add_q(q_object, self.used_aliases)
lib/python3.12/site-packages/django/db/models/sql/query.py:1649: in _add_q
    child_clause, needed_inner = self.build_filter(
lib/python3.12/site-packages/django/db/models/sql/query.py:1563: in build_filter
    condition = self.build_lookup(lookups, col, value)
lib/python3.12/site-packages/django/db/models/sql/query.py:1393: in build_lookup
    lookup = lookup_class(lhs, rhs)
lib/python3.12/site-packages/django/db/models/lookups.py:30: in __init__
    self.rhs = self.get_prep_lookup()

So, this is part of SQLUpdateCompiler and is coming from the update code path.

@csirmazbendeguz
Copy link
Contributor Author

csirmazbendeguz commented Apr 18, 2024

Thanks for testing and reporting the issue @grjones! Indeed, I forgot to handle this use case. I'll look into it this week.

@csirmazbendeguz csirmazbendeguz force-pushed the ticket_373 branch 3 times, most recently from c75dcdd to 4be1c68 Compare April 20, 2024 00:12
@csirmazbendeguz
Copy link
Contributor Author

@grjones, FYI I pushed the fix

@grjones
Copy link

grjones commented Apr 20, 2024

@grjones, FYI I pushed the fix

Nice! I hope this gets merged in soon. Your branch has been working great for me.

@grjones
Copy link

grjones commented Apr 22, 2024

I may have found one other small issue. When adding a regular primary_key=True on a single field, a unique constraint is added. But when using this branch, it becomes an IntegrityError instead. Adding a UniqueConstraint on the composite fields is a work-a-round but ideally would be captured in this PR. Imo, this PR is sooooo close. I'm excited for it to be merged in.

@csirmazbendeguz
Copy link
Contributor Author

@grjones , thanks, I appreciate the feedback, I'll look into it. If a model defines Meta.primary_key, defining primary_key=True on a field should not be possible - could you give me a code example so I know how to reproduce the issue? I didn't know Django added unique constraints to primary keys, I'll check, but isn't that redundant?

@grjones
Copy link

grjones commented Apr 23, 2024

@grjones , thanks, I appreciate the feedback, I'll look into it. If a model defines Meta.primary_key, defining primary_key=True on a field should not be possible - could you give me a code example so I know how to reproduce the issue? I didn't know Django added unique constraints to primary keys, I'll check, but isn't that redundant?

I'll see if I can give you a solid failing test. My "unique constraint" phrasing might not be exactly right. But ultimately, I believe Django queries the DB first to see if the new object's PK already exists and throws a validation error. The composite key logic doesn't seem to be doing that and so an unhandled IntegrityError is raised instead.

@csirmazbendeguz
Copy link
Contributor Author

csirmazbendeguz commented May 1, 2024

@grjones , sorry for the late reply, I've been busy last week. Could you give me more specifics? What's the error message you expect?

@grjones
Copy link

grjones commented May 2, 2024

@grjones , sorry for the late reply, I've been busy last week. Could you give me more specifics? What's the error message you expect?

Actually, I think it's mostly ok. I was using Django Spanner and it's just not quite working with composite keys and will need to be fixed there. I wrote this and it passed. It probably shouldn't say Id though?

from django.core.exceptions import ValidationError
from django.test import TestCase

from .models import Tenant, User


class CompositePKCleanTests(TestCase):
    """
    Test the .clean() method of composite_pk models.
    """

    @classmethod
    def setUpTestData(cls):
        cls.tenant = Tenant.objects.create()

    def test_validation_error_is_raised_when_pk_already_exists(self):
        test_cases = [
            {"tenant": self.tenant, "id": 2412, "email": "user2412@example.com"},
            {"tenant_id": self.tenant.id, "id": 5316, "email": "user5316@example.com"},
            {"pk": (self.tenant.id, 7424), "email": "user7424@example.com"},
        ]
        expected = "{'id': ['User with this Id already exists.']}"
        for fields in test_cases:
            User.objects.create(**fields)
            with self.assertRaisesMessage(ValidationError, expected):
                User(**fields).clean()

Copy link
Contributor

@LilyFoote LilyFoote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great start!

I've left a bunch of ideas for improvement. Feel free to push back if you think I'm wrong about anything.

django/db/models/base.py Outdated Show resolved Hide resolved
django/db/models/fields/composite.py Outdated Show resolved Hide resolved
django/db/models/fields/composite.py Outdated Show resolved Hide resolved
django/db/models/fields/composite.py Outdated Show resolved Hide resolved
django/db/models/fields/composite.py Outdated Show resolved Hide resolved
tests/composite_pk/test_get.py Outdated Show resolved Hide resolved
tests/composite_pk/test_get.py Outdated Show resolved Hide resolved
tests/composite_pk/test_update.py Outdated Show resolved Hide resolved
tests/composite_pk/tests.py Outdated Show resolved Hide resolved
tests/composite_pk/tests.py Outdated Show resolved Hide resolved
@csirmazbendeguz
Copy link
Contributor Author

Thank you so much for taking the time to review my changes @LilyFoote !
I have two questions:

  1. If Meta.primary_key is defined, this PR will automatically add a composite field called primary_key to the model. What do you think about this approach? I felt like it was easier to handle the composite primary keys this way as we can run checks against the meta class instead of traversing the model's fields for a composite field.
  2. I wrote a lot of tests testing the underlying queries made by the ORM. It makes a lot of sense to me, but I haven't seen this type of tests that much in the Django source code - do these tests look okay to you?

@LilyFoote
Copy link
Contributor

If Meta.primary_key is defined, this PR will automatically add a composite field called primary_key to the model. What do you think about this approach?

I don't feel strongly that this is better or worse than another option here, so happy to go with what you think is best.

I wrote a lot of tests testing the underlying queries made by the ORM. It makes a lot of sense to me, but I haven't seen this type of tests that much in the Django source code - do these tests look okay to you?

I like your tests quite a bit - they're pretty readable and comprehensive. The main issue I have with them is that they're written for specific databases instead of for generic database features. Where possible Django strongly prefers to test based on features because then the tests apply to as many databases as possible (including third party database libraries). I think the asserts of the actual SQL might be a bit tricky to adapt though, so we might need a different way to check what they're checking.

Also, after I reviewed yesterday, I thought of some more things:

  • We should add migrations tests to make sure that adding/removing Meta.primary_key works correctly and that removing a field that's part of a primary key also does something appropriate.
  • We might want tests for composite keys in forms and the admin. Maybe there's other areas too that we need to check the interactions.

tests/composite_pk/models/tenant.py Outdated Show resolved Hide resolved
django/db/models/fields/composite.py Outdated Show resolved Hide resolved
django/db/models/fields/composite.py Outdated Show resolved Hide resolved
django/db/models/sql/compiler.py Outdated Show resolved Hide resolved
tests/composite_pk/test_filter.py Outdated Show resolved Hide resolved
tests/composite_pk/test_filter.py Outdated Show resolved Hide resolved
tests/composite_pk/test_filter.py Outdated Show resolved Hide resolved
Copy link
Contributor

@LilyFoote LilyFoote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking better and better. I have a few new comments and questions.

django/db/models/fields/composite.py Outdated Show resolved Hide resolved
django/db/models/base.py Outdated Show resolved Hide resolved
django/db/models/fields/composite.py Outdated Show resolved Hide resolved
django/db/models/fields/composite.py Outdated Show resolved Hide resolved
django/db/models/fields/composite.py Outdated Show resolved Hide resolved
django/db/models/query.py Outdated Show resolved Hide resolved
tests/composite_pk/test_checks.py Outdated Show resolved Hide resolved
tests/composite_pk/test_create.py Show resolved Hide resolved
tests/composite_pk/test_filter.py Outdated Show resolved Hide resolved
tests/composite_pk/test_get.py Outdated Show resolved Hide resolved
@csirmazbendeguz csirmazbendeguz changed the title Fixed #373 -- Added CompositeField-based Meta.primary_key. Fixed #373 -- Added CompositePrimaryKey-based Meta.primary_key. May 16, 2024
Copy link
Member

@charettes charettes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments here and there. It's great to see the progress you've made on combining MultiColSource 🎉

Something that came through my mind while reviewing is that we likely want a plan to eventually deprecate Options.pk in favor of Options.primary_key?

django/db/models/fields/related_lookups.py Outdated Show resolved Hide resolved
django/db/models/fields/related_lookups.py Outdated Show resolved Hide resolved
django/db/models/fields/related_lookups.py Outdated Show resolved Hide resolved
django/db/models/fields/related_lookups.py Outdated Show resolved Hide resolved
django/db/models/fields/composite.py Outdated Show resolved Hide resolved
tests/composite_pk/test_filter.py Outdated Show resolved Hide resolved
django/db/models/fields/composite.py Outdated Show resolved Hide resolved
django/db/models/fields/composite.py Outdated Show resolved Hide resolved
django/db/models/fields/composite.py Show resolved Hide resolved
django/db/models/fields/related.py Outdated Show resolved Hide resolved
@csirmazbendeguz
Copy link
Contributor Author

Thanks @charettes !

Something that came through my mind while reviewing is that we likely want a plan to eventually deprecate Options.pk in favor of Options.primary_key?

I'm not sure what you mean by that, I don't think we can, because Options.pk refers to the field, while Options.primary_key is the list of field names.

@csirmazbendeguz
Copy link
Contributor Author

So as far as I understand, at the moment MultiColSource is used by Django internally to represent JOINs on multiple fields - that's why it has a sources field.

I'm not sure it's the right decision to reuse this for composite fields, which on the other hand don't need sources, it just needs to represent a list of Cols as an expression.

Let me know what you think!

@charettes
Copy link
Member

I'm not sure what you mean by that, I don't think we can, because Options.pk refers to the field, while Options.primary_key is the list of field names.

You're completely right. In this case is pk set to CompositePrimaryKey when Meta.primary_key is defined and is primary_key set when a non-composite primary is used as well?

django/db/models/fields/composite.py Outdated Show resolved Hide resolved
django/db/models/fields/composite.py Outdated Show resolved Hide resolved
Copy link
Contributor

@LilyFoote LilyFoote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two minor points about error message wording. They're not urgent to address.

tests/composite_pk/test_get.py Outdated Show resolved Hide resolved
django/db/models/fields/composite.py Outdated Show resolved Hide resolved
@csirmazbendeguz
Copy link
Contributor Author

csirmazbendeguz commented May 20, 2024

I'm not sure what you mean by that, I don't think we can, because Options.pk refers to the field, while Options.primary_key is the list of field names.

You're completely right. In this case is pk set to CompositePrimaryKey when Meta.primary_key is defined and is primary_key set when a non-composite primary is used as well?

It would not be set, if it's a regular primary key, Meta.primary_key is None.

@csirmazbendeguz
Copy link
Contributor Author

csirmazbendeguz commented Oct 14, 2024

I'll resolve the conflicts and rebase this after #18674 is merged.

@csirmazbendeguz
Copy link
Contributor Author

Hey everyone!

We have managed to tackle some issues in separate PRs.
I cannot find anything else that we can merge separately.
It's time to review this PR once again.
It looks like a big one, but there's only 442 lines of code changes in django, the rest are docs and tests.

Any review is appreciated.

@csirmazbendeguz
Copy link
Contributor Author

@charettes , could you take a look at this when you have the time? We'll need your approval before proceeding (as per Sarah's request). Any feedback is appreciated.

Copy link
Member

@charettes charettes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great work @csirmazbendeguz and nearing the finish line 🏁

I reviewed everything except the tests themselves as I remember having another go at it. Let me know if you'd like me to review something in particular that I might have missed.

django/db/models/fields/tuple_lookups.py Show resolved Hide resolved
Comment on lines +2009 to +2013
def _pk_constraint_sql(self, columns):
return self.sql_pk_constraint % {
"columns": ", ".join(self.quote_name(column) for column in columns)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that might have been an opportunity to always use the PRIMARY KEY constraint syntax even for single column primary keys (instead of the PRIMARY KEY next to the column) but that should do for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do a refactoring later, I didn't want to deal with it in this PR. I'm not sure if it's worth changing

Comment on lines 192 to 195
assert fields == pk_fields, (
"Count doesn't support counting composite fields that are not composite"
"primary keys"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should raise a NotSupportedError instead of an AssertionError.

Copy link
Contributor Author

@csirmazbendeguz csirmazbendeguz Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, as far as I understand, based on the discussion #18485 (comment) , we shouldn't do these type of checks in Django. Since the only way to reach this path is via composite primary keys, there's no need to check.

Comment on lines 515 to 517
and field.column is None
or field.generated
or isinstance(field, CompositePrimaryKey)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this already covered by field.column is None? It'd be great to avoid an instance check here as this is on the hot path for model instance creation.

Copy link
Contributor Author

@csirmazbendeguz csirmazbendeguz Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is there's an and.

field.attname not in kwargs
and field.column is None

Maybe we could remove this line?

field.attname not in kwargs

I pushed a commit to remove the line. It doesn't break any tests.

django/db/models/options.py Show resolved Hide resolved
docs/howto/composite-primary-key.txt Outdated Show resolved Hide resolved
docs/howto/composite-primary-key.txt Outdated Show resolved Hide resolved
docs/howto/composite-primary-key.txt Outdated Show resolved Hide resolved
docs/ref/models/fields.txt Outdated Show resolved Hide resolved
docs/ref/models/fields.txt Outdated Show resolved Hide resolved
@csirmazbendeguz
Copy link
Contributor Author

Thanks @charettes ! I have addressed everything, please check my replies 👍

docs/howto/composite-primary-key.txt Outdated Show resolved Hide resolved
docs/howto/composite-primary-key.txt Outdated Show resolved Hide resolved
docs/howto/composite-primary-key.txt Outdated Show resolved Hide resolved
docs/howto/composite-primary-key.txt Outdated Show resolved Hide resolved
docs/howto/composite-primary-key.txt Outdated Show resolved Hide resolved
docs/howto/composite-primary-key.txt Outdated Show resolved Hide resolved
)

cols = expr.get_cols()
return Count(cols[0], filter=result.filter)
Copy link
Member

@charettes charettes Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small note that if we add support for composite fields in the future and they rely on ColPairs as well this code will break as there might NULL in there.

This currently only works because all cols of a CompositePrimaryKey which has references to it turned into ColPairs must be NOT NULL so counting one member of the composite is equivalent to counting all of them (remember the SQL COUNT function returns the number of non-NULL values)

To support non-primary key composite fields this would require doing

cols = expr.get_cols()
cols_filter = result.filter
for col in cols[1:]:
   cols_filter &= Q(IsNull(col, False))
return Count(cols[0], filter=composite_filter)

I guess we could also only do it if columns are potentially NULL today which would keep the optimization around when unnecessary

cols = expr.get_cols()
cols_filter = result.filter
for col in cols[1:]:
   if col.output_field.null:
       cols_filter &= Q(IsNull(col, False))
return Count(cols[0], filter=composite_filter)

But there would be no way to test it until non-primary key composite field support lands.

Last interesting note on the subject, recent versions of Postgres elide IS NULL and IS NOT NULL when possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think for composite PKs we should count one field, and for other composite fields in the future do it the way you're proposing. 👍

Copy link
Member

@charettes charettes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment about the Count implementation that is dependent on ColPairs being from an origin when all cols are not col.output_field.null (which is always the case for CompositePrimaryKey). I'm really glad we added that as it will make integration with Pagination and the admin (which relies on count) much easier.

I'll give it a statuary approval given @sarahboyce and you have the documentation under control 🔍 🧙✅.

Congrats on getting this project so far @csirmazbendeguz and looking forward to getting it merged and the follow up tweaks and adjustments that will come from it 🎉 🏅

@csirmazbendeguz
Copy link
Contributor Author

Amazing, thank you @charettes ! 🎉

@sarahboyce
Copy link
Contributor

buildbot, test on oracle.

@sarahboyce
Copy link
Contributor

I pushed a few very small docs tweaks, the diff can be seen here, if anyone doesn't like those feel free to add some comments

Copy link
Contributor

@sarahboyce sarahboyce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @csirmazbendeguz 🎉
I will wait a day or two in case anyone else wants to jump in with feedback, I plan to land this on Monday 👍

@csirmazbendeguz
Copy link
Contributor Author

Thank you @sarahboyce ! The docs look good to me 👍

@timgraham
Copy link
Member

I see nothing about whether or not this field is supported in forms.

@csirmazbendeguz
Copy link
Contributor Author

@timgraham, you're right. It supports the model's clean function, but it's not supported otherwise. I haven't given much thought to it, because I don't think there should be a CompositePrimaryKey field on a form. If they want the primary key on the form, they could use the individual fields instead. Let me know what kind of support you had in mind for this? Sorry if I missed something obvious, I work with DRF and rarely use forms...

@timgraham
Copy link
Member

Just an explanation in the documentation. What happens if you try to use such a model in forms, formsets, etc. And be more explicit about what lack of admin support means... that a model that uses the field can't be registered.

@csirmazbendeguz
Copy link
Contributor Author

I added a section to warn users on correct usage with forms 👍

Comment on lines 160 to 161
To avoid issues, make sure :attr:`~django.db.models.Field.editable`
is set to ``False`` on all primary key fields.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably raise an error if this isn't the case - maybe using the checks framework.

Copy link
Contributor Author

@csirmazbendeguz csirmazbendeguz Nov 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we could add a check to force composite PK fields to be editable=False.

I'm just not sure if we should. This is an issue with regular primary keys as well: https://code.djangoproject.com/ticket/2259 . So if we force this for composite PKs , shouldn't we force it for regular PKs too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a skim through that issue and it seems the sticking point is creation vs updates - it should be possible to set the pk values during create and not after that. I'm not sure if Django has anything to handle that cleanly. If we don't, I'm happy to say this is out of scope (and in scope of 2259 instead).

Thanks Lily Foote and Simon Charette for reviews and mentoring
this Google Summer of Code 2024 project.
@sarahboyce
Copy link
Contributor

@timgraham @smithdc1 I believe your comments have been addressed, thank you for the reviews!
Please check and let us know if you think there should be further changes or if you're happy 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark Apply to have ASV benchmarks run on a PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.