Assume I have those Django models:
class Book(models.Model):
title = models.CharField(max_length=100)
class Author(models.Model):
name = models.CharField(max_length=100)
books = models.ManyToManyField(Book)
I already have a production system with several objects and several Author <-> Book connections.
Now I want to switch to:
class Book(models.Model):
title = models.CharField(max_length=100)
class BookAuthor(models.Model):
book = models.ForeignKey(Book, on_delete=models.CASCADE)
author = models.ForeignKey("Author", on_delete=models.CASCADE)
impact = models.IntegerField(default=1)
class Meta:
unique_together = ("book", "author")
class Author(models.Model):
name = models.CharField(max_length=100)
books = models.ManyToManyField(Book, through=BookAuthor)
If I do this Migration:
from django.db import migrations
def migrate_author_books(apps, schema_editor):
Author = apps.get_model('yourappname', 'Author')
BookAuthor = apps.get_model('yourappname', 'BookAuthor')
for author in Author.objects.all():
for book in author.books.all():
# Create a BookAuthor entry with default impact=1
BookAuthor.objects.create(author=author, book=book, impact=1)
class Migration(migrations.Migration):
dependencies = [
('yourappname', 'previous_migration_file'),
]
operations = [
migrations.CreateModel(name="BookAuthor", ...),
migrations.RunPython(migrate_author_books),
migrations.RemoveField(model_name="author", name="books"),
migrations.AddField(model_name="author", name="books", field=models.ManyToManyField(...),
]
then the loop for book in author.books.all()
will access the new (and empty) BookAuthor
table instead of iterating over the existing default table Django created.
How can I make the data migration?
The only way I see is to have two releases:
BookAuthor
model and fill it with data, but keep the existing one. So introducing a new field and keeping the old one. Also change every single place where author.books is used to author.books_new
Isn't there a simpler way?
You don't actually need a data migration at all to add a through table to a many-to-many relation.
When you create a many-to-many relation without a through
, Django creates a virtual model for it you behind the scenes:
>>> from xxxx.models import Author
>>> Author.books.through
<class 'xxxx.models.Author_books'>
>>> Author.books.through._meta.db_table
'xxxx_author_books'
>>> Author.books.through._meta.get_fields()
(<django.db.models.fields.BigAutoField: id>, <django.db.models.fields.related.ForeignKey: author>, <django.db.models.fields.related.ForeignKey: book>)
>>> Author.books.through._meta.unique_together
(('author', 'book'),)
Armed with this knowledge, you can create the same table as a real model (nb: I didn't check the exact fields from the virtual through table – you might want more diligence here!).
The important bit, however, is that you will need to set db_table
manually to what the virtual through table's name is.
class BookAuthor(models.Model):
book = models.ForeignKey(Book, on_delete=models.CASCADE)
author = models.ForeignKey("Author", on_delete=models.CASCADE)
class Meta:
unique_together = ("book", "author")
db_table = "xxxx_author_books"
You'll also need to set the through
on the ManyToManyField at this point.
If you create a migration out of this, you will get a CreateModel
, but trying to run that migration will understandably fail – you already have a "xxxx_author_books"
.
You'll need to modify the migration to wrap the operations in this migration in a SeparateDatabaseAndState
, so the physical database is not touched – after all, it doesn't need to be touched, since all we did was write out the same table that already was implicitly created:
migrations.SeparateDatabaseAndState(
database_operations=[],
state_operations=[
migrations.CreateModel(...),
migrations.AlterField(...),
...
Migrating this should go through without a hitch (and without touching the database).
You now have a bona fide through table you add the impact
field on, and migrate as usual.
EDIT: I just noted this operation has been described in the manual, too.