21. Migrations

Django models are a description of an underlying data resource made up of Python classes with class variables which represent the data that we can use in our application by instantiating instances of these objects.

However, another way to consider our model classes is as a representation of a database schema and one that we can use to actually build the underlying database. In fact, this is exactly the functionality that many modern web frameworks provide to us

In providing this functionality within the framework code it frees us from interacting with the underlying data resource directly, and in turn takes us away from having to engage with or write much SQL.

In this way, the entire logic of our application can be expressed purely in Python and can be contained wholly within the code base of our web application.

Let’s develop our database on a local computer with some simple built-in database system, such as the default SQLite 3.

Then when we come to deploying production, we can swap out the underlying database system for any other SQL compliant database, such as MySQL, Postgres, Oracle, or any other database system that Django or Python has support for.

We will now show what many web frameworks call migrations, which is short for a schema migration.

A schema migration is an incremental, reversible set of instructions that takes our underlying datastore schema from one state to another. That is, it migrates from state to state.

In short, what we’re going to do is take the classes in our models file and use them as instructions to create migrations, these are other files of code. Then we’ll apply those migrations to the database to actually create or alter the tables.

But the first thing we must do is configure our application to use Postgres so that we can run the migrations.

We can close the models.py file for a while, and open up to bioweb/bioweb/settings.py.

The first thing we must do is include the genedata app in the list of installed apps. The name of the app in this section is always ‘names.apps.NameConfig’, like this:

INSTALLED_APPS = [
    'genedata.apps.GenedataConfig',
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
]

The other lines, starting with django, were already present by default.

Scroll down the settings and update the database settings. Originally we have:

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.sqlite3',
        'NAME': BASE_DIR / 'db.sqlite3',
    }
}

We need to update this as follows:

DATABASES = {
‘default’: {
‘ENGINE’: ‘django.db.backends.postgresql_psycopg2’,
‘NAME’: ‘bioweb_db’,
‘USER’: ‘coder’,
‘PASSWORD’: ‘coder’,
‘HOST’: ‘localhost’,
‘PORT’: ‘5432’,
}
}

We do need to go back now and make some changes to the virtual environment. Go back to CMD and to the desktop/bioweb directory (where we did pip install django in part 19).

(env) $ pip install psycopg2

Open up SQL Shell (psql) which we installed last time.

Use all defaults including your username that you included above, and enter password you chose previously.

Now in psql create the database that we promised would exist.

$ CREATE DATABASE bioweb_db;

If you type \l you should see it there. We also need to create a user with this password:

CREATE USER coder WITH PASSWORD 'coder';

We can now quit psql by typing \q.

Go back to CMD and navigate to bioweb/bioweb – the directory with manage.py in it.

We can now start to look at migrations.

(env) $ python manage.py showmigrations

If you have followed the coding so far this comes up with a useful error.

NameError: name 'Sequencing' is not defined

This is because in our models.py file, in the first class Gene, we have referenced the Sequencing and EC classes before we have defined them. We can move these two to the top of the list, before the Gene class.

Once we run showmigrations again we may see something like this:

admin
 [ ] 0001_initial
 [ ] 0002_logentry_remove_auto_add
 [ ] 0003_logentry_add_action_flag_choices
auth
 [ ] 0001_initial
 [ ] 0002_alter_permission_name_max_length
 [ ] 0003_alter_user_email_max_length
 [ ] 0004_alter_user_username_opts
 [ ] 0005_alter_user_last_login_null
 [ ] 0006_require_contenttypes_0002
 [ ] 0007_alter_validators_add_error_messages
 [ ] 0008_alter_user_username_max_length
 [ ] 0009_alter_user_last_name_max_length
 [ ] 0010_alter_group_name_max_length
 [ ] 0011_update_proxy_permissions
 [ ] 0012_alter_user_first_name_max_length
contenttypes
 [ ] 0001_initial
 [ ] 0002_remove_content_type_name
genedata
 (no migrations)
sessions
 [ ] 0001_initial

Here we see a list of possible upcoming database changes and those which have been applied to the database.

You’ll notice we have our genedata app listed and showing no migrations to run, and we have various other migrations that are upcoming.

These refer to some functionality that’s built-in to Django. We don’t appear to have any pending migrations for genedata, so we need to make those, like this:

(env) $ python manage.py makemigrations

This gives you a little bit of logging information that tells you what this migration is going to do. It’s going to create our various models, and it’s going to add various fields. We will see something like this returned:

Migrations for 'genedata':
  genedata\migrations\0001_initial.py
    - Create model Attribute
    - Create model EC
    - Create model Gene
    - Create model Sequencing
    - Create model Product
    - Create model GeneAttributeLink
    - Add field sequencing to gene
    - Add field gene to attribute

Now if we run showmigrations again you will see the first initial migration for genedata listed.

admin
 [ ] 0001_initial
 [ ] 0002_logentry_remove_auto_add
 [ ] 0003_logentry_add_action_flag_choices
auth
 [ ] 0001_initial
 [ ] 0002_alter_permission_name_max_length
 [ ] 0003_alter_user_email_max_length
 [ ] 0004_alter_user_username_opts
 [ ] 0005_alter_user_last_login_null
 [ ] 0006_require_contenttypes_0002
 [ ] 0007_alter_validators_add_error_messages
 [ ] 0008_alter_user_username_max_length
 [ ] 0009_alter_user_last_name_max_length
 [ ] 0010_alter_group_name_max_length
 [ ] 0011_update_proxy_permissions
 [ ] 0012_alter_user_first_name_max_length
contenttypes
 [ ] 0001_initial
 [ ] 0002_remove_content_type_name
genedata
 [ ] 0001_initial
sessions
 [ ] 0001_initial

The makemigrations command created a new file called 0001_initial.py. You can actually go to genedata/migrations in VSC and see the file, which is a python file with classes showing various operations that it’s going to perform.

If we are curious we can run a command to see what SQL changes are going to be made:

(env) $ python manage.py sqlmigrate genedata 0001_initial

This will return a lot of information:

BEGIN;
--
-- Create model Attribute
--
CREATE TABLE "genedata_attribute" ("id" bigserial NOT NULL PRIMARY KEY, "key" varchar(256) NOT NULL, "value" varchar(256) NOT NULL);
--
-- Create model EC
--
CREATE TABLE "genedata_ec" ("id" bigserial NOT NULL PRIMARY KEY, "ec_name" varchar(256) NOT NULL);
--
-- Create model Gene
--
CREATE TABLE "genedata_gene" ("id" bigserial NOT NULL PRIMARY KEY, "gene_id" varchar(256) NOT NULL, "entity" varchar(256) NOT NULL, "start" integer NOT NULL, "stop" integer NOT NULL, "sense" varchar(1) NOT NULL, "start_codon" varchar(1) NOT NULL, "ec_id" bigint NOT NULL);
--
-- Create model Sequencing
--
CREATE TABLE "genedata_sequencing" ("id" bigserial NOT NULL PRIMARY KEY, "sequencing_factory" varchar(256) NOT NULL, "factory_location" varchar(256) NOT NULL);
--
-- Create model Product
--
CREATE TABLE "genedata_product" ("id" bigserial NOT NULL PRIMARY KEY, "type" varchar(256) NOT NULL, "product" varchar(256) NOT NULL, "gene_id" bigint NOT NULL);
--
-- Create model GeneAttributeLink
--
CREATE TABLE "genedata_geneattributelink" ("id" bigserial NOT NULL PRIMARY KEY, "attribute_id" bigint NOT NULL, "gene_id" bigint NOT NULL);
--
-- Add field sequencing to gene
--
ALTER TABLE "genedata_gene" ADD COLUMN "sequencing_id" bigint NOT NULL CONSTRAINT "genedata_gene_sequencing_id_a1edf45e_fk_genedata_sequencing_id" REFERENCES "genedata_sequencing"("id") DEFERRABLE INITIALLY DEFERRED; SET CONSTRAINTS "genedata_gene_sequencing_id_a1edf45e_fk_genedata_sequencing_id" IMMEDIATE;
--
-- Add field gene to attribute
--
ALTER TABLE "genedata_gene" ADD CONSTRAINT "genedata_gene_ec_id_c7d59001_fk_genedata_ec_id" FOREIGN KEY ("ec_id") REFERENCES "genedata_ec" ("id") DEFERRABLE INITIALLY DEFERRED;
CREATE INDEX "genedata_gene_ec_id_c7d59001" ON "genedata_gene" ("ec_id");
ALTER TABLE "genedata_product" ADD CONSTRAINT "genedata_product_gene_id_739ec280_fk_genedata_gene_id" FOREIGN KEY ("gene_id") REFERENCES "genedata_gene" ("id") DEFERRABLE INITIALLY DEFERRED;
CREATE INDEX "genedata_product_gene_id_739ec280" ON "genedata_product" ("gene_id");
ALTER TABLE "genedata_geneattributelink" ADD CONSTRAINT "genedata_geneattribu_attribute_id_0ae8a52a_fk_genedata_" FOREIGN KEY ("attribute_id") REFERENCES "genedata_attribute" ("id") DEFERRABLE INITIALLY DEFERRED;
ALTER TABLE "genedata_geneattributelink" ADD CONSTRAINT "genedata_geneattributelink_gene_id_ed1a4517_fk_genedata_gene_id" FOREIGN KEY ("gene_id") REFERENCES "genedata_gene" ("id") DEFERRABLE INITIALLY DEFERRED;
CREATE INDEX "genedata_geneattributelink_attribute_id_0ae8a52a" ON "genedata_geneattributelink" ("attribute_id");
CREATE INDEX "genedata_geneattributelink_gene_id_ed1a4517" ON "genedata_geneattributelink" ("gene_id");
CREATE INDEX "genedata_gene_sequencing_id_a1edf45e" ON "genedata_gene" ("sequencing_id");
COMMIT;

If we’re happy with all of this, we can now go ahead and make this database, adding all the tables. To do this we do:

(env) $ python manage.py migrate

We will get something like this returned:

Operations to perform:
  Apply all migrations: admin, auth, contenttypes, genedata, sessions
Running migrations:
  Applying contenttypes.0001_initial... OK
  Applying auth.0001_initial... OK
  Applying admin.0001_initial... OK
  Applying admin.0002_logentry_remove_auto_add... OK
  Applying admin.0003_logentry_add_action_flag_choices... OK
  Applying contenttypes.0002_remove_content_type_name... OK
  Applying auth.0002_alter_permission_name_max_length... OK
  Applying auth.0003_alter_user_email_max_length... OK
  Applying auth.0004_alter_user_username_opts... OK
  Applying auth.0005_alter_user_last_login_null... OK
  Applying auth.0006_require_contenttypes_0002... OK
  Applying auth.0007_alter_validators_add_error_messages... OK
  Applying auth.0008_alter_user_username_max_length... OK
  Applying auth.0009_alter_user_last_name_max_length... OK
  Applying auth.0010_alter_group_name_max_length... OK
  Applying auth.0011_update_proxy_permissions... OK
  Applying auth.0012_alter_user_first_name_max_length... OK
  Applying genedata.0001_initial... OK
  Applying sessions.0001_initial... OK

We can double check that everything expected has happened. Go back to psql and do this:

\c bioweb_db;

You should see:

You are now connected to database "bioweb_db" as user "postgres".

And then type this to describe the database:

\d

There’s lots more tables here than the six we have defined. There’s support here for administration interfaces and application users, which we may look at later.

What about future migrations?

Say we need to edit our models later – perhaps we want to add indices to certain columns to facilitate faster searches, etc.

As an example we are going to add an index to the gene_id field in the Gene class (gene table).

We can go to models.py and make this modification in class Gene:

    gene_id = models.CharField(max_length=256, null=False, blank=False, db_index=True)

We’ve just added db_index=True to the end here. Save the file, and go back to CMD. Make a new migration:

(env) $ python manage.py makemigrations

This will return:

Migrations for 'genedata':
  genedata\migrations\0002_alter_gene_gene_id.py
    - Alter field gene_id on gene

So it just details that we want to alter gene_id in the gene table.

We can now showmigrations as before and we will see this. Now we can see [X] to indicate migrations that have been run, with our new migration also showing as 0002.

admin
 [X] 0001_initial
 [X] 0002_logentry_remove_auto_add
 [X] 0003_logentry_add_action_flag_choices
auth
 [X] 0001_initial
 [X] 0002_alter_permission_name_max_length
 [X] 0003_alter_user_email_max_length
 [X] 0004_alter_user_username_opts
 [X] 0005_alter_user_last_login_null
 [X] 0006_require_contenttypes_0002
 [X] 0007_alter_validators_add_error_messages
 [X] 0008_alter_user_username_max_length
 [X] 0009_alter_user_last_name_max_length
 [X] 0010_alter_group_name_max_length
 [X] 0011_update_proxy_permissions
 [X] 0012_alter_user_first_name_max_length
contenttypes
 [X] 0001_initial
 [X] 0002_remove_content_type_name
genedata
 [X] 0001_initial
 [ ] 0002_alter_gene_gene_id
sessions
 [X] 0001_initial

We can now run the migration again by doing python manage.py migrate and we will get this:

Operations to perform:
  Apply all migrations: admin, auth, contenttypes, genedata, sessions
Running migrations:
  Applying genedata.0002_alter_gene_gene_id... OK

Finally, psql stores a table of Django migrations. You can go to psql and look at this.

$ SELECT * FROM django_migrations WHERE app='genedata';

returns:

 id |   app    |          name           |            applied
----+----------+-------------------------+-------------------------------
 18 | genedata | 0001_initial            | 2021-11-03 10:08:19.941164+00
 20 | genedata | 0002_alter_gene_gene_id | 2021-11-03 10:18:06.67375+00
(2 rows)

Further useful reading: https://docs.djangoproject.com/en/3.0/topics/migrations/
Includes information on how migrations could be reversed.

Wednesday 3 November 2021, 31 views


Leave a Reply

Your email address will not be published. Required fields are marked *