Django ORM Fundamentals
What is an ORM and Why Django Uses It
Imagine you have a database with tables, rows, and columns. Normally, to get data you'd write raw SQL like SELECT * FROM blog_post WHERE author_id = 5. That's powerful, but tedious, error-prone, and ties your code to a specific database engine.
An ORM (Object-Relational Mapper) is a layer that lets you interact with your database using Python objects instead of raw SQL. Django's ORM translates your Python code into the correct SQL automatically.
import sqlite3 conn = sqlite3.connect('db.sqlite3') cursor = conn.cursor() cursor.execute("SELECT * FROM posts WHERE active=1") rows = cursor.fetchall() # Now parse tuples manually...
posts = Post.objects.filter(active=True) # Returns Python objects โ clean!
Models โ Your Database Tables
A Model is a Python class that maps to a database table. Each attribute of the class maps to a column. Let's build a Blog app from scratch.
from django.db import models from django.contrib.auth.models import User class Post(models.Model): # โโ Fields โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ title = models.CharField(max_length=200) slug = models.SlugField(unique=True) body = models.TextField() author = models.ForeignKey(User, on_delete=models.CASCADE, related_name='posts') view_count = models.PositiveIntegerField(default=0) is_active = models.BooleanField(default=True) created_at = models.DateTimeField(auto_now_add=True) updated_at = models.DateTimeField(auto_now=True) class Meta: ordering = ['-created_at'] # newest first verbose_name = 'Post' verbose_name_plural = 'Posts' indexes = [ models.Index(fields=['slug']), models.Index(fields=['-created_at']), ] def __str__(self): return self.title
All Important Field Types
| Field | Use For | Key Options | Example |
|---|---|---|---|
CharField | Short text (names, titles) | max_length (required) | CharField(max_length=100) |
TextField | Long text (articles, descriptions) | โ | TextField(blank=True) |
IntegerField | Whole numbers | default | IntegerField(default=0) |
FloatField / DecimalField | Decimal numbers | max_digits, decimal_places | DecimalField(max_digits=10, decimal_places=2) |
BooleanField | True/False flags | default | BooleanField(default=True) |
DateField | Date only (no time) | auto_now, auto_now_add | DateField(auto_now_add(usecase: created_at)=True) |
DateTimeField | Date + time | auto_now, auto_now_add | DateTimeField(auto_now(usecase: updated_at)=True) |
SlugField | URL-friendly strings | unique, max_length | SlugField(unique=True) |
EmailField | Email addresses | max_length | EmailField(unique=True) |
URLField | URLs/links | max_length | URLField(blank=True) |
ImageField | Image uploads | upload_to | ImageField(upload_to='avatars/') |
ForeignKey | Many-to-one relation | on_delete, related_name | ForeignKey(User, on_delete=models.CASCADE) |
ManyToManyField | Many-to-many relation | through, related_name | ManyToManyField('Tag') |
OneToOneField | One-to-one relation | on_delete, related_name | OneToOneField(User, on_delete=models.CASCADE) |
JSONField | JSON data | default=dict | JSONField(default=dict) |
UUIDField | UUID identifiers | default=uuid.uuid4 | UUIDField(default=uuid.uuid4) |
Common Field Options
| Option | What it does | Default |
|---|---|---|
null=True | Allows NULL in the database column | False |
blank=True | Allows empty string in form validation | False |
default=... | Default value if none provided | โ |
unique=True | Database-level uniqueness constraint | False |
db_index=True | Creates a database index for fast lookups | False |
choices=... | Restricts field to given values | โ |
verbose_name | Human-readable field name | โ |
editable=False | Exclude from forms/admin | True |
null=True is a database concern โ it allows NULL in the DB column. blank=True is a validation concern โ it allows empty values in Django forms. For CharField/TextField, avoid null=True (use blank=True only); for non-string fields like IntegerField, use null=True if the field is optional.
Field Options โ Your Questions Answered
โ When do we use editable=False?
editable=False tells Django to exclude this field from all ModelForms and the Django Admin. The field still exists in the database โ it just cannot be edited by a user through a form or admin panel.
Real-world use cases:
- Auto-set timestamps you never want a user to change
- Fields computed or set by code (e.g., a slug auto-generated from a title)
- Internal system fields (e.g., a UUID primary key, a hash, a status set by logic)
class Order(models.Model): order_number = models.UUIDField( default=uuid.uuid4, editable=False # โ never shown in forms; set automatically ) created_at = models.DateTimeField( auto_now_add=True, editable=False # โ auto_now_add implies editable=False already ) checksum = models.CharField(max_length=64, editable=False) # checksum is set by your save() logic, never by the user # โ This field DOES exist in the DB and can be set in code: order = Order() order.checksum = compute_hash(order) order.save() # โ But it will NOT appear in ModelForm or Django Admin forms
auto_now=True and auto_now_add=True automatically set editable=False internally โ you don't need to add it yourself for those fields. It's most useful for fields you compute or assign in code.
โก How is verbose_name used in Forms and Django Admin?
verbose_name is the human-readable label Django shows for a field in forms and the admin panel. Without it, Django auto-generates a label from the field name by replacing underscores with spaces.
class UserProfile(models.Model): dob = models.DateField( verbose_name='Date of Birth' # Admin/form shows "Date of Birth" ) # Without it: shows "Dob" (ugly!) phone_no = models.CharField( max_length=15, verbose_name='Phone Number' ) class Meta: verbose_name = 'User Profile' # singular (admin list heading) verbose_name_plural = 'User Profiles' # plural (admin sidebar)
verbose_name is purely a display/UI concept. ORM queries always use the actual Python field name (e.g., filter(dob__gte=...)), never the verbose name. It has zero effect on SQL or query behavior.
| Where verbose_name appears | Where it does NOT appear |
|---|---|
| Django Admin field labels | ORM queries (filter, order_by, etc.) |
| ModelForm field labels | Database column names |
| Error messages from form validation | Python attribute access |
| Admin sidebar (verbose_name_plural) | Migration files |
โข How does the choices option work and what table does it create?
choices restricts a field to a fixed set of values. The key insight: Django does NOT create a separate table for choices. It stores the raw value (the first item in each tuple) directly in the same column. The choices are enforced only at the form/validation level โ not in the database.
class Order(models.Model): # โโ Define choices as class-level constants (best practice) โโ class Status(models.TextChoices): PENDING = 'pending', 'Pending' PAID = 'paid', 'Paid' SHIPPED = 'shipped', 'Shipped' CANCELLED = 'cancelled', 'Cancelled' status = models.CharField( max_length=20, choices=Status.choices, default=Status.PENDING, ) # โโ What gets stored in the DB? โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ # Only the RAW VALUE ('pending', 'paid', etc.) is stored. # No extra table is created. Just a VARCHAR column. # โโ How to use in code โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ order = Order.objects.get(pk=1) print(order.status) # 'pending' โ raw stored value print(order.get_status_display()) # 'Pending' โ human-readable label # Filter using the raw value Order.objects.filter(status=Order.Status.PAID) Order.objects.filter(status='paid') # same thing # โโ Old-style tuple syntax (still works) โโโโโโโโโโโโโโโโโโโโโ STATUS_CHOICES = [ ('P', 'Pending'), ('A', 'Approved'), ('R', 'Rejected'), ] status = models.CharField(max_length=1, choices=STATUS_CHOICES)
models.TextChoices (Django 3.0+) โ it gives you autocomplete, prevents typos, and auto-generates the choices list. Use models.IntegerChoices when storing integers. Old-style tuple lists still work but are harder to maintain.
choices when the options are fixed and defined in code (e.g., order statuses, priority levels). Use a separate model + ForeignKey when options need to be created/edited at runtime by users (e.g., tags, categories, countries). Choices live in your code; FK tables live in your database.
โฃ What exactly does on_delete=models.CASCADE do?
When you define a ForeignKey, you must tell Django: "What should happen to the child row when the parent row is deleted?" on_delete is your answer. CASCADE means delete the child too.
class Post(models.Model): # CASCADE: delete Post when its Author is deleted # โ A post cannot exist without an author. Makes sense. author = models.ForeignKey(User, on_delete=models.CASCADE) # SET_NULL: keep Post but clear its category when category is deleted # โ A post can exist without a category. Requires null=True. category = models.ForeignKey(Category, on_delete=models.SET_NULL, null=True, blank=True) # PROTECT: prevent deleting a Tag if any Post still uses it # โ Raises ProtectedError โ forces you to reassign first. primary_tag = models.ForeignKey(Tag, on_delete=models.PROTECT, null=True) # SET_DEFAULT: set to a default value when parent deleted editor = models.ForeignKey(User, on_delete=models.SET_DEFAULT, default=1, related_name='edited_posts') # SET(): call a function to get the value to set reviewer = models.ForeignKey(User, on_delete=models.SET(get_sentinel_user), related_name='reviewed_posts', null=True) # โโ What CASCADE looks like in practice โโโโโโโโโโโโโโโโโโโโโโโโโโ # You have: User(id=5) โ [Post(id=10), Post(id=11), Post(id=12)] User.objects.filter(pk=5).delete() # SQL: DELETE FROM auth_user WHERE id=5 # Django then automatically runs: # SQL: DELETE FROM blog_post WHERE author_id=5 # All 3 posts are gone too โ cascaded!
on_delete=CASCADE, then deleting a User deletes their Posts, which deletes all Comments on those posts. Always trace the full cascade chain before using it. Use PROTECT when you want to be safe and explicit.
| on_delete option | What happens to child row | Requires | Best for |
|---|---|---|---|
CASCADE | Child is deleted too | โ | Child can't exist without parent |
SET_NULL | FK column set to NULL | null=True | Child can exist alone (optional parent) |
SET_DEFAULT | FK set to field's default | default=... | A meaningful fallback parent exists |
PROTECT | Raises ProtectedError | โ | Must not delete parent while children exist |
RESTRICT | Like PROTECT but smarter in transactions | โ | Complex deletion logic |
SET(func) | FK set to return value of function | callable | Dynamic fallback (e.g., "anonymous" user) |
DO_NOTHING | Nothing โ may cause DB integrity error | โ | When you manage integrity yourself (rare) |
Meta Options
The inner Meta class controls model-level behavior like default ordering, table name, and constraints.
class Product(models.Model): name = models.CharField(max_length=200) sku = models.CharField(max_length=50) price = models.DecimalField(max_digits=10, decimal_places=2) store = models.ForeignKey('Store', on_delete=models.CASCADE) class Meta: db_table = 'shop_product' # custom table name(Instead of 'appname_product', this table will be named 'shop_product') ordering = ['name'] # default alphabetical verbose_name = 'Product' verbose_name_plural = 'Products' # Composite unique constraint constraints = [ models.UniqueConstraint( fields=['sku', 'store'], name='unique_sku_per_store' ) ] # Database indexes indexes = [ models.Index(fields=['name', 'price']), # composite models.Index(fields=['-price']), # descending ]
Migrations
Migrations are Django's way of propagating changes you make to your models into your database schema. Think of them as version control for your database.
# 1. Detect changes in models and create migration files python manage.py makemigrations # 2. Apply migrations to the database python manage.py migrate # 3. See what SQL will be run (inspect before applying) python manage.py sqlmigrate blog 0001 # 4. Show migration status python manage.py showmigrations # 5. Rollback a migration (go back to previous) python manage.py migrate blog 0002 # 6. Fake a migration (mark as applied without running) python manage.py migrate blog 0001 --fake # 7. Create an empty migration (for data migrations) python manage.py makemigrations blog --empty --name=seed_categories
CRUD โ Create, Read, Update, Delete
CREATE
from blog.models import Post from django.contrib.auth.models import User author = User.objects.get(username='alice') # Method 1: save() โ Two-step, lets you modify before saving post = Post(title='My First Post', slug='my-first-post', body='Hello world!', author=author) post.save() # Method 2: create() โ One step, saves immediately post = Post.objects.create( title='My Second Post', slug='my-second-post', body='Content here', author=author ) # Method 3: get_or_create() โ Avoid duplicates post, created = Post.objects.get_or_create( slug='my-first-post', defaults={'title': 'My First Post', 'body': '...', 'author': author} ) # created=True if new, False if it already existed # Method 4: update_or_create() post, created = Post.objects.update_or_create( slug='my-first-post', defaults={'title': 'Updated Title', 'is_active': True} )
READ
# Get ALL records (returns a QuerySet) posts = Post.objects.all() # Get ONE record โ raises DoesNotExist or MultipleObjectsReturned post = Post.objects.get(id=1) post = Post.objects.get(slug='my-first-post') # Safe get โ returns None if not found (no exception) try: post = Post.objects.get(pk=999) except Post.DoesNotExist: post = None # filter() โ returns QuerySet (can be empty) active_posts = Post.objects.filter(is_active=True) my_posts = Post.objects.filter(author=author, is_active=True) # exclude() โ opposite of filter not_mine = Post.objects.exclude(author=author) # Ordering newest_first = Post.objects.order_by('-created_at') by_title = Post.objects.order_by('title') # Slicing (LIMIT/OFFSET in SQL) first_five = Post.objects.all()[:5] page_two = Post.objects.all()[5:10] # first() and last() latest = Post.objects.order_by('-created_at').first() oldest = Post.objects.order_by('created_at').first()
UPDATE
# Method 1: Save on instance โ Good for one object, triggers signals post = Post.objects.get(pk=1) post.title = 'Updated Title' post.save() # Method 2: update() on QuerySet โ Fast bulk update, skips signals Post.objects.filter(is_active=False).update(is_active=True) # โ ๏ธ update() does NOT trigger post_save signals! # update() does NOT call Model.save() โ it goes directly to SQL # Update specific fields only (more efficient) post.title = 'New Title' post.save(update_fields=['title']) # Only UPDATEs the title column
DELETE
# Delete a single instance post = Post.objects.get(pk=1) post.delete() # Bulk delete via QuerySet Post.objects.filter(is_active=False).delete() # Returns: (number_deleted, {model: count_dict}) # Delete all (be careful!) Post.objects.all().delete()
QuerySets & Lazy Evaluation
A QuerySet is a collection of database objects. The crucial thing to understand: QuerySets are lazy โ they don't hit the database until you actually need the data.
# โ NO database query happens here yet! posts = Post.objects.filter(is_active=True) # โ Still no query โ chaining is free posts = posts.order_by('-created_at') # โ DATABASE HIT happens here (evaluation) for post in posts: # iteration evaluates print(post.title) # Other things that evaluate a QuerySet: list(posts) # convert to list posts[0] # indexing len(posts) # length (use .count() instead!) bool(posts) # boolean check (use .exists() instead!)
Field Lookups (The Double Underscore __)
Field lookups are how you specify the SQL WHERE clause. They use the __ (double underscore) syntax.
| Lookup | SQL Equivalent | Example |
|---|---|---|
__exact | = 'value' | filter(title__exact='Hello') |
__iexact | = 'value' (case insensitive) | filter(title__iexact='hello') |
__contains | LIKE '%value%' | filter(title__contains='Django') |
__icontains | LIKE '%value%' (case insensitive) | filter(title__icontains='django') |
__startswith | LIKE 'value%' | filter(title__startswith='My') |
__endswith | LIKE '%value' | filter(title__endswith='ORM') |
__gt | > value | filter(view_count__gt=100) |
__gte | >= value | filter(view_count__gte=100) |
__lt | < value | filter(view_count__lt=10) |
__lte | <= value | filter(view_count__lte=10) |
__in | IN (v1, v2, ...) | filter(id__in=[1,2,3]) |
__isnull | IS NULL / IS NOT NULL | filter(deleted_at__isnull=True) |
__range | BETWEEN a AND b | filter(price__range=(10, 100)) |
__date | extract date part | filter(created_at__date=today) |
__year / __month | extract year/month | filter(created_at__year=2024) |
from datetime import date # Search titles containing 'django' (case-insensitive) posts = Post.objects.filter(title__icontains='django') # Posts with more than 1000 views popular = Post.objects.filter(view_count__gt=1000) # Posts published in 2024 posts_2024 = Post.objects.filter(created_at__year=2024) # Posts from specific IDs specific = Post.objects.filter(id__in=[1, 5, 10]) # Posts where body contains Django AND views > 100 results = Post.objects.filter( body__icontains='django', view_count__gt=100 ) # Multiple args = AND condition
null=True and blank=True? When would you use each?Product model and want to find all products whose name contains "laptop" (case-insensitive) and price is between $500 and $2000. Write the filter query.Post.objects.get(pk=1) and Post.objects.filter(pk=1)? When would each fail?post.save(update_fields=['title']) over plain post.save() when you only changed the title?Relationships in Django ORM
OneToOneField
A one-to-one relationship means one record in Table A corresponds to exactly one record in Table B. Classic example: a User has one Profile.
class Profile(models.Model): user = models.OneToOneField( User, on_delete=models.CASCADE, related_name='profile' # user.profile ) bio = models.TextField(blank=True) avatar = models.ImageField(upload_to='avatars/', blank=True) website = models.URLField(blank=True) location = models.CharField(max_length=100, blank=True) # โโ Queries โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ # Forward access (Profile โ User) profile = Profile.objects.get(pk=1) user = profile.user # access the related user # Reverse access (User โ Profile) user = User.objects.get(username='alice') profile = user.profile # because related_name='profile' # Without related_name it would be: user.profile_set โ but OneToOne # creates a single object accessor, not a manager
ForeignKey (One-to-Many)
One author can write many posts, but each post has only one author. This is a one-to-many relationship, modeled with ForeignKey.
class Category(models.Model): name = models.CharField(max_length=100) slug = models.SlugField(unique=True) def __str__(self): return self.name class Post(models.Model): title = models.CharField(max_length=200) author = models.ForeignKey( User, on_delete=models.CASCADE, related_name='posts' # user.posts.all() ) category = models.ForeignKey( Category, on_delete=models.SET_NULL, null=True, blank=True, related_name='posts' # category.posts.all() ) # โโ Queries โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ # Forward: Post โ Author (simple attribute access) post = Post.objects.get(pk=1) print(post.author.username) # โ ๏ธ This does a 2nd query! print(post.author_id) # โ No extra query โ stored as FK column # Reverse: Author โ Posts (returns a Manager) alice = User.objects.get(username='alice') alices_posts = alice.posts.all() # QuerySet of all Alice's posts alices_posts = alice.posts.filter(is_active=True) # Category โ Posts tech = Category.objects.get(slug='tech') tech_posts = tech.posts.order_by('-created_at')
on_delete Options โ Explained
| Option | What happens when parent is deleted | Use When |
|---|---|---|
CASCADE | Delete all related child objects too | Children can't exist without parent (posts without author) |
SET_NULL | Set FK column to NULL (requires null=True) | Children can exist without parent (post without category) |
SET_DEFAULT | Set FK to its default value | You have a meaningful default |
PROTECT | Raise ProtectedError โ prevent deletion | Parent must not be deleted if children exist |
RESTRICT | Like PROTECT but allows if related objects also deleted in same transaction | Complex deletion scenarios |
DO_NOTHING | Do nothing โ can cause integrity errors! | Rarely โ when you manage integrity yourself |
ManyToManyField
A post can have many tags, and a tag can belong to many posts. That's a many-to-many relationship. Django creates a junction table automatically.
class Tag(models.Model): name = models.CharField(max_length=50, unique=True) slug = models.SlugField(unique=True) def __str__(self): return self.name class Post(models.Model): title = models.CharField(max_length=200) tags = models.ManyToManyField(Tag, related_name='posts', blank=True) # โโ Many-to-Many Queries โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ # ADD tags to a post post = Post.objects.get(pk=1) django_tag = Tag.objects.get(slug='django') python_tag = Tag.objects.get(slug='python') post.tags.add(django_tag) post.tags.add(django_tag, python_tag) # add multiple post.tags.set([django_tag, python_tag]) # replace all # REMOVE tags post.tags.remove(python_tag) post.tags.clear() # remove all # QUERY: all tags of a post post.tags.all() # REVERSE: all posts with a tag tag = Tag.objects.get(slug='django') tag.posts.all() # uses related_name # Filter posts that have a specific tag Post.objects.filter(tags__slug='django')
Through Model (Custom M2M Table)
Use a through model when you need extra data on the relationship itself (e.g., when a student enrolled in a course, or their grade).
class Student(models.Model): user = models.OneToOneField(User, on_delete=models.CASCADE) class Course(models.Model): title = models.CharField(max_length=200) students = models.ManyToManyField( Student, through='Enrollment', related_name='courses' ) class Enrollment(models.Model): """The through/junction model with extra data.""" student = models.ForeignKey(Student, on_delete=models.CASCADE) course = models.ForeignKey(Course, on_delete=models.CASCADE) enrolled_at = models.DateTimeField(auto_now_add=True) grade = models.CharField(max_length=2, blank=True) # A, B+, etc. completed = models.BooleanField(default=False) class Meta: unique_together = ['student', 'course'] # one enrollment per combo # โโ Queries with through model โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ # Create enrollment (must use through model directly) Enrollment.objects.create(student=student, course=course, grade='A') # Get all students in a course (through course.students) course.students.all() # Get enrollment details enrollment = Enrollment.objects.get(student=student, course=course) print(enrollment.grade) # All courses a student completed student.courses.filter(enrollment__completed=True)
related_name โ Why It Matters
related_name defines the name of the reverse relation from the related model back to this one. Without it, Django auto-generates modelname_set. With it, you control the name.
# WITHOUT related_name โ Django auto-creates: user.post_set class Post(models.Model): author = models.ForeignKey(User, on_delete=models.CASCADE) user.post_set.all() # works but ugly # WITH related_name โ you control the accessor name class Post(models.Model): author = models.ForeignKey(User, on_delete=models.CASCADE, related_name='posts') user.posts.all() # clean and readable โ # โโ When to use '+' as related_name โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ # Use '+' to DISABLE the reverse relation (saves memory, avoids clashes) class Post(models.Model): created_by = models.ForeignKey(User, on_delete=models.CASCADE, related_name='+') # Now user.post_set does NOT exist โ accessing it raises AttributeError # Use '+' when you'll NEVER need reverse access from that field # โโ Multiple FKs to same model โ need related_name! โโโโโโโโโโโโโโ class Transfer(models.Model): from_account = models.ForeignKey('Account', on_delete=models.CASCADE, related_name='outgoing_transfers') to_account = models.ForeignKey('Account', on_delete=models.CASCADE, related_name='incoming_transfers') # Without distinct related_names, Django raises a clash error
on_delete=CASCADE vs SET_NULL? Give a real-world example for each.User. What must you do and why?through model? Give an example scenario.Advanced Querying
values() and values_list()
Instead of returning full model objects, values() returns dictionaries and values_list() returns tuples โ much more efficient when you only need specific fields.
# values() โ list of dicts Post.objects.values('id', 'title', 'author__username') # [{'id': 1, 'title': 'Hello', 'author__username': 'alice'}, ...] # values_list() โ list of tuples Post.objects.values_list('id', 'title') # [(1, 'Hello'), (2, 'World'), ...] # flat=True for single field โ plain list Post.objects.values_list('id', flat=True) # [1, 2, 3, 4, ...] โ Just a flat list of IDs # Practical use: get all unique categories Category.objects.values_list('name', flat=True) # Practical use: build choices for a form choices = list(Category.objects.values_list('id', 'name'))
annotate() and aggregate()
aggregate() computes a single value across the whole QuerySet. annotate() computes a value per object and adds it as a new attribute.
from django.db.models import Count, Sum, Avg, Max, Min, F # โโ aggregate() โ single result dict โโโโโโโโโโโโโโโโโโโโโโโโโโโโ result = Post.objects.aggregate( total=Count('id'), avg_views=Avg('view_count'), max_views=Max('view_count'), total_views=Sum('view_count'), ) # {'total': 42, 'avg_views': 128.5, 'max_views': 5000, ...} # โโ annotate() โ per-object value โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ # Count how many posts each author has from django.contrib.auth.models import User authors = User.objects.annotate(post_count=Count('posts')) for a in authors: print(a.username, a.post_count) # Count posts per category, ordered by count Category.objects\ .annotate(post_count=Count('posts'))\ .order_by('-post_count') # Filter on annotations (HAVING clause) popular_authors = User.objects\ .annotate(post_count=Count('posts'))\ .filter(post_count__gte=5) # Annotate with Sum โ total order value per customer from shop.models import Customer, Order, OrderItem customers = Customer.objects.annotate( total_spent=Sum('orders__items__price') ).order_by('-total_spent')
F() Expressions
An F() object lets you reference a field's value in a query without pulling it into Python. The operation happens entirely in the database.
from django.db.models import F # โ BAD โ Requires two queries and a Python round-trip post = Post.objects.get(pk=1) post.view_count += 1 post.save() # โ GOOD โ Single atomic SQL UPDATE (race-condition safe!) Post.objects.filter(pk=1).update(view_count=F('view_count') + 1) # Compare two fields on same row # Products where stock is less than reorder level Product.objects.filter(stock__lt=F('reorder_level')) # Sort by computed difference Product.objects.annotate( diff=F('selling_price') - F('cost_price') ).order_by('-diff') # most profitable first
Q() Objects โ Complex OR/AND Queries
Normally, filter() chaining creates AND conditions. For OR or NOT logic, use Q() objects.
from django.db.models import Q # OR: posts about Python OR Django Post.objects.filter( Q(title__icontains='python') | Q(title__icontains='django') ) # AND with OR: active posts by Alice OR Bob Post.objects.filter( Q(is_active=True), Q(author__username='alice') | Q(author__username='bob') ) # NOT: exclude posts about "old" Post.objects.filter(~Q(title__icontains='old')) # Complex: (A OR B) AND NOT C Post.objects.filter( (Q(view_count__gt=1000) | Q(is_featured=True)) & ~Q(author__is_staff=True) ) # Dynamic Q building (e.g., from search form) q = Q() if search_term: q &= Q(title__icontains=search_term) if category_id: q &= Q(category_id=category_id) results = Post.objects.filter(q)
Conditional Expressions โ Case / When
from django.db.models import Case, When, Value, IntegerField, CharField # Annotate a "popularity" tier per post posts = Post.objects.annotate( popularity=Case( When(view_count__gte=10000, then=Value('viral')), When(view_count__gte=1000, then=Value('popular')), When(view_count__gte=100, then=Value('growing')), default=Value('new'), output_field=CharField(), ) ) # Conditional ordering: active posts first posts = Post.objects.annotate( priority=Case( When(is_active=True, then=Value(0)), default=Value(1), output_field=IntegerField(), ) ).order_by('priority', '-created_at')
Subquery() and OuterRef()
from django.db.models import OuterRef, Subquery # Annotate each author with the title of their LATEST post latest_post = Post.objects\ .filter(author=OuterRef('pk'))\ .order_by('-created_at')\ .values('title')[:1] authors = User.objects.annotate( latest_post_title=Subquery(latest_post) )
Raw SQL When Necessary
from django.db import connection # Method 1: Model.objects.raw() โ returns model instances posts = Post.objects.raw( 'SELECT * FROM blog_post WHERE view_count > %s', [100] ) # Method 2: connection.cursor() โ for arbitrary SQL with connection.cursor() as cursor: cursor.execute('SELECT COUNT(*) FROM blog_post') row = cursor.fetchone() total = row[0] # โ ๏ธ ALWAYS use parameterized queries โ NEVER string format! # โ BAD (SQL Injection risk!) cursor.execute(f'SELECT * FROM blog_post WHERE title = "{user_input}"') # โ GOOD (parameterized) cursor.execute('SELECT * FROM blog_post WHERE title = %s', [user_input])
aggregate() and annotate()? When would you use each?F('view_count') + 1 safer than fetching the object, incrementing, and saving?Nested Relations & Deep Queries
Traversing Relationships with __
The double underscore __ isn't just for field lookups โ it also traverses relationships. You can follow ForeignKey, OneToOne, and ManyToMany relationships arbitrarily deep.
# Models: Order โ OrderItem โ Product โ Category # Get orders that contain products in the "Electronics" category Order.objects.filter( items__product__category__name='Electronics' ) # Get users who wrote posts tagged with 'django' User.objects.filter(posts__tags__slug='django').distinct() # .distinct() because one user might match multiple times via M2M # Filter across 4 levels deep # Company โ Department โ Employee โ Task โ status='done' Company.objects.filter( departments__employees__tasks__status='done' ) # Annotate with nested count from django.db.models import Count Order.objects.annotate( item_count=Count('items'), # direct relation product_count=Count('items__product', distinct=True) # nested ) # Order by related field Post.objects.order_by('author__username') # sort by author's name # Filter + annotate across relations Category.objects\ .filter(posts__is_active=True)\ .annotate(active_post_count=Count('posts'))\ .filter(active_post_count__gt=5)\ .order_by('-active_post_count')
.distinct() to avoid duplicates in your QuerySet.
__ hop across a relation adds a JOIN to the generated SQL. Deep queries are powerful but can become slow โ this is exactly why Module 5 (performance) matters so much.
School โ Class โ Student โ Grade. Write a query to get all schools that have at least one student with a grade above 90..distinct() when filtering across a ManyToMany relationship?Performance & Optimization
The N+1 Query Problem
This is the most common Django performance bug. It happens when you loop over a QuerySet and access a related object inside the loop โ causing one query per iteration.
# โ N+1 PROBLEM โ 1 query for posts + N queries for authors posts = Post.objects.all() # Query 1: SELECT * FROM post for post in posts: print(post.author.username) # Query 2: SELECT * FROM user WHERE id=1 # Query 3: SELECT * FROM user WHERE id=2 # Query 4: SELECT * FROM user WHERE id=3 # ... N more queries for N posts! # With 1000 posts โ 1001 queries! โโโ # โ SOLUTION: select_related() โ Fixes it with a JOIN posts = Post.objects.select_related('author').all() # SELECT post.*, user.* FROM post JOIN user ... โ only 1 query! for post in posts: print(post.author.username) # No extra query โ already loaded!
select_related() vs prefetch_related()
| Feature | select_related() | prefetch_related() |
|---|---|---|
| How it works | SQL JOIN (single query) | Separate query + Python joining |
| Use for | ForeignKey, OneToOneField | ManyToManyField, reverse FKs, GenericRelations |
| Number of queries | 1 (with JOIN) | 2+ (one per relation) |
| Memory | Can be large if many rows | More controlled |
| Nesting | Use __ to follow chains | Use Prefetch() object for control |
from django.db.models import Prefetch # โโ select_related (FK & OneToOne โ JOIN) โโโโโโโโโโโโโโโโโโโโโ posts = Post.objects\ .select_related('author')\ # author FK .select_related('author__profile')\ # chain: author โ profile .select_related('category') # category FK # Shorthand: all in one call posts = Post.objects.select_related('author', 'category') # โโ prefetch_related (M2M & Reverse FK) โโโโโโโโโโโโโโโโโโโโโโ posts = Post.objects\ .select_related('author')\ # FK โ JOIN .prefetch_related('tags') # M2M โ separate query # โโ Prefetch() object for fine control โโโโโโโโโโโโโโโโโโโโโโโโ active_posts_prefetch = Prefetch( 'posts', queryset=Post.objects.filter(is_active=True).select_related('category'), to_attr='active_posts' # store result in .active_posts instead of .posts ) authors = User.objects.prefetch_related(active_posts_prefetch) for author in authors: for post in author.active_posts: # uses prefetched list print(post.title)
only(), defer(), count(), exists(), iterator()
# only() โ load ONLY specified fields (defer the rest) posts = Post.objects.only('id', 'title', 'created_at') # Accessing post.body will trigger an extra query! # defer() โ load everything EXCEPT specified fields posts = Post.objects.defer('body') # skip the big text column # count() โ efficient COUNT(*) SQL โ don't use len(qs)! total = Post.objects.filter(is_active=True).count() # โ total = len(Post.objects.filter(is_active=True)) # โ loads all rows # exists() โ efficient EXISTS check โ don't use if qs: if Post.objects.filter(author=user).exists(): # โ SELECT 1 LIMIT 1 print('has posts') # iterator() โ for large QuerySets, don't cache in memory for post in Post.objects.all().iterator(chunk_size=500): process(post) # Streams 500 rows at a time โ great for data exports # explain() โ show the query execution plan print(Post.objects.filter(is_active=True).explain())
Bulk Operations
# bulk_create โ insert many rows in one query posts = [ Post(title=f'Post {i}', slug=f'post-{i}', author=author, body='...') for i in range(1000) ] Post.objects.bulk_create(posts, batch_size=100) # โ ~10 queries instead of 1000! # bulk_update โ update many rows efficiently posts = list(Post.objects.filter(is_active=False)) for post in posts: post.is_active = True Post.objects.bulk_update(posts, ['is_active'], batch_size=100) # update() on QuerySet โ even better for simple updates Post.objects.filter(is_active=False).update(is_active=True) # 1 single SQL UPDATE โ fastest option
Database Indexing Best Practices
class Post(models.Model): # db_index=True โ single column index slug = models.SlugField(unique=True) # unique implies index created_at = models.DateTimeField(db_index=True) # frequent order_by is_active = models.BooleanField(db_index=True) # frequent filter class Meta: indexes = [ # Composite index for queries that filter+order together models.Index(fields=['is_active', '-created_at'], name='post_active_date_idx'), ] # โ ๏ธ Index Rules: # โ Index: columns in WHERE, ORDER BY, JOIN conditions # โ Index: high-cardinality columns (many unique values) # โ Don't index: columns with few unique values (e.g., boolean) # unless combined in a composite index # โ Don't over-index: each index slows down INSERT/UPDATE/DELETE
order.customer.name and order.items.all() for each. How many queries does this cause? How would you fix it?select_related over prefetch_related and vice versa?Post.objects.filter(...).count() and len(Post.objects.filter(...)) in terms of performance?Best Practices & Advanced Topics
Custom Managers & QuerySets
class PostQuerySet(models.QuerySet): def active(self): return self.filter(is_active=True) def by_author(self, user): return self.filter(author=user) def popular(self, min_views=1000): return self.filter(view_count__gte=min_views) class PostManager(models.Manager): def get_queryset(self): return PostQuerySet(self.model, using=self._db) # Delegate custom methods def active(self): return self.get_queryset().active() def popular(self): return self.get_queryset().popular() class Post(models.Model): # ... fields ... objects = PostManager() # Replace default manager # Now you can chain cleanly: Post.objects.active().popular().order_by('-created_at')
Signals
from django.db.models.signals import post_save, pre_delete from django.dispatch import receiver from .models import Post # Auto-create Profile when a User is created @receiver(post_save, sender=User) def create_user_profile(sender, instance, created, **kwargs): if created: Profile.objects.create(user=instance) # Log before a Post is deleted @receiver(pre_delete, sender=Post) def log_post_deletion(sender, instance, **kwargs): DeletionLog.objects.create( model='Post', object_id=instance.id, title=instance.title ) # Connect signals in AppConfig.ready() class BlogConfig(AppConfig): def ready(self): import blog.signals # โ make sure this is imported
Transactions
from django.db import transaction # Decorator โ entire function is atomic @transaction.atomic def transfer_funds(from_account, to_account, amount): from_account.balance -= amount from_account.save() to_account.balance += amount to_account.save() # If any exception โ both saves are rolled back โ # Context manager โ finer control with transaction.atomic(): order = Order.objects.create(customer=customer) for item_data in cart: OrderItem.objects.create(order=order, **item_data) cart.clear() # Savepoints โ nested atomic blocks with transaction.atomic(): order = Order.objects.create(...) try: with transaction.atomic(): # savepoint send_email(order) # might fail except EmailError: pass # rolls back only email, not order
Common Pitfalls to Avoid
- N+1 queries โ always use
select_related/prefetch_relatedwhen accessing related objects in loops - Using
len(qs)instead ofqs.count()โ loads all rows into memory - Using
if qs:instead ofqs.exists()โ same issue - Modifying QuerySets after evaluation โ re-filtering an evaluated QS hits DB again
- Raw SQL without parameterization โ SQL injection vulnerability
- Signals for complex business logic โ use service layers instead; signals are hard to trace and test
- Forgetting
.distinct()on M2M filters โ causes duplicate results - Over-indexing โ every index slows writes; only index what you filter/sort on frequently
Security Best Practices
- Always use ORM or parameterized queries โ never string-concatenate user input into SQL
- Use
get_object_or_404()in views instead of bareget() - Validate and sanitize data at the form/serializer layer before saving
- Use
select_for_update()for pessimistic locking in concurrent scenarios - Never expose raw database IDs in URLs for sensitive models โ use UUIDs or slugs
with transaction.atomic(): # Locks the row until transaction ends โ prevents race conditions account = Account.objects\ .select_for_update()\ .get(pk=account_id) account.balance -= amount account.save()
Async ORM (Django 4.1+)
# Django 4.1+ supports async ORM operations import asyncio from django.http import JsonResponse async def post_list(request): # Async versions of common operations posts = await Post.objects.filter(is_active=True)\ .aall() # async all() post = await Post.objects.aget(pk=1) # async get() await Post.objects.acreate( # async create() title='Async Post', slug='async', author=user, body='...' ) return JsonResponse({'count': await Post.objects.acount()}) # Note: prefetch_related / select_related work async too # For loops over QuerySets use: async for post in Post.objects.all()
Category, Product, Customer, Order, OrderItem, Review. Then implement:1. A custom Manager with
in_stock(), by_category(), on_sale() methods on ProductManager2. A query to get the top 10 customers by total spend, with their last order date
3. A query to get all products that are low on stock (stock < reorder_level) in the Electronics category
4. A transaction-safe
place_order(customer, cart_items) function5. A signal that updates a Product's
average_rating field whenever a Review is saved