Primary Key vs Foreign Key

What You'll Learn

By the end of this page, you will be able to:

Explain what a primary key is and why every table needs one
Explain what a foreign key is and how it establishes relationships between tables
Compare primary keys vs foreign keys — their purpose, constraints, and behavior
Understand referential integrity and how foreign keys enforce it
Choose between natural keys and surrogate keys with trade-offs
Apply best practices for defining keys in real-world schemas

The Problem: Identifying and Relating Data

Imagine a database for an e-commerce platform. You have millions of orders, thousands of products, and thousands of users. Two fundamental questions arise:

How do you uniquely identify each row? — Without a reliable identifier, you cannot distinguish one order from another, one user from another.
How do you link related data across tables? — An order belongs to a user and contains products. Without explicit relationships, the data is disconnected and unreliable.

Primary keys solve the first problem. Foreign keys solve the second. Together, they are the backbone of relational database design.

Primary Key

A primary key is a column (or set of columns) that uniquely identifies each row in a table. Every table in a relational database should have exactly one primary key.

Properties

Property	Description
Unique	No two rows can have the same primary key value
Not NULL	Primary key columns cannot contain NULL values
Immutable	Primary key values should not change over time
Only one per table	A table can have only one primary key (though it can be composite)

Creating a primary key

-- Single-column primary key
CREATE TABLE users (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    email VARCHAR(255)
);

-- Named constraint (preferred for production)
CREATE TABLE users (
    id INT,
    name VARCHAR(100),
    email VARCHAR(255),
    CONSTRAINT pk_users PRIMARY KEY (id)
);

-- Composite primary key (multiple columns)
CREATE TABLE order_items (
    order_id INT,
    product_id INT,
    quantity INT,
    PRIMARY KEY (order_id, product_id)
);

What makes a good primary key?

Criteria	Why it matters
Unique	Must never duplicate — ensures every row is identifiable
Never NULL	NULL cannot be compared, so it cannot serve as a reliable identifier
Immutable	Changing a primary key requires updating all foreign keys referencing it
Simple	Prefer a single integer column over multi-column composite keys
Efficient	Small data types (INT, BIGINT) index and join faster than strings or UUIDs

Foreign Key

A foreign key is a column (or set of columns) in one table that references the primary key of another table. It establishes a relationship between two tables and enforces referential integrity.

Properties

Property	Description
References a primary key	The foreign key points to a primary key (or unique key) in another table
Can be NULL	Foreign keys can be NULL, meaning the relationship is optional
Can duplicate	Multiple rows can reference the same parent row (one-to-many)
Multiple per table	A table can have multiple foreign keys referencing different tables

Creating a foreign key

CREATE TABLE orders (
    id INT PRIMARY KEY,
    user_id INT,
    order_date DATE,
    status VARCHAR(20),
    CONSTRAINT fk_orders_users
        FOREIGN KEY (user_id)
        REFERENCES users(id)
);

Referential integrity

Referential integrity ensures that every foreign key value matches an existing primary key in the referenced table — no orphaned references.

Orders 101, 102, and 103 are valid — user_id references real users.
Order 104 is invalid — user_id = 999 does not exist in users. The foreign key constraint would reject this insert.

ON DELETE / ON UPDATE actions

When a referenced row is deleted or updated, the database must decide what happens to the rows that reference it. Foreign keys support referential actions:

CONSTRAINT fk_orders_users
    FOREIGN KEY (user_id)
    REFERENCES users(id)
    ON DELETE CASCADE          -- delete orders when user is deleted
    ON UPDATE CASCADE;         -- update user_id when users.id changes

Action	ON DELETE	ON UPDATE
CASCADE	Delete referencing rows automatically	Update referencing rows automatically
SET NULL	Set foreign key to NULL	Set foreign key to NULL
SET DEFAULT	Set foreign key to its default value	Set foreign key to its default value
RESTRICT	Prevent the delete (error)	Prevent the update (error)
NO ACTION	Same as RESTRICT (deferred in some DBs)	Same as RESTRICT (deferred in some DBs)

tip

RESTRICT (or NO ACTION) is the safest default — it forces the application to handle the relationship explicitly. Use CASCADE only when you are certain that deleting a parent should always delete its children (e.g., order items when an order is deleted).

Primary Key vs Foreign Key: Comparison

	Primary Key	Foreign Key
Purpose	Uniquely identifies each row in a table	References a primary key in another table
Uniqueness	Must be unique	Can have duplicate values
NULL	Cannot be NULL	Can be NULL (optional relationship)
Per table	Exactly one	Zero or more
Direction	Is referenced by foreign keys	References a primary key
Auto-generated	Often auto-incremented or UUID	Values come from the referenced table
Index	Automatically indexed (clustered in most DBs)	Not auto-indexed in some DBs (e.g., PostgreSQL, MySQL InnoDB)
Constraint type	Entity integrity	Referential integrity

Relationship cardinality

Foreign keys define the cardinality of relationships between tables:

Relationship	Example	How it works
One-to-Many	One user → many orders	Foreign key on the "many" side (`orders.user_id`)
One-to-One	One user → one profile	Foreign key on either side with a UNIQUE constraint
Many-to-Many	Students ↔ courses	Junction table with two foreign keys

-- One-to-One: user profile
CREATE TABLE profiles (
    id INT PRIMARY KEY,
    user_id INT UNIQUE,           -- UNIQUE makes it 1:1
    bio TEXT,
    CONSTRAINT fk_profiles_users FOREIGN KEY (user_id) REFERENCES users(id)
);

-- Many-to-Many: students ↔ courses (junction table)
CREATE TABLE enrollments (
    student_id INT,
    course_id INT,
    enrolled_at TIMESTAMP,
    PRIMARY KEY (student_id, course_id),
    FOREIGN KEY (student_id) REFERENCES students(id),
    FOREIGN KEY (course_id) REFERENCES courses(id)
);

Natural Key vs Surrogate Key

When choosing a primary key, you face a fundamental design decision:

	Natural Key	Surrogate Key
Definition	A column with business meaning (e.g., email, ISBN, SSN)	An artificially generated column with no business meaning (e.g., auto-increment INT, UUID)
Example	`email`, `isbn`, `tax_id`	`id INT AUTO_INCREMENT`, `id UUID`
Pros	No extra column; directly meaningful	Never changes; simple; uniform across tables
Cons	May change (email, SSN); may be composite; can be large	Extra column; no business meaning
Performance	Can be slow if the key is a long string	Fast joins on small integer types

Surrogate key (recommended)

-- Auto-increment integer (most common)
CREATE TABLE users (
    id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,  -- PostgreSQL 10+
    name VARCHAR(100),
    email VARCHAR(255) UNIQUE
);

-- SQL Server
CREATE TABLE users (
    id INT IDENTITY(1,1) PRIMARY KEY,
    name VARCHAR(100),
    email VARCHAR(255) UNIQUE
);

-- UUID (for distributed systems)
CREATE TABLE users (
    id UUID DEFAULT gen_random_uuid() PRIMARY KEY,  -- PostgreSQL 13+
    name VARCHAR(100),
    email VARCHAR(255) UNIQUE
);

When to use natural keys

Natural keys make sense when the business identifier is truly immutable and guaranteed unique:

ISO country codes (US, VN, DE)
Currency codes (USD, EUR)
Language codes (en, vi)

warning

Avoid using emails, phone numbers, or SSNs as primary keys. They can change, be reused, or have privacy implications. Use a surrogate key and put a UNIQUE constraint on the natural identifier instead.

Foreign Key Indexing

A critical performance consideration: foreign key columns should be indexed.

Without an index on orders.user_id, the database must perform a full table scan on orders every time a row in users is deleted or updated (to enforce the foreign key constraint). This can be devastating for performance.

-- PostgreSQL, MySQL InnoDB: foreign keys are NOT auto-indexed on the referencing side
-- You must create the index manually:
CREATE INDEX idx_orders_user_id ON orders(user_id);

-- SQL Server: foreign keys ARE auto-indexed... wait, no. They are NOT.
-- Same rule applies — always index your foreign key columns.

tip

Always index foreign key columns. This is one of the most common performance mistakes in database design. Some ORMs and frameworks don't do this automatically.

Common Pitfalls

Using a natural key that changes — if your primary key changes, every foreign key referencing it must also change. Use surrogate keys to avoid this.
Missing foreign key indexes — without an index on the foreign key column, deletes and updates on the parent table cause full table scans.
Overusing CASCADE — ON DELETE CASCADE can silently delete large amounts of data. Use it intentionally, not by default.
Not handling circular references — two tables referencing each other can make inserts and deletes complicated. Break the cycle with nullable foreign keys or deferred constraints.
Using GUID/UUID as clustered index — random UUIDs cause index fragmentation. Use sequential UUIDs or integer surrogate keys for clustered primary keys.
Composite primary keys in large tables — multi-column primary keys make all foreign key references wider and slower. Prefer single-column surrogate keys.

Best Practices

Always define a primary key — every table should have one. Without it, you cannot reliably identify or update rows.
Prefer surrogate keys — use auto-increment integers or UUIDs. Keep natural identifiers as UNIQUE constraints.
Always define foreign keys — don't rely on application logic to maintain relationships. The database enforces integrity even when the application has bugs.
Index every foreign key column — this is critical for join performance and constraint enforcement.
Use RESTRICT by default — choose CASCADE only when you have a clear business requirement for cascading behavior.
Keep primary keys small — INT or BIGINT is ideal. Every foreign key duplicates the primary key value, so smaller keys mean smaller indexes and faster joins.
Never change primary key values — if you need to change an identifier, add a new surrogate key and keep the old value as a unique column.

Interview Questions

1. What is the difference between a primary key and a foreign key?

	Primary Key	Foreign Key
Purpose	Uniquely identifies a row	References a row in another table
Uniqueness	Must be unique	Can have duplicates
NULL	Cannot be NULL	Can be NULL
Per table	Exactly one	Zero or more
Enforces	Entity integrity	Referential integrity
Auto-indexed	Yes (always)	Depends on the database

2. What is referential integrity and how is it enforced?

Referential integrity ensures that every foreign key value matches an existing primary key in the referenced table. It prevents:

Inserting a child row with a non-existent parent
Deleting a parent row that is still referenced by children (unless CASCADE)
Updating a parent's primary key while children still reference the old value

It is enforced by the database through foreign key constraints — the database checks every INSERT, UPDATE, and DELETE against the constraint.

3. What is the difference between a natural key and a surrogate key?

	Natural Key	Surrogate Key
Source	Business data (email, ISBN, SSN)	Generated (auto-increment, UUID)
Meaningful	Yes — has business meaning	No — artificial identifier
Stability	May change over time	Never changes
Size	Can be large (strings)	Small and uniform (INT, BIGINT)
Recommendation	Use as UNIQUE constraint	Use as primary key

Surrogate keys are generally preferred because they are stable, simple, and performant.

4. What happens when you delete a row that is referenced by a foreign key?

It depends on the ON DELETE action defined on the foreign key constraint:

Action	Result
RESTRICT / NO ACTION	Delete is rejected — error is raised
CASCADE	All referencing rows are automatically deleted
SET NULL	Foreign key columns in referencing rows are set to NULL
SET DEFAULT	Foreign key columns are set to their default value

The default is RESTRICT, which is the safest option.

5. Why should foreign key columns be indexed?

Without an index on the foreign key column:

Deleting a parent row requires a full table scan on the child table to verify no references exist (or to find rows to cascade)
Joining parent and child tables is slow — the database cannot efficiently look up matching rows
Locking can escalate from row-level to table-level under some databases (e.g., MySQL InnoDB)

Indexing foreign key columns ensures O(log n) lookups instead of O(n) scans.

6. Can a table have multiple foreign keys?

Yes. A table can have as many foreign keys as needed, each referencing a different table (or even the same table for self-referencing relationships).

CREATE TABLE order_items (
    id INT PRIMARY KEY,
    order_id INT REFERENCES orders(id),      -- FK to orders
    product_id INT REFERENCES products(id),  -- FK to products
    quantity INT
);

Common examples:

order_items references both orders and products
employees references departments (department) and employees (manager) — self-referencing
posts references users (author) and categories (category)

Learn More

ACID Properties — how transactions guarantee consistency
SQL Joins — querying data across related tables
Database Indexing — how indexes speed up queries on keys
Database Locking — concurrency control when modifying related data

What You'll Learn​

The Problem: Identifying and Relating Data​

Primary Key​

Properties​

Creating a primary key​

What makes a good primary key?​

Foreign Key​

Properties​

Creating a foreign key​

Referential integrity​

ON DELETE / ON UPDATE actions​

Primary Key vs Foreign Key: Comparison​

Relationship cardinality​

Natural Key vs Surrogate Key​

Surrogate key (recommended)​

When to use natural keys​

Foreign Key Indexing​

Common Pitfalls​

Best Practices​

Interview Questions​

1. What is the difference between a primary key and a foreign key?​

2. What is referential integrity and how is it enforced?​

3. What is the difference between a natural key and a surrogate key?​

4. What happens when you delete a row that is referenced by a foreign key?​

5. Why should foreign key columns be indexed?​

6. Can a table have multiple foreign keys?​

Learn More​

What You'll Learn

The Problem: Identifying and Relating Data

Primary Key

Properties

Creating a primary key

What makes a good primary key?

Foreign Key

Properties

Creating a foreign key

Referential integrity

ON DELETE / ON UPDATE actions

Primary Key vs Foreign Key: Comparison

Relationship cardinality

Natural Key vs Surrogate Key

Surrogate key (recommended)

When to use natural keys

Foreign Key Indexing

Common Pitfalls

Best Practices

Interview Questions

1. What is the difference between a primary key and a foreign key?

2. What is referential integrity and how is it enforced?

3. What is the difference between a natural key and a surrogate key?

4. What happens when you delete a row that is referenced by a foreign key?

5. Why should foreign key columns be indexed?

6. Can a table have multiple foreign keys?

Learn More