SQL Joins

What You'll Learn

By the end of this page, you will be able to:

Explain why joins are necessary in relational databases
Write INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN queries
Understand the difference between each join type using Venn diagrams and result tables
Use CROSS JOIN and SELF JOIN for specialized scenarios
Avoid common performance pitfalls when writing join queries
Choose the right join type for a given business question

The Problem: Data Lives in Multiple Tables

Relational databases split data across tables to avoid duplication and maintain integrity (normalization). But to answer real questions, you often need to combine data from multiple tables.

employees table:          departments table:
+----+--------+--------+  +--------+------------------+
| id | name   | dept_id|  | dept_id| dept_name        |
+----+--------+--------+  +--------+------------------+
|  1 | Alice  |   10   |  |   10   | Engineering      |
|  2 | Bob    |   20   |  |   20   | Marketing        |
|  3 | Carol  |   30   |  |   30   | Sales            |
|  4 | Dave   |  NULL  |  |   40   | HR               |
+----+--------+--------+  +--------+------------------+

Questions you can't answer from a single table:

"Which department does Alice work in?" — needs both tables
"Which departments have no employees?" — needs both tables
"Who has no department?" — needs both tables

JOINs are the SQL mechanism that combines rows from two tables based on a related column between them.

Sample Tables

All examples use these two tables:

CREATE TABLE departments (
    dept_id   INT PRIMARY KEY,
    dept_name VARCHAR(50)
);

INSERT INTO departments VALUES
    (10, 'Engineering'),
    (20, 'Marketing'),
    (30, 'Sales'),
    (40, 'HR');

CREATE TABLE employees (
    id      INT PRIMARY KEY,
    name    VARCHAR(50),
    dept_id INT REFERENCES departments(dept_id)
);

INSERT INTO employees VALUES
    (1, 'Alice', 10),
    (2, 'Bob', 20),
    (3, 'Carol', 30),
    (4, 'Dave', NULL),
    (5, 'Eve', 10);

Key observations:

Department 40 (HR) has no employees
Employee 4 (Dave) has no department (dept_id is NULL)
This asymmetry makes the differences between join types visible

Join Types Overview

INNER JOIN

Returns only the rows where there is a match in both tables. Non-matching rows from either table are excluded.

SELECT e.name, d.dept_name
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id;

Result:

name	dept_name
Alice	Engineering
Bob	Marketing
Carol	Sales
Eve	Engineering

4 rows returned — only employees with a valid department
Dave is excluded (no department)
HR is excluded (no employees)

tip

INNER JOIN is the default — writing just JOIN is equivalent to INNER JOIN.

LEFT JOIN (Left Outer Join)

Returns all rows from the left table and the matching rows from the right table. If there's no match, the right side columns are filled with NULL.

SELECT e.name, d.dept_name
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.dept_id;

Result:

name	dept_name
Alice	Engineering
Bob	Marketing
Carol	Sales
Dave	NULL
Eve	Engineering

5 rows returned — all employees are included
Dave's dept_name is NULL because he has no department
HR is still excluded (no employees)

Finding rows with no match

A common pattern is using LEFT JOIN ... WHERE right.id IS NULL to find rows that exist in the left table but have no match in the right table:

-- Find employees with no department
SELECT e.name
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.dept_id
WHERE d.dept_id IS NULL;

Result:

name
Dave

RIGHT JOIN (Right Outer Join)

Returns all rows from the right table and the matching rows from the left table. If there's no match, the left side columns are filled with NULL.

SELECT e.name, d.dept_name
FROM employees e
RIGHT JOIN departments d ON e.dept_id = d.dept_id;

Result:

name	dept_name
Alice	Engineering
Bob	Marketing
Carol	Sales
NULL	HR
Eve	Engineering

5 rows returned — all departments are included
HR has NULL for name because it has no employees
Dave is now excluded (no department)

tip

RIGHT JOIN is functionally equivalent to LEFT JOIN with the table order swapped. In practice, most developers prefer LEFT JOIN for readability — it keeps the "primary" table first.

FULL OUTER JOIN

Returns all rows from both tables. Matching rows are combined; non-matching rows from either side get NULL for the other table's columns.

SELECT e.name, d.dept_name
FROM employees e
FULL OUTER JOIN departments d ON e.dept_id = d.dept_id;

Result:

name	dept_name
Alice	Engineering
Bob	Marketing
Carol	Sales
Dave	NULL
Eve	Engineering
NULL	HR

6 rows returned — every employee and every department
Dave has NULL for department (no department)
HR has NULL for name (no employees)

Finding unmatched rows from both sides

-- Find employees with no department AND departments with no employees
SELECT e.name, d.dept_name
FROM employees e
FULL OUTER JOIN departments d ON e.dept_id = d.dept_id
WHERE e.dept_id IS NULL OR d.dept_id IS NULL;

Result:

name	dept_name
Dave	NULL
NULL	HR

CROSS JOIN

Returns the Cartesian product — every row from the left table combined with every row from the right table. No ON condition is needed.

SELECT e.name, d.dept_name
FROM employees e
CROSS JOIN departments d;

Result: 5 employees x 4 departments = 20 rows

name	dept_name
Alice	Engineering
Alice	Marketing
Alice	Sales
Alice	HR
Bob	Engineering
...	...

warning

CROSS JOIN can produce very large result sets. A cross join of two tables with 10,000 rows each produces 100,000,000 rows. Use it intentionally, not accidentally.

When to use CROSS JOIN

Generating all combinations (e.g., all sizes x all colors)
Creating a matrix or grid of values
Building reporting periods by crossing date ranges with entities

SELF JOIN

A table joined to itself. Useful when rows in the same table have hierarchical or peer relationships.

Example: Employee and Manager

ALTER TABLE employees ADD COLUMN manager_id INT REFERENCES employees(id);

-- Alice and Bob report to Eve (manager_id = 5)
-- Carol reports to Alice (manager_id = 1)
-- Dave and Eve have no manager
UPDATE employees SET manager_id = 5 WHERE id IN (1, 2);
UPDATE employees SET manager_id = 1 WHERE id = 3;

SELECT
    e.name AS employee,
    m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;

Result:

employee	manager
Alice	Eve
Bob	Eve
Carol	Alice
Dave	NULL
Eve	NULL

Common self-join scenarios

Scenario	Table references itself to find...
Org chart	Employee → Manager relationship
Bill of materials	Assembly → Component relationship
Category trees	Category → Parent category
Friend/ follower graphs	User → Related user

Join Type Comparison

	INNER	LEFT	RIGHT	FULL OUTER	CROSS
Returns	Matching rows only	All left + matching right	All right + matching left	All rows from both	Every combination
Unmatched left rows	Excluded	Included (NULL right)	Excluded	Included (NULL right)	N/A
Unmatched right rows	Excluded	Excluded	Included (NULL left)	Included (NULL left)	N/A
Result size	Smallest	Depends	Depends	Largest (no CROSS)	Left x Right
Common use	Default join	Most business queries	Rare (swap to LEFT)	Data auditing	Combinations

Multiple Joins

Real queries often join more than two tables. Joins chain left-to-right:

SELECT
    e.name AS employee,
    d.dept_name,
    p.project_name
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id
LEFT JOIN employee_projects ep ON e.id = ep.employee_id
LEFT JOIN projects p ON ep.project_id = p.id;

tip

Mix join types deliberately. Use INNER JOIN for required relationships and LEFT JOIN for optional ones.

Join Performance

How the database executes joins

Databases use different join algorithms depending on table size, indexes, and query complexity:

Algorithm	Best For	Time Complexity	Memory
Nested Loop	Indexed joins, small outer table	O(M x log N) with index	Low
Hash Join	Large unsorted tables, equijoins	O(M + N)	O(smaller table)
Merge Join	Pre-sorted data, large tables	O(M + N)	Low

Performance tips

Index the join columns — the most impactful optimization. Foreign key columns should be indexed.

-- This makes joins on dept_id fast
CREATE INDEX idx_employees_dept_id ON employees(dept_id);

Filter early — apply WHERE conditions before joining to reduce row counts:

-- Good: filters before joining
SELECT e.name, d.dept_name
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id
WHERE d.dept_name = 'Engineering';

-- Bad: joins everything, then filters
-- (optimizer may fix this, but be explicit)

Select only needed columns — avoid SELECT * in join queries:

-- Good
SELECT e.name, d.dept_name
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id;

-- Avoid
SELECT *
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id;

Be cautious with multiple joins — each join can multiply the intermediate result set. Check row counts with EXPLAIN.
Use explicit join conditions — avoid comma-style joins (cross joins with a WHERE filter) as they are harder to read and more error-prone.

-- Good: explicit JOIN syntax
SELECT e.name, d.dept_name
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id;

-- Avoid: implicit join (old syntax)
SELECT e.name, d.dept_name
FROM employees e, departments d
WHERE e.dept_id = d.dept_id;

Common Pitfalls

Accidental cross join — forgetting the ON clause produces a Cartesian product. Always include a join condition.
NULL in join columns — NULL = NULL evaluates to UNKNOWN (not TRUE), so rows with NULL join keys are excluded from INNER JOINs silently.
Duplicate rows from multiple matches — if the join column has duplicate values, you get more rows than expected. Check for duplicates with COUNT(*).
Wrong join type — using INNER JOIN when you need LEFT JOIN silently drops rows with no match. Think about whether unmatched rows matter.
Joining on the wrong column — always verify the join condition references the correct foreign key relationship.
Filtering in WHERE vs ON — for LEFT JOIN, putting right-table conditions in WHERE turns it into an effective INNER JOIN:

-- This EXCLUDES departments with no employees (WHERE filters after join)
SELECT d.dept_name, e.name
FROM departments d
LEFT JOIN employees e ON d.dept_id = e.dept_id
WHERE e.name LIKE 'A%';

-- This KEEPS all departments (ON filters before join)
SELECT d.dept_name, e.name
FROM departments d
LEFT JOIN employees e ON d.dept_id = d.dept_id AND e.name LIKE 'A%';

Best Practices

Use explicit JOIN syntax — not comma-separated tables with WHERE filters
Always qualify column names — use table aliases (e.name, d.dept_name) to avoid ambiguity
Index join columns — foreign keys and join predicates should be indexed
Choose the correct join type — INNER for required matches, LEFT for optional data
Prefer LEFT over RIGHT — swap table order and use LEFT for readability
Use table aliases — keep queries concise, but use meaningful aliases (e for employees, d for departments)
Check with EXPLAIN — verify the query plan uses the right join algorithm and indexes

Interview Questions

1. What is the difference between INNER JOIN and LEFT JOIN?

	INNER JOIN	LEFT JOIN
Returns	Only matching rows from both tables	All rows from left table + matching right rows
Unmatched left rows	Excluded	Included (right columns are NULL)
Use case	"Show me employees and their departments"	"Show me all employees, even those without departments"
Result size	Less than or equal to left table rows	Equal to left table rows

2. What is the difference between LEFT JOIN and RIGHT JOIN?

They are mirror images of each other:

A LEFT JOIN B = all rows from A + matching rows from B
A RIGHT JOIN B = all rows from B + matching rows from A

Any RIGHT JOIN can be rewritten as a LEFT JOIN by swapping the table order:

-- These are equivalent:
SELECT * FROM A RIGHT JOIN B ON A.id = B.a_id;
SELECT * FROM B LEFT JOIN A ON B.a_id = A.id;

In practice, LEFT JOIN is preferred for readability — keep the "driving" table first.

3. What is a FULL OUTER JOIN and when would you use it?

FULL OUTER JOIN returns all rows from both tables, with NULLs where there's no match. Use it when you need:

Data reconciliation — comparing two datasets to find mismatches
Finding orphaned records — rows in either table with no match in the other
Merging datasets — combining two sources while preserving all records

-- Find all unmatched rows in either table
SELECT e.name, d.dept_name
FROM employees e
FULL OUTER JOIN departments d ON e.dept_id = d.dept_id
WHERE e.id IS NULL OR d.dept_id IS NULL;

4. What is the difference between ON and WHERE in a LEFT JOIN?

This is a common trap:

-- WHERE clause: filters AFTER the join → turns LEFT into effective INNER
SELECT d.dept_name, e.name
FROM departments d
LEFT JOIN employees e ON d.dept_id = e.dept_id
WHERE e.name IS NOT NULL;

-- ON clause: filters DURING the join → preserves LEFT behavior
SELECT d.dept_name, e.name
FROM departments d
LEFT JOIN employees e ON d.dept_id = e.dept_id AND e.name IS NOT NULL;

ON — determines which rows to join. Unmatched left rows still appear with NULLs.
WHERE — filters the final result. WHERE e.name IS NOT NULL removes rows where e.name is NULL, effectively turning the LEFT JOIN into an INNER JOIN.

5. How do you find rows in one table that have no match in another?

Use LEFT JOIN ... WHERE right.key IS NULL:

-- Employees with no department
SELECT e.name
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.dept_id
WHERE d.dept_id IS NULL;

-- Departments with no employees
SELECT d.dept_name
FROM departments d
LEFT JOIN employees e ON d.dept_id = e.dept_id
WHERE e.id IS NULL;

Alternatively, use NOT EXISTS (often preferred for readability):

SELECT e.name
FROM employees e
WHERE NOT EXISTS (
    SELECT 1 FROM departments d WHERE d.dept_id = e.dept_id
);

6. What is a self join and when would you use it?

A self join is a regular join where both sides reference the same table. It requires table aliases to distinguish the two "copies":

-- Find employees and their managers
SELECT
    e.name AS employee,
    m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;

Common use cases: organizational hierarchies, category trees, adjacency lists, and comparing rows within the same table.

Learn More

Database Indexing — how indexes speed up join operations
ACID Properties — transaction guarantees that protect data integrity
Database Replication — how data is distributed across nodes

What You'll Learn​

The Problem: Data Lives in Multiple Tables​

Sample Tables​

Join Types Overview​

INNER JOIN​

LEFT JOIN (Left Outer Join)​

Finding rows with no match​

RIGHT JOIN (Right Outer Join)​

FULL OUTER JOIN​

Finding unmatched rows from both sides​

CROSS JOIN​

When to use CROSS JOIN​

SELF JOIN​

Example: Employee and Manager​

Common self-join scenarios​

Join Type Comparison​

Multiple Joins​

Join Performance​

How the database executes joins​

Performance tips​

Common Pitfalls​

Best Practices​

Interview Questions​

1. What is the difference between INNER JOIN and LEFT JOIN?​

2. What is the difference between LEFT JOIN and RIGHT JOIN?​

3. What is a FULL OUTER JOIN and when would you use it?​

4. What is the difference between ON and WHERE in a LEFT JOIN?​

5. How do you find rows in one table that have no match in another?​

6. What is a self join and when would you use it?​

Learn More​

What You'll Learn

The Problem: Data Lives in Multiple Tables

Sample Tables

Join Types Overview

INNER JOIN

LEFT JOIN (Left Outer Join)

Finding rows with no match

RIGHT JOIN (Right Outer Join)

FULL OUTER JOIN

Finding unmatched rows from both sides

CROSS JOIN

When to use CROSS JOIN

SELF JOIN

Example: Employee and Manager

Common self-join scenarios

Join Type Comparison

Multiple Joins

Join Performance

How the database executes joins

Performance tips

Common Pitfalls

Best Practices

Interview Questions

1. What is the difference between INNER JOIN and LEFT JOIN?

2. What is the difference between LEFT JOIN and RIGHT JOIN?

3. What is a FULL OUTER JOIN and when would you use it?

4. What is the difference between ON and WHERE in a LEFT JOIN?

5. How do you find rows in one table that have no match in another?

6. What is a self join and when would you use it?

Learn More