Skip to main content

SQL UNION

What You'll Learn

By the end of this page, you will be able to:

  • Explain what set operations are and how they combine result sets in SQL
  • Write UNION, UNION ALL, INTERSECT, and EXCEPT queries
  • Understand the difference between UNION and UNION ALL and when to use each
  • Apply the column compatibility rules that govern set operations
  • Avoid common performance and correctness pitfalls
  • Choose the right set operation for a given problem

The Problem: Combining Results from Multiple Queries

Sometimes you need data that spans multiple tables or multiple conditions that can't be expressed in a single SELECT. Set operations let you combine the results of two or more queries into a single result set.

customers table:          suppliers table:
+----+--------+-------+ +----+--------+-------+
| id | name | city | | id | name | city |
+----+--------+-------+ +----+--------+-------+
| 1 | Alice | Hanoi | | 1 | Gamma | HCMC |
| 2 | Bob | HCMC | | 2 | Delta | Hanoi |
| 3 | Carol | Hanoi | | 3 | Epslon| Da Nang|
+----+--------+-------+ +----+--------+-------+

Questions you can't answer from a single query:

  • "Give me all contacts — both customers and suppliers" — needs to combine two tables
  • "Which cities have both customers and suppliers?" — needs to find the overlap
  • "Which cities have customers but no suppliers?" — needs to find the difference

Set operations solve these by treating query results as mathematical sets.

Sample Tables

All examples use these tables:

CREATE TABLE customers (
id INT PRIMARY KEY,
name VARCHAR(50),
city VARCHAR(50)
);

INSERT INTO customers VALUES
(1, 'Alice', 'Hanoi'),
(2, 'Bob', 'HCMC'),
(3, 'Carol', 'Hanoi');

CREATE TABLE suppliers (
id INT PRIMARY KEY,
name VARCHAR(50),
city VARCHAR(50)
);

INSERT INTO suppliers VALUES
(1, 'Gamma', 'HCMC'),
(2, 'Delta', 'Hanoi'),
(3, 'Epsilon', 'Da Nang');

Key observations:

  • Hanoi appears in both tables (Alice, Carol are customers; Delta is a supplier)
  • HCMC appears in both tables (Bob is a customer; Gamma is a supplier)
  • Da Nang appears only in suppliers
  • The name column has no overlap — useful for demonstrating deduplication

Set Operations Overview

Rules for Set Operations

Before diving into each operation, all set operations follow these rules:

  1. Same number of columns — both queries must return the same column count
  2. Compatible data types — corresponding columns must have compatible types
  3. Column names — the result uses column names from the first query
  4. ORDER BY — can only appear once, at the very end, and applies to the entire result
-- Valid: same columns, compatible types
SELECT name, city FROM customers
UNION
SELECT name, city FROM suppliers;

-- Invalid: different column counts
SELECT name, city, email FROM customers
UNION
SELECT name, city FROM suppliers; -- ERROR: each UNION query must have the same number of columns

UNION

Returns all distinct rows from both queries. Duplicate rows are removed.

SELECT name, city FROM customers
UNION
SELECT name, city FROM suppliers;

Result:

namecity
AliceHanoi
BobHCMC
CarolHanoi
DeltaHanoi
EpsilonDa Nang
GammaHCMC
  • 6 rows returned — all distinct combinations
  • Since all names are unique across both tables, no duplicates are removed here

When duplicates exist

-- Get all cities from both tables
SELECT city FROM customers
UNION
SELECT city FROM suppliers;

Result:

city
Hanoi
HCMC
Da Nang
  • 3 rows returned — Hanoi and HCMC appear in both tables but UNION keeps only one copy
  • Without UNION (using UNION ALL), you'd get 6 rows

UNION ALL

Returns all rows from both queries, including duplicates. No deduplication step is performed.

SELECT city FROM customers
UNION ALL
SELECT city FROM suppliers;

Result:

city
Hanoi
HCMC
Hanoi
HCMC
Hanoi
Da Nang
  • 6 rows returned — every row from both queries, no deduplication
tip

Always prefer UNION ALL when you know duplicates are impossible or desired. UNION removes duplicates by sorting/hashing the result set, which adds overhead. UNION ALL simply appends results — it's faster.

UNION vs UNION ALL

UNIONUNION ALL
DuplicatesRemovedKept
PerformanceSlower (dedup step)Faster (no dedup)
Use whenYou need unique rows onlyDuplicates don't matter or are impossible
Equivalent toSELECT DISTINCT ... across all resultsConcatenation of results

INTERSECT

Returns only the rows that appear in both query results.

-- Cities that have BOTH customers and suppliers
SELECT city FROM customers
INTERSECT
SELECT city FROM suppliers;

Result:

city
Hanoi
HCMC
  • Only cities present in both tables are returned
  • Da Nang is excluded because no customer lives there
warning

MySQL does not support INTERSECT natively. Use INNER JOIN or WHERE ... IN as a workaround:

-- MySQL alternative for INTERSECT
SELECT DISTINCT c.city
FROM customers c
INNER JOIN suppliers s ON c.city = s.city;

EXCEPT

Returns rows from the first query that are not in the second query. Also known as MINUS in Oracle.

-- Cities with customers but NO suppliers
SELECT city FROM customers
EXCEPT
SELECT city FROM suppliers;

Result:

city
(empty — all customer cities also have suppliers)

Now let's reverse it:

-- Cities with suppliers but NO customers
SELECT city FROM suppliers
EXCEPT
SELECT city FROM customers;

Result:

city
Da Nang
  • Da Nang has a supplier but no customer — only it survives the EXCEPT
warning

MySQL does not support EXCEPT natively. Use LEFT JOIN ... WHERE IS NULL or NOT EXISTS:

-- MySQL alternative for EXCEPT
SELECT DISTINCT s.city
FROM suppliers s
LEFT JOIN customers c ON s.city = c.city
WHERE c.city IS NULL;
tip

In Oracle, EXCEPT is written as MINUS. They are functionally identical.

-- Oracle syntax
SELECT city FROM customers
MINUS
SELECT city FROM suppliers;

Set Operation Comparison

UNIONUNION ALLINTERSECTEXCEPT
ReturnsAll distinct rowsAll rowsRows in bothRows in first only
DuplicatesRemovedKeptRemovedRemoved
PerformanceModerateFastestSlowestSlow
MySQL supportYesYesNo (use JOIN)No (use JOIN)
AnalogyA ∪ BA ∪ B (with dupes)A ∩ BA − B

Practical Examples

Combining data from similar tables

A common use case is combining rows from tables with similar structure:

-- All contacts (customers + suppliers) with a source label
SELECT name, city, 'Customer' AS source
FROM customers
UNION ALL
SELECT name, city, 'Supplier' AS source
FROM suppliers;

Result:

namecitysource
AliceHanoiCustomer
BobHCMCCustomer
CarolHanoiCustomer
GammaHCMCSupplier
DeltaHanoiSupplier
EpsilonDa NangSupplier

Merging reports across time periods

-- Sales from Q1 and Q2, combined
SELECT product_id, amount, 'Q1' AS quarter FROM sales_q1
UNION ALL
SELECT product_id, amount, 'Q2' AS quarter FROM sales_q2
ORDER BY product_id, quarter;

Finding orphaned records

-- Products that exist in the catalog but have never been ordered
SELECT product_id FROM products
EXCEPT
SELECT DISTINCT product_id FROM order_items;

Finding complete vs incomplete records

-- Users who have both a profile AND an active subscription
SELECT user_id FROM profiles
INTERSECT
SELECT user_id FROM subscriptions WHERE status = 'active';

Performance Tips

1. Prefer UNION ALL over UNION

UNION performs an implicit DISTINCT operation — it must sort or hash the entire result set to eliminate duplicates. UNION ALL simply appends.

-- Slow: implicit dedup
SELECT city FROM customers
UNION
SELECT city FROM suppliers;

-- Fast: no dedup (use when you know there are no duplicates)
SELECT city FROM customers
UNION ALL
SELECT city FROM suppliers;

2. Filter before combining

Apply WHERE clauses in each individual query rather than filtering after the set operation:

-- Good: filters in each query, less data to combine
SELECT name, city FROM customers WHERE city = 'Hanoi'
UNION ALL
SELECT name, city FROM suppliers WHERE city = 'Hanoi';

-- Inefficient: combines everything, then filters
SELECT * FROM (
SELECT name, city FROM customers
UNION ALL
SELECT name, city FROM suppliers
) combined
WHERE city = 'Hanoi';

3. Index columns used in set operations

If you're frequently running INTERSECT or EXCEPT on a column, make sure it's indexed:

CREATE INDEX idx_customers_city ON customers(city);
CREATE INDEX idx_suppliers_city ON suppliers(city);

4. Limit early

Use LIMIT / TOP in subqueries when you only need a sample:

-- Only need 10 recent entries from each source
(SELECT id, name, created_at FROM customers ORDER BY created_at DESC LIMIT 10)
UNION ALL
(SELECT id, name, created_at FROM suppliers ORDER BY created_at DESC LIMIT 10)
ORDER BY created_at DESC;
tip

When combining ORDER BY or LIMIT with set operations, wrap each individual query in parentheses. The final ORDER BY applies to the entire combined result.

Common Pitfalls

  • Mixing up UNION and UNION ALL — UNION removes duplicates (slower), UNION ALL keeps them (faster). Choose deliberately.
  • Column mismatch — the most common runtime error. Both queries must return the same number of columns with compatible types.
  • Misleading column names — the result uses column names from the first query. If the second query's columns have different names, you may get confusing headers. Use aliases in the first query.
  • ORDER BY placementORDER BY can only appear once, at the end of the entire statement. Placing it in individual queries (without parentheses) causes an error.
  • NULL comparisons in INTERSECT/EXCEPT — set operations treat two NULLs as equal, unlike standard SQL comparisons where NULL = NULL is UNKNOWN.
  • Assuming order — set operations do not guarantee row order. Always use ORDER BY if order matters.

Best Practices

  1. Use UNION ALL by default — only switch to UNION when you explicitly need deduplication
  2. Add a source column — when combining similar tables, include a label column to track where each row came from
  3. Wrap individual queries in parentheses — when using ORDER BY or LIMIT per query
  4. Verify column alignment — double-check that corresponding columns have compatible types and semantics
  5. Use INTERSECT/EXCEPT for data validation — great for finding overlaps and differences between datasets
  6. Consider alternatives — sometimes a single query with OR, IN, or JOIN is simpler and faster than a set operation
-- Sometimes simpler than UNION
SELECT * FROM contacts
WHERE type = 'Customer' OR type = 'Supplier';

-- Sometimes simpler than INTERSECT
SELECT DISTINCT c.city
FROM customers c
INNER JOIN suppliers s ON c.city = s.city;

Interview Questions

1. What is the difference between UNION and UNION ALL?

UNIONUNION ALL
DuplicatesRemovedKept
PerformanceSlower (needs dedup)Faster (no dedup)
When to useNeed unique rowsDuplicates don't matter

UNION performs an extra step to eliminate duplicate rows (like wrapping the result in DISTINCT). UNION ALL simply concatenates the results. If you know duplicates are impossible or acceptable, always use UNION ALL for better performance.

2. What are the rules for SQL set operations?

  1. Both queries must return the same number of columns
  2. Corresponding columns must have compatible data types
  3. The result uses column names from the first query
  4. ORDER BY can only appear once, at the end of the entire statement
  5. NULLs are treated as equal for deduplication purposes

3. How would you find rows in one table that don't exist in another?

Use EXCEPT (or MINUS in Oracle):

SELECT product_id FROM products
EXCEPT
SELECT DISTINCT product_id FROM order_items;

In MySQL, use a workaround:

SELECT p.product_id
FROM products p
LEFT JOIN order_items oi ON p.product_id = oi.product_id
WHERE oi.product_id IS NULL;

4. When would you use INTERSECT vs INNER JOIN?

Both find overlapping data, but they work at different levels:

INTERSECTINNER JOIN
ComparesEntire rowsSpecific columns
ReturnsDistinct rows from both queriesCombined columns from both tables
DuplicatesRemoved automaticallyMay produce duplicates (one-to-many)
Use whenComparing complete row valuesNeed columns from both tables
-- INTERSECT: cities in both tables (just the city value)
SELECT city FROM customers
INTERSECT
SELECT city FROM suppliers;

-- INNER JOIN: same question, but can include more columns
SELECT DISTINCT c.city, c.name AS customer_name, s.name AS supplier_name
FROM customers c
INNER JOIN suppliers s ON c.city = s.city;

5. Can you use ORDER BY with UNION? How?

Yes, but only once — at the very end of the entire statement:

SELECT name, city FROM customers
UNION ALL
SELECT name, city FROM suppliers
ORDER BY city, name; -- applies to the combined result

To order individual queries before combining, wrap them in parentheses:

(SELECT name, city FROM customers ORDER BY name)
UNION ALL
(SELECT name, city FROM suppliers ORDER BY name)
ORDER BY city; -- final ordering of combined result

6. How do you add a "source" column to distinguish rows in a UNION?

Add a literal string in each query:

SELECT name, city, 'Customer' AS source FROM customers
UNION ALL
SELECT name, city, 'Supplier' AS source FROM suppliers;

This is a common pattern when merging data from similar tables and you need to track which table each row came from.

Learn More