DIFFERENCE BETWEEN UNION AND UNION ALL

In short, the difference between UNION operator and UNION ALL (UNION vs UNION ALL) operator is UNION gives unique results set in the output even though the combined result set has some duplicate records. It is like applying DISTINCT on the final result rest. But UNION ALL won’t filter any records.

To combine rows from two or more queries into a single result set, we use UNION operator. Below shows the basic syntax of the UNION operator:

QUERY-1
UNION [ALL]
QUERY-2
UNION [ALL]
QUERY-3
.
...;

Both UNION and UNION ALL operators combine rows from result sets into a single result set.

As discussed the only difference (union vs union all) is UNION ALL won’t eliminate the duplicate rows.

Performance factor: UNION ALL operator does not remove duplicate rows hence it runs faster than the UNION (has to apply DISTINCT – overhead to the system) operator.

A UNION command does a SELECT DISTINCT on the final results set. If you are sure that all the records returned are unique from your UNION, use UNION ALL instead, it gives results faster.

Let us dig into deeper on this and I will show you using one sample query. Create two tables departments and departments_new with the following syntax, this is written for Oracle database only and might not work for other databases.

-- create departments table
CREATE TABLE DEPARTMENTS (
name VARCHAR2(255),
location VARCHAR2(255),
country VARCHAR2(255)
) ;
-- create departments_new table
CREATE TABLE DEPARTMENTS_NEW (
name VARCHAR2(255),
location VARCHAR2(255)
) ;

if you notice there is no country column in DEPARTMENTS_NEW table.

Insert below 5 sample records into DEPARTMENTS table:

INSERT INTO DEPARTMENTS (name,location,country) VALUES ('Shipping and Receiving','Tanquecitos','United States'); 

INSERT INTO DEPARTMENTS (name,location, country) VALUES ('Human Resources', 'Sugarloaf','United States'); 

INSERT INTO DEPARTMENTS (name,location,country) VALUES ('Systems Architecture', 'Dale City','United States'); 

INSERT INTO DEPARTMENTS (name,location,country) VALUES ('Future Products','Grosvenor','United States'); 

INSERT INTO DEPARTMENTS (name,location,country) VALUES ('Delivery','Riverside','United States');


Insert below 5 sample records into departments_new table:

INSERT INTO DEPARTMENTS_NEW (name,location) VALUES ('Shipping and Receiving','Tanquecitos');

INSERT INTO DEPARTMENTS_NEW (name,location) VALUES ('Human Resources', 'Sugarloaf');

INSERT INTO DEPARTMENTS_NEW (name,location) VALUES ('Sales', 'Fourth state');

INSERT INTO DEPARTMENTS_NEW (name,location) VALUES ('Order Processing','Red gate');

INSERT INTO DEPARTMENTS_NEW (name,location) VALUES ('Accounts','Cross road');


Now the test tables are ready for our ‘union vs union all’ testing!

UNION:

SELECT name,location FROM DEPARTMENTS
UNION
SELECT name,location FROM DEPARTMENTS_NEW;

So now the result set has only 8 records instead of total 10 records, as we have two duplicate department records such as ‘Shipping and Receiving’ and ‘Human Resources’ including locations, they are eliminated in the final results set.

UNION ALL:

SELECT name,location FROM DEPARTMENTS
UNION ALL
SELECT name,location FROM DEPARTMENTS_NEW;

Now the result set has total of 10 records including duplicates from both the tables.

Rules for using UNION and UNION ALL:

  1. The number of columns in all queries should be the same. You will get an error like below as DEPARTMENTS table has extra column ‘country’
  1. Corresponding columns should have compatible data types.
  2. Column names of the first query decide the column names of the result set. You can observe the same in the below image, the final column name is given from the first query.
  1. The GROUP BY and HAVING clauses are applied to each query, not the final result set.
  2. The ORDER BY clause is applied to the combined result set, not within the individual result set.

Read also:

Comments

comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: