A Closer Look at SQL Joins
SQL is by far one of the most popular and powerful languages for dealing with databases, and, to me, one of its greatest (and basic) features is the ability to join tables. Doing so is a necessity for anyone who finds himself working with two or more tables which are interconnected.
In the business world, it’s common practice to have, for example, a table with customer data and order ids, a second table containing – for each order id – the products ordered, and another table in which products descriptions are stored.
The problem arises whenever we have the need to find what products a customer, identified by a number, has ordered.
Table Setup
For clarity and ease of explanation and understanding, we’ll simplify the above situation just a bit.
Our company needs two tables: one, called customers, storing customer information, while the other one, called products is filled with our magnificent limited edition pens details (yeah, we sell unique pens…).
The resulting table layout of the query
CREATE TABLE `customers` ( `customer_id` INT( 11 ) NOT NULL , `customer_name` VARCHAR( 255 ) NOT NULL , PRIMARY KEY ( `customer_id` ) ) ENGINE = MYISAM; <br /> CREATE TABLE `products` ( `pen_color ` VARCHAR( 255 ) NOT NULL , `customer_id` INT( 11 ) , PRIMARY KEY ( `pen_color ` ) ) ENGINE = MYISAM
will be something like:
+---------------+--------------+------+-----+---------+-------+ | FIELD | Type | NULL | KEY | DEFAULT | Extra | +---------------+--------------+------+-----+---------+-------+ | customer_id | int(11) | NO | PRI | NULL | | | customer_name | varchar(255) | NO | | NULL | | +---------------+--------------+------+-----+---------+-------+ <br /> +--------------+--------------+------+-----+---------+-------+ | FIELD | Type | NULL | KEY | DEFAULT | Extra | +--------------+--------------+------+-----+---------+-------+ | pen_color | varchar(255) | NO | PRI | | | | customer_id | int(11) | NO | | NULL | | +--------------+--------------+------+-----+---------+-------+
It’s important to notice how both tables have a customer_id column that can tie them together. This field represents the customer details (customers table) or what customer has ordered that particular pen (products table).
Your first simple JOIN
Alright, we got a lot of orders from our fellow aficionados and we want to start putting things together and ship the first few packages.
We have the following tables:
+-------------+---------------+ | customer_id | customer_name | +-------------+---------------+ | 1 | John | | 2 | Tim | | 3 | David | | 4 | Aaron | +-------------+---------------+ <br /> +--------------+-------------+ | pen_color | customer_id | +--------------+-------------+ | red | 1 | | blue | 3 | | yellow | 4 | | purple | 1 | | black | 0 | | green | 1 | | white | 3 | | brown | 0 | +--------------+-------------+
Hm, looks like someone wanted to test a few of our goodies!
So, to find out what colors John (ID #1), our longest-time customer has chosen this week, we’d run the following code:
SELECT * FROM customers JOIN products ON customers.customer_id = products.customer_id WHERE customers.customer_id = 1;
And here we go!
+-------------+---------------+--------------+-------------+ | customer_id | customer_name | pen_color | customer_id | +-------------+---------------+--------------+-------------+ | 1 | John | red | 1 | | 1 | John | purple | 1 | | 1 | John | green | 1 | +-------------+---------------+--------------+-------------+
On to LEFT JOIN
What the hell is a left join, you may be wondering.
Well, as the name very clearly states, it gives extra attention to the left table (being the first mentioned, on the left of the LEFT JOIN statement, customers in our example).
Each item of the left table will show up in the result, even thought there is no match with the right table.
Referring to the previous setup, the query
SELECT * FROM customers LEFT JOIN products ON customers.customer_id = products.customer_id;
yields
+-------------+---------------+--------------+-------------+ | customer_id | customer_name | pen_color | customer_id | +-------------+---------------+--------------+-------------+ | 1 | John | red | 1 | | 1 | John | purple | 1 | | 1 | John | green | 1 | | 2 | Tim | NULL | NULL | | 3 | David | blue | 3 | | 3 | David | white | 3 | | 4 | Aaron | yellow | 4 | +-------------+---------------+--------------+-------------+
The result shows the query preserved every customers, including those without any product ordered.
RIGHT JOIN is just the same
This time, though, the dominating table is the right one, so we expect to see all the products, regardless of their order status.
SELECT * FROM customers RIGHT JOIN products ON customers.customer_id = products.customer_id;
results in
+-------------+---------------+--------------+-------------+ | customer_id | customer_name | pen_color | customer_id | +-------------+---------------+--------------+-------------+ | 1 | John | red | 1 | | 3 | David | blue | 3 | | 4 | Aaron | yellow | 4 | | 1 | John | purple | 1 | | NULL | NULL | black | 0 | | 1 | John | green | 1 | | 3 | David | white | 3 | | NULL | NULL | brown | 0 | +-------------+---------------+--------------+-------------+
Selective RIGHT or LEFT JOIN
You can get a particular set of results specifing the WHERE clause after the JOIN statement.
SELECT * FROM customers RIGHT JOIN products ON customers.customer_id = products.customer_id WHERE products.customer_id = 0;
You’ll then select from the set you’d get from a regular RIGHT JOIN query all of the records that have not been bought yet.

+-------------+---------------+--------------+-------------+ | customer_id | customer_name | pen_color | customer_id | | NULL | NULL | black | 0 | | NULL | NULL | brown | 0 | +-------------+---------------+--------------+-------------+
FULL JOIN: get ‘em all!
Produces a of all records in the two tables, with matching records from both sides where available. If there is no match, though, the missing side will contain null.
SELECT * FROM customers FULL JOIN products ON customers.customer_id = products.customer_id;
results in…
+-------------+---------------+--------------+-------------+ | customer_id | customer_name | pen_color | customer_id | +-------------+---------------+--------------+-------------+ | 1 | John | red | 1 | | 3 | David | blue | 3 | | 4 | Aaron | yellow | 4 | | 1 | John | purple | 1 | | NULL | NULL | black | 0 | | 1 | John | green | 1 | | 3 | David | white | 3 | | NULL | NULL | brown | 0 | | 2 | Tim | NULL | NULL | +-------------+---------------+--------------+-------------+
INNER, OUTER and friends
You might have heard something about INNER and OUTER JOINS, but never understood what they really are used for.
By surprise, an INNER JOIN does a full join, thus acts in the same exact way as our first example does.
On the other side, the keyword OUTER is used when dealing with two or more tables only and may be added after the word LEFT or RIGHT or FULL in the last examples.
The reason of this all? ODBC compatibility!
CROSS JOIN
There’s also a so-called cartesian product available for SQL users.
This particular quesry joins everything to everuthing, resulting in a huge set of results that, sometimes can harm the system.
SELECT * FROM customers CROSS JOIN products;
The would result in 4 x 8 = 32 rows.
It should be pretty strightforward what would happen running this join against very large tables.
Self JOIN instead of a subquery
A Self join is basically a join on the same table.
Take the following example, and assume our customers table now looks like
+------------------+--------------+------+-----+---------+-------+ | FIELD | Type | NULL | KEY | DEFAULT | Extra | +------------------+--------------+------+-----+---------+-------+ | customer_id | int(11) | NO | PRI | NULL | | | customer_name | varchar(255) | NO | | NULL | | | customer_country | varchar(255) | NO | | NULL | | +------------------+--------------+------+-----+---------+-------+ <br /> +-------------+---------------+------------------+ | customer_id | customer_name | customer_country | +-------------+---------------+------------------+ | 1 | John | USA | | 2 | Tim | Germany | | 3 | David | Italy | | 4 | Aaron | USA | +-------------+---------------+------------------+
Now, suppose we found out John has’t paid his order and all the people from his country have learned how to trick our payment system. We want to find out who they are. But first, we must check which country John is from.
SELECT * FROM customers WHERE customer_country = (SELECT customer_country FROM customers WHERE customer_id = 1);
With the use of a subquery, the result is
+-------------+---------------+------------------+ | customer_id | customer_name | customer_country | +-------------+---------------+------------------+ | 1 | John | USA | | 4 | Aaron | USA | +-------------+---------------+------------------+
which is the same of (using aliases).
SELECT customers_1.* FROM customers AS customers_1 JOIN customers AS customers_2 ON customers_1.customer_country = customers_2.customer_country AND customers_2.customer_id = 1;
Joining and grouping
Have you ever wondered who is the customer making us always out of stock for certain colors?
A nice way to find out how many products a customer has ordered is by means of GROUP BY.
Here’s how to do it, instead of fetching all the customers and then, after each row, query again to count the products.
SELECT customers.customer_id, customers.customer_name, COUNT(products.pen_color) AS total_orders FROM customers JOIN products ON customers.customer_id = products.customer_id GROUP BY customers.customer_id;
And, to see people who haven’t ordered anything yet, use or beloved LEFT JOIN!
SELECT customers.customer_id, customers.customer_name, COUNT(products.pen_color) AS total_orders FROM customers LEFT JOIN products ON customers.customer_id = products.customer_id GROUP BY customers.customer_id;
+-------------+---------------+--------------+ | customer_id | customer_name | total_orders | +-------------+---------------+--------------+ | 1 | John | 3 | | 2 | Tim | 0 | | 3 | David | 2 | | 4 | Aaron | 1 | +-------------+---------------+--------------+
- Please subscribe to the Theme Forest RSS Feed, and follow us on Twitter.

















YES. Thanks for this article/tutorial. I’ve never understood joins. Thanks!
Excellent work in explaining. I’ve never come a cross a tutorial that explains joins as simple and clear like this article..
sounds like a http://www.khankennels.com/blog/index.php/archives/2007/04/20/getting-joins/ remake, with little examples
excellent!
Awesome!
Great stuff! If you get a chance, can you do a tutorial on indexing?
Nice tutorial and good examples.
I used this from Coding Horror back in 2007:
http://www.codinghorror.com/blog/archives/000976.html
very nice
thanks
Great articles. I am one of those that have stayed away from joins.
I did find certain parts were a little heavy without explanation such as the cross joins bit and the “which is the same of (using aliases).” bit. These lacke explanation I felt.
But on the whole it was a great article.
Would this also be applicable to sql server
Joins are something some people never touch because you do not have too ever use them. Nothing absolutely requires a join. However, if you are an efficiency expert then this is something you will benefit from knowing well.
I think you did a good job explaining them and your visuals were very helpful.
In your example, what happens when one color has multiple customer id’s or (the reverse) when customer orders overlap pens of the same color.
This must be where the order number table you describe in the introduction comes in, but I can’t seem to wrap my head around the fact that each order, while having one customer id, would have multiple pens. Would you get around this by providing multiple generic fields (product1, product2, etc) or would you want to store multiple product id’s in one field by concatenating them ($a = product1; $b = product2; $c = product 3; $delim = ‘,’; $storstrng = $a.$delim.$b.$delim.$c; //returns a string “product1,product2,product3″) and using a php explode to return the results in a readable format?
I’m curious which would be optimal. I hope there is an easier, more appropriate solution.
Good tutorial but i still dont really understand coz i to new to SQL, hope to see more this type of tutorial.
Brilliant article, thank you so much for taking the time to share this with all of us, will definitely help in the future
great stuff
Great article! I use SQL databases (almost) all the time and I wouldn’t know how to survive without JOINs.
Of course I don’t agree at all with Mack. He/she either never uses relational databases with inter-related tables or he/she writes all the logic necessary to ‘join’ table data in his/er code.
Thanks again.
Great article! I use SQL databases (almost) all the time and I wouldn’t know how to survive without JOINs.
Of course I don’t agree at all with Mack. He/she either never uses relational databases with inter-related tables or he/she writes all the logic necessary to ‘join’ table data in his/her code.
Thanks again.