Understanding JOINs in SQL

Tobi Williams Babatunde

By Tobi Williams Babatunde

Dec 14

One question I get asked a lot in SQL is, "How do you know which join type to use?", or "What's the difference between a left and a right join?".

If you have ever asked these questions before, then this tutorial is just right for you. I will explore the most common types of JOINs in SQL - INNER, LEFT/RIGHT, and SELF - and provide a framework for knowing which join to employ in your various business data problem-solving scenarios.

What are JOINs in SQL?

A JOIN is a method that allows you to combine data from two or more tables based on a common column or set of columns. For a join to be correctly executed, columns with similar values must exist in the tables to be joined.

In SQL, there are several types of JOINs, including INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL OUTER JOIN, SELF JOIN, and CROSS JOIN. We will together explore the first 5 types and leave the final one for another tutorial.

I will be using PostgreSQL flavour of SQL. PostgreSQL is one of the most advanced databases in the world with rich capabilities and advanced analytics functions. The version 16 of the PostgreSQL server is installed for this tutorial.

I will also use a universal database administration tool - DBeaver - to interact with the data on the PostgreSQL server.

To ensure that you follow along, I will walk you through the installation steps of my configuration. If you already have a PostgreSQL installation or prefer to use a different relational database management system (RDBMS), feel free to skip the next section.

Installing PostgreSQL and DBeaver

Install PostgreSQL

From the PostgreSQL downloads page, select your operating system and proceed to download the latest version of the server. I use a Windows operating system so I followed the directions for download.

On the next page, locate the link to download the installer and follow the ensuing prompts. As mentioned, I downloaded the latest version which at the time of writing is version 16.

Open the installer and continue with the installation on your computer. Make sure to accept all the defaults.

Ensure to set a master password for your installation. Take note of this password and you will need it to sign into the master view of your server occasionally.

Continue to accept the default configurations including allowing PostgreSQL server to listen on port 5432. This process might take some time. Ensure you have gone through the installation process before having a go at the actual tutorial.

After the PostgreSQL installation, you will have both the server and the client administration tool called PgAdmin4. PgAdmin is the most popular and feature-rich Open Source administration and development platform for PostgreSQL, however, we will not be using that.

We will use a universal administrator tool called DBeaver which allows you to connect to multiple different Relational Database Management Systems including MS SQL Server, PostgreSQL, MySQL, and a host of others.

If you are very happy with using PgAdmin 4, feel free to skip the next section.

Install DBeaver Database Management Tool

As with the previous installation, we will go for the Windows community installer download from the DBeaver download page.

I will accept all the default configurations and complete the installation process.

After installing DBeaver, you can skip the installation of Stack Builder, an additional tool for installing add-ons and drivers.

Next, open DBeaver and begin by creating a new connection to the PostgreSQL server.

Select the Driver properties tab to install the latest driver for your PostgreSQL server. DBeaver provides a full suite of management functions to several databases through the use of drivers and connectors.

Make sure to download the required driver to ensure that you can make a connection to the PostgreSQL server you just installed.

After downloading the required driver, go back to the Main tab and use the default connection options.

Host: localhost

Database: postgres

Port: 5432

Username: postgres

Password: your_master_postgres_password

Once done, you can test the connection with the Test Connection option. If all is fine and good, proceed to finish with the setup. With that done, you will now have access to the local PostgreSQL server using DBeaver to query and manipulate its databases.

Installing the Practice Database in PostgreSQL

The database for this tutorial was culled from the first case study from the Danny Ma SQL challenge. I have modified it a bit for the current context of understanding joins in SQL.

If you have your DBeaver installation for PostgreSQL or PgAdmin 4, go ahead and copy the database code below into your SQL editor and run it. To create a new SQL script, go to the SQL Editor command menu > New SQL script. Or use the keyboard combination Ctrl + ].

CREATE SCHEMA resa_diner;
SET search_path = resa_diner;

CREATE TABLE sales (
  "customer_id" VARCHAR(1),
  "order_date" DATE,
  "product_id" INTEGER
);

INSERT INTO sales
  ("customer_id", "order_date", "product_id")
VALUES
  ('A', '2021-01-01', '1'),
  ('A', '2021-01-01', '2'),
  ('A', '2021-01-07', '2'),
  ('A', '2021-01-10', '3'),
  ('A', '2021-01-11', '3'),
  ('A', '2021-01-11', '3'),
  ('B', '2021-01-01', '2'),
  ('B', '2021-01-02', '2'),
  ('B', '2021-01-04', '1'),
  ('B', '2021-01-11', '1'),
  ('B', '2021-01-16', '3'),
  ('B', '2021-02-01', '3'),
  ('C', '2021-01-01', '3'),
  ('C', '2021-01-01', '3'),
  ('C', '2021-01-07', '3');
 

CREATE TABLE menu (
  "product_id" INTEGER,
  "product_name" VARCHAR(5),
  "price" INTEGER
);

INSERT INTO menu
  ("product_id", "product_name", "price")
VALUES
  ('1', 'sushi', '10'),
  ('2', 'curry', '15'),
  ('3', 'ramen', '12');
  

CREATE TABLE members (
  "customer_id" VARCHAR(1),
  "join_date" DATE
);

INSERT INTO members
  ("customer_id", "join_date")
VALUES
  ('A', '2021-01-07'),
  ('B', '2021-01-09');

 
CREATE TABLE menu_chefs (
  "start_date" DATE,
  "product_id" INTEGER,
  "chef_name" VARCHAR(20)
);

INSERT INTO menu_chefs
  ("start_date", "product_id", "chef_name")
VALUES
  ('2020-11-03', '1', 'Excella Ubah'),
  ('2021-01-01', '1', 'Patty Hironaka'),
  ('2020-10-07', '2', 'Prakash Silver'),
  ('2020-10-15', '3', 'Risi Onibread'),
  ('2020-10-27', '3', 'Nkiruka Taylor'),
  ('2021-01-04', '3', 'Broda Shola');

Once you're done running the script, you should have a new database called resa_diner created along with 4 tables namely sales, menu, members, and menu_chefs. If you don't see these tables immediately, refresh the schema within the PostgreSQL server object in the Database Navigator section of DBeaver.

Study the relationships between the tables using the Entity Relationship Diagram (ERD) below and feel free to also run some simple SELECT * FROM table LIMIT 10 statements to get an understanding of the data. Remember that you may have to qualify the table with its schema e.g. resa_diner.sales.

ERD was designed using the DB Diagram tool.

Understanding the LEFT JOIN

A LEFT JOIN returns all the rows from the left table and matching rows from the right table. If a row from the left table does not have a matching row in the right table, the result set will include NULL values for the columns from the right table.

Using a LEFT JOIN ensures that all the rows from the LEFT table i.e. the table in the FROM clause, are returned along with matching rows in the right table. The trick lies in first identifying which table will be in the FROM clause and which will be the table on the right.

Let's begin by looking at some relatable questions from our database - resa_diner. But first, what is Resa Diner and what sort of data does it collect?

---

Resa Diner is a new restaurant serving some of our favourite Japanese meals - sushi, ramen, and curry. Before now, records of sales were kept manually, however since 2021 with the new digitalization effort, we have an improved data system for recording transactions. The data for the first few months since digitalization are now available and we want to analyse the sales performance and understand our customers' purchase patterns, and their favourite meals. Ultimately, we would like deeper insights on how to improve our loyalty scheme.

---

Q. What is the total amount each customer spent at the restaurant?

SELECT customer_id, SUM(price) AS total_amount
FROM sales
LEFT JOIN menu USING (product_id)
GROUP BY customer_id

To calculate the amount spent by each customer, we need a record of all transactions - the sales table. This is also known as a FACT table. It is in this table that we will retrieve the price of each product from the menu table using the product_id as the common column.

Essentially, our LEFT table will be the sales table; it will appear in the FROM clause. We want all the records (rows) of sales and the corresponding price of each product from the menu table. For this operation, we make use of a LEFT JOIN.

In this scenario, an INNER JOIN will also work simply because all the products sold and recorded in the sales table have a corresponding entry in the menu table. But let's imagine for instance that you had a different menu table, call it the old_menu.

Run the code below in a new SQL script to populate a new table.

CREATE TABLE old_menu (
  "product_id" INTEGER,
  "product_name" VARCHAR(6),
  "price" INTEGER
);

INSERT INTO old_menu
  ("product_id", "product_name", "price")
VALUES
  ('1', 'sushi', '10'),
  ('6', 'egusi', '7'),
  ('7', 'suya', '12'),
  ('8', 'jollof', '15');

The the old menu, we have some items that we no longer sell such as egusi, suya, and jollof.

If we modify the first code to use the old_menu table with an INNER join along with the sales table and view the result, it tells a different story.

SELECT customer_id, product_id, price
FROM sales
INNER JOIN old_menu USING (product_id)

Result:

Consequently, modifying the last piece of code to include a LEFT JOIN will return all the rows from the LEFT table in the FROM clause along with any corresponding, matching rows from the right table.

SELECT customer_id, product_id, price
FROM sales
LEFT JOIN old_menu USING (product_id)

Result:

Here, all the rows from the sales table are returned along with matching rows from the old_menu table. Unmatched rows in the price column from the old_menu table return a value of NULL.

Understanding the INNER JOIN

An INNER JOIN returns only the rows from both tables that match the join condition. This means that if a row from one table does not have a matching row in the other table, it will not be included in the result set. In mathematics, an inner join is similar to an intersection of tables.

When making use of an INNER JOIN, it's important to identify whether you want to return ONLY matching rows from the join condition. The result returned is usually a subset of the main tables. INNER JOIN commands are sometimes abbreviated as JOIN, however, it's generally preferred written as INNER JOIN.

Q. Which items were purchased by the customers after they became members?

SELECT s.customer_id, product_id
FROM sales AS s
INNER JOIN members AS m 
	ON s.customer_id = m.customer_id 
	AND s.order_date > m.join_date

We start by breaking down the question into necessary tables. To calculate the first item that was purchased after a customer became a member, we will need both the sales and member tables. The member table contains only two customers who have joined the loyalty program. Immediately, you can make the connection that the type of join needed will be an INNER JOIN to return only matching rows of member sales.

In this particular scenario, we join on two conditions:

  • the customer_id to match similar customers
  • the order date which should be right after the join date

Result:

If we change the join type for this question to a LEFT JOIN, the results come out very erroneous.

SELECT s.customer_id, product_id
FROM sales AS s
LEFT JOIN members AS m 
	ON s.customer_id = m.customer_id 
	AND s.order_date > m.join_date

Result:

The modified code returns all the records from the LEFT table - sales - along with matching records from the join condition. It's erroneous because:

  • It returns records of customers who are not members
  • It disregards the condition on join_date vs order_date

What about changing the join condition to a RIGHT JOIN? What do you notice?

Yes, that's right. It gives the same output as the one with the INNER JOIN. A RIGHT JOIN is similar to a LEFT JOIN, but it returns all the rows from the right table and matching rows from the left table. If a row from the right table does not have a matching row in the left table, the result set will include NULL values for the columns from the left table.

In this case, the members table is on the RIGHT which implies that all rows from this table will be returned along with rows matching the join condition from the table in the FROM clause.

Mastering SQL joins takes some time especially as you're starting in your career. Even after many years, the skills to attain mastery in SQL come from varied exposure and experiences working with datasets from diverse industries.

Understanding the SELF JOIN

This is a unique type of join in which a table is intersected with itself to produce a result. It can be very useful to know when to join a table onto itself and many business questions require this type of approach.

Q. What items were purchased by each customer on their first purchase date?

SELECT s.customer_id,
order_date,
product_id
FROM sales AS s
INNER JOIN (
	SELECT customer_id, MIN(order_date) AS first_date
	FROM sales
	GROUP BY customer_id
) AS f
ON s.customer_id = f.customer_id
AND s.order_date = f.first_date

Result:

In this SELF JOIN example, the first executed blocks are the tables in the join command. The inner table finds the first date of purchase grouped by each customer. This table was then joined onto the sales table to extract matching sales records for each customer using their first date of purchase.

This section of the code is what I'm referring to as the inner table which has two columns - customer_id and first_date. The table has been aliased as f as demonstrated in the join conditions.

SELECT customer_id, MIN(order_date) AS first_date
FROM sales
GROUP BY customer_id

To continue building your skills, why don't you practice the questions below?

More SQL practice questions

  • How many days has each customer visited the restaurant?
  • What was the first item from the menu purchased by each customer?
  • What is the most purchased item on the menu and how many times was it purchased by all customers?
  • Generate a list of orders for ramen after its most recent chef was hired.
  • Which item was the most popular for each customer?
  • Which item was purchased first by the customer after they became a member?
  • Which item was purchased just before the customer became a member
  • What is the total amount spent on each item for each member before they became a member?
  • If each $1 spent equates to 10 points and sushi has a 2x points multiplier - how many points would each customer have?
  • In the first week after a customer joins the program (including their join date) they earn 2x points on all items, not just sushi - how many points do customers A and B have at the end of January?

Ready to Take Action?

1. Dive into our online data bootcamp! Learn at your own pace with our expert-led virtual programs designed to fit into your schedule. Become a qualified data expert in just 4 months and unlock your potential and land your dream career.

2. Learn more about our Data BootCamp programs by reading the testimonials of our graduates. Click HERE to access the testimonials.

3. You can also sign up for 1:1 personal tutoring with an expert instructor or request other solutions that we provide, which include data research, tech skill training, tech jobs in Africah, data products, and application development. Click HERE to begin.

4. Get in touch with us for further assistance from our team OR via email at servus@resagratia.com or our Whatsapp Number via +2349042231545.

Table of contents
  1. What are JOINs in SQL?
  2. Installing PostgreSQL and DBeaver
    1. Install PostgreSQL
    2. Install DBeaver Database Management Tool
  3. Installing the Practice Database in PostgreSQL
  4. Understanding the LEFT JOIN
    1. Q. What is the total amount each customer spent at the restaurant?
  5. Understanding the INNER JOIN
    1. Q. Which items were purchased by the customers after they became members?
  6. Understanding the SELF JOIN
    1. Q. What items were purchased by each customer on their first purchase date?
  7. More SQL practice questions
  8. Ready to Take Action?
resa logo

Empowering individuals and businesses with the tools to harness data, drive innovation, and achieve excellence in a digital world.

2025Resagratia (a brand of Resa Data Solutions Ltd). All Rights Reserved.