Integration


SQL - Order of execution

2020-02-18

: Vivian Herbst


Knowing the order during which an SQL query is executed will help us a good deal in optimizing our queries. This will often be very true with massive and sophisticated queries where knowing the order of execution can save us from unwanted results, and facilitate us by producing queries that execute quicker.


In SQL, the first clause that is processed is the FROM clause, while the SELECT clause is processed at a much later point in time. The logical processing of SQL queries is as follow: 1. FROM clause 2. OUTER clause 3. WHERE clause 4. GROUP BY clause 5. HAVING clause 6. SELECT clause 7. DISTINCT keyword 8. ORDER BY clause 9. TOP clause FROM clause SQL FROM clause is used to select and join tables and is the first part of a query that gets evaluated. What this means is that in a query with joins, the join will be the very first thing to happen. WHERE clause After the FROM clause, the WHERE clause will be the second to be evaluated. The WHERE clause is used to filter the data that was set in place by the FROM and JOIN clauses. GROUP BY clause Now that we finished filtering the data using the WHERE clause, we can group the data into different chunks/buckets by making use of the GROUP BY clause. We use the GROUP BY clause to aggregate the data according to one or more columns. HAVING clause Now that we finished grouping the data by making use of the GROUP BY clause, we can use the HAVING clause to filter out some chunks/buckets. Because we have already grouped the date, we can no longer apply a condition to filter out a single row inside a chunk/bucket, but we can apply a condition on an entire chunk/bucket to filter it out. SELECT clause Now that we are done grouping the data set and discarding rows that we don't want; we can SELECT the data we want. You can use aggregations, column names and subqueries inside the SELECT clause. DISTINCT keyword The DISTINCT operation takes place after the SELECT; it is a bit confusing because the keyword takes its place before the column names in the query. The DISTINCT keyword is used to discard rows with duplicate values from the rows that remained after the aggregation and filtering. ORDER BY clause Sorting takes place once the entire result set is ready. Once we have that, then we can sort the result set using columns, aggregation functions and selected aliases even if they are not part of the data that were selected. The data can be sorted using a descending (DESC) order or an ascending (ASC) order. TOP clause The TOP clause is part of the LIMIT and OFFSET clause. This clause is used to limit the amount of data to return.