As for all of the SQL for business lessons, this
lesson uses the data.iowa.gov database. The original data is available at https://data.iowa.gov/Economy/Iowa-Liquor-Sales/m3tr-qhgy, the modified database we use for this workshop is avaliable at https://github.com/data-lessons/SQL-business/tree/gh-pages/data
It is named as soda.db
, it can be imported into SQLite.
See this slide deck as a sample intro for the lesson: SQL Intro Deck
Key points:
If you’ve written up a diagram of the data analysis pipeline (raw data -> clean data -> import and analyze -> results -> visualization), it can be helpful to identify that you’re now somewhere between clean data and analysis.
You can also describe few scenario when SQL was used in business environment. For example, auditors often have to access clients’ database.
(optional) The first lesson includes a brief introduction to data design and choosing database systems. This material expands on the database design in the first section.
As you teach the lesson, it can be helpful to pause and write up the query keywords on the board. This could look like this:
FROM table
WHERE (conditional statement, applies to row values) (AND/OR)
ORDER BY column/FUNCTION(column) (ASC/DESC)
* After 02-sql-aggregation
SELECT column FUNCTION(column) AGGREGATE_FUNCTION(column) FROM table
WHERE (conditional statement, applies to row values) (AND/OR) (IS (NOT) NULL) GROUP BY column HAVING (conditional statement, applies to group) ORDER BY column/FUNCTION(column) (ASC/DESC)
* After 03-sql-joins-aliases
SELECT column FUNCTION(column) AGGREGATE_FUNCTION(column) FROM table JOIN table ON table.col = table.col WHERE (conditional statement, applies to row values) (AND/OR) (IS (NOT) NULL) GROUP BY column HAVING (conditional statement, applies to group) ORDER BY column/FUNCTION(column) (ASC/DESC)
As a bonus, if you can leave this on the board, it translates nicely into
the `dplyr` portion of the `R` lesson, i.e.:
SQL: dplyr:
SELECT column select(col) FUNCTION(column) mutate(col = fcn(col)) AGGREGATE_FUNCTION(column) summarize(col = fcn(col)) FROM table JOIN table ON table.col = table.col WHERE (conditional statement, applies to row values) filter(condition) (AND/OR) (IS (NOT) NULL) is.na() GROUP BY column group_by(col) HAVING (conditional statement, applies to group) ORDER BY column/FUNCTION(column) (ASC/DESC) arrange() ~~~
If you want to try something more active (esp. if you’re teaching SQL in the afternoon!), this is a an interactive activity to try.
SELECT name, name FROM students WHERE height > 66