Chapter 2 - WHERE AND/OR what?
I know what I want, and I know what I don't want.
First of all table manners
If you have any modicum of attention to detail, you must have been questioning yourself about capitalised letters, commas, and the use of single and double quotes.
If not, get ready for a crash course on coding etiquette!
Why you should capitalise some code
The dune compiler will read all 4 lines the same, just as all food that you eat will end up into a big warm mushy soup in your tummy.
SELECT * FROM table
Select * From table
select * from table
SeLeCt * frOM table
However in the spirit of orderliness, I tend to capitalise all machine targeted instructions, so that future readers of my code know which part is standard instruction and which is a ‘user generated data’.
Food is so much nicer to appreciate when well presented, as is code.
Why you should use indentation
As your queries gets more and more complicated (in a few weeks time your queries will likely hit the 200 lines on average) it gets harder and harder to read, even for yourself!
By taking an early habit of using indentation, you will make both debugging and building easier for yourself. Here are 2 identical queries:
With an intention to make you squint:
Both give identical results, but the first one is easier to understand as we can see clearly which data we are pulling, and all the conditions attached to the query, line by line.
Above is a link to Simon Holywell’s guide to sql styling, this is the “Mein Kampf” of SQL styling Nazis, OCD people will orgasm when they read that.
I am more of a libertarian in the sense that I believe styling is a personal choice, but it still matters as styling is a reflection of your mind, and orderly minds have orderly styles.
Now back onto actual query learning!!!
WHERE is your bread and butter as an apprentice wizard, it is you most used logical operator.
Let’s dive into
WHERE with a practical example with a simple query on our sample data table from Chapter 1:
SELECT minute, price, symbol, contract_address FROM dune_user_generated."april_2022_dex_token_prices_sample_1" WHERE symbol = 'WETH'
And by running the above we get the results below:
The first thing you will notice is that this query returns 43,200 rows, as opposed to the initial query from Chapter 1, which returned 86,400 rows.
This is because we told the machine to only give us the rows of data where the data in the symbol column exactly matched the word ‘WETH’. So all the rows where the symbol column was not ‘WETH’ got removed from the query results.
WHERE is a useful but capricious tool, if you want to have multiple conditions (e.g looking up the rows where symbol is ‘WETH’ and minute is after a certain date), you can’t use WHERE multiple times, instead you must use
This won’t work:
SELECT minute, price, symbol, contract_address FROM dune_user_generated.april_2022_dex_token_prices_sample_1 WHERE symbol = 'WETH' WHERE minute > '2022-04-19 23:59'
But this will work:
SELECT minute, price, symbol, contract_address FROM dune_user_generated.april_2022_dex_token_prices_sample_1 WHERE symbol = 'WETH' AND minute > '2022-04-19 23:59'
Multiple AND work as well:
SELECT minute, price, symbol, contract_address FROM dune_user_generated.april_2022_dex_token_prices_sample_1 WHERE symbol = 'WETH' AND minute > '2022-04-19 23:59' AND minute < '2022-04-26 23:59'
WHERE OR is another way
WHERE can be used which is slightly different from the
WHERE AND syntax.
WHERE OR is used when you have an either or situation for your filter, e.g: I want to only see the rows where the symbol is either ‘WETH’ or ‘USD’ (we know that there is no ‘USD’ in our sample data table, but let’s ignore that for now).
This won’t work, there is no such row where the symbol is both ‘WETH’ and ‘USD’:
SELECT minute, price, symbol, contract_address FROM dune_user_generated.april_2022_dex_token_prices_sample_1 WHERE symbol = 'WETH' AND symbol = 'USD'
But this will work:
SELECT minute, price, symbol, contract_address FROM dune_user_generated.april_2022_dex_token_prices_sample_1 WHERE symbol = 'WETH' OR symbol = 'USD'
While the use cases of WHERE OR might not be obvious for now, it is a handy tool to keep in your toolbox for when the need arises.
Mathematical operators are an important part in queries, because not everything is boolean, i.e yes or no, a lot of data come across ranges, or other forms which is why logical operators are important.
Equal is the easiest of all, just as is its antithesis, not equal.
As you start your journey, you might think as > and >= being the same.
>= > <= <
But here is where the distinction becomes important:
You want to measure the price of a token as from the first increment of time of a specific day which is: 2022-04-19 00:01
If you use >, then the row with minute 2022-04-19 00:01 will be omitted from the query.
But if you use >=, then the row with minute 2022-04-19 00:01 will be included in the query.
More weird but useful operators
PostgreSQL is filled with operators that may seem alien for the first time users. We will encounter more of those as we proceed down the rabbit hole.
In the meanwhile here are a few you should familiarise yourself with that will prove useful in your early days as a wizard in training.
IS NULL/IS NOT NULL
Sometimes cells in rows or columns can be empty.
Imagine the whole internet shut down for 1 specific minute, so for that specific minute, there’d be no price for either ‘WBTC’ or ‘WETH’. (we call that value null because it is not zero).
SELECT minute, price, symbol, contract_address FROM dune_user_generated.april_2022_dex_token_prices_sample_1 WHERE price IS NULL
IS NULL operator, we can pinpoint the exact time when the price was not reported. This is equally useful when trying to troubleshoot your queries.
SELECT minute, price, symbol, contract_address FROM dune_user_generated.april_2022_dex_token_prices_sample_1 WHERE price IS NOT NULL
On the other hand, the
IS NOT NULL operator becomes very useful if we still want to visualise data, but want to omit the null values from our query.
IN / NOT IN
IN/NOT IN overlaps a bit with with = and != mathematical operators but they have their uses.
SELECT minute, price, symbol, contract_address FROM dune_user_generated.april_2022_dex_token_prices_sample_1 WHERE symbol IN ('WETH','USD') --achieves OR filter AND minute > '2022-04-19 23:59' AND minute < '2022-04-26 23:59'
In the above case we managed to achieve both the result of the
WHERE OR and the
WHERE AND, which otherwise would not have been achievable in a single query because you cannot do a
WHERE AND OR query.
Logical operators add a whole layer of complexity to queries but also make data analysis more malleable.
Mathematical operators add a second dimensionality to logical operators, that give the data analyst even more legroom to be creative.
Let’s go make our first dashboard!