Chapter 1 - SELECT Something FROM Something
The only syntax you will ever need to remember.
Starting from the start
If you want to learn about Data Analytics on Dune, everyone on the discord will tell you to go learn SQL first.
And they are 99% right, but 99% of people opening an SQL tutorial will likely lose interest in even proceeding any further after after doing so.
As a matter of fact, it took me 5 months between creating my Dune account and publishing my first dashboard (the dashboard itself was completed in 3 hours).
I will not take the easy way out by dumping my reader into a tutorial.
Through this substack, I will aspire to inspire you to learn SQL while playing with Dune Analytics.
The spreadsheet analogy
A spreadsheet is basically rows and columns, and this is all you need to visualise in your mind’s eye for now to understand data tables.
And your first task as a data analyst is to instruct the Dune to show you data that you want from that figurative spreadsheet.
What you need to get started
For the first exercise, I prepared a short data table on the Ethereum database named:
Dune has segregated databases for each blockchain, and I am using Ethereum as it lets me create user generated tables for the purposes of getting started.
Note: You can see how this table was made here. But I would advise against opening it at this point in time, it’s unnecessary headaches for you for now.
This table contains 4 columns named: minute, symbol, price, and contract_address; and contains 86,400 rows of data.
Don’t be impressed by the number of rows, you will almost never have to go beyond the first first 20 rows as a data analyst, less you are here to split hairs and find bugs.
Think of the above table table as spreadsheet.
Your first query
Now let’s make our first query.
Our first query will be a query to fetch all the rows and columns from the short data table I prepared earlier.
And the code you need to paste into the black box is as per below, and hit the ‘Run’ button.
Feels magical, right?
What did I just do?
SELECT means show
* means everything
FROM means from
You basically sent a message to the program saying: “hey, show me everything from in that data table.”
Easy peasy lemon squeezy.
Let’s try to tweak our instruction a bit
As you saw previously we have 4 columns in the sample data table, what if you only need 3 of them (let’s ditch the contract address), well we can actually do that by only querying the columns that we need, and to do that we simply ask for what we need in the code:
SELECT minute, symbol, price
And this will return us:
Note that you can also switch the order of the columns as such (I inverted price and symbol):
SELECT minute, price, symbol
And this will return us:
Switching columns is not really a big trick, but it can be useful to present a table in a more readable format.
Personally I like to see the timestamp on my left, then the token symbol and the price last, because it feels more natural to my eyes.
What did I actually query?
Now that you have successfully ran your first query, let’s have a looked at what was actually fished out.
And to do that we can have a look at the sample data table, and how it was originated.
What I did to create the sample data table was to query the
prices.usd data table (which is provided by dune) that contains all the prices of most of the cryptos in dollar terms taken at regular intervals of one minute.
For the purposes of this substack I truncated the data for only the month of April and only allowed for WBTC and WETH to be recorded in the sample data table.
Never mind the new commands that you just saw, we will go through them properly at a later stage, for now you just needed a brief explanatory on what those 86,400 rows of data contained.
What can I do with this query?
Next to the query result is a button called ‘New visualisation’, select ‘line chart’ from the drop down menu and click ‘Add Visualisation’, and you will get something like the chart below by default:
However if you look back at the query results, you will see that we actually have the data points for 2 sets of tokens, WBTC and WETH.
This means that for every minute in the data table, there are 2 rows:
Check the table out when you sort it by minute.
This enables us to do multi dimensional charting!
Scrolling down a bit further we can see that we can set the x-axis to be minute, and the y-axis to price, with grouping by symbol.
This creates 2 distinct line that uses x and y, and the symbol as a segregation factor.
Nice right? And all that with just a few lines of codes and a few clicks!
A data table is basically a big spreadsheet.
FROMsomething is the only syntax you need to remember.
That was a piece of cake, I want moar!!!