SQL tutorial •

SELECTing your first data with SQL


The most basic SQL query simply grabs data from a table. This query consists of a <span class="code">SELECT</span> statement describing which columns to grab and a ‘from’ statement describing which table to grab this data from. You can only run one select statement at a time in SQL. If you enter some new data for the first time, as you will now, your starting point will typically be to get an understanding of which data your tables consist of. If you browse in the sidebar to the left in the Weld editor and click one of the tables in the e_commerce_sample folder, you’ll see an overview of the column names. This gives you an idea of what data this table consists of. You can also use the <span class="code">SELECT *</span> function to grab all the columns in a table. You can try to grab all the columns in the customer table like this:

You can try out this yourself on your own Weld account with the Weld_Sample_Data. Make sure you point to exactly the right destination in the <span class="code">FROM</span> statement. If you haven’t changed the name of the Connector or Gather Sync when you set it up, you can copy the query directly from the text above. Otherwise, you can browse for the data in the Raw Data section to the left in the Weld Editor. The <span class="code">FROM</span> statement always defines the table you’re manipulating in the query. This also means that the content in the table defined in the <span class="code">FROM</span> statement is the only data you can manipulate — you won’t be able to suddenly reference a column or some data in a query that doesn’t exist in the table defined in the <span class="code">FROM</span> statement.

Pro Tip: To enable the lineage feature in Weld, you must always encapsulate the destination in the FROM statement in '{{' and '}}'. This is not strictly required, but recommended to get the data observability features in Weld.

The <span class="code">*</span> will always return all data in a given table. Your future work of analyzing or following up on data will entail a lot of data exploration. The <span class="code">SELECT *</span> query is therefore super helpful to quickly see what data is in which tables. You can start by exploring the three tables in the sample e-commerce data set, by running the <span class="code">SELECT *</span> statement on the three tables:

Remember to delete one query before you run another <span class="code">SELECT *</span> statement. When you have an understanding of which data you have available in your table, you can be a bit more specific about what you want to query. Say you want to grab the highlighted columns in the customer table:

To get Weld to return these columns, you have to write out the column headers exactly as they are named in a query, like this:

Try to write the query in Weld on your own and execute the query by clicking the Execute button. Make sure the old query is removed before you add the new one. Remember that you can only run one <span class="code">SELECT *</span> statement at a time. You’ll see the columns you’ve defined. Notice that the column headers are separated by commas in your query, and that there is not a comma after the last column header. This is how you define exactly which and how many columns you want to be returned.

You can see all the available columns in your table in the Weld editor by going to the sidebar to the left and clicking the table you’re working on in the Raw section. If you click "insert all columns" in the editor, Weld will insert all the column headers. If you click this, make sure the column names are located in the right place in your query, and execute it, you’ll see all the data in the table you’ve defined. Try this on your own. The query should look like this:

While queries in SQL need to be written with the correct content and in a specific order to get the results you’re expecting, you don’t need to differentiate between capitalized letters. Because of this, it’s a good idea to use capitalized and uncapitalized letters to make it easier for yourself or your colleagues to read your code. For example, you can write your statements with capitalized letters — or uncapitalized letters — to make them easier to read. Furthermore, the editor ignores spaces and line breaks, so you can also use these to make the code easier to read. 

Pro Tip: You can check out the suggested query formatting in the Tips & Tricks section and the end of this guide. Or you can press the "FORMAT" button in Weld, and the editor will do the formatting for you.

Exercise 1: Use the SELECT function in Weld

In this exercise, you’ll use the SELECT function in Weld to get the id, customer_id, and shipping_address_country columns from the Orders table in the Weld_Sample_Data.

For this exercise, try to grab some data from the Orders table instead of the Customers table you worked with before. The name of this table is <span class="code">webshop_order</span> and can be found in the same location as the Customers table. So all you need to do is to change the last part of the from statement from <span class="code">webshop_customer</span> to <span class="code">.webshop_order</span> like this:

Grab the <span class="code">id</span>, <span class="code">customer_id</span>, and <span class="code">shipping_address_country</span> columns from this table. If you run your query, it should give a result like this:

In SQL there are often many means to an end, so you’ll usually be able to get the required result by writing different variations of the same query. A simple way to solve the exercise above would be to write a query like this:

* vs. Column Names for Performance

A little comment on how to make sure your queries also perform well if you start to work on larger data sets without taking too much unnecessary computing power. From a performance perspective, writing out the column headers you need to use in your data analysis is more effective than using the <span class="code">*</span> statement, even if you need all columns in a table. It’s therefore recommended to write out the column headers (or have Weld do it automatically), and only use the <span class="code">*</span> in your data exploration phase.

Next up
Introduction to filtering (WHERE)