LIMITing the amount of data with SQL
SQL can handle very large amounts of data. But if you’re working with very large data sets, this requires a lot of computing power, which can take up resources both in terms of time and money.
A good practice:
When developing queries or building models, it’s often smart to work with a smaller sample of the data first. Once everything is working as expected, you can then apply your final model or query to the full dataset.
You can use the LIMIT
function for this.
For example, the order line table contains 1000 rows. If you wanted to develop a model for this table by only experimenting with the first 5 rows, it could be done like this:
1select
2 *
3from
4 {{raw.e_commerce_sample.webshop_order_line}}
5limit
6 5
This returns this table:

The order in which you place your limit
statement in your query is important. For example, if you want to add an ‘order by’ statement to the query above and order the query as:
- SELECT
- FROM
- LIMIT
- ORDER BY
The result would be to first select the 5 top rows in the raw table, and then order those 5 rows. If you change the order of the statements to:
- SELECT
- FROM
- ORDER BY
- LIMIT
The resulting table would instead be the 5 top rows after you’ve ordered them. You can try this on your own.