SQL fundamentals

Most of the world’s structured data lives in relational databases, and the language for talking to them is SQL (Structured Query Language). It has endured for half a century because the ideas beneath it are simple and powerful: organize data into tables, and describe what you want rather than how to fetch it. Learning a handful of core operations unlocks the ability to ask almost any question of your data.

The relational idea

The relational model organizes data into tables. Each table holds information about one kind of thing — customers, orders, products — and is made of rows and columns. A row is a single record, one customer or one order. A column is an attribute that every row shares, like email or price, with a defined data type.

The model’s quiet genius is how it connects tables. Rather than nesting data inside data, relationships are expressed through shared key values. An orders table does not contain a whole customer; it stores a customer’s identifier, which points back to a row in the customers table. This keeps each fact in exactly one place.

Keys: identity and links

Two kinds of key make this work. A primary key is a column (or set of columns) whose value uniquely identifies each row. It must be unique and never null, giving every record a stable identity — typically an id column.

A foreign key is a column in one table that references the primary key of another. When the orders table stores a customer_id that matches a customers.id, that customer_id is a foreign key. It links the two tables and enforces referential integrity: the database can ensure you never reference a customer that does not exist.

Reading data: SELECT and WHERE

The workhorse of SQL is the SELECT statement, which retrieves data. You name the columns you want and the table to read from:

SELECT name, email FROM customers;

To narrow the results, add a WHERE clause, which filters rows by a condition. Only rows that satisfy it are returned:

SELECT name, email FROM customers WHERE country = 'UK';

You can sort the output with ORDER BY, which arranges rows by one or more columns, ascending by default or descending on request. Sorting changes presentation, not which rows come back.

Combining tables: JOIN

Because data is split across tables, you often need to recombine it. A JOIN stitches rows from two tables together based on a related column — usually a foreign key matching a primary key:

SELECT orders.id, customers.name
FROM orders
JOIN customers ON orders.customer_id = customers.id;

The type of join controls what happens to unmatched rows. An inner join returns only rows that have a match in both tables. A left outer join keeps every row from the left table even when the right has no match, filling the missing columns with NULL. Choosing the right join is the difference between “customers who placed orders” and “all customers, with their orders if any.”

Summarizing data: GROUP BY

Often you want not the raw rows but a summary of them. Aggregate functions — COUNT, SUM, AVG, MIN, MAX — collapse many rows into a single value. Paired with GROUP BY, they compute one result per group instead of one for the whole table:

SELECT country, COUNT(*) AS customer_count
FROM customers
GROUP BY country;

This returns one row per country with its customer count. GROUP BY is what turns a table of individual records into the kind of aggregated report a business actually reads.

SQL is declarative

A final idea ties everything together: SQL is declarative. You describe what result you want, and the database’s query engine decides how to produce it — which indexes to use, which table to scan first, how to execute the join.

You say “give me the total sales per region,” not “loop over these rows, accumulate these sums, then sort.” The engine’s optimizer figures out the efficient plan.