SQL is the language analysts use to explore, filter, and summarize data. This guide walks you through the commands you’ll use most often, showing not only how to write them but also why they work the way they do. Examples are included so you can see each command in action.
Core SQL Commands
The foundation of querying data is the SELECT
statement. For instance, if you want to retrieve the names and salaries of employees, you write:
SELECT name, salary FROM employees;
This tells the database to return only the name
and salary
columns from the employees
table. While it’s tempting to use SELECT *
to return all columns, such as in:
SELECT * FROM employees;
this approach is convenient for exploration but inefficient in production because it may return large amounts of unnecessary data and make your results harder to interpret. By selecting only the columns you need, you keep queries faster and more readable.
Filtering data is essential for analysis. The WHERE
clause lets you specify conditions. For example, to find all employees in the Sales department:
SELECT * FROM employees WHERE department = 'Sales';
You can filter numerical ranges using BETWEEN
, as in:
SELECT * FROM employees WHERE salary BETWEEN 50000 AND 100000;
or search for multiple values with IN
:
SELECT * FROM employees WHERE region IN ('EU', 'US');
Text patterns can be matched using LIKE
, for example:
SELECT * FROM employees WHERE email LIKE '%@company.com';
The %
wildcard matches any sequence of characters, and _
matches a single character. You can combine conditions with AND
, OR
, and NOT
to create more complex filters.
Sorting and Limiting Results
Often you want your results in a specific order or only a subset. ORDER BY
sorts the data, either ascending (ASC
) or descending (DESC
). For instance, to get the five highest-paid employees:
SELECT name, salary
FROM employees
ORDER BY salary DESC
LIMIT 5;
The LIMIT
clause ensures that only the top five rows are returned. In SQL Server, you would use TOP 5
or OFFSET ... FETCH
instead. Sorting and limiting make your reports focused and easier to interpret.
Aggregation and Grouping
When summarizing data, aggregate functions are essential. COUNT
, SUM
, AVG
, MIN
, and MAX
allow you to compute totals, averages, minimums, or maximums. To see the number of employees and average salary by department:
SELECT department, COUNT(*) AS total, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING COUNT(*) > 10;
The GROUP BY
clause collects rows by department. HAVING
filters groups after the aggregation is calculated, which is different from WHERE
, which filters rows before aggregation.
Joins – Combining Tables
Real-world data is spread across multiple tables, so joins are essential. An INNER JOIN
returns only rows that exist in both tables:
SELECT e.name, d.department_name
FROM employees e
INNER JOIN departments d ON e.dept_id = d.id;
This retrieves each employee’s name along with their department name, but only for employees whose department exists in the departments
table. A LEFT JOIN
keeps all rows from the left table and fills unmatched rows from the right table with NULL
:
SELECT e.name, d.department_name
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.id;
This ensures that all employees are listed, even if their department is missing or has been removed. RIGHT
and FULL
joins are variations that adjust which table’s rows are preserved.
Subqueries and Window Functions
Sometimes you need a value computed from one query to filter another. Subqueries allow that. For example, to find employees earning more than the average salary:
SELECT name
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
The inner query computes the average salary, and the outer query returns employees whose salaries exceed it. Subqueries can return single values, lists of values, or entire tables.
Window functions let you calculate rankings, running totals, and moving averages without collapsing the rows. For instance:
SELECT name, department, salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank,
SUM(sales) OVER (PARTITION BY region ORDER BY date) AS running_sales
FROM employees;
Here, RANK()
gives each employee a rank within their department by salary, while SUM(...) OVER
computes a running total of sales within each region over time. Window functions are powerful because they combine analysis with row-level detail.
CRUD Operations and Utilities
Adding, updating, and removing data are critical skills. To insert a new employee:
INSERT INTO employees (name, salary) VALUES ('Jane', 60000);
To update a salary for a specific employee:
UPDATE employees SET salary = 65000 WHERE id = 5;
To remove an employee:
DELETE FROM employees WHERE id = 5;
Always include a WHERE
clause with updates and deletes to avoid affecting all rows accidentally.
Other useful utilities include conditional logic with CASE
:
SELECT CASE WHEN salary > 100000 THEN 'High' ELSE 'Normal' END AS band
FROM employees;
Handling NULL
values can be done with COALESCE
:
SELECT COALESCE(manager, 'None') AS manager
FROM employees;
Casting converts data types:
SELECT CAST(order_date AS DATE) FROM orders;
Practical Tips for Analysts
Using SQL effectively requires attention to detail. Avoid SELECT *
in production queries because it can slow performance. Use EXPLAIN
or EXPLAIN ANALYZE
to check how queries execute. Index columns that are frequently used in filters or joins, but always measure the impact. Ensure that joined columns are of the same type to avoid implicit conversions. Finally, when writing updates or deletes, test the query first as a SELECT
to make sure only the intended rows will be affected.
Be aware that SQL dialects differ. MySQL and PostgreSQL use LIMIT
for row restriction, whereas SQL Server uses TOP
or OFFSET ... FETCH
. Full outer joins are not natively supported in MySQL, so you may need workarounds.