The last common SQL command used is the one that enables you to view the data in the database: SELECT. This action is by far the most common action performed in SQL. While data entry and modifications do happen on occasion, most databases spend the vast majority of their lives serving up data for reading. The general form of the SELECT statement is as follows:
SELECT column1, column2, ..., columnN FROM table1, table2, ..., tableN [WHERE clause]
This syntax is certainly the most common way in which you will retrieve data from any SQL database. Of course, there are variations for performing complex and powerful queries, especially under MySQL. We cover the full range of the SELECT syntax in Chapter 15, "SQL Reference".
The first part of a SELECT statement enumerates the columns you wish to retrieve. You may specify a "*" to say that you want to select all columns. The FROM clause specifies which tables those columns come from. The WHERE clause identifies the specific rows to be used and enables you to specify how to join two tables.
Joins put the "relational" in relational databases. Specifically, a join enables you to match a row from one table up with a row in another table. The basic form of a join is what you may hear sometimes described as an inner join . Joining tables is a matter of specifying equality in columns from two tables:
SELECT book.title, author.name FROM author, book WHERE book.author = author.id
Consider a database where the book table looks like Table 6-1.
ID |
Title |
Author |
Pages |
---|---|---|---|
1 |
The Green Mile |
4 |
894 |
2 |
Guards, Guards! |
2 |
302 |
3 |
Imzadi |
3 |
354 |
4 |
Gold |
1 |
405 |
5 |
Howling Mad |
3 |
294 |
And the author table looks like Table 6-2.
An inner join creates a table by combining the fields of both tables for rows that satisfy the query in both tables. In our example, the query specifies that the author field of the book table must be identical to the id field of the author table. The query's result would thus look like Table 6-3.
ID |
Name |
Citizen |
---|---|---|
1 |
Isaac Asimov |
US |
2 |
Terry Pratchet |
UK |
3 |
Peter David |
US |
4 |
Stephen King |
US |
5 |
Neil Gaiman |
UK |
An inner join creates a table by combining the fields of both tables for rows that satisfy the query in both tables. In our example, the query specifies that the author field of the book table must be identical to the id field of the author table. The query's result would thus look like Table 6-3.
Neil Gaiman is nowhere to be found in these results. He is left out because there is no value for his author.id value found in the book.author table. An inner join only contains those rows that exactly match the query. We will discuss the concept of an outer join later in the chapter for situations where we would be interested in the fact that we have an author in the database who does not have a book in the database.
Book Title |
Author Name |
---|---|
The Green Mile |
Stephen King |
Guards, Guards! |
Terry Pratchet |
Imzadi |
Peter David |
Gold |
Isaac Asimov |
Howling Mad |
Peter David |
Neil Gaiman is nowhere to be found in these results. He is left out because there is no value for his author.id value found in the book.author table. An inner join only contains those rows that exactly match the query. We will discuss the concept of an outer join later in the chapter for situations where we would be interested in the fact that we have an author in the database who does not have a book in the database.
When you use column names that are fully qualified with their table and column name, the names can grow to be quite unwieldy. In addition, when referencing SQL functions, which will be discussed later in the chapter, you will likely find it cumbersome to refer to the same function more than once within a statement. The aliased name, usually shorter and more descriptive, can be used anywhere in the same SQL statement in place of the longer name. For example:
# A column alias SELECT long_field_names_are_annoying AS myfield FROM table_name WHERE myfield = 'Joe' # A table alias under MySQL SELECT people.names, tests.score FROM tests, really_long_people_table_name AS people # A table alias under mSQL SELECT people.names, tests.score FROM tests, really_long_people_table_name=people
While mSQL fully supports table aliasing, it does not support column aliasing.
The results you get back from a select are, by default, indeterminate in the order they will appear. Fortunately, SQL provides some tools for imposing order on this seemingly random list. The first tool -- available in both MySQL and mSQL -- is ordering. You can tell a database that it should order any results you see by a certain column. For example, if you specify that a query should order the results by last_name, then the results will appear alphabetized according to the last_name value. Ordering comes in the form of the ORDER BY clause:
SELECT last_name, first_name, age FROM people ORDER BY last_name, first_name
In this situation, we are ordering by two columns. You can order by any number of columns, but the columns must be named in the SELECT clause. If we had failed to select the last_name above, we could not have ordered by the last_name field.
Grouping is an ANSI SQL tool that MySQL implements but mSQL does not. Because mSQL does not have any concept of aggregate functions, grouping simply does not make sense in mSQL. As its name implies, grouping lets you group rows with a similar value into a single row in order to operate on them together. You usually do this to perform aggregate functions on the results. We will go into functions a little later in the chapter.
Consider the following:
mysql> SELECT name, rank, salary FROM people\g
+--------------+----------+--------+ | name | rank | salary | +--------------+----------+--------+ | Jack Smith | Private | 23000 | | Jane Walker | General | 125000 | | June Sanders | Private | 22000 | | John Barker | Sargeant | 45000 | | Jim Castle | Sargeant | 38000 | +--------------+----------+--------+ 5 rows in set (0.01 sec)
If you group the results by rank, the output changes:
mysql> SELECT rank FROM people GROUP BY rank\g
+----------+ | rank | +----------+ | General | | Private | | Sargeant | +----------+ 3 rows in set (0.01 sec)
Now that you have the output grouped, you can finally find out the average salary for each rank. Again, we will discuss more on the functions you see in this example later in the chapter.
mysql> SELECT rank, AVG(salary) FROM people GROUP BY rank\g
+----------+-------------+ | rank | AVG(salary) | +----------+-------------+ | General | 125000.0000 | | Private | 22500.0000 | | Sargeant | 41500.0000 | +----------+-------------+ 3 rows in set (0.04 sec)
The power of ordering and grouping combined with the utility of SQL functions enables you to do a great deal of data manipulation even before you retrieve the data from the server. You should take great care not to rely too heavily on this power. While it may seem like an efficiency gain to place as much processing load as possible onto the database server, it is not really the case. Your client application is dedicated to the needs of a particular client, while the server is being shared by many clients. Because of the greater amount of work a server already has to do, it is almost always more efficient to place as little load as possible on the database server. MySQL and mSQL may be two of the fastest databases around, but you do not want to waste that speed on processing that a client application is better equipped to manage.
If you know that a lot of clients will be asking for the same summary information often (for instance, data on a particular rank in our previous example), just create a new table containing that information and keep it up to date as the original tables change. This is similar to caching and is a common database programming technique.
Copyright © 2001 O'Reilly & Associates. All rights reserved.