Queries (MySQL & mSQL)

6.7.1. Joins

Joins put the "relational" in relational databases. Specifically, a join enables you to match a row from one table up with a row in another table. The basic form of a join is what you may hear sometimes described as an inner join . Joining tables is a matter of specifying equality in columns from two tables:

SELECT book.title, author.name 
FROM author, book
WHERE book.author = author.id

Consider a database where the book table looks like Table 6-1.

Table 6-1. A book Table

ID	Title	Author	Pages
1	The Green Mile	4	894
2	Guards, Guards!	2	302
3	Imzadi	3	354
4	Gold	1	405
5	Howling Mad	3	294

And the author table looks like Table 6-2.

An inner join creates a table by combining the fields of both tables for rows that satisfy the query in both tables. In our example, the query specifies that the author field of the book table must be identical to the id field of the author table. The query's result would thus look like Table 6-3.

Table 6-2. An author Table

ID	Name	Citizen
1	Isaac Asimov	US
2	Terry Pratchet	UK
3	Peter David	US
4	Stephen King	US
5	Neil Gaiman	UK

An inner join creates a table by combining the fields of both tables for rows that satisfy the query in both tables. In our example, the query specifies that the author field of the book table must be identical to the id field of the author table. The query's result would thus look like Table 6-3.

Neil Gaiman is nowhere to be found in these results. He is left out because there is no value for his author.id value found in the book.author table. An inner join only contains those rows that exactly match the query. We will discuss the concept of an outer join later in the chapter for situations where we would be interested in the fact that we have an author in the database who does not have a book in the database.

Table 6-3. Query Results Based on an Inner Join

Book Title	Author Name
The Green Mile	Stephen King
Guards, Guards!	Terry Pratchet
Imzadi	Peter David
Gold	Isaac Asimov
Howling Mad	Peter David

Neil Gaiman is nowhere to be found in these results. He is left out because there is no value for his author.id value found in the book.author table. An inner join only contains those rows that exactly match the query. We will discuss the concept of an outer join later in the chapter for situations where we would be interested in the fact that we have an author in the database who does not have a book in the database.

6.7.2. Aliasing

When you use column names that are fully qualified with their table and column name, the names can grow to be quite unwieldy. In addition, when referencing SQL functions, which will be discussed later in the chapter, you will likely find it cumbersome to refer to the same function more than once within a statement. The aliased name, usually shorter and more descriptive, can be used anywhere in the same SQL statement in place of the longer name. For example:

# A column alias
SELECT long_field_names_are_annoying AS myfield
FROM table_name
WHERE myfield = 'Joe'
# A table alias under MySQL
SELECT people.names, tests.score 
FROM tests, really_long_people_table_name AS people
# A table alias under mSQL
SELECT people.names, tests.score
FROM tests, really_long_people_table_name=people

While mSQL fully supports table aliasing, it does not support column aliasing.

6.7.3. Grouping and Ordering

The results you get back from a select are, by default, indeterminate in the order they will appear. Fortunately, SQL provides some tools for imposing order on this seemingly random list. The first tool -- available in both MySQL and mSQL -- is ordering. You can tell a database that it should order any results you see by a certain column. For example, if you specify that a query should order the results by last_name, then the results will appear alphabetized according to the last_name value. Ordering comes in the form of the ORDER BY clause:

SELECT last_name, first_name, age
FROM people
ORDER BY last_name, first_name

In this situation, we are ordering by two columns. You can order by any number of columns, but the columns must be named in the SELECT clause. If we had failed to select the last_name above, we could not have ordered by the last_name field.

Grouping is an ANSI SQL tool that MySQL implements but mSQL does not. Because mSQL does not have any concept of aggregate functions, grouping simply does not make sense in mSQL. As its name implies, grouping lets you group rows with a similar value into a single row in order to operate on them together. You usually do this to perform aggregate functions on the results. We will go into functions a little later in the chapter.

Consider the following:

mysql> SELECT name, rank, salary FROM people\g

+--------------+----------+--------+
| name         | rank     | salary |
+--------------+----------+--------+
| Jack Smith   | Private  |  23000 |
| Jane Walker  | General  | 125000 |
| June Sanders | Private  |  22000 |
| John Barker  | Sargeant |  45000 |
| Jim Castle   | Sargeant |  38000 |
+--------------+----------+--------+
5 rows in set (0.01 sec)

If you group the results by rank, the output changes:

mysql> SELECT rank FROM people GROUP BY rank\g

+----------+
| rank     |
+----------+
| General  |
| Private  |
| Sargeant |
+----------+
3 rows in set (0.01 sec)

Now that you have the output grouped, you can finally find out the average salary for each rank. Again, we will discuss more on the functions you see in this example later in the chapter.

mysql> SELECT rank, AVG(salary) FROM people GROUP BY rank\g

+----------+-------------+
| rank     | AVG(salary) |
+----------+-------------+
| General  | 125000.0000 |
| Private  |  22500.0000 |
| Sargeant |  41500.0000 |
+----------+-------------+
3 rows in set (0.04 sec)

The power of ordering and grouping combined with the utility of SQL functions enables you to do a great deal of data manipulation even before you retrieve the data from the server. You should take great care not to rely too heavily on this power. While it may seem like an efficiency gain to place as much processing load as possible onto the database server, it is not really the case. Your client application is dedicated to the needs of a particular client, while the server is being shared by many clients. Because of the greater amount of work a server already has to do, it is almost always more efficient to place as little load as possible on the database server. MySQL and mSQL may be two of the fastest databases around, but you do not want to waste that speed on processing that a client application is better equipped to manage.

If you know that a lot of clients will be asking for the same summary information often (for instance, data on a particular rank in our previous example), just create a new table containing that information and keep it up to date as the original tables change. This is similar to caching and is a common database programming technique.