What was the point in creating the logical data model? You want to create a database to store data about CDs. The data model is only an intermediate step along the way. Ultimately, you would like to end up with a MySQL or mSQL database where you can store data. How do you get there? Physical database design translates your logical data model into a set of SQL statements that define your MySQL or mSQL database.
Since MySQL and mSQL are relational database systems, it is relatively easy to translate from a logical data model, such as the one we described earlier, into a physical MySQL or mSQL database. Here are the rules for translation:
Entities become tables in the physical database.
Attributes become columns in the physical database. You have to choose an appropriate datatype for each of the columns.
Unique identifiers become columns that are not allowed to have NULLs. These are called primary keys in the physical database. You may also choose to create a unique index on the identifiers to enforce uniqueness. For your purposes, mSQL does not have a concept of a primary key. It simply has unique indices. This issue does not apply to MySQL.
Relationships are modeled as foreign keys. We will cover this later.
If we apply these rules to our data model -- minus the Record Label address information -- we will end up with the physical database described in Table 2-1.
Table |
Column |
Datatype |
Notes |
---|---|---|---|
CD |
CDId |
INT |
primary key |
CDTitle |
TEXT(50) |
||
Artist |
ArtistId |
INT |
primary key |
ArtistName |
TEXT(50) |
||
Song |
SongId |
INT |
primary key |
SongName |
TEXT(50) |
||
RecordLabel |
RecordLabelId |
INT |
primary key |
RecordLabelName |
TEXT(50) |
primary key |
The first thing you may notice is that all of the spaces are gone from the entity names in our physical schema. This is because these names need to translate into SQL calls to create these tables. Table names should thus conform to SQL naming rules. Another thing to notice is that we made all primary keys of type INT. Because these attributes are complete inventions on our part, they can be of any indexible datatype.[3] The fact that they are of type INT here is almost purely arbitrary. It is almost arbitrary because it is actually faster to search on numeric fields in many database engines and hence numeric fields make good primary keys. However, we could have chosen CHAR as the type for the primary key fields and everything would work just fine. The bottom line is that this choice should be driven by your criteria for choosing identifiers.
[3]Later in this book, we will cover the datatypes supported by MySQL and mSQL. Each database engine has different rules about which datatypes can be indexible. Neither database, for example, allows indices to be created on whole TEXT fields. It would therefore be inappropriate to have a primary key column be of type TEXT.
The rest of the columns are set to be of type TEXT with a length of 50. This definition works for both MySQL and mSQL. For MySQL, however, VARCHAR would be a better choice but not important to this example. Picking the right datatype for columns is very important, but we will not dwell on it here since we have not yet covered the datatypes for MySQL and mSQL.
We now have a starting point for a physical schema. We haven't yet translated the relationships into the physical data model. As we discussed earlier, once you have refined your data model, you should have mostly 1-to-1 and 1-to-M relationships -- the M-to-M relationships were resolved via junction tables. We model relationships by adding a foreign key to one of the tables involved in the relationship. A foreign key is the unique identifier or primary key of the table on the other side of the relationship.
The most common relationship is the 1-to-M relationship. This relationship is mapped by placing the primary key on the "one" side of the relationship into the table on the "many" side. In our example, this rule means that we need to do the following:
Place a RecordLabelId column in the CD table.
Place a CDId column in the Song table.
Place an ArtistId column in the Song table.
Table 2-2 shows the new schema.
Table |
Column |
Datatype |
Notes |
---|---|---|---|
CD |
CdId |
INT |
primary key |
CDTitle |
TEXT(50) |
||
RecordLabelId |
INT |
foreign key |
|
Artist |
ArtistId |
INT |
primary key |
ArtistName |
TEXT(50) |
||
Song |
SongId |
INT |
primary key |
SongName |
TEXT(50) |
||
CdId |
INT |
foreign key |
|
ArtistId |
INT |
foreign key |
|
RecordLabel |
RecordLabelId |
INT |
primary key |
RecordLabelName |
TEXT(50) |
We do not have any 1-to-1 relationships in this data model. If we did have such a relationship, it should be mapped by picking one of the tables and giving it a foreign key column that matches the primary key from the other table. In theory, it does not matter which table you choose, but practical considerations may dictate which column makes the most sense as a foreign key.
We now have a complete physical database schema ready to go. The last remaining task is to translate that schema into SQL. For each table in the schema, you write one CREATE TABLE statement. Typically, you will choose to create unique indices on the primary keys to enforce uniqueness.
We are, in a sense, jumping ahead at this point. You may not be familiar with SQL yet, and it is not the purpose of this chapter to introduce the MySQL and mSQL variants of SQL. Nevertheless, here are two sample scripts to create the CD database. The first script, Example 2-1 is for MySQL. Example 2-2 is for mSQL.
CREATE TABLE CD (CD_ID INT NOT NULL, RECORD_LABEL_I INT, CD_TITLE TEXT, PRIMARY KEY (CD_ID)) CREATE TABLE Artist (ARTIST_ID INT NOT NULL, ARTIST_NAME TEXT, PRIMARY KEY (ARTIST_ID)) CREATE TABLE Song (SONG_ID INT NOT NULL, CD_ID INT, SONG_NAME TEXT, PRIMARY KEY (SONG_ID)) CREATE TABLE RecorLabel (RECORD_LABEL_ID INT NOT NULL, RECORD_LABEL_NAME TEXT, PRIMARY KEY(RECORD_LABEL_ID))
CREATE TABLE CD (CD_ID INT NOT NULL, RECORD_LABEL_ID INT, CD_TITLE TEXT(50)) CREATE UNIQUE INDEX CD_IDX ON CD (CD_ID) CREATE TABLE Artist (ARTIST_ID INT NO NULL, ARTIST_NAME TEXT(50)) CREATE UNIQUE INDEX Artist_IDX ON Artist (ARTIST_ID) CREATE TABLE Song (SONG_ID INT NOT NULL, CD_ID INT, SONG_NAME TEXT(50)) CREATE UNIQUE INDEX Song_IDX ON Song (SONG_ID) CREATE TABLE RecordLabel (RECORD_LABEL_ID INT NOT NULL, RECORD_LABEL_NAME TEXT(50)) CREATE UNIQUE INDEX RecordLabel_IDX ON RecordLabel(RECORD_LABEL_ID)
Data models are meant to be database independent. You can therefore take the techniques and the data model we have generated in this chapter and apply them not only to MySQL and mSQL, but to Oracle, Sybase, Ingres, or any other relational database engine. In the following chapters, we will discuss the details of how you can merge your new database design knowledge into MySQL and mSQL.
Copyright © 2001 O'Reilly & Associates. All rights reserved.