Command Prompt, Inc.
Data Types

Data Types

SQL is considered a strongly typed language. This means that any piece of data represented by PostgreSQL has an associated data type, even if it is not plainly obvious. A data value's type both defines and constrains the kinds of operations which may be performed on it.

Not only is every piece of data associated with a type, but types play a large part in the construction of tables. As stated in the Section called Introduction to Relational Databases," tables are made up of one or more columns. These columns must, in addition to having a name, have a specific data type.

Note: While PostgreSQL provides a wide variety of built-in data types, you also have the option to add new data types to PostgreSQL using the CREATE TYPE command. See the reference entry on CREATE TYPE for more on this command.

Table 3-10 lists the data types officially supported by PostgreSQL, as well as any PostgreSQL recognized aliases (alternative names that are identical in connotation). There are many other internal (meaning they are no longer intended for normal use) or deprecated (outdated, and discouraged) data types available that are unlisted.

Additionally, while most of the data types implemented in PostgreSQL are directly derived from SQL standards, there are some actively maintained data types that are non-standard (such as the geometric and spacial types). Therefore, you will not always be able to find equivalent types on other SQL-capable database management systems.

Table 3-10. PostgreSQL supported data types

Category

Data type

Description

Standardization

Boolean and binary types

boolean, bool

A single true or false value.

SQL99

bit(n )

An n -length bit string (exactly n binary bits).

SQL92

bit varying(n ), varbit(n )

A variable n -length bit string (up to n binary bits)

SQL92

Character types

character (n ), char(n )

A fixed n -length character string.

SQL89

character varying(n ), varchar(n )

A variable length character string of up to n characters.

SQL92

text

A variable length character string, of unlimited length.

PostgreSQL-specific

Numeric types

smallint, int2

A signed 2-byte integer.

SQL89

integer, int, int4

A signed, fixed-precision 4-byte number.

SQL92

bigint, int8

A signed 8-byte integer, up to 18 digits in length.

PostgreSQL-specific

real, float4

A 4-byte floating-point number.

SQL89

double precision, float8, float

An 8-byte floating-point number.

SQL89

numeric(p,s ), decimal(p,s )

An exact numeric type with arbitrary precision p, and scale s.

SQL99

money

A fixed precision, U.S.-style currency.

PostgreSQL-specific, deprecated.

serial

An auto-incrementing 4-byte integer.

PostgreSQL-specific

Date and time types

date

The calendar date (day, month and year).

SQL92

time

The time of day.

SQL92

time with time zone

The time of day, including time zone information.

SQL92

timestamp (includes time zone)

Both the date and time.

SQL92

interval

An arbitrarily specified length of time.

SQL92

Geometric types

box

A rectangular box in a 2D plane.

PostgreSQL-specific

line

An infinite line in a 2D plane.

PostgreSQL-specific

lseg

A finite line segment in a 2D plane.

PostgreSQL-specific

circle

A circle with center and radius.

PostgreSQL-specific

path

Open and closed geometric paths in a two-dimensional plane.

PostgreSQL-specific

point

geometric point in a 2D plane

PostgreSQL-specific

polygon

A closed geometric path in a 2D plane.

PostgreSQL-specific

Network types

cidr

An IP network specification.

PostgreSQL-specific

inet

A network IP address, with optional subnet bits.

PostgreSQL-specific

macaddr

A MAC address (e.g., an Ethernet card's hardware address).

PostgreSQL-specific

System types

oid

An object (row) identifier.

PostgreSQL-specific

xid

A transaction identifier.

PostgreSQL-specific

Remaining true to theme, the following sections on data types will describe in further detail each of the most widely used and practical types. This book will not go into detail on the non-standard and/or more esoteric types, such as the geometric, network and bitwise types. These sections include information on valid usage, storage considerations, input and output formats and general syntactic conventions. Before we go much further on specific data types there are a couple of topics worth discussing, including the NULL keyword.

NULL Values

Despite the previously discussed rule that a column can have only one data type and logically accept only that type, there is a value that all columns can be defined as, no matter what their data type. This is the value a column is set to when you use the SQL keyword NULL. Essentially, NULL has no data value, so it is not considered a type; it is a system value that indicates to the database that the field it is located within contains no value. The only exception to the rule that any column can contain a NULL is when the NOT NULL constraint is specified for a column.

NULL is often used in places where a value is optional. It can be a convenient way of omitting data without having to resort to strange or arbitrary conventions, such as storing negative values in an integer field to represent omitted data. While your system requirements may change over time, the connotation of NULL is always NULL.

NULL can be thought of as a meta-value: a value that represents a lack of a value, which will never be equivalent to a non-NULL value. One problem often encountered when working with NULL values is that they are easily confused with empty character strings, which return a blank value to the client when selected. The reason this can be confusing is that NULL values also return a blank value when selected; however, they are completely different than empty character strings and this must be understood in order to avoid creating faulty queries or code. A character string column that contains a blank value still contains a string of characters, though the characters that compose the string are blank; thus, there is still a value in the column. A NULL value represents the complete absence of value within the column, not that it is merely blank.

This is an important distinction, as the rules for SQL operations involving the NULL value are quite different than the rules for operations involving empty string values. This internal distinction is especially important in reference to joins, which are discussed in Chapter 4.

The return of both NULL and empty values is shown in Example 3-15, which retrieves a set of five books from the books table. The first SELECT query shows that there appear to be two books which have been inserted without titles. Upon successive querying, however, it becomes clear that while neither have visible titles, one of the books has an empty value for its title (id 100), while the other has a NULL value.

Example 3-15. Observing NULL values

booktown=# SELECT id, title FROM books;
  id  |        title
------+---------------------
 7808 | The Shining
  156 | The Tell-Tale Heart
 4513 | Dune
  100 |
  101 |
(5 rows)

booktown=# SELECT id, title FROM books WHERE title = '';
 id  | title
-----+-------
 100 |
(1 row)

booktown=# SELECT id, title FROM books WHERE title IS NULL;
 id  | title
-----+-------
 101 |
(1 row)

Example 3-16 demonstrates a more practical (and likely) use of NULL in a table called editions, which relates a book's ISBN number to its publication date.

Example 3-16. Using NULL values

booktown=# SELECT isbn, publication FROM editions;
    isbn    | publication
------------+-------------
 039480001X | 1957-03-01
 0394800753 | 1949-03-01
 0385121679 |
(3 rows)

booktown=# SELECT isbn, publication FROM editions WHERE publication IS NULL;
    isbn    | publication
------------+-------------
 0385121679 |
(1 row)

NULL might be used in this manner in order to represent books with editions that are not yet published, or for books whose publication date was unknown when entered into the database. It could be misleading to supply some arbitrarily illogical date for a book fitting either of these criteria, and in both cases, NULL makes sense as a solution.

Boolean Values

A Boolean value is a simple data structure which can only represent values of true or false. PostgreSQL supports the SQL99-defined boolean data type, with a PostgreSQL-specific alias of bool.

Like all other data types, Boolean values can also be set to NULL. If a Boolean is set to NULL, it will never be interpreted as either true or false; it will be interpreted as NULL. This may seem obvious, but it is significant in situations where you may think to check for NULL Booleans by checking for false values (which won't work). You must use IS NULL to check for NULL Booleans. The ability to be true, false, or NULL (and its related rules regarding the designation of NULL as not being true or false) is known as three-valued logic.

Table 3-11 shows the valid constant values for a true or false state that are recognized by PostgreSQL. Which convention you choose to employ is dependent solely on your own preference. All variations of true, as well as all variations of false, are interpreted identically by the server.

Table 3-11. Supported true or false constants

True

False

true

false

't'

'f '

'true'

'false'

'y'

'n'

'yes'

'no'

'1'

'0'

Warning

If you decide to use the constants listed in Table 3-11, every value (except for true and false) must be enclosed within single quotes. Failure to do so will result in a server error.

Example 3-17 creates a simple table named daily_inventory that logs what books are stock and which are not, correlating an ISBN number with a Boolean value. Once created, the table is populated with data via a series of INSERT statements involving a string constant (the ISBN number), and a variety of valid Boolean constants.

Example 3-17. Simple Boolean table

booktown=# CREATE TABLE daily_inventory (isbn text, in_stock boolean);
CREATE
booktown=# INSERT INTO daily_inventory VALUES ('0385121679', true);
INSERT 3390926 1
booktown=# INSERT INTO daily_inventory VALUES ('039480001X', 't');
INSERT 3390927 1
booktown=# INSERT INTO daily_inventory VALUES ('044100590X', 'true');
INSERT 3390928 1
booktown=# INSERT INTO daily_inventory VALUES ('0451198492', false);
INSERT 3390929 1
booktown=# INSERT INTO daily_inventory VALUES ('0394900014', '0');
INSERT 3390930 1
booktown=# INSERT INTO daily_inventory VALUES ('0441172717', '1');
INSERT 3390931 1
booktown=# INSERT INTO daily_inventory VALUES ('0451160916');
INSERT 3390932 1

Now that the table has been populated with records, a SELECT query may be issued to easily check which books are in stock, as shown in Example 3-18.

Example 3-18. Checking Boolean values

booktown=# SELECT * FROM daily_inventory WHERE in_stock = 'yes';
    isbn    | in_stock
------------+----------
 0385121679 | t
 039480001X | t
 044100590X | t
 0441172717 | t
(4 rows)

With a Boolean column you have the ability to imply a true value by referencing the column name without any kind of operator or modifying keyword. This can lead to more intuitive looking queries for well-designed tables, as shown in Example 3-19.

Example 3-19. Implying Boolean 'true'

booktown=# SELECT * FROM daily_inventory WHERE in_stock;
    isbn    | in_stock
------------+----------
 0385121679 | t
 039480001X | t
 044100590X | t
 0441172717 | t
(4 rows)

Although the second query does not specify 'true' or 'false', it implicitly looks for a value of 'true' by omitting a comparison operator.

Similarly, if you want to search for false values, you may either compare the named column's value against any of the valid boolean constants in Table 3-11, or you may use the SQL keyword NOT just before the column name. Each method is demonstrated in Example 3-20.

Example 3-20. Checking for 'false' Boolean values

booktown=# SELECT * FROM daily_inventory WHERE in_stock = 'no';
    isbn    | in_stock
------------+----------
 0451198492 | f
 0394900014 | f
(2 rows)

booktown=# SELECT * FROM daily_inventory WHERE NOT in_stock;
    isbn    | in_stock
------------+----------
 0451198492 | f
 0394900014 | f
(2 rows)

In this way, you can see how SQL was designed with human readability in mind. By naming your tables and columns in well-designed terms, a SQL query can read almost as plainly as an English sentence.

For the more programming-oriented readers, it may be of interest that you can use the inequality (!=) operator to compare the value of a boolean field against any of the values in Table 3-11 (e.g., WHERE in_stock != 't'). As such, the following three syntactic variations are each equivalent:

 SELECT * FROM daily_inventory WHERE NOT in_stock;
 SELECT * FROM daily_inventory WHERE in_stock = 'no';
 SELECT * FROM daily_inventory WHERE in_stock != 't';

You may have noticed that while seven rows were inserted into the table in Example 3-17, only six rows were returned between the books found in stock, and those found out of stock. This is due to the last insertion in Example 3-17 not supplying a value at all for the in_stock column, leaving the record for the book with ISBN 0451160916 with a NULL value in the in_stock column.

As stated previously, NULL will not register as either true or false. As such, you may use the SQL phrase IS NULL to check for rows with NULL values. Alternatively, you may use != but you will risk portability issues with other databases. The following syntax demonstrates a SQL query which uses the IS NULL phrase:

booktown=# SELECT * FROM daily_inventory WHERE in_stock IS NULL;
    isbn    | in_stock
------------+----------
 0451160916 |
(1 row)

Since IS NULL is a general SQL phrase, you can use the same WHERE clause in an UPDATE statement to correct any accidental NULL values.

Example 3-21. Correcting Null values

booktown=# UPDATE daily_inventory SET in_stock = 'f' WHERE in_stock IS NULL;
UPDATE 1

Character Types

Character types are required any time that you wish to reference character data, such as blocks of ASCII text. They are commonly used for storing names, addresses, and so on.

SQL provides two character types called character, and character varying. In addition to these, a general text type is supported by PostgreSQL, which does not require an explicitly declared upper limit on the size of the field. Columns of type text are automatically re-sized according to the data you put in them, and they may re-size without boundaries (discounting, of course, the 1GB limit for a single field). Table 3-12 shows the available character data types within PostgreSQL.

Table 3-12. Character types

Type

Storage

Description

character(n ), char(n )

(4 + n ) bytes

A fixed-length character string, padded with spaces so that it is n characters in length.

character varying(n ), varchar(n )

Up to (4 + n ) bytes

A variable-length character string with a limit of n characters

text

Variable

A variable, unlimited-length character string

The n in Table 3-12 represents an arbitrarily specified number of characters. This number is specified for a column when a table is created.

Note: Although the text data type is not part of the ANSI/ISO SQL standards, many other Relational Database Management Systems (RDBMS) provide this functionality, including Sybase and MS SQL Server.

Numeric Types

PostgreSQL's numeric types are used to represent both integers and decimal floating-point values. From a general perspective, PostgreSQL's supported numeric types consist of:

  • Two-, four-, and eight-byte integers

  • Four- and eight-byte floating-point numbers

  • Fixed precision decimals

PostgreSQL has support for special types which fall under the family of numeric types, including the deprecated money type, and the special serial construct.

Table 3-13. Numeric types overview

Data type

Storage

Range

bigint, int8

8 bytes

Whole integer values, –9,223,372,036,854,775,807 to +9,223,372,036,854,775,807

double precision, float8, float

8 bytes

Floating-point integer values, 15 significant digits, unlimited size (with limited precision)

integer, int, int4

4 bytes

Whole integer values, –2147483648 to +2147483647

numeric(p,s ), decimal (p,s )

Variable

Whole or floating point integers defined as p total digits (including digits to the right of the decimal) with s digits to the right of the decimal point

real, float4

4 bytes

Floating-point integer values, six significant digits, unlimited size (with limited precision)

smallint, int2

2 bytes

Whole integers, –32768 to +32767

money

4 bytes

Floating-point integer values with a scale of two digits to the right of the decimal, —21474836.48 to +21474836.47

serial

4 bytes

Whole integers, 0 to 2147483647

As shown in Table 3-13, several of PostgreSQL's data types have aliases that are equivalent to their associated data types. This was done for ease of use, but at times it can be confusing, due to the fact that some of the aliases sound familiar. If you are not careful to understand what data type an alias you are using is associated with, you may accidentally reference the wrong data type. For example, in PostgreSQL the real and double precision data types represent numbers you may be more familiar to using a float variable in other languages; however, because they both have aliases that contain the word "float" (float and float8 link to double precision; float4 links to real). Problems may result if if you attempt to use the float alias, thinking it is linked to real, when in fact it is associated with double precision.

The numeric type

The numeric (also known as decimal) type is a specially designed numeric data type that can represent arbitrarily large and precise values within a fixed length that is given by the user. When you create a table with a column of type numeric, you may specify in parentheses two values: the precision and the scale.

The precision is the maximum number of digits that the numeric value may hold (including digits to the right of the decimal point), while the scale describes how many of those digits of precision are to be to the right of the decimal point. If left unspecified, the precision will default to 30 digits, and scale to 6 digits. The maximum precision (and, hence, the maximum scale) you can set this to is 1,000. Setting the precision to 1,000 would allow a maximum 1,000 digits, which should be fairly adequate for most needs.

Note: PostgreSQL will not always return an error if you violate the precision and scale of a numeric column.

Unlike the floating-point data types, you will receive an overflow error if you attempt to insert a number that is larger than the allotted precision range. Beside this limitation, you should be able to insert any number that fits within the provided precision and scale of the numeric type column.

For example, in a numeric(11,6) column, you may safely insert the value 9.999999 with two digits too many to the right of the decimal point (though the value is rounded up to 10.000000). However, an attempt to insert the value 99999.99999999 will fail, as shown in Example 3-22.

Problems that arise from trying to insert values that are two large can be avoided by using the trunc() numeric truncating function within an INSERT command to make sure a number is truncated to a size suitable for the column it is being inserted into. You must provide the length it should be truncated to, which means you'll have to be aware of the precisions you've previously specified. The use of trunc() is also illustrated within Example 3-22.

Example 3-22. Avoiding overflow errors

booktown=# INSERT INTO numbers VALUES (9.99999999);
INSERT 3390697 1
booktown=# SELECT * FROM numbers;
    number
--------------
    10.000000
(1 row)

booktown=# INSERT INTO numbers VALUES (99999.99999999);
ERROR:  overflow on numeric ABS(value) >= 10^5 for field with precision 11 scale 6
booktown=# INSERT INTO numbers VALUES (trunc(99999.99999999, 6));
INSERT 3390698 1
booktown=# SELECT * FROM numbers;
    number
--------------
    10.000000
 99999.999999
(2 rows)

booktown=# INSERT INTO numbers VALUES (trunc(9.99999999, 6));
INSERT 3390699 1
booktown=# SELECT * FROM numbers;
    number
--------------
    10.000000
 99999.999999
     9.999999
(3 rows)

The monetary type

The money type stores U.S.-style currency notation and plain numeric values. As of the writing of this book, the money type is deprecated, and is discouraged from being actively used. It is only presented here as it is still a functional data type, and may be in use on existing PostgreSQL systems.

The suggested alternative to the money type is the numeric type, with a scale of 2 to represent coin values, and a precision large enough to store the largest necessary monetary value (including two digits for the coin precision). Formatting similar to that of the money type can be achieved with the to_char() function, as shown in Example 3-23. This example demonstrates the text concatenation operator, and the ltrim() text formatting function, each described in Chapter 4.

Example 3-23. A numeric alternative to money

booktown=# CREATE TABLE money_example (money_cash money, numeric_cash numeric(10,2));    
CREATE
booktown=# INSERT INTO money_example VALUES ('$12.24', 12.24);
INSERT 3391095 1
booktown=# SELECT * FROM money_example;
 money_cash | numeric_cash
------------+--------------
     $12.24 |        12.24
(1 row)

booktown=# SELECT money_cash, 
booktown-#        '$' || ltrim(to_char(numeric_cash, '9999.99')) 
booktown-#        AS numeric_cashified 
booktown-#        FROM money_example;
 money_cash | numeric_cashified
------------+-------------------
     $12.24 | $12.24
(1 row)

The serial type

The serial type is a non-standard but useful shortcut which allows you to easily create an identifier column within a table that contains a unique value for each row. The serial type literally combines the functionality of a 4-byte integer data type, an index, and a sequence. Example 3-24 shows the serial type being used to generate a unique identifier for each row in a table named auto_identifier. Example 3-25 shows the same thing being accomplished using an integer column, the nextval() function , and a sequence. As of the writing of this book, these two methods are functionally identical.

See Chapter 7 for more information on using sequences.

Example 3-24. Using the serial data type

booktown=# CREATE TABLE auto_identified (id serial);
NOTICE:  CREATE TABLE will create implicit sequence 'auto_identified_id_seq'
for SERIAL column 'auto_identified.id'
NOTICE:  CREATE TABLE/UNIQUE will create implicit index 'auto_identified_id_key'
for table 'auto_identified'
CREATE

Example 3-25. Accomplishing the same goal manually

booktown=# CREATE SEQUENCE auto_identified_id_seq;
CREATE
booktown=# CREATE TABLE auto_identified 
booktown-# (id integer UNIQUE DEFAULT nextval('auto_identified_id_seq'));
NOTICE:  CREATE TABLE/UNIQUE will create implicit index 'auto_identified_id_key' for table 'auto_identified'
CREATE

Caution with Implicit Sequences

Upon dropping a table, the implicit sequence created for the serial types are not automatically dropped. You must clean up after these types of sequences if you destroy a table which had a serial column, as shown in Example 3-24, with the DROP SEQUENCE command.

Date and Time Types

Date and time types are a convenient way to store date and time related data in a uniform SQL data structure, without having to worry about the conventions involved with storage (e.g., if you were to try to store such information in a character data type). PostgreSQL uses Julian dates for all date and time calculations. Julian date representation is the commonly used January through December calendar that you are most likely familiar with. By fixing the length of a year at about 365.24 days, Julian dates can correctly calculate any date after 4713 BC, as well as far into the future.

PostgreSQL supports all of the SQL92-defined date and time types shown in Table 3-14, as well as some PostgreSQL-specific extensions to help with SQL92's timezone limitations.

Table 3-14. Date and time types

Name

Storage

Description

Range

date

4 bytes

A calendar date (year, month, and day)

4713 BC to 32767 AD

time

4 bytes

The time of day only, without time zone information

00:00:00.00 to 23:59:59.99

time with time zone

4 bytes

The time of day only, including a time zone

00:00:00.00+12 to 23:59:59.99-12

timestamp (includes time zone)

8 bytes

Both the calendar date and time, with time zone information

1903 AD to 2037 AD

interval

12 bytes

A general time span interval

–1780000000 years to 17800000 years

Backward compatibility

To ensure compatibility with earlier versions of PostgreSQL, the developers have continued to provide the older datetime and timespan data types. The datetime type is now equivalent to timestamp, while the timespan is now equivalent to the interval types.

Other date/time data types include abstime and reltime, which are lower precision types. However, these types are internal to PostgreSQL, and any or all of these types may disappear in a future release. It is advised therefore to design new applications with the SQL-compliant data types in mind, and to convert older applications from any of these data types as soon as is possible.

Date conventions

Date input can be accepted by PostgreSQL in many common formats, including the ISO-8601 format, the traditional SQL format, the original PostgreSQL format, and more. Table 3-15 lists several of these date formats.

These formats are relevant to the date and the timestamp data types.

Table 3-15. Valid date formats

Format Example

Description

July 1, 2001

Named month, day and year

Sunday July 1, 2001

Named day, named month, day and year

July 15, 01 BC

Named month, day and year before the Common Era

2001-07-01

Standard ISO-8601 format: numeric year, month and day

20010715

ISO-8601: formatted numerically as complete year, month, day

010715

ISO-8601: formatted numerically as 2-digit year, month, day

7/01/2001

Non-European (U.S.) format: numeric month, day and year

1/7/2001

European format: numeric day, month and year

2001.182

Numeric format, with complete year, and sequential day of the year

When specifying a named month in a date value to PostgreSQL, you may either type the complete month name, or choose from a set of defined abbreviations for each month. These abbreviations are listed in Table 3-16.

Table 3-16. Month abbreviations

Month

Abbreviation

January

Jan

February

Feb

March

Mar

April

Apr

May

May

June

Jun

July

Jul

August

Aug

September

Sep, Sept

October

Oct

November

Nov

December

Dec

Similarly, Table 3-17 lists PostgreSQL-recognized abbreviations for weekday names.

Table 3-17. Day of the week abbreviations

Day

Abbreviation

Sunday

Sun

Monday

Mon

Tuesday

Tue, Tues

Wednesday

Wed, Weds

Thursday

Thu, Thur, Thurs

Friday

Fri

Saturday

Sat

Despite the wide variety of ways in which PostgreSQL can interpret date values, the values are always stored uniformally, and will be returned in a consistent format. As such, you have a variety of methods available to you to customize the default behavior with which date and time values are returned to you.

Note: While date values can always be formatted during selection via several formatting functions (e.g., to_char()), it is more efficient to configure your defaults as close to the most commonly used conventions as you can before having to resort to manual type conversion and text formatting.

To set the general date/time output format, the SET command can be applied to the run-time variable DATESTYLE. This variable may be set to one of four available general styles shown in Table 3-18.

Table 3-18. Date output formats

General format

Description

Example

ISO

ISO-8601 standard

2001-06-25 12:24:00-07

SQL

Traditional SQL style

06/25/2001 12:24:00.00 PDT

Postgres

Original PostgreSQL style

Mon 25 Jun 12:24:00 2001 PDT

German

Regional style for Germany

25.06.2001 12:24:00.00 PDT

As an example, you can use the following SQL statement to set the date style to SQL:

booktown=# SET DATESTYLE TO SQL;
SET VARIABLE

If you perform a SELECT current_timestamp query after setting this variable, PostgreSQL should return the current time using the ISO format as instructed:

booktown=# SELECT current_timestamp;
         timestamp
----------------------------
 08/10/2001 13:25:55.00 PDT
(1 row)

The SHOW command can be used to display the current value of the DATESTYLE variable while PostgreSQL is running.

booktown=# SHOW DATESTYLE;
NOTICE:  DateStyle is SQL with US (NonEuropean) conventions
SHOW VARIABLE

In addition to these general formats, PostgreSQL's date output format has two other variants which further describe how to display the date, shown in Table 3-19: European and non-European (U.S.). These determine whether the format is day followed by month, or vice versa. This variation can be applied on top of the previous four general formats with the same syntax to SET DATESTYLE and will not modify your chosen format except for the arrangement of the month and day.

Table 3-19. Extended date output formats

Month/day format

Description

Example

European

day/month/year

12/07/2001 17:34:50.00 MET

U.S., or Non-European

month/day/year

07/12/2001 17:34:50.0 PST

Furthermore, you may set both the general format and day/month convention by supplying both variables to the SET command, comma delimited. The order of these variables is not important to the SET command as long as the variables are not mutually exclusive (e.g., SQL and ISO), as shown in Example 3-26.

Example 3-26. Setting date formats

booktown=# SET DATESTYLE TO ISO,US;
SET VARIABLE
booktown=# SHOW DATESTYLE;
NOTICE:  DateStyle is ISO with US (NonEuropean) conventions
SHOW VARIABLE
booktown=# SET DATESTYLE TO NONEUROPEAN, GERMAN;
SET VARIABLE
booktown=# SHOW DATESTYLE;
NOTICE:  DateStyle is German with European conventions
SHOW VARIABLE

If you do not specify a month/day format, a reasonable default will usually be chosen (e.g., European is the default for the German regional format).

While SET DATESTYLE is a convenient way to set the output format, it is important to note that this is a run-time variable, which means that it exists only for the lifespan of your connected session. There are two methods available that allow you to provide a default value for the DATESTYLE variable, which lets you avoid explicitly setting the variable for each new session you begin:

  • You may change the PGDATESTYLE environment variable on the server running postmaster. For example, with the bash shell, you could add the export PGDATESTYLE="SQL US" line to the postgres user's .bash_ profile file. When the postgres user starts postmaster, the PGDATESTYLE variable will be read and applied globally to all date and time formatting performed by PostgreSQL.

  • You may change the PGDATESTYLE environment variable used by a client application (assuming it was written with the libpq library) on its session start-up, if you wish the client rather than the server to configure the output. For example, setting the PGDATESTYLE variable at a bash prompt with the export command before starting psql sets the format for psql to use.

Time conventions

Time values, like date values, may be entered in to a table in a number of ways. Commonly used formats are listed in Table 3-20. These apply to values of type time and time with time zone.

Table 3-20. Valid time formats

Format example

Description

01:24

ISO-8601, detailed to minutes

01:24 AM

Equivalent to 01:24 (the "AM" attached is for readability only, and does not affect the value)

01:24 PM

Equivalent to 13:24 (the hour must be less-than or equal to 12 to use "PM")

13:24

24-hour time, equivalent to 01:24 PM

01:24:11

ISO-8601, detailed to seconds

01:24:11.112

ISO-8601, detailed to microseconds

012411

ISO-8601, detailed to seconds, formatted numerically

In addition to these formats, PostgreSQL allows for further description of a time value which is defined as time with time zone by supporting extra time zone parameters following the time value. The supported formats are illustrated in Table 3-21.

Table 3-21. Valid time zone formats

Format example

Description

01:24:11-7

ISO-8601, 7 hours behind GMT

01:24:11-07:00

ISO-8601, 7 hours, zero minutes behind GMT

01:24:11-0700

ISO-8601, 7 hours, zero minutes behind GMT

01:24:11 PST

ISO-8601, Pacific Standard Time (7 hours behind GMT)

Note: PostgreSQL supports the use of all ISO standard time zone abbreviations.

The time with time zone data type is mainly supported by PostgreSQL to adhere to existing SQL standards and for portability with other database management systems. If you need to work with time zones, it is recommended that you use the timestamp data type discussed in the Section called Timestamps." This is primarily because of the fact that, due to daylight savings, time zones cannot always be meaningfully interpreted without an associated date.

Internally, PostgreSQL keeps track of all time zone information as a numeric offset of GMT (Greenwich Mean Time), which is also known as UTC (Universal Coordinated Time). By default, PostgreSQL's time display will use the time zone that your server's operating system is configured for. If you wish the time value to operate under a different time zone, there are four ways in which you can modify the output:

Set the TZ environment variable on the server

This variable is found by the backend server as the default time zone when the postmaster starts up. It can be set, for example, in the postgres user's .bash_ profile file with a bash export TZ='zone' command.

Set the PGTZ environment variable on the client

If the PGTZ environment variable is set, it can be read by any client written with libpq and interpreted as the client's default time zone.

Use the SET TIMEZONE TO SQL statement

This SQL command sets the time zone for the session to zone (e.g., SET TIMEZONE TO UTC)

Use the AT TIME ZONE SQL clause

This SQL92 clause can be used to specify zone as a text time zone (e.g., PST ) or as an interval (e.g., interval('—07:00')). This clause may be applied in the middle of a SQL statement following a value which contains a timestamp (e.g., SELECT my_timestamp AT TIME ZONE 'PST').

Note: Most systems will default to GMT when a time zone variable is set to an invalid time zone.

Additionally, if the compiler option USE_AUSTRALIAN_RULES was set when PostgreSQL was built, the EST time zone will refer to Australian Eastern Standard Time (with an offset of +10:00 hours from GMT) rather than U.S. Eastern Standard Time.

Timestamps

The PostgreSQL timestamp combines the functionality of the PostgreSQL date and time types into a single data type. The syntax of a timestamp value consists of a valid date format, followed by at least one whitespace character, and a valid time format. It can be followed optionally by a time zone value, if specified.

Combinations of all date and time formats listed in Table 3-15 and Table 3-20 are each supported in this fashion. Table 3-22 illustrates some examples of valid timestamp input.

Table 3-22. Some valid timestamp formats

Format Example

Description

1980-06-25 11:11-7

ISO-8601 date format, detailed to minutes, and PST time zone

25/06/1980 12:24:11.112

European date format, detailed to microseconds

06/25/1980 23:11

U.S. date format, detailed to minutes in 24-hour time

25.06.1980 23:11:12 PM

German regional date format, detailed to seconds, and PM attached

Warning

While PostgreSQL supports the syntax of creating a column or value with the type timestamp without time zone, as of PostgreSQL 7.1.2 the resultant data type still contains a time zone.

Intervals

The SQL92 standard specifies a data typed called an interval, which represents a fixed span of time. By itself, an interval represents only a quantity of time, and does not begin or end at any set date or time. These intervals can be useful when applied to date and time values to calculate a new date or time, either by subtracting or adding the quantity. They can also be handy for quickly determining the precise interval between two date or time values. This can be achieved by subtracting date values, time values or timestamps from one another.

The two syntax variations below can specify an interval within PostgreSQL:

  qty unit  [ ago ]
  qty1 unit [, qty2 unit2 ... ] [ ago ]

Where:

qty

Specifies the quantity of your interval, which may be any whole integer, or floating-point number in the case of microseconds. The literal meaning of this number is qualified by the subsequent unit.

unit

Qualifies the qty provided. The unit may be any one of the following keywords: second, minute, hour, day, week, month, year, decade, century, millennium. It can also be an abbreviation (as short as you want, as long as it cannot be confused with another keyword) or plurals of the previously mentioned units.

ago

The optional ago keyword of the interval determines whether or not you are describing a period of time before the associated time, rather than after. You can think of it as a negative sign for date and time types.

Example 3-27 shows functional syntax for date and interval values being meaningfully combined. You can see that subtracting an inverted time interval (e.g., one with the term ago) is functionally identical to adding a normal interval. This can be thought of as similar to the effect of adding negative numbers to integer values.

Example 3-27. Interpreting interval formats

booktown=# SELECT date('1980-06-25');
    date
------------
 1980-06-25
(1 row)

booktown=# SELECT interval('21 years 8 days');
    interval
-----------------
 21 years 8 days
(1 row)

booktown=# SELECT date('1980-06-25') + interval('21 years 8 days') 
booktown-# AS spanned_date;
      spanned_date
------------------------
 2001-07-03 00:00:00-07
(1 row)

booktown=# SELECT date('1980-06-25') - interval('21 years 8 days ago') 
booktown-# AS twice_inverted_interval_date;
 twice_inverted_interval_date
------------------------------
 2001-07-03 00:00:00-07
(1 row)

Built-in date and time constants

PostgreSQL supports many special constants for use when referencing dates and times. These constants represent common date/time values, such as now, tomorrow, and yesterday. The predefined date and time constants supported by PostgreSQL are listed in Table 3-23.

PostgreSQL also provides three built-in functions for retrieving the current time, date, and timestamp. These are aptly named current_date, current_time, and current_timestamp.

Table 3-23. Date and time constants

Constant

Description

current

The current transaction time, deferred. Unlike a now, current is not a timestamp; it represents the current system time and can be used to reference whatever that time may be.

epoch

1970-01-01 00:00:00+00 (UNIX's "Birthday")

infinity

An abstract constant later than all other valid dates and times

-infinity

An abstract constant earlier than all other valid dates and times

now

The current transaction timestamp

today

Midnight, on the current day

tomorrow

Midnight, on the day after the current day

yesterday

Midnight on the day before the current day

The now and current timestamp constants may seem to be identical, looking solely at their names. They are, however, very different in terms of storing them in a table. The now constant is translated into the timestamp of the system time at the execution of whichever command referenced it (e.g., the time of insertion, it now had been referenced in an INSERT statement). In contrast, the current constant, as it is a deferred identifier, will actually appear as the phrase current in the database. From there, it can be translated (e.g., via the to_char() function) to the timestamp associated with the transaction time of any query which requests that value.

In other words, current will always tell you the "current" time when queried, regardless of when it was stored to the table. The current constant can be used in special situations, such as process tracking, where you may need to calculate the difference between a timestamp made with now and the current date and time to find the total time the process has been running. Example 3-28 demonstrates using the now and current constants to create a log of tasks. First, a table is created to house the task's name, its start date and time, and its finished date and time. Two tasks are then added to the table, using the now constant to set the start date and current to set the completed date. The reason this is done is to show that both of these tasks are uncompleted. If a task were to be completed, the table could be updated to show a now timestamp for that task's timefinished column.

Note: The use of time/date constants requires the use of single-quotes around their respective names. See Example 3-28 for a valid representation of single-quoted time/date constants.

Example 3-28. Using the current and now constants

booktown=# CREATE TABLE tasklog
booktown=#   (taskname char(15),
booktown=#    timebegun timestamp,
booktown=#    timefinished timestamp);
CREATE
booktown=# INSERT INTO tasklog VALUES
booktown=#    ('delivery', 'now', 'current');
INSERT 169936 1
booktown=# INSERT INTO tasklog VALUES
booktown=#    ('remodeling', 'now', 'current');
INSERT 169937 1
booktown=# SELECT taskname, timefinished - timebegun AS timespent FROM tasklog;
    taskname     | timespent
-----------------+-----------
 delivery        | 00:15:32
 remodeling      | 00:04:42
(2 rows)

Therefore, you generally want to use now when storing a transaction timestamp in a table, or even the current_timestamp function, which is equivalent to the output of now. Example 3-29 shows how this could be a potentially disastrous SQL design issue if not properly understood. It shows a pair of INSERT statements; one which uses now, another which uses current. If you watch the first row returned from the two queries (the row with a current timestamp), you'll notice it changes in each query to show the updated system time, while the second row remains the same (this is he the row in which now was used).

Example 3-29. Comparing now to current

booktown=# INSERT INTO shipments (customer_id, isbn, ship_date) 
booktown-#        VALUES (1, '039480001X', 'current');
INSERT 3391221 1
booktown=# INSERT INTO shipments (customer_id, isbn, ship_date) 
booktown-#        VALUES (2, '0394800753', 'now');
INSERT 3391222 1
booktown=# SELECT isbn, ship_date FROM shipments;
    isbn    |       ship_date
------------+------------------------
 039480001X | current
 0394800753 | 2001-08-10 18:17:49-07
(2 rows)

booktown=# SELECT isbn, 
booktown-#        to_char(ship_date, 'YYYY-MM-DD HH24:MI:SS') 
booktown-#        AS value
booktown-#   FROM shipments;
    isbn    |        value
------------+---------------------
 039480001X | 2001-08-10 18:21:22
 0394800753 | 2001-08-10 18:17:49
(2 rows)

booktown=# SELECT isbn, to_char(ship_date, 'YYYY-MM-DD HH24:MI:SS') AS value
booktown-#   FROM shipments;
    isbn    |        value
------------+---------------------
 039480001X | 2001-08-10 18:22:35
 0394800753 | 2001-08-10 18:17:49
(2 rows)

Geometric types

Geometric types in PostgreSQL represent two dimensional spatial objects. These types are not standard SQL data types, and will not be discussed in depth in this book. Table 3-24 gives a brief overview of each of the available geometric types.

Table 3-24. Geometric types

Type Name

Storage

Description

Syntax

point

16 bytes

A dimensionless object with no properties except for its location, where x and y are floating-point numbers.

(x , y )

lseg

32 bytes

Finite line segment. The points specified are the end points of the line segment.

((x1 , y1 ), (x2 , y2 ))

box

32 bytes

Rectangular box. The points specified are the opposite corners of the box.

((x1 , y1 ), (x2 , y2 ))

path

4 + 32 * n bytes

Closed path (similar to polygon). A connected set of n points.

((x1 , y1 ), ...)

path

4 + 32 * n bytes

Open path. A connected set of n points.

[(x1 , y1 ), ...]

polygon

4 + 32 * n bytes

Polygon (similar to closed path), with n end points defining line segments that makes up the boundary of the polygon.

((x1 , y1 ), ...)

circle

24 bytes

The point (x , y ) is the center, while r is the radius of the circle.

<(x , y ), r >

Arrays

The original relational model specifies that the values represented by columns within a table be an atomic piece of data, object-relational database systems such as PostgreSQL allow non-atomic values to be used through data structures called arrays.

An array is a collection of data values referenced through a single identifier. The array may be a collection of values of a built-in data type or a user-defined data type, but every value in the array must be of the same type. Arrays can be accessed from a table through subscript notation via square brackets (e.g., my_array[0]). You can also use an array constant via curly braces within single quotes (e.g., '{value_one,value_two,value_three}').

Arrays in tables

When defining an array, the syntax allows for the array to be defined either as fixed-length or variable-length; however as of PostgreSQL 7.1.2, the fixed-length size restriction is not enforced. This means that you may treat the array as having a fixed number of elements at all times, but it can still be dynamically sized. For example, it is perfectly acceptable for a single column defined as an array to contain three values in one record, four values in another, and no values in a third.

Additionally, arrays may be defined as being multi-dimensional, meaning that each element of the array may actually represent another array, rather than an atomic value. Values that are selected from a multi-dimensional array will consist of nested curly braces in order to show an array within an array, as follows:

booktown=# SELECT editions FROM my_notes WHERE title='The Cat in the Hat';
                           editions
---------------------------------------------------------------
 {{"039480001X","1st Ed, Hard Cover"},{"0394900014","1st Ed"}}
(1 row)

Array constants

In order to actually insert array values into a table column, you need a way to refer to several values as an array in a SQL statement. The formal syntax of an array constant is a grouping of values, separated by delimiters (commas, for built-in data types), enclosed by curly braces ({}), which are in turn enclosed by single quotes, as follows:

 '{ value1 , value2 [, ...] }'

The values in this syntax can be any valid PostgreSQL data type. As the entire array is constrained by single quotes, the use of single quotes within an array value must be escaped, just as they must be within a string constant. The use of commas to delimit the values, however, poses an interesting problem pertaining to the use of character strings which contain commas themselves, as the commas will be interpreted as delimiters if not within single-quotes. However, as just mentioned, the singles quotes constrain the array, not the array's values.

PostgreSQL's method of handling this is to use double-quotes to quote string constants where single-quotes would ordinarily be used outside of an array context, as follows:

 '{"value1" , "value 2, which contains a comma" }'

It's vital to remember that arrays require the single quotes surrounding the curly braces in order to be interpreted correctly by PostgreSQL. You can think of array constants as being akin to a special type of string constant, which is interpreted as an array based on where it is used (e.g., when used to add records to a target column which is of an array data type). This is because unless used in an array context, a constant of the this format will be interpreted by PostgreSQL as a normal string constant (as it is bound by single quotes) which just happens to include curly braces.

Type Coercion

PostgreSQL supports three separate conventions for type coercion (also called type casting, or explicit type casting). Type coercion is a somewhat ugly looking term which refers to a PostgreSQL method for changing a value from one data type to another. In the middle of a SQL statement, this has the net effect of explicitly creating a constant of an arbitrary type.

Generally any of the following three methods can be used in order to cast the value contained within a string constant to another type:

  • type 'value '

  • 'value '::type

  • CAST ('value ' AS type )

In the case of maintained numeric constants that you wish to cast to a character string, you will need to use one of the following syntax forms:

  • value ::type

  • CAST (value AS type )

The value in this syntax represents the constant whose data type you wish to modify, and type represents the type that you wish to coerce, or cast, the value into.

Note: Remember that the money type is deprecated, and therefore not easily cast.

Constants are not the only data values that may be coerced to different types. Columns of a data set returned by a SQL query may be cast by using its identifier in one of the following syntax forms:

  • identifier ::type

  • CAST (identifier AS type )

Bear in mind that not every data type can be coerced into every other data type. For example, there is no meaningful way to convert the character string abcd into a binary bit type. Invalid casting will result in an error from PostgreSQL. Common valid casts are from character string, date/time type, or a numeric type to text, or character strings to numeric values.

In addition to these type casting conventions, there are some functions that can be called to achieve essentially the same effect as an explicit cast of any of the previously mentioned forms. These often bear the name of the type itself (such as the text() function), though others are named more specifically (such as bitfromint4()). Example 3-30 shows such a function, converting the integer 1000 to a character string of type text representing the characters 1000.

Example 3-30. Using Type Conversion Functions

booktown=# SELECT text(1000) 
booktown-# AS explicit_text;
 explicit_text
---------------
 1000
(1 row)

Because of conflicting semantics recognized by PostgreSQL's parser, the type coercion format of type 'value' can only be used to specify the data type of a single value (e.g., a string constant bound by single quotes). In contrast, the other available methods of type coercion ('value'::type, CAST('value' AS type ) and type conversion functions, where applicable) can be used to to specify the type of arbitrary expressions.

This is partially because attempting to follow a data type with a grouped expression (e.g., in parentheses) will cause PostgreSQL to expect a function with the name of the provided data type (which will often cause an error) while each of the other methods are syntactically valid upon grouped expressions.

booktown=# SELECT 1 + integer ('1' || '2') AS add_one_to_twelve;
ERROR:  Function 'integer(text)' does not exist
        Unable to identify a function that satisfies the given argument types
        You may need to add explicit typecasts
booktown=# SELECT 1 + ('1' || '2')::integer AS add_one_to_twelve;
 add_one_to_twelve
-------------------
                13
(1 row)

booktown=# SELECT 1 + CAST('1' || '2' AS integer) AS add_on_to_twelve;
 add_on_to_twelve
------------------
               13
(1 row)


Powered by Mammoth PostgreSQL
Copyright © 2000-2007 Command Prompt, Inc. All Rights Reserved. All trademarks property of their respective owners.