Quoting String Constants
Printing Strings
Accessing Individual Characters
Cleaning Strings
Encoding and Escaping
Comparing Strings
Manipulating and Searching Strings
Regular Expressions
POSIX-Style Regular Expressions
Perl-Compatible Regular Expressions
Most data you encounter as you program will be sequences of characters, or strings. Strings hold people's names, passwords, addresses, credit-card numbers, photographs, purchase histories, and more. For that reason, PHP has an extensive selection of functions for working with strings.
This chapter shows the many ways to write strings in your programs, including the sometimes-tricky subject of interpolation (placing a variable's value into a string), then covers the many functions for changing, quoting, and searching strings. By the end of this chapter, you'll be a string-handling expert.
There are three ways to write a literal string in your program: using single quotes, double quotes, and the here document (heredoc) format derived from the Unix shell. These methods differ in whether they recognize special escape sequences that let you encode other characters or interpolate variables.
The general rule is to use the least powerful quoting mechanism necessary. In practice, this means that you should use single-quoted strings unless you need to include escape sequences or interpolate variables, in which case you should use double-quoted strings. If you want a string that spans many lines, use a heredoc.
When you define a string literal using double quotes or a heredoc, the string is subject to variable interpolation. Interpolation is the process of replacing variable names in the string with the values of those variables. There are two ways to interpolate variables into strings—the simple way and the complex way.
The simple way is to just put the variable name in a double-quoted string or heredoc:
$who = 'Kilroy'; $where = 'here'; echo "$who was $where"; Kilroy was here
The complex way is to surround the variable being interpolated with curly braces. This method can be used either to disambiguate or to interpolate array lookups. The classic use of curly braces is to separate the variable name from surrounding text:
$n = 12; echo "You are the {$n}th person"; You are the 12th person
Without the curly braces, PHP would try to print the value of the $nth variable.
Unlike in some shell environments, in PHP strings are not repeatedly processed for interpolation. Instead, any interpolations in a double-quoted string are processed, then the result is used as the value of the string:
$bar = 'this is not printed'; $foo = '$bar'; // single quotes print("$foo"); $bar
Single-quoted strings do not interpolate variables. Thus, the variable name in the following string is not expanded because the string literal in which it occurs is single-quoted:
$name = 'Fred'; $str = 'Hello, $name'; // single-quoted echo $str; Hello, $name
The only escape sequences that work in single-quoted strings are \', which puts a single quote in a single-quoted string, and \\, which puts a backslash in a single-quoted string. Any other occurrence of a backslash is interpreted simply as a backslash:
$name = 'Tim O\'Reilly'; // escaped single quote echo $name; $path = 'C:\\WINDOWS'; // escaped backslash echo $path; $nope = '\n'; // not an escape echo $nope; Tim O'Reilly C:\WINDOWS \n
Double-quoted strings interpolate variables and expand the many PHP escape sequences. Table 4-1 lists the escape sequences recognized by PHP in double-quoted strings.
Escape sequence |
Character represented |
---|---|
\" |
Double quotes |
\n |
Newline |
\r |
Carriage return |
\t |
Tab |
\\ |
Backslash |
\$ |
Dollar sign |
\{ |
Left brace |
\} |
Right brace |
\[ |
Left bracket |
\] |
Right bracket |
\0 through \777 |
ASCII character represented by octal value |
\x0 through \xFF |
ASCII character represented by hex value |
If an unknown escape sequence (i.e., a backslash followed by a character that is not one of those in Table 4-1) is found in a double-quoted string literal, it is ignored (if you have the warning level E_NOTICE set, a warning is generated for such unknown escape sequences):
$str = "What is \c this?"; // unknown escape sequence echo $str ; What is \c this?
You can easily put multiline strings into your program with a heredoc, as follows:
$clerihew = <<< End_Of_Quote Sir Humphrey Davy Abominated gravy. He lived in the odium Of having discovered sodium. End_Of_Quote; echo $clerihew; Sir Humphrey Davy Abominated gravy. He lived in the odium Of having discovered sodium.
The <<< Identifier tells the PHP parser that you're writing a heredoc. There must be a space after the <<< and before the identifier. You get to pick the identifier. The next line starts the text being quoted by the heredoc, which continues until it reaches a line that consists of nothing but the identifier.
As a special case, you can put a semicolon after the terminating identifier to end the statement, as shown in the previous code. If you are using a heredoc in a more complex expression, you need to continue the expression on the next line, as shown here:
printf(<<< Template %s is %d years old. Template , "Fred", 35);
Single and double quotes in a heredoc are passed through:
$dialogue = <<< No_More "It's not going to happen!" she fumed. He raised an eyebrow. "Want to bet?" No_More; echo $dialogue; "It's not going to happen!" she fumed. He raised an eyebrow. "Want to bet?"
Whitespace in a heredoc is also preserved:
$ws = <<< Enough boo hoo Enough; // $ws = " boo\n hoo\n";
The newline before the trailing terminator is removed, so these two assignments are identical:
$s = 'Foo'; // same as $s = <<< End_of_pointless_heredoc Foo End_of_pointless_heredoc;
If you want a newline to end your heredoc-quoted string, you'll need to add an extra one yourself:
$s = <<< End Foo End;
Copyright © 2003 O'Reilly & Associates. All rights reserved.