Book HomePHP CookbookSearch this book

1.12. Taking Strings Apart

1.12.1. Problem

You need to break a string into pieces. For example, you want to access each line that a user enters in a <textarea> form field.

1.12.2. Solution

Use explode( ) if what separates the pieces is a constant string:

$words = explode(' ','My sentence is not very complicated');

Use split( ) or preg_split( ) if you need a POSIX or Perl regular expression to describe the separator:

$words = split(' +','This sentence  has  some extra whitespace  in it.');
$words = preg_split('/\d\. /','my day: 1. get up 2. get dressed 3. eat toast');
$lines = preg_split('/[\n\r]+/',$_REQUEST['textarea']);

Use spliti( ) or the /i flag to preg_split( ) for case-insensitive separator matching:

$words = spliti(' x ','31 inches x 22 inches X 9 inches');
$words = preg_split('/ x /i','31 inches x 22 inches X 9 inches');

1.12.3. Discussion

The simplest solution of the bunch is explode( ). Pass it your separator string, the string to be separated, and an optional limit on how many elements should be returned:

$dwarves = 'dopey,sleepy,happy,grumpy,sneezy,bashful,doc';
$dwarf_array = explode(',',$dwarves);

Now $dwarf_array is a seven element array:

print_r($dwarf_array);
Array
(
    [0] => dopey
    [1] => sleepy
    [2] => happy
    [3] => grumpy
    [4] => sneezy
    [5] => bashful
    [6] => doc
)

If the specified limit is less than the number of possible chunks, the last chunk contains the remainder:

$dwarf_array = explode(',',$dwarves,5);
print_r($dwarf_array);
Array
(
    [0] => dopey
    [1] => sleepy
    [2] => happy
    [3] => grumpy
    [4] => sneezy,bashful,doc
)

The separator is treated literally by explode( ). If you specify a comma and a space as a separator, it breaks the string only on a comma followed by a space — not on a comma or a space.

With split( ), you have more flexibility. Instead of a string literal as a separator, it uses a POSIX regular expression:

$more_dwarves = 'cheeky,fatso, wonder boy, chunky,growly, groggy, winky';
$more_dwarf_array = split(', ?',$more_dwarves);

This regular expression splits on a comma followed by an optional space, which treats all the new dwarves properly. Those with a space in their name aren't broken up, but everyone is broken apart whether they are separated by "," or ", ":

print_r($more_dwarf_array);
Array
(
    [0] => cheeky
    [1] => fatso
    [2] => wonder boy
    [3] => chunky
    [4] => growly
    [5] => groggy
    [6] => winky
)

Similar to split( ) is preg_split( ), which uses a Perl-compatible regular-expression engine instead of a POSIX regular-expression engine. With preg_split( ), you can take advantage of various Perlish regular-expression extensions, as well as tricks such as including the separator text in the returned array of strings:

$math = "3 + 2 / 7 - 9";
$stack = preg_split('/ *([+\-\/*]) */',$math,-1,PREG_SPLIT_DELIM_CAPTURE);
print_r($stack);
Array
(
    [0] => 3
    [1] => +
    [2] => 2
    [3] => /
    [4] => 7
    [5] => -
    [6] => 9
)

The separator regular expression looks for the four mathematical operators (+, -, /, *), surrounded by optional leading or trailing spaces. The PREG_SPLIT_DELIM_CAPTURE flag tells preg_split( ) to include the matches as part of the separator regular expression in parentheses in the returned array of strings. Only the mathematical operator character class is in parentheses, so the returned array doesn't have any spaces in it.

1.12.4. See Also

Regular expressions are discussed in more detail in Chapter 13; documentation on explode( ) at http://www.php.net/explode, split( ) at http://www.php.net/split, and preg_split( ) at http://www.php.net/preg-split.



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.