Previous Section Next Section

4.6 Sequence Operations

Python supports a variety of operations that can be applied to sequence types, including strings, lists, and tuples.

4.6.1 Sequences in General

Sequences are containers with items accessible by indexing or slicing, as we'll discuss shortly. The built-in len function takes a container as an argument and returns the number of items in the container. The built-in min and max functions take one argument, a non-empty sequence (or other iterable) whose items are comparable, and they return the smallest and largest items in the sequence, respectively. You can also call min and max with multiple arguments, in which case they return the smallest and largest arguments, respectively.

4.6.1.1 Coercion and conversions

There is no implicit coercion between different sequence types except that normal strings are coerced to Unicode strings if needed. Conversion to strings is covered in detail in Chapter 9. You can call the built-in tuple and list functions with a single argument (a sequence or other iterable) to get an instance of the type you're calling, with the same items in the same order as in the argument.

4.6.1.2 Concatenation

You can concatenate sequences of the same type with the + operator. You can also multiply any sequence S by an integer n with the * operator. The result of S*n or n*S is the concatenation of n copies of S. If n is zero or less than zero, the result is an empty sequence of the same type as S.

4.6.1.3 Sequence membership

The x in S operator tests to see whether object x equals any item in the sequence S. It returns True if it does and False if it doesn't. Similarly, the x not in S operator is just like not (x in S).

4.6.1.4 Indexing a sequence

The nth item of a sequence S is denoted by an indexing: S[n]. Indexing in Python is zero-based (i.e., the first item in S is S[0]). If S has L items, the index n may be 0, 1, ... up to and including L-1, but no larger. n may also be -1, -2, ... down to and including -L, but no smaller. A negative n indicates the same item in S as L+n does. In other words, S[-1] is the last element of S, S[-2] is the next-to-last one, and so on. For example:

x = [1,2,3,4]
x[1]                  # 2
x[-1]                 # 4

Using an index greater than or equal to L or less than -L raises an exception. Assigning to an item with an invalid index also raises an exception. You can add elements to a list, but to do so you assign to a slice, not an item, as we'll discuss shortly.

4.6.1.5 Slicing a sequence

You can denote a subsequence of S with a slicing, using the syntax S[i:j], where i and j are integers. S[i:j] is the subsequence of S from the ith item, included, to the jth item, excluded. Note that in Python, all ranges include the lower bound and exclude the upper bound. A slice can be an empty subsequence if j is less than i or if i is greater than or equal to L, the length of S. You can omit i if it is equal to 0, so that the slice begins from the start of S, and you can omit j if it is greater than or equal to L, so that the slice extends all the way to the end of S. You can even omit both indices to mean the entire sequence: S[:]. Either or both indices may be less than 0. A negative index indicates the same spot in S as L+n, just as in indexing. An index greater than or equal to L means the end of S, while a negative index less than or equal to -L means the start of S. Here are some examples:

x = [1,2,3,4]
x[1:3]                 # [2,3]
x[1:]                  # [2,3,4]
x[:2]                  # [1,2]

Slicing can also use the extended syntax S[i:j:k]. In Python 2.2, built-in sequences do not support extended-form slicing, but in Python 2.3 they do. Even in Python 2.2 and earlier, however, user-defined sequences can optionally support extended-form slicing. k is the stride of the slice, or the distance between successive indices. For example, S[i:j] is equivalent to S[i:j:1], S[::2] is the subsequence of S that includes all items that have an even index in S, and S[::-1] has the same items as S, but in reverse order.

4.6.2 Strings

String objects are immutable, so attempting to rebind or delete an item or slice of a string raises an exception. The items of a string object are strings of length 1. The slices of a string object are its substrings. String objects have several methods, which are covered in Chapter 9.

4.6.3 Tuples

Tuple objects are immutable, so attempting to rebind or delete an item or slice of a tuple raises an exception. The items of a tuple are arbitrary objects, and may be of different types. The slices of a tuple are also tuples. Tuples have no normal methods.

4.6.4 Lists

List objects are mutable, so you may rebind or delete items and slices of a list. The items of a list are arbitrary objects, and may be of different types. The slices of a list are also lists.

4.6.4.1 Modifying a list

You can modify a list by assigning to an indexing. For instance:

x = [1,2,3,4]
x[1] = 42                # x is now [1,42,2,3]

Another way to modify a list object L is to use a slice of L as the target (left-hand side) of an assignment statement. The right-hand side of the assignment must also be a list. The left-hand side slice and the right-hand side list may each be of any length, which means that assigning to a slice can add items to the list or remove items from the list. For example:

x = [1,2,3,4]
x[1:3] = [22,33,44]      # x is now [1,22,33,44,4]
x[1:4] = [2,3]           # x back to [1,2,3,4]

Here are some important special cases:

  • Using the empty list [ ] as the right-hand side expression removes the target slice from L. In other words, L[i:j]=[ ] has the same effect as del L[i:j].

  • Using an empty slice of L as the left-hand side target inserts the items of the right-hand side list at the appropriate spot in L. In other words, L[i:i]=['a','b'] inserts the items 'a' and 'b' after item i in L.

  • Using a slice that covers the entire list object, L[:], as the left-hand side target totally replaces the content of L.

You can delete an item or a slice from a list with del. For instance:

x = [1,2,3,4,5]
del x[1]                 # x is now [1,3,4,5]
del x[1:3]               # x is now [1,5]
4.6.4.2 In-place operations on a list

List objects define in-place versions of the + and * operators, which are used via augmented assignment statements. The augmented assignment statement L+=L1 has the effect of adding the items of list L1 to the end of L, while L*=n has the effect of adding n copies of L to the end of L.

4.6.4.3 List methods

List objects provide several methods, as shown in Table 4-3. Non-mutating methods return a result without altering the object to which they apply, while mutating methods may alter the object to which they apply. Many of the mutating methods behave like assignments to appropriate slices of the list. In Table 4-3, L and l indicate any list object, i any valid index in L, and x any object.

Table 4-3. List object methods

Method

Description

Non-mutating methods

L.count(x)

Returns the number of occurrences of x in L

L.index(x)

Returns the index of the first occurrence of item x in L or raises an exception if L has no such item

Mutating methods

L.append(x)

Appends item x to the end of L

L.extend(l)

Appends all the items of list l to the end of L

L.insert(i,x)

Inserts item x at index i in L

L.remove(x)

Removes the first occurrence of item x from L

L.pop([i])

Returns the value of the item at index i and removes it from L; if i is omitted, removes and returns the last item

L.reverse(  )

Reverses, in-place, the items of L

L.sort([f])

Sorts, in-place, the items of L, comparing items by f; if f is omitted, cmp is used as comparison function

All mutating methods of list objects except pop return None. The sort method takes one optional argument. If present, the argument must be a function that, when called with any two list items as arguments, returns -1, 0, or 1, depending on whether the first item is to be considered less than, equal to, or greater than the second item for sorting purposes. Passing the argument slows down the sort, although it makes it easy to sort small lists in flexible ways. The decorate-sort-undecorate idiom, presented in Chapter 17, is faster (and often less error-prone) than passing an argument to sort, and it's at least as flexible.

    Previous Section Next Section