14.8 The mmap Module
The mmap module supplies
memory-mapped file objects. An mmap object behaves
similarly to a plain (not Unicode) string, so you can often pass an
mmap object where a plain string is expected.
However, there are differences:
An mmap object does not
supply the methods of a string object
An mmap object is mutable, while string objects
are immutable
An mmap object also corresponds to an open file
and behaves polymorphically to a Python file object (as covered in
Chapter 10)
An mmap object m can be
indexed or sliced, yielding plain strings. Since
m is mutable, you can also assign to an
indexing or slicing of m. However, when
you assign to a slice of m, the right-hand
side of the assignment statement must be a string of exactly the same
length as the slice you're assigning to. Therefore,
many of the useful tricks available with list slice assignment
(covered in Chapter 4) do not apply to
mmap slice assignment.
Module mmap supplies a factory function that is
different on Unix-like systems and Windows.
mmap(filedesc,length,tagname='') # Windows
mmap(filedesc,length,flags=MAP_SHARED,
prot=PROT_READ|PROT_WRITE) # Unix
|
|
Creates and returns an mmap object
m that maps into memory the first
length bytes of the file indicated by file
descriptor filedesc.
filedesc must normally be a file
descriptor opened for both reading and writing (except, on Unix-like
platforms, when argument prot requests
only reading or only writing). File descriptors are covered in Section 10.2.8. To get an mmap object
m that refers to a Python file object
f, use
m=mmap.mmap(f.fileno(
),length).
On Windows only, you can pass a
string tagname to give an explicit tag
name for the memory mapping. This tag name lets you have several
memory mappings on the same file, but this functionality is rarely
necessary. Calling mmap with only two arguments
has the advantage of keeping your code portable between Windows and
Unix-like platforms. On Windows, all memory mappings are readable and
writable and shared between processes, so that all processes with a
memory mapping on a file can see changes made by each such process.
On Unix-like platforms only, you can
pass mmap.MAP_PRIVATE as the
flags argument to get a mapping that is
private to your process and copy-on-write.
mmap.MAP_SHARED, the default, gets a mapping that
is shared with other processes, so that all processes mapping the
file can see changes made by one process (same as on Windows). You
can pass mmap.PROT_READ as the
prot argument to get a mapping that you
can only read, not write. Passing mmap.PROT_WRITE
gets a mapping that you can only write, not read. The bitwise-OR
mmap.PROT_READ|mmap.PROT_WRITE, the default, gets
a mapping that you can both read and write (same as on Windows).
14.8.1 Methods of mmap Objects
An mmap object m
supplies the following methods.
Closes the file of m.
Returns the lowest index I greater than or
equal to start such that
str=
=m[i:i+len(str)].
If no such i exists,
m.find returns
-1. This is the same functionality as for the
find method of string objects, covered in Chapter 9.
Ensures that all changes made to m also
exist on m's file. Until
you call m.flush,
it's uncertain whether the file reflects the current
state of m. You can pass a starting byte
offset offset and a byte count
n to limit the flushing
effect's guarantee to a slice of
m. You must pass both arguments, or
neither: it is an error to call
m.flush with exactly
one argument.
Like
the slicing
m[dstoff:dstoff+n]=m[srcoff:srcoff+n],
but potentially faster. The source and destination slices can
overlap. Apart from such potential overlap, move
does not affect the source slice (i.e., the move
method copies bytes but does not move them,
despite the method's name).
Reads
and returns a string s containing up to
n bytes starting from
m's file pointer, then
advances m's file pointer
by
len(s).
If there are less than n bytes between
m's file pointer and
m's length, returns the
bytes available. In particular, if
m's file pointer is at
the end of m, returns the empty string
''.
Returns a string of length 1 containing the
character at m's file
pointer, then advances
m's file pointer by
1.
m.read_byte( ) is
similar to m.read(1).
However, if m's file
pointer is at the end of m,
m.read(1) returns the
empty string '', while
m.read_byte( ) raises a
ValueError exception.
Reads and returns one line from the
file of m, from
m's current file pointer
up to the next '\n', included (or up to the end of
m, if there is no
'\n'), then advances
m's file pointer to point
just past the bytes just read. If
m's file pointer is at
the end of m, readline
returns the empty string ''.
Changes the length of m, so that
len(m)
becomes n. Does not affect the size of
m's file.
m's length and the
file's size are independent. To set
m's length to be equal to
the file's size, call
m.resize(m.size(
)). If m's
length is larger than the file's size,
m is padded with null bytes
(\x00).
Sets
the file pointer of m to the integer byte
offset pos. how
indicates the reference point (point 0): when
how is 0, the reference
point is the start of the file; when 1,
m's current file pointer;
when 2, the end of m. A
seek that tries to set
m's file pointer to a
negative byte offset, or to a positive offset beyond
m's length, raises a
ValueError exception.
Returns
the length (number of bytes) of the file of
m, not the length of
m itself. To get the length of
m, use
len(m).
Returns
the current position of the file pointer of
m, as a byte offset from the start of
m's file.
Writes
the bytes in str into
m and at the current position of
m's file pointer,
overwriting the bytes that were there, and then advances
m's file pointer by
len(str).
If there aren't at least
len(str)
bytes between m's file
pointer and the length of m,
write raises a ValueError
exception.
Writes byte,
which must be a single-character string, into mapping
m at the current position of
m's file pointer,
overwriting the byte that was there, and then advances
m's file pointer by
1. When x is a
single-character string,
m.write_byte(x)
is similar to
m.write(x).
However, if m's file
pointer is at the end of m,
m.write_byte(x)
silently does nothing, while
m.write(x)
raises a ValueError exception. Note that this is
the reverse of the relationship between read and
read_byte at end-of-file: write
and read_byte raise ValueError,
while read and write_byte
don't.
14.8.2 Using mmap Objects for IPC
The
way in which processes communicate using mmap is
similar to IPC using files: one process can write data, and another
process can later read the same data back. Since an
mmap object rests on an underlying file, you can
also have some processes doing I/O directly on the file, as covered
in Chapter 10, while others use
mmap to access the same file. You can choose
between mmap and I/O on file objects on the basis
of convenience: the functionality is the same. For example, here is a
simple program that uses file I/O to make the contents of a file
equal to the last line interactively typed by the user:
fileob = open('xxx','w')
while True:
data = raw_input('Enter some text:')
fileob.seek(0)
fileob.write(data)
fileob.truncate( )
fileob.flush( )
And here is another simple program that, when run in the same
directory as the former, uses mmap (and the
time.sleep function, covered in Chapter 12) to check every second for changes to the file
and print out the file's new contents:
import mmap, os, time
mx = mmap.mmap(os.open('xxx',os.O_RDWR), 1)
last = None
while True:
mx.resize(mx.size( ))
data = mx[:]
if data != last:
print data
last = data
time.sleep(1)
|