python通过mmap库映射文件到内存用法详解

feiwen 分享于 昨天 11440阅 0人收藏此代码, 我要收藏

python通过mmap库映射文件到内存用法详解
转自:http://blog.chinaunix.net/uid-20393955-id-1645587.html



                示例使用的文本如下lorem.txt:
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec
egestas, enim et consectetuer ullamcorper, lectus ligula rutrum leo,
a elementum elit tortor eu quam. Duis tincidunt nisi ut ante. Nulla
facilisi. Sed tristique eros eu libero. Pellentesque vel
arcu. Vivamus purus orci, iaculis ac, suscipit sit amet, pulvinar eu,
lacus. Praesent placerat tortor sed nisl. Nunc blandit diam egestas
dui. Pellentesque habitant morbi tristique senectus et netus et
malesuada fames ac turpis egestas. Aliquam viverra fringilla
leo. Nulla feugiat augue eleifend nulla. Vivamus mauris. Vivamus sed
mauris in nibh placerat egestas. Suspendisse potenti. Mauris
massa. Ut eget velit auctor tortor blandit sollicitudin. Suspendisse
imperdiet justo.

        
数据读取:
                使用mmap()函数可以创建内存映射文件。第一个参数是一个文件描述符,可以来自一个文件对象的fileno()方法或从os.open()。调用者要在调用mmap()前打开文件,并调用结束后关闭它。第二个参数以字节为单位,是映射文件的大小。如果值是0,映射整个文件。如果大于当前文件大小,则扩展这个文件。注意可选参数access:ACCESS_READ,ACCESS_WRITE,ACCESS_COPY。


import mmap
import contextlib
 
with open('lorem.txt', 'r') as f:
    with contextlib.closing(mmap.mmap(f.fileno(), 0,
                                      access=mmap.ACCESS_READ)
                            ) as m:
        print 'First 10 bytes via read :', m.read(10)
        print 'First 10 bytes via slice:', m[:10]
        print '2nd   10 bytes via read :', m.read(10)
 

#该代码片段来自于: http://www.sharejs.com/codes/python/8166
 执行结果:

$ python mmap_read.py
First 10 bytes via read : Lorem ipsu
First 10 bytes via slice: Lorem ipsu
2nd 10 bytes via read : m dolor si
 
数据写入

import mmap
import shutil
import contextlib
 
# Copy the example file
shutil.copyfile('lorem.txt', 'lorem_copy.txt')
 
word = 'consectetuer'
reversed = word[::-1]
print 'Looking for    :', word
print 'Replacing with :', reversed
 
with open('lorem_copy.txt', 'r+') as f:
    with contextlib.closing(mmap.mmap(f.fileno(), 0)) as m:
        print 'Before:'
        print m.readline().rstrip()
        m.seek(0) # rewind
 
        loc = m.find(word)
        m[loc:loc+len(word)] = reversed
        m.flush()
 
        m.seek(0) # rewind
        print 'After :'
        print m.readline().rstrip()
 
        f.seek(0) # rewind
        print 'File  :'
        print f.readline().rstrip()

                

#该代码片段来自于: http://www.sharejs.com/codes/python/8166
执行结果:

$ python mmap_write_slice.py
Looking for : consectetuer
Replacing with : reutetcesnoc
Before:
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec
After :
Lorem ipsum dolor sit amet, reutetcesnoc adipiscing elit. Donec
File :
Lorem ipsum dolor sit amet, reutetcesnoc adipiscing elit. Donec
 

#该代码片段来自于: http://www.sharejs.com/codes/python/8166
使用ACCESS_COPY则不会改变实际存储的文件

import mmap
import shutil
import contextlib
 
# Copy the example file
shutil.copyfile('lorem.txt', 'lorem_copy.txt')
 
word = 'consectetuer'
reversed = word[::-1]
 
with open('lorem_copy.txt', 'r+') as f:
    with contextlib.closing(mmap.mmap(f.fileno(), 0,
                                      access=mmap.ACCESS_COPY)
                            ) as m:
        print 'Memory Before:'
        print m.readline().rstrip()
        print 'File Before  :'
        print f.readline().rstrip()
        print
 
        m.seek(0) # rewind
        loc = m.find(word)
        m[loc:loc+len(word)] = reversed
 
        m.seek(0) # rewind
        print 'Memory After :'
        print m.readline().rstrip()
 
        f.seek(0)
        print 'File After   :'
        print f.readline().rstrip()
 
                

#该代码片段来自于: http://www.sharejs.com/codes/python/8166
执行结果:

$ python mmap_write_copy.py
Memory Before:
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec
File Before :
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec
Memory After :
Lorem ipsum dolor sit amet, reutetcesnoc adipiscing elit. Donec
File After :
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec
 

#该代码片段来自于: http://www.sharejs.com/codes/python/8166
正则表达式
    可以与正则表达式配合使用:

import mmap
import re
import contextlib
 
pattern = re.compile(r'(\.\W+)?([^.]?nulla[^.]*?\.)',
                     re.DOTALL | re.IGNORECASE | re.MULTILINE)
 
with open('lorem.txt', 'r') as f:
    with contextlib.closing(mmap.mmap(f.fileno(), 0,
                                      access=mmap.ACCESS_READ)
                            ) as m:
        for match in pattern.findall(m):
            print match[1].replace('\n', ' ')
                

#该代码片段来自于: http://www.sharejs.com/codes/python/8166
执行结果:
$ python mmap_regex.py
Nulla facilisi.
Nulla feugiat augue eleifend nulla.
 
参考资料:mmap (http://docs.python.org/lib/module-mmap.html) 
标签:
  • mmap
  • 内存映射
  • python