Google's Python Class 五 (Python Dict and File)

2012-12-24

Googles Python Class 5 (Python Dict and File)原文:http://code.google.com/edu/languages/google-pyth

Google's Python Class 5 (Python Dict and File)

原文:http://code.google.com/edu/languages/google-python-class/dict-files.html

? ## Can build up a dict by starting with the the empty dict {}? ## and storing key/value pairs into the dict like this:? ## dict[key] = value-for-that-key? dict = {}? dict['a'] = 'alpha'? dict['g'] = 'gamma'? dict['o'] = 'omega'? print dict ?## {'a': 'alpha', 'o': 'omega', 'g': 'gamma'}? print dict['a'] ? ? ## Simple lookup, returns 'alpha'? dict['a'] = 6 ? ? ? ## Put new key/value into dict? 'a' in dict ? ? ? ? ## True? ## print dict['z'] ? ? ? ? ? ? ? ? ?## Throws KeyError? if 'z' in dict: print dict['z'] ? ? ## Avoid KeyError? print dict.get('z') ?## None (instead of KeyError)

Google's Python Class 五 (Python Dict and File)

for循环默认会遍历字典的key. 这些key的顺序是不确定的. dict.keys() 和 dict.values()会返回键或值的序列. 字典有一个items()方法，可以把key-value转换成一个数组，这个数组中的每个元素都是两个元素形式的元组，这是遍历操作所有元素的最有效方法。

? ## By default, iterating over a dict iterates over its keys.? ## Note that the keys are in a random order.? for key in dict: print key? ## prints a g o? ? ## Exactly the same as above? for key in dict.keys(): print key? ## Get the .keys() list:? print dict.keys() ?## ['a', 'o', 'g']? ## Likewise, there's a .values() list of values? print dict.values() ?## ['alpha', 'omega', 'gamma']? ## Common case -- loop over the keys in sorted order,? ## accessing each key/value? for key in sorted(dict.keys()):? ? print key, dict[key]? ? ## .items() is the dict expressed as (key, value) tuples? print dict.items() ?## ?[('a', 'alpha'), ('o', 'omega'), ('g', 'gamma')]? ## This loop syntax accesses the whole dict by looping? ## over the .items() tuple list, accessing one (key, value)? ## pair on each iteration.? for k, v in dict.items(): print k, '>', v? ## a > alpha ? ?o > omega ? ? g > gamma

有很多"iter"形式的变体函数，如iterkeys(), itervalues() 和 iteritems(),他们可以避免构造新的list -- 当数据量比较大的时候效果明显. 但我依然建议使用keys() 和 values() ,因为他们的名字比较有意义. 并且在Python3000以后性能有了较大改善，iterkeys不再需要了。

提示:从性能上考虑，字典是你应当使用的最有效的工具，在任何需要数据操作、传递的时候尽量使用.

Dict 格式化

% 可以方便的通过字典进行别名调用赋值:

? hash = {}? hash['word'] = 'garfield'? hash['count'] = 42? s = 'I want %(count)d copies of %(word)s' % hash ?# %d for int, %s for string? # 'I want 42 copies of garfield'

Del

"del" 可以删除变量定义，同时也可以删除list或其他集合中的数据:

? var = 6? del var ?# var no more!? ? list = ['a', 'b', 'c', 'd']? del list[0] ? ? ## Delete first element? del list[-2:] ? ## Delete last two elements? print list ? ? ?## ['b']? dict = {'a':1, 'b':2, 'c':3}? del dict['b'] ? ## Delete 'b' entry? print dict ? ? ?## {'a':1, 'c':3}

Files

open()函数会打开一个文件，按照正常的方法我们可以进行读写操作. f = open('name', 'r') 按照只读方式打开文件并把指针交给f，在结束操作后调用f.close()关闭文件。'r'、'w'、'a'分别表示读写和追加。'rU'被称作"Universal"模式，它可以自动在每一行结尾添加一个简单的 '\n'来进行换行，这样你就不需要手动追加了.for循环是最有效的遍历文件所有行的方法:

? # Echo the contents of a file? f = open('foo.txt', 'rU')? for line in f: ? ## iterates over the lines of the file? ? print line, ? ?## trailing , so print does not add an end-of-line char? ? ? ? ? ? ? ? ? ?## since 'line' already includes the end-of line.? f.close()

Reading one line at a time has the nice quality that not all the file needs to fit in memory at one time -- handy if you want to look at every line in a 10 gigabyte file without using 10 gigabytes of memory. The f.readlines() method reads the whole file into memory and returns its contents as a list of its lines. The f.read() method reads the whole file into a single string, which can be a handy way to deal with the text all at once, such as with regular expressions we'll see later.

For writing, f.write(string) method is the easiest way to write data to an open output file. Or you can use "print" with an open file, but the syntax is nasty: "print >> f, string". In python 3000, the print syntax will be fixed to be a regular function call with a file= optional argument: "print(string, file=f)".

Unicode文件

"codecs"模块提供访问unicode编码文件的功能:

import codecsf = codecs.open('foo.txt', 'rU', 'utf-8')for line in f:? # 这里的line表示了一个unicode编码字符串

For writing, use f.write() since print does not fully support unicode.

持续改进编程

编写一个Python程序的时候，不要尝试一次性把所有东西写完. 试着把程序分成很多里程碑, 比如 "第一步是从这些词中提取list" ，这样逐步对程序进行完善并测试，这是python的编写后直接运行带给我们的优势，我们要尽情使用。

热点排行

perl python

Google's Python Class 五 (Python Dict and File)