python3文件里面的解码问题
用python解释器可以正常解码,但待解的放在文件里面时却解码不正确了。
>>> ss='<strong>\u96c5\u864e\u90ae\u7bb1\u6700\u4fbf\u4e8e\u60a8\u4e0e\u4eb2\u670b\u597d\u53cb\u4fdd\u6301\u8054\u7cfb\u3002</strong>'
>>> s2=bytes(ss,'gbk').decode('gb18030')
>>> s2
'<strong>雅虎邮箱最便于您与亲朋好友保持联系。</strong>'
当把ss的内容保存在一个文件里面时读出来却解码不对。
try:
fobj=open('cn001.txt','r',encoding='utf-8') #cn001.txt里面的内容跟ss一样
except IOError as e:
print ("open error",e)
for eachline in fobj:
eachline=bytes(eachline,'utf-8').decode('gb18030')
print (eachline)
fobj.close()
[解决办法]
open里指定合适的编码...
with open('1.txt', encoding='unicode_escape') as f:
print(line)