Python2.7在Windows下CM

发布时间:2019-05-30 20:55:28编辑:auto阅读(2247)

    使用python2.7处理unicode的字符串,环境变量已设置PYTHONIOENCODING为utf-8,cmd编码为utf-8时print unicode字符串会报错[Errno 0]或[Errno 2](python3.6环境下未出现此问题)

    #coding:utf-8
    import os
    os.system("chcp 65001")
    a = u"你好こんにちは"
    print a

    此时会报错,如果字符串只含ASCII字符就不会报错

     

    经查这是windows实现C函数的问题

    https://bugs.python.org/issue1602#msg148990

    The underlying cause of Python's write exceptions with cp65001 is:
    
    The ANSI C write() function as implemented by the Windows console returns the number of _characters_ written rather than the number of _bytes_, which Python reasonably interprets as a "short write error". It then consults errno, which gives the effectively random error message seen.
    
    This can be bypassed by using os.write(sys.stdout.fileno(), utf8str), which will a) succeed and b) return a count <= len(utf8str).
    
    With os.write() and an appropriate font, the Windows console will correctly display a large number of characters.
    
    Possible workaround: clear errno before calling write, check for non-zero errno after. The vast majority of (non-Python) applications never check the return value of write, so don't encounter this problem.

    解决方法

    方法1 使用win_unicode_console模块

    1.安装

    pip install win_unicode_console

    2.使用

    很简单,导入后设置开启就行

    #coding:utf-8
    import os
    import win_unicode_console
    
    win_unicode_console.enable()
    
    os.system("chcp 65001")
    a = u"你好こんにちは"
    print a

    方法2 不使用print

     根据issue的描述,可以用os.write(sys.stdout.fileno(), utf8str)的方式绕过

    此时字符串不加u前缀,直接写入str类型

    #coding:utf-8
    import os
    import sys
    os.system("chcp 65001")
    a = "你好こんにちは"
    os.write(sys.stdout.fileno(), a)

    偷懒方法

    1.使用pycharm执行不会报错,推测pycharm自行修复了这个问题

    2.只输出中文的话,那就不用utf8了,直接chcp 936然后输出a.encode("gbk","ignore")

关键字