I'm reading a config file in python getting sections and creating new config files for each section.
However.. I'm getting a decode error because one of the strings contains EspaƱol=spain
self.output_file.write( what.replace( " = ", "=", 1 ) )
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128)
How would I adjust my code to allow for encoded characters such as these? I'm very new to this so please excuse me if this is something simple..
class EqualsSpaceRemover:
output_file = None
def __init__( self, new_output_file ):
self.output_file = new_output_file
def write( self, what ):
self.output_file.write( what.replace( " = ", "=", 1 ) )
def get_sections():
configFilePath = 'C:\\test.ini'
config = ConfigParser.ConfigParser()
config.optionxform = str
config.read(configFilePath)
for section in config.sections():
configdata = {k:v for k,v in config.items(section)}
confignew = ConfigParser.ConfigParser()
cfgfile = open("C:\\" + section + ".ini", 'w')
confignew.add_section(section)
for x in configdata.items():
confignew.set(section,x[0],x[1])
confignew.write( EqualsSpaceRemover( cfgfile ) )
cfgfile.close()
If you use
python2withfrom __future__ import unicode_literalsthen every string literal you write is an unicode literal, as if you would prefix every literal withu"...", unless you explicitly writeb"...".This explains why you get an UnicodeDecodeError on this line:
because what you actually do is
ConfigParseruses plain oldstrfor its items when it reads a file using theparser.read()method, which meanswhatwill be astr. If you use unicode as arguments tostr.replace(), then the string is converted (decoded) to unicode, the replacement applied and the result returned as unicode. But ifwhatcontains characters that can't be decoded to unicode using the default encoding, then you get an UnicodeDecodeError where you wouldn't expect one.So to make this work you can
what.replace(b" = ", b"=", 1)unicode_litrealsfuture import.Generally you shouldn't mix
unicodeandstr(python3 fixes this by making it an error in almost any case). You should be aware thatfrom __future__ import unicode_literalschanges every non prefixed literal to unicode and doesn't automatically change your code to work with unicode in all case. Quite the opposite in many cases.