Python and Unicode
cebix Member Christian BauerOrganization: AREVAProject: Diagnostics frontend for industrial I&C ✭
...says chapter 32.21 of the Ice manual, but if that is the concept shouldn't IcePy map Slice strings to Python Unicode strings instead of 8-bit strings?For languages other than C++, Ice encodes strings in their native Unicode representation, so applications can transparently use characters from non-English alphabets.
For example, the Python implementation of the "Hello World" demo server:
class PrinterI(Demo.Printer): def printString(self, s, current=None): print sonly works with a UTF-8 locale. 's' is an 8-bit string which, as long as the printString() operation is called from a correctly written client, will always be in UTF-8. A more correct server implementation should look something like:
class PrinterI(Demo.Printer): def printString(self, s, current=None): print s.decode('utf8')which would, however, generate a UnicodeEncodeError if a client sends a string with characters that are not representable in the server's locale, so more effort is needed to have robust printing in the server (the best I've been able to come up with is
print s.decode('utf8').encode(locale.getpreferredencoding(), 'replace')which is not that trivial any more...).
Likewise, in a Python client, I would like to be able to
printer.printString(u"Hällö Wörld!")directly (after setting the proper coding for the Python script, of course) instead of
printer.printString(u"Hällö Wörld!".encode('utf8'))but this gives me a "ValueError: invalid value for argument 1 in operation `printString'" from Ice.
Alternatively, if IcePy uses 8-bit strings for Slice strings, it should provide an automatic string conversion facility as in C++. Our applications have to run with a Latin1 locale for legacy reasons. In C++ this works very nice and transparent after installing a UTF-8 <-> Latin1 StringConverter, but in Python it gets ugly and increases the potential for mistakes (are encode/decode correctly applied to all strings that go over the Ice interface?).
I guess the best option I currently have is to patch the C++ code of the IcePy module to install a StringConverter there?
In any case, it would be nice if IcePy could marshal Unicode strings to Slice strings instead of raising a ValueError.