When Ducktyping is Dangerous
Python’s ducktyping is very useful. It’s good to be able to generally treat something as a file, or a string even if it’s not exactly a file or a string. But when there are methods with the same signature(*) whose parameters change meaning, then you’ve got a problem.
When you have a string, you can use the string.translate(mapping) call to translate characters — to do things like force all whitespace to be a real space, or drop everything in the 8 bit zone. Unicode strings have a similar method, unicode.translate(mapping).
Except that the mapping is totally different for strings and unicode strings. The string translate takes a 256 character string, and the unicode one takes a hash. This leads to the error:
TypeError: character mapping must return integer, None or unicode
which is especially fun when you don’t know if you have a string or a unicode string, and stuff just worked before.
This is a working solution for the unicode case. Instead of mapping = string.maketrans('\r\n\t',' ')
, you need mapping = dict(zip(map(ord, u'\r\n\t'), u' '))
.
(*) well, technically the type would change, but you don’t see that in a non-statically typed language.
No comments