#53 closed defect (fixed)
FTPHost.walk fails when the argument is a unicode string and the tree contains non-ASCII characters
Reported by: | schwa | Owned by: | schwa |
---|---|---|---|
Priority: | major | Milestone: | 2.6 |
Component: | Library | Version: | 2.4.2 |
Keywords: | unicode, walk, UnicodeEncodeError | Cc: |
Description
When FTPHost.walk
is used to examine a filesystem tree which somewhere contains a non-7-bit-ASCII character in a name and the argument passed in is any unicode string, the walk
method implicitly will raise a UnicodeEncodeError
.
Imagine this directory structure:
some_dir some_stränge_file
then
ftp_host.walk(u"some_dir")
will cause a UnicodeEncodeError
in posixpath
like:
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/posixpath.py", line 70, in join path += '/' + b UnicodeDecodeError: 'ascii' codec can't decode byte 0xdf in position 41: ordinal not in range(128)
(original report by Henning Hraban Ramm - thanks!)
Change History (4)
comment:1 Changed 10 years ago by
Status: | new → assigned |
---|
comment:2 Changed 10 years ago by
Milestone: | 2.5 → 2.5.1 |
---|
I put this off until ftputil 2.5.1. I'll have to go through all the methods and see how they're affected, and I don't want to delay the release of ftputil 2.5 final even more after the apparently long beta phase.
comment:3 Changed 10 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Fixed in commits [7ee81a2ca43a] and [8c59e4da5479].
comment:4 Changed 10 years ago by
Milestone: | 2.5.1 → 2.6 |
---|
As pointed out in this mail, FTP has no concept of encodings. As the encoding of the directories and files on the remote side is unknown, there's no convenient solution.
At the moment, I think the most appropriate approach is to have a method fail as early as possible if it accepts remote paths and gets a unicode string for them.
A solution might be something like: