I'd like to second the request for supporting the MLSD and MLST commands if available as discussed at http://thread.gmane.org/gmane.comp.python.ftputil/440
Even though the LIST parser works fine, time resolution is very restricted to days for all files that weren't modified today. The MLS* commands solve that problem.
Hello Andi,
Thanks for entering the ticket.
Given what I wrote in my reply to the mail you mention I couldn't yet make up my mind on whether I want to implement MLSD/MLST support for
stat
/lstat
.How badly are you affected by the current admittedly rather limited precision for older directories and files? What workaround(s) do you use? I'd like to get a feeling for how important the feature is in practice.
I've looked a bit more into this.
From now on, I'm always talking about getting information for a complete directory per request because that's the basis for the stat caching.
The response to MLSD for the PureFTPd server installed here looks like this:
type=cdir;sizd=4096;modify=20111120085556;UNIX.mode=0755;UNIX.uid=1004;UNIX.gid=1004;unique=805g41401c2; . type=pdir;sizd=4096;modify=20111120085556;UNIX.mode=0755;UNIX.uid=1004;UNIX.gid=1004;unique=805g41401c2; .. type=file;size=641;modify=20090722070212;UNIX.mode=0644;UNIX.uid=0;UNIX.gid=0;unique=805g41401d4; CONTENTS type=OS.unix=slink:;size=11;modify=20090722070039;UNIX.mode=0777;UNIX.uid=0;UNIX.gid=0;unique=805g41401d2; broken_link type=file;size=13705975;modify=20060817084836;UNIX.mode=0644;UNIX.uid=1004;UNIX.gid=1004;unique=805g41401d3; debian-keyring.tar.gz type=dir;sizd=4096;modify=20070603143530;UNIX.mode=0755;UNIX.uid=1004;UNIX.gid=1004;unique=805g41401e3; dir with spaces type=dir;sizd=4096;modify=20090722080040;UNIX.mode=0755;UNIX.uid=0;UNIX.gid=0;unique=805g41401e9; dir_with_broken_link type=dir;sizd=4096;modify=20060128191352;UNIX.mode=0755;UNIX.uid=0;UNIX.gid=0;unique=805g41401e7; rootdir1 type=dir;sizd=4096;modify=20060128191443;UNIX.mode=0755;UNIX.uid=0;UNIX.gid=0;unique=805g41401d0; rootdir2 type=OS.unix=slink:;size=21;modify=20090722070029;UNIX.mode=0777;UNIX.uid=0;UNIX.gid=0;unique=805g41401e8; valid_link type=dir;sizd=4096;modify=20090405163514;UNIX.mode=0755;UNIX.uid=1004;UNIX.gid=1004;unique=805g41401d5; walk_test
The corresponding DIR response is
-rw-r--r-- 1 0 0 641 Jul 22 2009 CONTENTS lrwxrwxrwx 1 0 0 11 Jul 22 2009 broken_link -> nonexistent -rw-r--r-- 1 1004 1004 13705975 Aug 17 2006 debian-keyring.tar.gz drwxr-xr-x 3 1004 1004 4096 Jun 3 2007 dir with spaces drwxr-xr-x 2 0 0 4096 Jul 22 2009 dir_with_broken_link drwxr-xr-x 2 0 0 4096 Jan 28 2006 rootdir1 drwxr-xr-x 3 0 0 4096 Jan 28 2006 rootdir2 lrwxrwxrwx 1 0 0 21 Jul 22 2009 valid_link -> debian-keyring.tar.gz drwxr-xr-x 5 1004 1004 4096 Apr 5 2009 walk_test
Notable differences between the MLSD and the DIR response:
- The specification for the MLSD and MLST commands doesn't seem to specify which facts (the term is from the RFC) should or even may be included in the response.
- I was unable to find out whether the user id in the MLSD response is always or even usually the same as the corresponding column in the DIR output. More precisely, I wonder whether the user id may be a number in the MLSD response but a string in the DIR response.
- At least in this specific example, the MLSD response doesn't contain link targets which are required for
FTPHost.stat
(which follows links) to work.These differences mean that even if the remote server supports the MLSD command, the DIR request and the parsing of its response probably are still required. In other words, ftputil needs to send two directory requests and retrieve and process their responses. If the code using ftputil does relatively little file transfer in comparison to getting stat information this will slow down the FTP-related code. So the use of MLSD, even if it's supported by the server, may have to be optional.
If MLSD support is to be implemented, it will most likely be done as follows:
- Addition of an MLSD parser in
ftp_stat.py
.- Extend
ftp_stat._Stat._stat_results_from_dir
so that after using DIR it tries to use MLSD to get more precise values for the time of the item's last modification. If the MLSD command turns out to be unsupported by the server or the MLSD response doesn't contain the modification time, set a flag so that MLSD isn't even tried again for this connection.- Add unit tests (usually in form of test-driven development during the implementation).
- If the use of MLSD should be configurable by the user, extend
FTPHost
accordingly and describe the API in the documentation.
Meanwhile I got some feedback by the original ticket author that he can work around the absence of MLSD/MLST with sufficient ease.
I close the ticket as "wontfix" for now. If someone runs into actual problems because of the lack of precision in the modification timestamps, please reopen the ticket and describe your problem in a comment.