source: ftputil.txt @ 795:743125307fe7

Last change on this file since 795:743125307fe7 was 795:743125307fe7, checked in by Stefan Schwarzer <sschwarzer@…>, 12 years ago
Note that the time shift can be different from zero even if client and server are in the same time zone.
File size: 45.7 KB
Line 
1``ftputil`` -- a high-level FTP client library
2==============================================
3
4:Version:   2.4
5:Date:      2009-02-15
6:Summary:   high-level FTP client library for Python
7:Keywords:  FTP, ``ftplib`` substitute, virtual filesystem, pure Python
8:Author:    Stefan Schwarzer <sschwarzer@sschwarzer.net>
9:`Russian translation`__: Anton Stepanov <antymail@mail.ru>
10
11.. __: ftputil_ru.html
12
13.. contents::
14
15
16Introduction
17------------
18
19The ``ftputil`` module is a high-level interface to the ftplib_
20module. The `FTPHost objects`_ generated from it allow many operations
21similar to those of os_, `os.path`_ and `shutil`_.
22
23.. _ftplib: http://www.python.org/doc/current/lib/module-ftplib.html
24.. _os: http://www.python.org/doc/current/lib/module-os.html
25.. _`os.path`: http://www.python.org/doc/current/lib/module-os.path.html
26.. _`shutil`: http://www.python.org/doc/current/lib/module-shutil.html
27
28Examples::
29
30    import ftputil
31
32    # download some files from the login directory
33    host = ftputil.FTPHost('ftp.domain.com', 'user', 'password')
34    names = host.listdir(host.curdir)
35    for name in names:
36        if host.path.isfile(name):
37            host.download(name, name, 'b')  # remote, local, binary mode
38
39    # make a new directory and copy a remote file into it
40    host.mkdir('newdir')
41    source = host.file('index.html', 'r')         # file-like object
42    target = host.file('newdir/index.html', 'w')  # file-like object
43    host.copyfileobj(source, target)  # similar to shutil.copyfileobj
44    source.close()
45    target.close()
46
47Also, there are `FTPHost.lstat`_ and `FTPHost.stat`_ to request size and
48modification time of a file. The latter can also follow links, similar
49to `os.stat`_. Even `FTPHost.walk`_ and `FTPHost.path.walk`_ work.
50
51.. _`os.stat`: http://www.python.org/doc/2.5/lib/os-file-dir.html#l2h-2698
52
53
54``ftputil`` features
55--------------------
56
57* Method names are familiar from Python's ``os``, ``os.path`` and
58  ``shutil`` modules
59
60* Remote file system navigation (``getcwd``, ``chdir``)
61
62* Upload and download files (``upload``, ``upload_if_newer``,
63  ``download``, ``download_if_newer``)
64
65* Time zone synchronization between client and server (needed
66  for ``upload_if_newer`` and ``download_if_newer``)
67
68* Create and remove directories (``mkdir``, ``makedirs``, ``rmdir``,
69  ``rmtree``) and remove files (``remove``)
70
71* Get information about directories, files and links (``listdir``,
72  ``stat``, ``lstat``, ``exists``, ``isdir``, ``isfile``, ``islink``,
73  ``abspath``, ``split``, ``join``, ``dirname``, ``basename`` etc.)
74
75* Iterate over remote file systems (``walk``)
76
77* Local caching of results from ``lstat`` and ``stat`` calls to reduce
78  network access (also applies to ``exists``, ``getmtime`` etc.).
79
80* Read files from and write files to remote hosts via
81  file-like objects (``FTPHost.file``; the generated file-like objects
82  have many common methods like ``read``, ``readline``, ``readlines``,
83  ``write``, ``writelines``, ``close`` and can do automatic line
84  ending conversions on the fly, i. e. text/binary mode).
85
86
87Exception hierarchy
88-------------------
89
90The exceptions are in the namespace of the ``ftp_error`` module, e. g.
91``ftp_error.TemporaryError``. Getting the exception classes from the
92"package module" ``ftputil`` is deprecated and will no longer be
93supported in ``ftputil`` version 2.5.
94
95The exceptions are organized as follows::
96
97    FTPError
98        FTPOSError(FTPError, OSError)
99            PermanentError(FTPOSError)
100                CommandNotImplementedError(PermanentError)
101            TemporaryError(FTPOSError)
102        FTPIOError(FTPError)
103        InternalError(FTPError)
104            InaccessibleLoginDirError(InternalError)
105            ParserError(InternalError)
106            RootDirError(InternalError)
107            TimeShiftError(InternalError)
108
109and are described here:
110
111- ``FTPError``
112
113  is the root of the exception hierarchy of the module.
114
115- ``FTPOSError``
116
117  is derived from ``OSError``. This is for similarity between the
118  os module and ``FTPHost`` objects. Compare
119
120  ::
121
122    try:
123        os.chdir('nonexisting_directory')
124    except OSError:
125        ...
126
127  with
128
129  ::
130
131    host = ftputil.FTPHost('host', 'user', 'password')
132    try:
133        host.chdir('nonexisting_directory')
134    except OSError:
135        ...
136
137  Imagine a function
138
139  ::
140
141    def func(path, file):
142        ...
143
144  which works on the local file system and catches ``OSErrors``. If you
145  change the parameter list to
146
147  ::
148
149    def func(path, file, os=os):
150        ...
151
152  where ``os`` denotes the ``os`` module, you can call the function also as
153
154  ::
155
156    host = ftputil.FTPHost('host', 'user', 'password')
157    func(path, file, os=host)
158
159  to use the same code for a local and remote file system. Another
160  similarity between ``OSError`` and ``FTPOSError`` is that the latter
161  holds the FTP server return code in the ``errno`` attribute of the
162  exception object and the error text in ``strerror``.
163
164- ``PermanentError``
165
166  is raised for 5xx return codes from the FTP server. This
167  corresponds to ``ftplib.error_perm`` (though ``PermanentError`` and
168  ``ftplib.error_perm`` are *not* identical).
169
170- ``CommandNotImplementedError``
171
172  indicates that an underlying command the code tries to use is not
173  implemented. For an example, see the description of the
174  `FTPHost.chmod`_ method.
175
176- ``TemporaryError``
177
178  is raised for FTP return codes from the 4xx category. This
179  corresponds to ``ftplib.error_temp`` (though ``TemporaryError`` and
180  ``ftplib.error_temp`` are *not* identical).
181
182- ``FTPIOError``
183
184  denotes an I/O error on the remote host. This appears
185  mainly with file-like objects which are retrieved by invoking
186  ``FTPHost.file`` (``FTPHost.open`` is an alias). Compare
187
188  ::
189
190    >>> try:
191    ...     f = open('not_there')
192    ... except IOError, obj:
193    ...     print obj.errno
194    ...     print obj.strerror
195    ...
196    2
197    No such file or directory
198
199  with
200
201  ::
202
203    >>> host = ftputil.FTPHost('host', 'user', 'password')
204    >>> try:
205    ...     f = host.open('not_there')
206    ... except IOError, obj:
207    ...     print obj.errno
208    ...     print obj.strerror
209    ...
210    550
211    550 not_there: No such file or directory.
212
213  As you can see, both code snippets are similar. However, the error
214  codes aren't the same.
215
216- ``InternalError``
217
218  subsumes exception classes for signaling errors due to limitations
219  of the FTP protocol or the concrete implementation of ``ftputil``.
220
221- ``InaccessibleLoginDirError``
222
223  This exception is only raised if *both* of the following conditions
224  are met:
225
226  - The directory in which "you" are placed upon login is not
227    accessible, i. e. a ``chdir`` call with the directory as
228    argument would fail.
229
230  - You try to access a path which contains whitespace.
231
232- ``ParserError``
233
234  is used for errors during the parsing of directory
235  listings from the server. This exception is used by the ``FTPHost``
236  methods ``stat``, ``lstat``, and ``listdir``.
237
238- ``RootDirError``
239
240  Because of the implementation of the ``lstat`` method it is not
241  possible to do a ``stat`` call  on the root directory ``/``.
242  If you know *any* way to do it, please let me know. :-)
243
244  This problem does *not* affect stat calls on items *in* the root
245  directory.
246
247- ``TimeShiftError``
248
249  is used to denote errors which relate to setting the `time shift`_,
250  *for example* trying to set a value which is no multiple of a full
251  hour.
252
253
254``FTPHost`` objects
255-------------------
256
257.. _`FTPHost construction`:
258
259Construction
260~~~~~~~~~~~~
261
262Basics
263``````
264
265``FTPHost`` instances can be generated with the following call::
266
267    host = ftputil.FTPHost(host, user, password, account,
268                           session_factory=ftplib.FTP)
269
270The first four parameters are strings with the same meaning as for the
271FTP class in the ``ftplib`` module.
272
273Session factories
274`````````````````
275
276The keyword argument ``session_factory`` may be used to generate FTP
277connections with other factories than the default ``ftplib.FTP``. For
278example, the M2Crypto distribution uses a secure FTP class which is
279derived from ``ftplib.FTP``.
280
281In fact, all positional and keyword arguments other than
282``session_factory`` are passed to the factory to generate a new background
283session (which happens for every remote file that is opened; see
284below).
285
286This functionality of the constructor also allows to wrap
287``ftplib.FTP`` objects to do something that wouldn't be possible with
288the ``ftplib.FTP`` constructor alone.
289
290As an example, assume you want to connect to another than the default
291port but ``ftplib.FTP`` only offers this by means of its ``connect``
292method, but not via its constructor. The solution is to use a wrapper
293class::
294
295    import ftplib
296    import ftputil
297
298    EXAMPLE_PORT = 50001
299
300    class MySession(ftplib.FTP):
301        def __init__(self, host, userid, password, port):
302            """Act like ftplib.FTP's constructor but connect to another port."""
303            ftplib.FTP.__init__(self)
304            self.connect(host, port)
305            self.login(userid, password)
306
307    # try not to use MySession() as factory, - use the class itself
308    host = ftputil.FTPHost(host, userid, password,
309                           port=EXAMPLE_PORT, session_factory=MySession)
310    # use `host` as usual
311
312On login, the format of the directory listings (needed for stat'ing
313files and directories) should be determined automatically. If not,
314please `file a bug report`_.
315
316.. _`file a bug report`: http://ftputil.sschwarzer.net/issuetrackernotes
317
318Support for the ``with`` statement
319``````````````````````````````````
320
321If you are sure that all the users of your code use at least Python
3222.5, you can use Python's `with statement`_::
323
324    # not needed for Python 2.6 and later
325    from __future__ import with_statement
326
327    import ftputil
328
329    with ftputil.FTPHost(host, user, password) as host:
330        print host.listdir(host.curdir)
331
332After the ``with`` block, the ``FTPHost`` instance and the
333associated FTP sessions will be closed automatically.
334
335If something goes wrong during the ``FTPHost`` construction or in the
336body of the ``with`` statement, the instance is closed as well.
337Exceptions will be propagated (as with ``try ... finally``).
338
339.. _`with statement`: http://www.python.org/dev/peps/pep-0343/
340
341``FTPHost`` attributes and methods
342~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
343
344Attributes
345``````````
346
347- ``curdir``, ``pardir``, ``sep``
348
349  are strings which denote the current and the parent directory on the
350  remote server. ``sep`` identifies the path separator. Though `RFC 959`_
351  (File Transfer Protocol) notes that these values may depend on the
352  FTP server implementation, the Unix counterparts seem to work well
353  in practice, even for non-Unix servers.
354
355Remote file system navigation
356`````````````````````````````
357
358- ``getcwd()``
359
360  returns the absolute current directory on the remote host. This
361  method acts similar to ``os.getcwd``.
362
363- ``chdir(directory)``
364
365  sets the current directory on the FTP server. This resembles
366  ``os.chdir``, as you may have expected.
367
368Uploading and downloading files
369```````````````````````````````
370
371- ``upload(source, target, mode='')``
372
373  copies a local source file (given by a filename, i. e. a string)
374  to the remote host under the name target. Both ``source`` and
375  ``target`` may be absolute paths or relative to their corresponding
376  current directory (on the local or the remote host, respectively).
377  The mode may be "" or "a" for ASCII uploads or "b" for binary
378  uploads. ASCII mode is the default (again, similar to regular local
379  file objects).
380
381- ``download(source, target, mode='')``
382
383  performs a download from the remote source to a target file. Both
384  ``source`` and ``target`` are strings. The description of the
385  upload method applies here, too.
386
387.. _`upload_if_newer`:
388
389- ``upload_if_newer(source, target, mode='')``
390
391  is similar to the ``upload`` method. The only difference is that the
392  upload is only invoked if the time of the last modification for the
393  source file is more recent than that of the target file or the
394  target doesn't exist at all. If an upload actually happened, the
395  return value is a true value, else a false value.
396
397  Note that this method only checks the existence and/or the
398  modification time of the source and target file; it can't recognize
399  a change in the transfer mode, e. g.
400
401  ::
402
403    # transfer in ASCII mode
404    host.upload_if_newer('source_file', 'target_file', 'a')
405    # won't transfer the file again, which is bad!
406    host.upload_if_newer('source_file', 'target_file', 'b')
407
408  Similarly, if a transfer is interrupted, the remote file will have a
409  newer modification time than the local file, and thus the transfer
410  won't be repeated if ``upload_if_newer`` is used a second time.
411  There are (at least) two possibilities after a failed upload:
412
413  - use ``upload`` instead of ``upload_if_newer``, or
414
415  - remove the incomplete target file with ``FTPHost.remove``, then
416    use ``upload`` or ``upload_if_newer`` to transfer it again.
417
418  If it seems that a file is uploaded unnecessarily, read the
419  subsection on `time shift`_ settings.
420
421.. _`download_if_newer`:
422
423- ``download_if_newer(source, target, mode='')``
424
425  corresponds to ``upload_if_newer`` but performs a download from the
426  server to the local host. Read the descriptions of download and
427  ``upload_if_newer`` for more. If a download actually happened, the
428  return value is a true value, else a false value.
429
430  If it seems that a file is downloaded unnecessarily, read the
431  subsection on `time zone correction`_.
432
433.. _`time shift`:
434.. _`time zone correction`:
435
436Time zone correction
437````````````````````
438
439If the client (where ``ftputil`` runs) and the server have a different
440understanding of local time, this has to be taken into account for
441``upload_if_newer`` and ``download_if_newer`` to work correctly.
442
443Note that even if the client and the server are in the same time zone
444(or even on the same computer), the time shift value (see below) may
445be different from zero. For example, my computer is set to use local
446time whereas the server insists on using UTC time.
447
448.. _`set_time_shift`:
449
450- ``set_time_shift(time_shift)``
451
452  sets the so-called time shift value (measured in seconds). The time
453  shift is the difference between the local time of the server and the
454  local time of the client at a given moment, i. e. by definition
455
456  ::
457
458    time_shift = server_time - client_time
459
460  Setting this value is important if `upload_if_newer`_ and
461  `download_if_newer`_ have to work correctly even if the time zone of
462  the FTP server differs from that of the client (where ``ftputil``
463  runs). Note that the time shift value *can be negative*.
464
465  If the time shift value is invalid, e. g. no multiple of a full hour
466  or its absolute (unsigned) value larger than 24 hours, a
467  ``TimeShiftError`` is raised.
468
469  See also `synchronize_times`_ for a way to set the time shift with a
470  simple method call.
471
472- ``time_shift()``
473
474  returns the currently-set time shift value. See ``set_time_shift``
475  (above) for its definition.
476
477.. _`synchronize_times`:
478
479- ``synchronize_times()``
480
481  synchronizes the local times of the server and the client, so that
482  `upload_if_newer`_ and `download_if_newer`_ work as expected, even
483  if the client and the server are in different time zones. For this
484  to work, *all* of the following conditions must be true:
485
486  - The connection between server and client is established.
487
488  - The client has write access to the directory that is current when
489    ``synchronize_times`` is called.
490
491  If you can't fulfill these conditions, you can nevertheless set the
492  time shift value manually with `set_time_shift`_. Trying to call
493  ``synchronize_times`` if the above conditions aren't true results in
494  a ``TimeShiftError`` exception.
495
496Creating and removing directories
497`````````````````````````````````
498
499- ``mkdir(path, [mode])``
500
501  makes the given directory on the remote host. This doesn't construct
502  "intermediate" directories which don't already exist. The ``mode``
503  parameter is ignored; this is for compatibility with ``os.mkdir`` if
504  an ``FTPHost`` object is passed into a function instead of the
505  ``os`` module. See the explanation in the subsection `Exception hierarchy`_.
506
507- ``makedirs(path, [mode])``
508
509  works similar to ``mkdir`` (see above), but also makes intermediate
510  directories like ``os.makedirs``. The ``mode`` parameter is only
511  there for compatibility with ``os.makedirs`` and is ignored.
512
513- ``rmdir(path)``
514
515  removes the given remote directory. If it's not empty, raise
516  a ``PermanentError``.
517
518- ``rmtree(path, ignore_errors=False, onerror=None)``
519
520  removes the given remote, possibly non-empty, directory tree.
521  The interface of this method is rather complex, in favor of
522  compatibility with ``shutil.rmtree``.
523
524  If ``ignore_errors`` is set to a true value, errors are ignored.
525  If ``ignore_errors`` is a false value *and* ``onerror`` isn't
526  set, all exceptions occurring during the tree iteration and
527  processing are raised. These exceptions are all of type
528  ``PermanentError``.
529
530  To distinguish between error causes, pass in a callable for
531  ``onerror``. This callable must accept three arguments: ``func``,
532  ``path`` and ``exc_info``. ``func`` is a bound method object,
533  *for example* ``your_host_object.listdir``. ``path`` is the path
534  that was the recent argument of the respective method (``listdir``,
535  ``remove``, ``rmdir``). ``exc_info`` is the exception info as it is
536  gotten from ``sys.exc_info``.
537
538  The code of ``rmtree`` is taken from Python's ``shutil`` module
539  and adapted for ``ftputil``.
540
541Removing files and links
542````````````````````````
543
544- ``remove(path)``
545
546  removes a file or link on the remote host, similar to ``os.remove``.
547
548- ``unlink(path)``
549
550  is an alias for ``remove``.
551
552Retrieving information about directories, files and links
553`````````````````````````````````````````````````````````
554
555- ``listdir(path)``
556
557  returns a list containing the names of the files and directories
558  in the given path, similar to ``os.listdir``. The special names
559  ``.`` and ``..`` are not in the list.
560
561The methods ``lstat`` and ``stat`` (and others) rely on the directory
562listing format used by the FTP server. When connecting to a host,
563``FTPHost``'s constructor tries to guess the right format, which
564succeeds in most cases. However, if you get strange results or
565``ParserError`` exceptions by a mere ``lstat`` call, please
566`file a bug report`_.
567
568If ``lstat`` or ``stat`` yield wrong modification dates or times, look
569at the methods that deal with time zone differences (`time zone
570correction`_).
571
572.. _`FTPHost.lstat`:
573
574- ``lstat(path)``
575
576  returns an object similar to that from ``os.lstat`` (a "tuple" with
577  additional attributes; see the documentation of the ``os`` module for
578  details). However, due to the nature of the application, there are
579  some important aspects to keep in mind:
580
581  - The result is derived by parsing the output of a ``DIR`` command on
582    the server. Therefore, the result from ``FTPHost.lstat`` can not
583    contain more information than the received text. In particular:
584
585  - User and group ids can only be determined as strings, not as
586    numbers, and that only if the server supplies them. This is
587    usually the case with Unix servers but maybe not for other FTP
588    server programs.
589
590  - Values for the time of the last modification may be rough,
591    depending on the information from the server. For timestamps
592    older than a year, this usually means that the precision of the
593    modification timestamp value is not better than days. For newer
594    files, the information may be accurate to a minute.
595
596  - Links can only be recognized on servers that provide this
597    information in the ``DIR`` output.
598
599  - Stat attributes that can't be determined at all are set to
600        ``None``. For example, a line of a directory listing may not
601        contain the date/time of a directory's last modification.
602
603  - There's a special problem with stat'ing the root directory.
604    (Stat'ing things *in* the root directory is fine though.) In
605    this case, a ``RootDirError`` is raised. This has to do with the
606    algorithm used by ``(l)stat``, and I know of no approach which
607    mends this problem.
608
609..
610
611  Currently, ``ftputil`` recognizes the common Unix-style and
612  Microsoft/DOS-style directory formats. If you need to parse output
613  from another server type, please write to the `ftputil mailing
614  list`_. You may consider to `write your own parser`_.
615
616.. _`ftputil mailing list`: http://ftputil.sschwarzer.net/mailinglist
617.. _`write your own parser`: `Writing directory parsers`_
618
619.. _`FTPHost.stat`:
620
621- ``stat(path)``
622
623  returns ``stat`` information also for files which are pointed to by a
624  link. This method follows multiple links until a regular file or
625  directory is found. If an infinite link chain is encountered or the
626  target of the last link in the chain doesn't exist, a
627  ``PermanentError`` is raised.
628
629.. _`FTPHost.path`:
630
631``FTPHost`` objects contain an attribute named ``path``, similar to
632`os.path`_. The following methods can be applied to the remote host
633with the same semantics as for ``os.path``:
634
635::
636
637    abspath(path)
638    basename(path)
639    commonprefix(path_list)
640    dirname(path)
641    exists(path)
642    getmtime(path)
643    getsize(path)
644    isabs(path)
645    isdir(path)
646    isfile(path)
647    islink(path)
648    join(path1, path2, ...)
649    normcase(path)
650    normpath(path)
651    split(path)
652    splitdrive(path)
653    splitext(path)
654    walk(path, func, arg)
655
656Like Python's counterparts under `os.path`_, ``ftputil``'s ``is...``
657methods return ``False`` if they can't find the given path.
658
659Local caching of file system information
660````````````````````````````````````````
661
662Many of the above methods need access to the remote file system to
663obtain data on directories and files. To get the most recent data,
664*each* call to ``lstat``, ``stat``, ``exists``, ``getmtime`` etc.
665would require to fetch a directory listing from the server, which can
666make the program *very* slow. This effect is more pronounced for
667operations which mostly scan the file system rather than transferring
668file data.
669
670For this reason, ``ftputil`` by default saves (caches) the results
671from directory listings locally and reuses those results. This reduces
672network accesses and so speeds up the software a lot. However, since
673data is more rarely fetched from the server, the risk of obsolete data
674also increases. This will be discussed below.
675
676Caching can -- if necessary at all -- be controlled via the
677``stat_cache`` object in an ``FTPHost``'s namespace. For example,
678after calling
679
680::
681
682    host = ftputil.FTPHost(host, user, password, account,
683                           session_factory=ftplib.FTP)
684
685the cache can be accessed as ``host.stat_cache``.
686
687While ``ftputil`` usually manages the cache quite well, there are two
688possible reasons that may suggest to modify cache parameters.
689The first is when the number of possible entries is too low. You may
690notice that when you are processing very large directories (e. g.
691containing more than 1000 directories or files) and the program
692becomes much slower than before. It's common for code to read a
693directory with ``listdir`` and then process the found directories and
694files. For this application, it's a good rule of thumb to set the
695cache size to somewhat more than the number of directory entries
696fetched with ``listdir``. This is done by the ``resize`` method::
697
698    host.stat_cache.resize(2000)
699
700where the argument is the maximum number of ``lstat`` results to store
701(the default is 1000). Note that each path on the server, e. g.
702"/home/schwa/some_dir", corresponds to a single cache entry. (Methods
703like ``exists`` or ``getmtime`` all derive their results from a
704previously fetched ``lstat`` result.)
705
706The value 2000 above means that the cache will hold at most 2000
707entries. If more are about to be stored, the entries which have not
708been used for the longest time will be deleted to make place for newer
709entries.
710
711Caching is so effective because it reduces network accesses. This can
712also be a disadvantage if the file system data on the remote server
713changes after a stat result has been retrieved; the client, when
714looking at the cached stat data, will use obsolete information.
715
716There are two ways to get such out-of-date stat data. The first
717happens when an ``FTPHost`` instance modifies a file path for which it
718has a cache entry, e. g. by calling ``remove`` or ``rmdir``. Such
719changes are handled transparently; the path will be deleted from the
720cache. A different matter are changes unknown to the ``FTPHost``
721object which reads its cache. Obviously, for example, these are
722changes by programs running on the remote host. On the other hand,
723cache inconsistencies can also occur if two ``FTPHost`` objects change
724a file system simultaneously::
725
726    host1 = ftputil.FTPHost(server, user1, password1)
727    host2 = ftputil.FTPHost(server, user1, password1)
728    try:
729        stat_result1 = host1.stat("some_file")
730        stat_result2 = host2.stat("some_file")
731        host2.remove("some_file")
732        # `host1` will still see the obsolete cache entry!
733        print host1.stat("some_file")
734        # will raise an exception since an `FTPHost` object
735        #  knows of its own changes
736        print host2.stat("some_file")
737    finally:
738        host1.close()
739        host2.close()
740
741At first sight, it may appear to be a good idea to have a shared cache
742among several ``FTPHost`` objects. After some thinking, this turns out
743to be very error-prone. For example, it won't help with different
744processes using ``ftputil``. So, if you have to deal with concurrent
745write accesses to a server, you have to handle them explicitly.
746
747The most useful tool for this probably is the ``invalidate`` method.
748In the example above, it could be used as::
749
750    host1 = ftputil.FTPHost(server, user1, password1)
751    host2 = ftputil.FTPHost(server, user1, password1)
752    try:
753        stat_result1 = host1.stat("some_file")
754        stat_result2 = host2.stat("some_file")
755        host2.remove("some_file")
756        # invalidate using an absolute path
757        absolute_path = host1.path.abspath(
758                        host1.path.join(host1.curdir, "some_file"))
759        host1.stat_cache.invalidate(absolute_path)
760        # will now raise an exception as it should
761        print host1.stat("some_file")
762        # would raise an exception since an `FTPHost` object
763        #  knows of its own changes, even without `invalidate`
764        print host2.stat("some_file")
765    finally:
766        host1.close()
767        host2.close()
768
769The method ``invalidate`` can be used on any *absolute* path, be it a
770directory, a file or a link.
771
772By default, the cache entries are stored indefinitely, i. e. if you
773start your Python process using ``ftputil`` and let it run for three
774days a stat call may still access cache data that old. To avoid this,
775you can set the ``max_age`` attribute::
776
777    host = ftputil.FTPHost(server, user, password)
778    host.stat_cache.max_age = 60 * 60  # = 3600 seconds
779
780This sets the maximum age of entries in the cache to an hour. This
781means any entry older won't be retrieved from the cache but its data
782instead fetched again from the remote host and then again stored for
783up to an hour. To reset `max_age` to the default of unlimited age,
784i. e. cache entries never expire, use ``None`` as value.
785
786If you are certain that the cache will be in the way, you can disable
787and later re-enable it completely with ``disable`` and ``enable``::
788
789    host = ftputil.FTPHost(server, user, password)
790    host.stat_cache.disable()
791    ...
792    host.stat_cache.enable()
793
794During that time, the cache won't be used; all data will be fetched
795from the network. After enabling the cache, its entries will be the
796same as when the cache was disabled, that is, entries won't get
797updated with newer data during this period. Note that even when the
798cache is disabled, the file system data in the code can become
799inconsistent::
800
801    host = ftputil.FTPHost(server, user, password)
802    host.stat_cache.disable()
803    if host.path.exists("some_file"):
804        mtime = host.path.getmtime("some_file")
805
806In that case, the file ``some_file`` may have been removed by another
807process between the calls to ``exists`` and ``getmtime``!
808
809Iteration over directories
810``````````````````````````
811
812.. _`FTPHost.walk`:
813
814- ``walk(top, topdown=True, onerror=None)``
815
816  iterates over a directory tree, similar to `os.walk`_. Actually,
817  ``FTPHost.walk`` uses the code from Python with just the necessary
818  modifications, so see the linked documentation.
819
820.. _`os.walk`: http://www.python.org/doc/2.5/lib/os-file-dir.html#l2h-2707
821
822.. _`FTPHost.path.walk`:
823
824- ``path.walk(path, func, arg)``
825
826  Similar to ``os.path.walk``, the ``walk`` method in
827  `FTPHost.path`_ can be used, though ``FTPHost.walk`` is probably
828  easier to use.
829
830Other methods
831`````````````
832
833- ``close()``
834
835  closes the connection to the remote host. After this, no more
836  interaction with the FTP server is possible without using a new
837  ``FTPHost`` object.
838
839- ``rename(source, target)``
840
841  renames the source file (or directory) on the FTP server.
842
843.. _`FTPHost.chmod`:
844
845- ``chmod(path, mode)``
846
847  sets the access mode (permission flags) for the given path. The mode
848  is an integer as returned for the mode by the ``stat`` and ``lstat``
849  methods. Be careful: Usually, mode values are written as octal
850  numbers, for example 0755 to make a directory readable and writable
851  for the owner, but not writable for the group and others. If you
852  want to use such octal values, rely on Python's support for them::
853
854    host.chmod("some_directory", 0755)
855
856  *Note the leading zero.*
857
858  Not all FTP servers support the ``chmod`` command. In case of
859  an exception, how do you know if the path doesn't exist or if
860  the command itself is invalid? If the FTP server complies with
861  `RFC 959`_, it should return a status code 502 if the ``SITE CHMOD``
862  command isn't allowed. ``ftputil`` maps this special error
863  response to a ``CommandNotImplementedError`` which is derived from
864  ``PermanentError``.
865
866  So you need to code like this::
867
868    host = ftputil.FTPHost(server, user, password)
869    try:
870        host.chmod("some_file", 0644)
871    except ftp_error.CommandNotImplementedError:
872        # chmod not supported
873        ...
874    except ftp_error.PermanentError:
875        # possibly a non-existent file
876        ...
877
878  Because the ``CommandNotImplementedError`` is more specific, you
879  have to test for it first.
880
881.. _`RFC 959`: `RFC 959 - File Transfer Protocol (FTP)`_
882
883- ``copyfileobj(source, target, length=64*1024)``
884
885  copies the contents from the file-like object source to the
886  file-like object target. The only difference to
887  ``shutil.copyfileobj`` is the default buffer size. Note that
888  arbitrary file-like objects can be used as arguments (e. g. local
889  files, remote FTP files). See `File-like objects`_ for construction
890  and use of remote file-like objects.
891
892.. _`set_parser`:
893
894- ``set_parser(parser)``
895
896  sets a custom parser for FTP directories. Note that you have to pass
897  in a parser *instance*, not the class.
898
899  An `extra section`_ shows how to write own parsers if the default
900  parsers in ``ftputil`` don't work for you. Possibly you are lucky
901  and someone has already written a parser you can use. Please ask on
902  the `mailing list`_.
903
904.. _`extra section`: `Writing directory parsers`_
905
906
907File-like objects
908-----------------
909
910Construction
911~~~~~~~~~~~~
912
913Basics
914``````
915
916``FTPFile`` objects are returned by a call to ``FTPHost.file`` or
917``FTPHost.open``, never use the constructor directly.
918
919- ``FTPHost.file(path, mode='r')``
920
921  returns a file-like object that refers to the path on the remote
922  host. This path may be absolute or relative to the current directory
923  on the remote host (this directory can be determined with the getcwd
924  method). As with local file objects the default mode is "r", i. e.
925  reading text files. Valid modes are "r", "rb", "w", and "wb".
926
927- ``FTPHost.open(path, mode='r')``
928
929  is an alias for ``file`` (see above).
930
931Support for the ``with`` statement
932``````````````````````````````````
933
934If you are sure that all the users of your code use at least Python
9352.5, you can use Python's `with statement`_ also with the ``FTPFile``
936constructor::
937
938    # not needed for Python 2.6 and later
939    from __future__ import with_statement
940
941    import ftputil
942
943    # get an ``FTPHost`` object from somewhere
944    ...
945
946    with host.file("new_file", "w") as f:
947        f.write("This is some text.")
948
949At the end of the ``with`` block, the file will be closed
950automatically.
951
952If something goes wrong during the construction of the file or in the
953body of the ``with`` statement, the file will be closed as well.
954Exceptions will be propagated (as with ``try ... finally``).
955
956.. _`with statement`: http://www.python.org/dev/peps/pep-0343/
957
958Attributes and methods
959~~~~~~~~~~~~~~~~~~~~~~
960
961The methods
962
963::
964
965    close()
966    read([count])
967    readline([count])
968    readlines()
969    write(data)
970    writelines(string_sequence)
971    xreadlines()
972
973and the attribute ``closed`` have the same semantics as for file
974objects of a local disk file system. The iterator protocol is
975supported as well, i. e. you can use a loop to read a file line by
976line::
977
978    host = ftputil.FTPHost(...)
979    input_file = host.file("some_file")
980    for line in input_file:
981        # do something with the line, e. g.
982        print line.strip().replace("ftplib", "ftputil")
983    input_file.close()
984
985This feature obsoletes the ``xreadlines`` method which is deprecated
986and will be removed in ``ftputil`` version 2.5.
987
988For more on file objects, see the section `File objects`_ in the
989Python Library Reference.
990
991.. _`file objects`: http://www.python.org/doc/current/lib/bltin-file-objects.html
992
993Note that ``ftputil`` supports both binary mode and text mode with the
994appropriate line ending conversions.
995
996
997Writing directory parsers
998-------------------------
999
1000``ftputil`` recognizes the two most widely-used FTP directory formats
1001(Unix and MS style) and adjusts itself automatically. However, if your
1002server uses a format which is different from the two provided by
1003``ftputil``, you can plug in an own custom parser and have it used by
1004a single method call.
1005
1006For this, you need to write a parser class by inheriting from the
1007class ``Parser`` in the ``ftp_stat`` module. Here's an example::
1008
1009    from ftputil import ftp_error
1010    from ftputil import ftp_stat
1011
1012    class XyzParser(ftp_stat.Parser):
1013        """
1014        Parse the default format of the FTP server of the XYZ
1015        corporation.
1016        """
1017        def parse_line(self, line, time_shift=0.0):
1018            """
1019            Parse a `line` from the directory listing and return a
1020            corresponding `StatResult` object. If the line can't
1021            be parsed, raise `ftp_error.ParserError`.
1022
1023            The `time_shift` argument can be used to fine-tune the
1024            parsing of dates and times. See the class
1025            `ftp_stat.UnixParser` for an example.
1026            """
1027            # split the `line` argument and examine it further; if
1028            #  something goes wrong, raise an `ftp_error.ParserError`
1029            ...
1030            # make a `StatResult` object from the parts above
1031            stat_result = ftp_stat.StatResult(...)
1032            # `_st_name` and `_st_target` are optional
1033            stat_result._st_name = ...
1034            stat_result._st_target = ...
1035            return stat_result
1036
1037        # define `ignores_line` only if the default in the base class
1038        #  doesn't do enough!
1039        def ignores_line(self, line):
1040            """
1041            Return a true value if the line should be ignored. For
1042            example, the implementation in the base class handles
1043            lines like "total 17". On the other hand, if the line
1044            should be used for stat'ing, return a false value.
1045            """
1046            is_total_line = super(XyzParser, self).ignores_line(line)
1047            my_test = ...
1048            return is_total_line or my_test
1049
1050A ``StatResult`` object is similar to the value returned by
1051`os.stat`_ and is usually built with statements like
1052
1053::
1054
1055    stat_result = StatResult(
1056                  (st_mode, st_ino, st_dev, st_nlink, st_uid,
1057                   st_gid, st_size, st_atime, st_mtime, st_ctime) )
1058    stat_result._st_name = ...
1059    stat_result._st_target = ...
1060
1061with the arguments of the ``StatResult`` constructor described in
1062the following table.
1063
1064===== ========== ============ =============== =======================
1065Index Attribute  os.stat type StatResult type Notes
1066===== ========== ============ =============== =======================
10670     st_mode    int          int
10681     st_ino     long         long
10692     st_dev     long         long
10703     st_nlink   int          int
10714     st_uid     int          str             usually only available as string
10725     st_gid     int          str             usually only available as string
10736     st_size    long         long
10747     st_atime   int/float    float
10758     st_mtime   int/float    float
10769     st_ctime   int/float    float
1077\-    _st_name   \-           str             file name without directory part
1078\-    _st_target \-           str             link target
1079===== ========== ============ =============== =======================
1080
1081If you can't extract all the desirable data from a line (for
1082example, the MS format doesn't contain any information about the
1083owner of a file), set the corresponding values in the ``StatResult``
1084instance to ``None``.
1085
1086Parser classes can use several helper methods which are defined in
1087the class ``Parser``:
1088
1089- ``parse_unix_mode`` parses strings like "drwxr-xr-x" and returns
1090  an appropriate ``st_mode`` value.
1091
1092- ``parse_unix_time`` returns a float number usable for the
1093  ``st_...time`` values by parsing arguments like "Nov"/"23"/"02:33" or
1094  "May"/"26"/"2005". Note that the method expects the timestamp string
1095  already split at whitespace.
1096
1097- ``parse_ms_time`` parses arguments like "10-23-01"/"03:25PM" and
1098  returns a float number like from ``time.mktime``. Note that the
1099  method expects the timestamp string already split at whitespace.
1100
1101Additionally, there's an attribute ``_month_numbers`` which maps
1102three-letter month abbreviations to integers.
1103
1104For more details, see the two "standard" parsers ``UnixParser`` and
1105``MSParser`` in the module ``ftp_stat.py``.
1106
1107To actually *use* the parser, call the method `set_parser`_ of the
1108``FTPHost`` instance.
1109
1110If you can't write a parser or don't want to, please ask on the
1111`ftputil mailing list`_. Possibly someone has already written a parser
1112for your server or can help to do it.
1113
1114
1115FAQ / Tips and tricks
1116---------------------
1117
1118Where can I get the latest version?
1119~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1120
1121See the `download page`_. Announcements will be sent to the `mailing
1122list`_. Announcements on major updates will also be posted to the
1123newsgroup `comp.lang.python`_ .
1124
1125.. _`download page`: http://ftputil.sschwarzer.net/download
1126.. _`mailing list`: http://ftputil.sschwarzer.net/mailinglist
1127.. _`comp.lang.python`: news:comp.lang.python
1128
1129Is there a mailing list on ``ftputil``?
1130~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1131
1132Yes, please visit http://ftputil.sschwarzer.net/mailinglist to
1133subscribe or read the archives.
1134
1135I found a bug! What now?
1136~~~~~~~~~~~~~~~~~~~~~~~~
1137
1138Before reporting a bug, make sure that you already tried the `latest
1139version`_ of ``ftputil``. There the bug might have already been fixed.
1140
1141.. _`latest version`: http://ftputil.sschwarzer.net/download
1142
1143Please see http://ftputil.sschwarzer.net/issuetrackernotes for
1144guidelines on entering a bug in ``ftputil``'s ticket system. If you
1145are unsure if the behaviour you found is a bug or not, you can write
1146to the `ftputil mailing list`_. In *either* case you *must not*
1147include confidential information (user id, password, file names, etc.)
1148in the problem report! Be careful!
1149
1150Does ``ftputil`` support SSL?
1151~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1152
1153``ftputil`` has no *built-in* SSL support. On the other hand,
1154you can use M2Crypto_ (in the source code archive, look for the
1155file ``M2Crypto/ftpslib.py``) which has a class derived from
1156``ftplib.FTP`` that supports SSL. You then can use a class
1157(not an object of it) similar to the following as a "session
1158factory" in ``ftputil.FTPHost``'s constructor::
1159
1160    import ftputil
1161
1162    from M2Crypto import ftpslib
1163
1164    class SSLFTPSession(ftpslib.FTP_TLS):
1165        def __init__(self, host, userid, password):
1166            """
1167            Use M2Crypto's `FTP_TLS` class to establish an
1168            SSL connection.
1169            """
1170            ftpslib.FTP_TLS.__init__(self)
1171            # do anything necessary to set up the SSL connection
1172            ...
1173            self.connect(host, port)
1174            self.login(userid, password)
1175            ...
1176
1177    # note the `session_factory` parameter
1178    host = ftputil.FTPHost(host, userid, password,
1179                           session_factory=SSLFTPSession)
1180    # use `host` as usual
1181
1182.. _M2Crypto: http://wiki.osafoundation.org/bin/view/Projects/MeTooCrypto#Downloads
1183
1184Connecting on another port
1185~~~~~~~~~~~~~~~~~~~~~~~~~~
1186
1187By default, an instantiated ``FTPHost`` object connects on the usual
1188FTP ports. If you have to use a different port, refer to the
1189section `FTPHost construction`_.
1190
1191You can use the same approach to connect in active or passive mode, as
1192you like.
1193
1194Using active or passive connections
1195~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1196
1197Use a wrapper class for ``ftplib.FTP``, as described in section
1198`FTPHost construction`_::
1199
1200    import ftplib
1201
1202    class ActiveFTPSession(ftplib.FTP):
1203        def __init__(self, host, userid, password):
1204            """
1205            Act like ftplib.FTP's constructor but use active mode
1206            explicitly.
1207            """
1208            ftplib.FTP.__init__(self)
1209            self.connect(host, port)
1210            self.login(userid, password)
1211            # see http://docs.python.org/lib/ftp-objects.html
1212            self.set_pasv(False)
1213
1214Use this class as the ``session_factory`` argument in ``FTPHost``'s
1215constructor.
1216
1217Conditional upload/download to/from a server in a different time zone
1218~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1219
1220You may find that ``ftputil`` uploads or downloads files
1221unnecessarily, or not when it should. This can happen when the FTP
1222server is in a different time zone than the client on which
1223``ftputil`` runs. Please see the section on `time zone correction`_.
1224It may even be sufficient to call `synchronize_times`_.
1225
1226Wrong dates or times when stat'ing on a server
1227~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1228
1229Please see the previous tip.
1230
1231I tried to upload or download a file and it's corrupt
1232~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1233
1234Perhaps you used the upload or download methods without a ``mode``
1235argument. For compatibility with Python's code for local file systems,
1236``ftputil`` defaults to ASCII/text mode which will try to convert
1237presumable line endings and thus corrupt binary files. Pass "b" as the
1238``mode`` argument (see `Uploading and downloading files`_).
1239
1240When I use ``ftputil``, all I get is a ``ParserError`` exception
1241~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1242
1243The FTP server you connect to uses a directory format that
1244``ftputil`` doesn't understand. You can either write and
1245`plug in an own parser`_, or preferably ask on the `mailing list`_ for
1246help.
1247
1248.. _`plug in an own parser`: `Writing directory parsers`_
1249
1250``isdir``, ``isfile`` or ``islink`` incorrectly return ``False``
1251~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1252
1253Like Python's counterparts under `os.path`_, ``ftputil``'s methods
1254return ``False`` if they can't find the given path.
1255
1256Probably you used ``listdir`` on a directory and called ``is...()`` on
1257the returned names. But if the argument for ``listdir`` wasn't the
1258current directory, the paths won't be found and so all ``is...()``
1259variants will return ``False``.
1260
1261I don't find an answer to my problem in this document
1262~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1263
1264Please send an email with your problem report or question to the
1265`ftputil mailing list`_, and we'll see what we can do for you. :-)
1266
1267
1268Bugs and limitations
1269--------------------
1270
1271- ``ftputil`` needs at least Python 2.3 to work.
1272
1273- Due to the implementation of ``lstat`` it can not return a sensible
1274  value for the root directory ``/`` though stat'ing entries *in* the
1275  root directory isn't a problem. If you know an implementation that
1276  can do this, please let me know. The root directory is handled
1277  appropriately in ``FTPHost.path.exists/isfile/isdir/islink``, though.
1278
1279- Timeouts of individual child sessions currently are not handled.
1280  This is only a problem if your ``FTPHost`` object or the generated
1281  ``FTPFile`` objects are inactive for about ten minutes or longer.
1282
1283- Until now, I haven't paid attention to thread safety. In principle,
1284  at least, different ``FTPFile`` objects should be usable in different
1285  threads. If in doubt if your approach will work, ask on the mailing
1286  list.
1287
1288- ``FTPFile`` objects in text mode *may not* support charsets with more
1289  than one byte per character. Please email me your experiences
1290  (address above), if you work with multibyte text streams in FTP
1291  sessions.
1292
1293- Currently, it is not possible to continue an interrupted upload or
1294  download. Contact me if you have problems with that.
1295
1296- There's exactly one cache for lstat results for each ``FTPHost``
1297  object, i. e. there's no sharing of cache results determined by
1298  several ``FTPHost`` objects.
1299
1300
1301Files
1302-----
1303
1304If not overwritten via installation options, the ``ftputil`` files
1305reside in the ``ftputil`` package. The documentation (in
1306`reStructuredText`_ and in HTML format) is in the same directory.
1307
1308.. _`reStructuredText`: http://docutils.sourceforge.net/rst.html
1309
1310The files ``_test_*.py`` and ``_mock_ftplib.py`` are for unit-testing.
1311If you only *use* ``ftputil`` (i. e. *don't* modify it), you can
1312delete these files.
1313
1314
1315References
1316----------
1317
1318- Mackinnon T, Freeman S, Craig P. 2000. `Endo-Testing:
1319  Unit Testing with Mock Objects`_.
1320
1321- Postel J, Reynolds J. 1985. `RFC 959 - File Transfer Protocol (FTP)`_.
1322
1323- Van Rossum G, Drake Jr FL. 2003. `Python Library Reference`_.
1324
1325.. _`Endo-Testing: Unit Testing with Mock Objects`:
1326   http://www.connextra.com/aboutUs/mockobjects.pdf
1327.. _`RFC 959 - File Transfer Protocol (FTP)`: http://www.ietf.org/rfc/rfc959.txt
1328.. _`Python Library Reference`: http://www.python.org/doc/current/lib/lib.html
1329
1330
1331Authors
1332-------
1333
1334``ftputil`` is written by Stefan Schwarzer
1335<sschwarzer@sschwarzer.net>, in part based on suggestions
1336from users.
1337
1338The ``lrucache`` module is written by Evan Prodromou
1339<evan@prodromou.name>.
1340
1341Feedback is appreciated. :-)
1342
Note: See TracBrowser for help on using the repository browser.