source: ftputil.txt @ 790:2148fae72310

Last change on this file since 790:2148fae72310 was 790:2148fae72310, checked in by Stefan Schwarzer <sschwarzer@…>, 12 years ago
Some small improvements.
File size: 44.6 KB
Line 
1``ftputil`` -- a high-level FTP client library
2==============================================
3
4:Version:   2.4
5:Date:      2009-02-15
6:Summary:   high-level FTP client library for Python
7:Keywords:  FTP, ``ftplib`` substitute, virtual filesystem, pure Python
8:Author:    Stefan Schwarzer <sschwarzer@sschwarzer.net>
9:`Russian translation`__: Anton Stepanov <antymail@mail.ru>
10
11.. __: ftputil_ru.html
12
13.. contents::
14
15
16Introduction
17------------
18
19The ``ftputil`` module is a high-level interface to the ftplib_
20module. The `FTPHost objects`_ generated from it allow many operations
21similar to those of os_, `os.path`_ and `shutil`_.
22
23.. _ftplib: http://www.python.org/doc/current/lib/module-ftplib.html
24.. _os: http://www.python.org/doc/current/lib/module-os.html
25.. _`os.path`: http://www.python.org/doc/current/lib/module-os.path.html
26.. _`shutil`: http://www.python.org/doc/current/lib/module-shutil.html
27
28Examples::
29
30    import ftputil
31
32    # download some files from the login directory
33    host = ftputil.FTPHost('ftp.domain.com', 'user', 'password')
34    names = host.listdir(host.curdir)
35    for name in names:
36        if host.path.isfile(name):
37            host.download(name, name, 'b')  # remote, local, binary mode
38
39    # make a new directory and copy a remote file into it
40    host.mkdir('newdir')
41    source = host.file('index.html', 'r')         # file-like object
42    target = host.file('newdir/index.html', 'w')  # file-like object
43    host.copyfileobj(source, target)  # similar to shutil.copyfileobj
44    source.close()
45    target.close()
46
47Also, there are `FTPHost.lstat`_ and `FTPHost.stat`_ to request size and
48modification time of a file. The latter can also follow links, similar
49to `os.stat`_. Even `FTPHost.walk`_ and `FTPHost.path.walk`_ work.
50
51.. _`os.stat`: http://www.python.org/doc/2.5/lib/os-file-dir.html#l2h-2698
52
53
54``ftputil`` features
55--------------------
56
57* Method names are familiar from Python's ``os``, ``os.path`` and
58  ``shutil`` modules
59
60* Remote file system navigation (``getcwd``, ``chdir``)
61
62* Upload and download files (``upload``, ``upload_if_newer``,
63  ``download``, ``download_if_newer``)
64
65* Time zone synchronization between client and server (needed
66  for ``upload_if_newer`` and ``download_if_newer``)
67
68* Create and remove directories (``mkdir``, ``makedirs``, ``rmdir``,
69  ``rmtree``) and remove files (``remove``)
70
71* Get information about directories, files and links (``listdir``,
72  ``stat``, ``lstat``, ``exists``, ``isdir``, ``isfile``, ``islink``,
73  ``abspath``, ``split``, ``join``, ``dirname``, ``basename`` etc.)
74
75* Iterate over remote file systems (``walk``)
76
77* Local caching of results from ``lstat`` and ``stat`` calls to reduce
78  network access (also applies to ``exists``, ``getmtime`` etc.).
79
80* Read files from and write files to remote hosts via
81  file-like objects (``FTPHost.file``; the generated file-like objects
82  have many common methods like ``read``, ``readline``, ``readlines``,
83  ``write``, ``writelines``, ``close`` and can do automatic line
84  ending conversions on the fly, i. e. text/binary mode).
85
86
87Exception hierarchy
88-------------------
89
90The exceptions are in the namespace of the ``ftp_error`` module, e. g.
91``ftp_error.TemporaryError``. Getting the exception classes from the
92"package module" ``ftputil`` is deprecated and will no longer be
93supported in ``ftputil`` version 2.5.
94
95The exceptions are organized as follows::
96
97    FTPError
98        FTPOSError(FTPError, OSError)
99            PermanentError(FTPOSError)
100                CommandNotImplementedError(PermanentError)
101            TemporaryError(FTPOSError)
102        FTPIOError(FTPError)
103        InternalError(FTPError)
104            InaccessibleLoginDirError(InternalError)
105            ParserError(InternalError)
106            RootDirError(InternalError)
107            TimeShiftError(InternalError)
108
109and are described here:
110
111- ``FTPError``
112
113  is the root of the exception hierarchy of the module.
114
115- ``FTPOSError``
116
117  is derived from ``OSError``. This is for similarity between the
118  os module and ``FTPHost`` objects. Compare
119
120  ::
121
122    try:
123        os.chdir('nonexisting_directory')
124    except OSError:
125        ...
126
127  with
128
129  ::
130
131    host = ftputil.FTPHost('host', 'user', 'password')
132    try:
133        host.chdir('nonexisting_directory')
134    except OSError:
135        ...
136
137  Imagine a function
138
139  ::
140
141    def func(path, file):
142        ...
143
144  which works on the local file system and catches ``OSErrors``. If you
145  change the parameter list to
146
147  ::
148
149    def func(path, file, os=os):
150        ...
151
152  where ``os`` denotes the ``os`` module, you can call the function also as
153
154  ::
155
156    host = ftputil.FTPHost('host', 'user', 'password')
157    func(path, file, os=host)
158
159  to use the same code for a local and remote file system. Another
160  similarity between ``OSError`` and ``FTPOSError`` is that the latter
161  holds the FTP server return code in the ``errno`` attribute of the
162  exception object and the error text in ``strerror``.
163
164- ``PermanentError``
165
166  is raised for 5xx return codes from the FTP server. This
167  corresponds to ``ftplib.error_perm`` (though ``PermanentError`` and
168  ``ftplib.error_perm`` are *not* identical).
169
170- ``CommandNotImplementedError``
171
172  indicates that an underlying command the code tries to use is not
173  implemented. For an example, see the description of the
174  `FTPHost.chmod`_ method.
175
176- ``TemporaryError``
177
178  is raised for FTP return codes from the 4xx category. This
179  corresponds to ``ftplib.error_temp`` (though ``TemporaryError`` and
180  ``ftplib.error_temp`` are *not* identical).
181
182- ``FTPIOError``
183
184  denotes an I/O error on the remote host. This appears
185  mainly with file-like objects which are retrieved by invoking
186  ``FTPHost.file`` (``FTPHost.open`` is an alias). Compare
187
188  ::
189
190    >>> try:
191    ...     f = open('not_there')
192    ... except IOError, obj:
193    ...     print obj.errno
194    ...     print obj.strerror
195    ...
196    2
197    No such file or directory
198
199  with
200
201  ::
202
203    >>> host = ftputil.FTPHost('host', 'user', 'password')
204    >>> try:
205    ...     f = host.open('not_there')
206    ... except IOError, obj:
207    ...     print obj.errno
208    ...     print obj.strerror
209    ...
210    550
211    550 not_there: No such file or directory.
212
213  As you can see, both code snippets are similar. However, the error
214  codes aren't the same.
215
216- ``InternalError``
217
218  subsumes exception classes for signaling errors due to limitations
219  of the FTP protocol or the concrete implementation of ``ftputil``.
220
221- ``InaccessibleLoginDirError``
222
223  This exception is only raised if *both* of the following conditions
224  are met:
225
226  - The directory in which "you" are placed upon login is not
227    accessible, i. e. a ``chdir`` call with the directory as
228    argument would fail.
229
230  - You try to access a path which contains whitespace.
231
232- ``ParserError``
233
234  is used for errors during the parsing of directory
235  listings from the server. This exception is used by the ``FTPHost``
236  methods ``stat``, ``lstat``, and ``listdir``.
237
238- ``RootDirError``
239
240  Because of the implementation of the ``lstat`` method it is not
241  possible to do a ``stat`` call  on the root directory ``/``.
242  If you know *any* way to do it, please let me know. :-)
243
244  This problem does *not* affect stat calls on items *in* the root
245  directory.
246
247- ``TimeShiftError``
248
249  is used to denote errors which relate to setting the `time shift`_,
250  *for example* trying to set a value which is no multiple of a full
251  hour.
252
253
254``FTPHost`` objects
255-------------------
256
257.. _`FTPHost construction`:
258
259Construction
260~~~~~~~~~~~~
261
262Basics
263``````
264
265``FTPHost`` instances can be generated with the following call::
266
267    host = ftputil.FTPHost(host, user, password, account,
268                           session_factory=ftplib.FTP)
269
270The first four parameters are strings with the same meaning as for the
271FTP class in the ``ftplib`` module.
272
273Session factories
274`````````````````
275
276The keyword argument ``session_factory`` may be used to generate FTP
277connections with other factories than the default ``ftplib.FTP``. For
278example, the M2Crypto distribution uses a secure FTP class which is
279derived from ``ftplib.FTP``.
280
281In fact, all positional and keyword arguments other than
282``session_factory`` are passed to the factory to generate a new background
283session (which happens for every remote file that is opened; see
284below).
285
286This functionality of the constructor also allows to wrap
287``ftplib.FTP`` objects to do something that wouldn't be possible with
288the ``ftplib.FTP`` constructor alone.
289
290As an example, assume you want to connect to another than the default
291port but ``ftplib.FTP`` only offers this by means of its ``connect``
292method, but not via its constructor. The solution is to use a wrapper
293class::
294
295    import ftplib
296    import ftputil
297
298    EXAMPLE_PORT = 50001
299
300    class MySession(ftplib.FTP):
301        def __init__(self, host, userid, password, port):
302            """Act like ftplib.FTP's constructor but connect to another port."""
303            ftplib.FTP.__init__(self)
304            self.connect(host, port)
305            self.login(userid, password)
306
307    # try not to use MySession() as factory, - use the class itself
308    host = ftputil.FTPHost(host, userid, password,
309                           port=EXAMPLE_PORT, session_factory=MySession)
310    # use `host` as usual
311
312On login, the format of the directory listings (needed for stat'ing
313files and directories) should be determined automatically. If not,
314please `file a bug report`_.
315
316.. _`file a bug report`: http://ftputil.sschwarzer.net/issuetrackernotes
317
318Support for the ``with`` statement
319``````````````````````````````````
320
321If you are sure that all the users of your code use at least Python
3222.5, you can use Python's `with statement`_::
323
324    # not needed for Python 2.6 and later
325    from __future__ import with_statement
326
327    import ftputil
328
329    with ftputil.FTPHost(host, user, password) as host:
330        print host.listdir(host.curdir)
331
332After the ``with`` block, the ``FTPHost`` instance and the
333associated FTP sessions will be closed automatically.
334
335If something goes wrong during the ``FTPHost`` construction or in the
336body of the ``with`` statement, the instance is closed as well.
337Exceptions will be propagated (as with ``try ... finally``).
338
339.. _`with statement`: http://www.python.org/dev/peps/pep-0343/
340
341``FTPHost`` attributes and methods
342~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
343
344Attributes
345``````````
346
347- ``curdir``, ``pardir``, ``sep``
348
349  are strings which denote the current and the parent directory on the
350  remote server. ``sep`` identifies the path separator. Though `RFC 959`_
351  (File Transfer Protocol) notes that these values may depend on the
352  FTP server implementation, the Unix counterparts seem to work well
353  in practice, even for non-Unix servers.
354
355Remote file system navigation
356`````````````````````````````
357
358- ``getcwd()``
359
360  returns the absolute current directory on the remote host. This
361  method acts similar to ``os.getcwd``.
362
363- ``chdir(directory)``
364
365  sets the current directory on the FTP server. This resembles
366  ``os.chdir``, as you may have expected.
367
368Uploading and downloading files
369```````````````````````````````
370
371- ``upload(source, target, mode='')``
372
373  copies a local source file (given by a filename, i. e. a string)
374  to the remote host under the name target. Both ``source`` and
375  ``target`` may be absolute paths or relative to their corresponding
376  current directory (on the local or the remote host, respectively).
377  The mode may be "" or "a" for ASCII uploads or "b" for binary
378  uploads. ASCII mode is the default (again, similar to regular local
379  file objects).
380
381- ``download(source, target, mode='')``
382
383  performs a download from the remote source to a target file. Both
384  ``source`` and ``target`` are strings. The description of the
385  upload method applies here, too.
386
387.. _`upload_if_newer`:
388
389- ``upload_if_newer(source, target, mode='')``
390
391  is similar to the ``upload`` method. The only difference is that the
392  upload is only invoked if the time of the last modification for the
393  source file is more recent than that of the target file or the
394  target doesn't exist at all. If an upload actually happened, the
395  return value is a true value, else a false value.
396
397  Note that this method only checks the existence and/or the
398  modification time of the source and target file; it can't recognize
399  a change in the transfer mode, e. g.
400
401  ::
402
403    # transfer in ASCII mode
404    host.upload_if_newer('source_file', 'target_file', 'a')
405    # won't transfer the file again, which is bad!
406    host.upload_if_newer('source_file', 'target_file', 'b')
407
408  Similarly, if a transfer is interrupted, the remote file will have a
409  newer modification time than the local file, and thus the transfer
410  won't be repeated if ``upload_if_newer`` is used a second time.
411  There are (at least) two possibilities after a failed upload:
412
413  - use ``upload`` instead of ``upload_if_newer``, or
414
415  - remove the incomplete target file with ``FTPHost.remove``, then
416    use ``upload`` or ``upload_if_newer`` to transfer it again.
417
418  If it seems that a file is uploaded unnecessarily, read the
419  subsection on `time shift`_ settings.
420
421.. _`download_if_newer`:
422
423- ``download_if_newer(source, target, mode='')``
424
425  corresponds to ``upload_if_newer`` but performs a download from the
426  server to the local host. Read the descriptions of download and
427  ``upload_if_newer`` for more. If a download actually happened, the
428  return value is a true value, else a false value.
429
430  If it seems that a file is downloaded unnecessarily, read the
431  subsection on `time shift`_ settings.
432
433.. _`time shift`:
434
435Time zone correction
436````````````````````
437
438.. _`set_time_shift`:
439
440- ``set_time_shift(time_shift)``
441
442  sets the so-called time shift value (measured in seconds). The time
443  shift is the difference between the local time of the server and the
444  local time of the client at a given moment, i. e. by definition
445
446  ::
447
448    time_shift = server_time - client_time
449
450  Setting this value is important if `upload_if_newer`_ and
451  `download_if_newer`_ have to work correctly even if the time zone of
452  the FTP server differs from that of the client (where ``ftputil``
453  runs). Note that the time shift value *can be negative*.
454
455  If the time shift value is invalid, e. g. no multiple of a full hour
456  or its absolute (unsigned) value larger than 24 hours, a
457  ``TimeShiftError`` is raised.
458
459  See also `synchronize_times`_ for a way to set the time shift with a
460  simple method call.
461
462- ``time_shift()``
463
464  returns the currently-set time shift value. See ``set_time_shift``
465  (above) for its definition.
466
467.. _`synchronize_times`:
468
469- ``synchronize_times()``
470
471  synchronizes the local times of the server and the client, so that
472  `upload_if_newer`_ and `download_if_newer`_ work as expected, even
473  if the client and the server are in different time zones. For this
474  to work, *all* of the following conditions must be true:
475
476  - The connection between server and client is established.
477
478  - The client has write access to the directory that is current when
479    ``synchronize_times`` is called.
480
481  If you can't fulfill these conditions, you can nevertheless set the
482  time shift value manually with `set_time_shift`_. Trying to call
483  ``synchronize_times`` if the above conditions aren't true results in
484  a ``TimeShiftError`` exception.
485
486Creating and removing directories
487`````````````````````````````````
488
489- ``mkdir(path, [mode])``
490
491  makes the given directory on the remote host. This doesn't construct
492  "intermediate" directories which don't already exist. The ``mode``
493  parameter is ignored; this is for compatibility with ``os.mkdir`` if
494  an ``FTPHost`` object is passed into a function instead of the
495  ``os`` module. See the explanation in the subsection `Exception hierarchy`_.
496
497- ``makedirs(path, [mode])``
498
499  works similar to ``mkdir`` (see above), but also makes intermediate
500  directories like ``os.makedirs``. The ``mode`` parameter is only
501  there for compatibility with ``os.makedirs`` and is ignored.
502
503- ``rmdir(path)``
504
505  removes the given remote directory. If it's not empty, raise
506  a ``PermanentError``.
507
508- ``rmtree(path, ignore_errors=False, onerror=None)``
509
510  removes the given remote, possibly non-empty, directory tree.
511  The interface of this method is rather complex, in favor of
512  compatibility with ``shutil.rmtree``.
513
514  If ``ignore_errors`` is set to a true value, errors are ignored.
515  If ``ignore_errors`` is a false value *and* ``onerror`` isn't
516  set, all exceptions occurring during the tree iteration and
517  processing are raised. These exceptions are all of type
518  ``PermanentError``.
519
520  To distinguish between error causes, pass in a callable for
521  ``onerror``. This callable must accept three arguments: ``func``,
522  ``path`` and ``exc_info``. ``func`` is a bound method object,
523  *for example* ``your_host_object.listdir``. ``path`` is the path
524  that was the recent argument of the respective method (``listdir``,
525  ``remove``, ``rmdir``). ``exc_info`` is the exception info as it is
526  gotten from ``sys.exc_info``.
527
528  The code of ``rmtree`` is taken from Python's ``shutil`` module
529  and adapted for ``ftputil``.
530
531Removing files and links
532````````````````````````
533
534- ``remove(path)``
535
536  removes a file or link on the remote host, similar to ``os.remove``.
537
538- ``unlink(path)``
539
540  is an alias for ``remove``.
541
542Retrieving information about directories, files and links
543`````````````````````````````````````````````````````````
544
545- ``listdir(path)``
546
547  returns a list containing the names of the files and directories
548  in the given path, similar to ``os.listdir``. The special names
549  ``.`` and ``..`` are not in the list.
550
551The methods ``lstat`` and ``stat`` (and others) rely on the directory
552listing format used by the FTP server. When connecting to a host,
553``FTPHost``'s constructor tries to guess the right format, which
554succeeds in most cases. However, if you get strange results or
555``ParserError`` exceptions by a mere ``lstat`` call, please
556`file a bug report`_.
557
558If ``lstat`` or ``stat`` yield wrong modification dates or times, look
559at the methods that deal with time zone differences (`time shift`_).
560
561.. _`FTPHost.lstat`:
562
563- ``lstat(path)``
564
565  returns an object similar to that from ``os.lstat`` (a "tuple" with
566  additional attributes; see the documentation of the ``os`` module for
567  details). However, due to the nature of the application, there are
568  some important aspects to keep in mind:
569
570  - The result is derived by parsing the output of a ``DIR`` command on
571    the server. Therefore, the result from ``FTPHost.lstat`` can not
572    contain more information than the received text. In particular:
573
574  - User and group ids can only be determined as strings, not as
575    numbers, and that only if the server supplies them. This is
576    usually the case with Unix servers but maybe not for other FTP
577    server programs.
578
579  - Values for the time of the last modification may be rough,
580    depending on the information from the server. For timestamps
581    older than a year, this usually means that the precision of the
582    modification timestamp value is not better than days. For newer
583    files, the information may be accurate to a minute.
584
585  - Links can only be recognized on servers that provide this
586    information in the ``DIR`` output.
587
588  - Stat attributes that can't be determined at all are set to
589        ``None``. For example, a line of a directory listing may not
590        contain the date/time of a directory's last modification.
591
592  - There's a special problem with stat'ing the root directory.
593    (Stat'ing things *in* the root directory is fine though.) In
594    this case, a ``RootDirError`` is raised. This has to do with the
595    algorithm used by ``(l)stat``, and I know of no approach which
596    mends this problem.
597
598..
599
600  Currently, ``ftputil`` recognizes the common Unix-style and
601  Microsoft/DOS-style directory formats. If you need to parse output
602  from another server type, please write to the `ftputil mailing
603  list`_. You may consider to `write your own parser`_.
604
605.. _`ftputil mailing list`: http://ftputil.sschwarzer.net/mailinglist
606.. _`write your own parser`: `Writing directory parsers`_
607
608.. _`FTPHost.stat`:
609
610- ``stat(path)``
611
612  returns ``stat`` information also for files which are pointed to by a
613  link. This method follows multiple links until a regular file or
614  directory is found. If an infinite link chain is encountered or the
615  target of the last link in the chain doesn't exist, a
616  ``PermanentError`` is raised.
617
618.. _`FTPHost.path`:
619
620``FTPHost`` objects contain an attribute named ``path``, similar to
621`os.path`_. The following methods can be applied to the remote host
622with the same semantics as for ``os.path``:
623
624::
625
626    abspath(path)
627    basename(path)
628    commonprefix(path_list)
629    dirname(path)
630    exists(path)
631    getmtime(path)
632    getsize(path)
633    isabs(path)
634    isdir(path)
635    isfile(path)
636    islink(path)
637    join(path1, path2, ...)
638    normcase(path)
639    normpath(path)
640    split(path)
641    splitdrive(path)
642    splitext(path)
643    walk(path, func, arg)
644
645Local caching of file system information
646````````````````````````````````````````
647
648Many of the above methods need access to the remote file system to
649obtain data on directories and files. To get the most recent data,
650*each* call to ``lstat``, ``stat``, ``exists``, ``getmtime`` etc.
651would require to fetch a directory listing from the server, which can
652make the program *very* slow. This effect is more pronounced for
653operations which mostly scan the file system rather than transferring
654file data.
655
656For this reason, ``ftputil`` by default saves (caches) the results
657from directory listings locally and reuses those results. This reduces
658network accesses and so speeds up the software a lot. However, since
659data is more rarely fetched from the server, the risk of obsolete data
660also increases. This will be discussed below.
661
662Caching can -- if necessary at all -- be controlled via the
663``stat_cache`` object in an ``FTPHost``'s namespace. For example,
664after calling
665
666::
667
668    host = ftputil.FTPHost(host, user, password, account,
669                           session_factory=ftplib.FTP)
670
671the cache can be accessed as ``host.stat_cache``.
672
673While ``ftputil`` usually manages the cache quite well, there are two
674possible reasons that may suggest to modify cache parameters.
675The first is when the number of possible entries is too low. You may
676notice that when you are processing very large directories (e. g.
677containing more than 1000 directories or files) and the program
678becomes much slower than before. It's common for code to read a
679directory with ``listdir`` and then process the found directories and
680files. For this application, it's a good rule of thumb to set the
681cache size to somewhat more than the number of directory entries
682fetched with ``listdir``. This is done by the ``resize`` method::
683
684    host.stat_cache.resize(2000)
685
686where the argument is the maximum number of ``lstat`` results to store
687(the default is 1000). Note that each path on the server, e. g.
688"/home/schwa/some_dir", corresponds to a single cache entry. (Methods
689like ``exists`` or ``getmtime`` all derive their results from a
690previously fetched ``lstat`` result.)
691
692The value 2000 above means that the cache will hold at most 2000
693entries. If more are about to be stored, the entries which have not
694been used for the longest time will be deleted to make place for newer
695entries.
696
697Caching is so effective because it reduces network accesses. This can
698also be a disadvantage if the file system data on the remote server
699changes after a stat result has been retrieved; the client, when
700looking at the cached stat data, will use obsolete information.
701
702There are two ways to get such out-of-date stat data. The first
703happens when an ``FTPHost`` instance modifies a file path for which it
704has a cache entry, e. g. by calling ``remove`` or ``rmdir``. Such
705changes are handled transparently; the path will be deleted from the
706cache. A different matter are changes unknown to the ``FTPHost``
707object which reads its cache. Obviously, for example, these are
708changes by programs running on the remote host. On the other hand,
709cache inconsistencies can also occur if two ``FTPHost`` objects change
710a file system simultaneously::
711
712    host1 = ftputil.FTPHost(server, user1, password1)
713    host2 = ftputil.FTPHost(server, user1, password1)
714    try:
715        stat_result1 = host1.stat("some_file")
716        stat_result2 = host2.stat("some_file")
717        host2.remove("some_file")
718        # `host1` will still see the obsolete cache entry!
719        print host1.stat("some_file")
720        # will raise an exception since an `FTPHost` object
721        #  knows of its own changes
722        print host2.stat("some_file")
723    finally:
724        host1.close()
725        host2.close()
726
727At first sight, it may appear to be a good idea to have a shared cache
728among several ``FTPHost`` objects. After some thinking, this turns out
729to be very error-prone. For example, it won't help with different
730processes using ``ftputil``. So, if you have to deal with concurrent
731write accesses to a server, you have to handle them explicitly.
732
733The most useful tool for this probably is the ``invalidate`` method.
734In the example above, it could be used as::
735
736    host1 = ftputil.FTPHost(server, user1, password1)
737    host2 = ftputil.FTPHost(server, user1, password1)
738    try:
739        stat_result1 = host1.stat("some_file")
740        stat_result2 = host2.stat("some_file")
741        host2.remove("some_file")
742        # invalidate using an absolute path
743        absolute_path = host1.path.abspath(
744                        host1.path.join(host1.curdir, "some_file"))
745        host1.stat_cache.invalidate(absolute_path)
746        # will now raise an exception as it should
747        print host1.stat("some_file")
748        # would raise an exception since an `FTPHost` object
749        #  knows of its own changes, even without `invalidate`
750        print host2.stat("some_file")
751    finally:
752        host1.close()
753        host2.close()
754
755The method ``invalidate`` can be used on any *absolute* path, be it a
756directory, a file or a link.
757
758By default, the cache entries are stored indefinitely, i. e. if you
759start your Python process using ``ftputil`` and let it run for three
760days a stat call may still access cache data that old. To avoid this,
761you can set the ``max_age`` attribute::
762
763    host = ftputil.FTPHost(server, user, password)
764    host.stat_cache.max_age = 60 * 60  # = 3600 seconds
765
766This sets the maximum age of entries in the cache to an hour. This
767means any entry older won't be retrieved from the cache but its data
768instead fetched again from the remote host and then again stored for
769up to an hour. To reset `max_age` to the default of unlimited age,
770i. e. cache entries never expire, use ``None`` as value.
771
772If you are certain that the cache will be in the way, you can disable
773and later re-enable it completely with ``disable`` and ``enable``::
774
775    host = ftputil.FTPHost(server, user, password)
776    host.stat_cache.disable()
777    ...
778    host.stat_cache.enable()
779
780During that time, the cache won't be used; all data will be fetched
781from the network. After enabling the cache, its entries will be the
782same as when the cache was disabled, that is, entries won't get
783updated with newer data during this period. Note that even when the
784cache is disabled, the file system data in the code can become
785inconsistent::
786
787    host = ftputil.FTPHost(server, user, password)
788    host.stat_cache.disable()
789    if host.path.exists("some_file"):
790        mtime = host.path.getmtime("some_file")
791
792In that case, the file ``some_file`` may have been removed by another
793process between the calls to ``exists`` and ``getmtime``!
794
795Iteration over directories
796``````````````````````````
797
798.. _`FTPHost.walk`:
799
800- ``walk(top, topdown=True, onerror=None)``
801
802  iterates over a directory tree, similar to `os.walk`_. Actually,
803  ``FTPHost.walk`` uses the code from Python with just the necessary
804  modifications, so see the linked documentation.
805
806.. _`os.walk`: http://www.python.org/doc/2.5/lib/os-file-dir.html#l2h-2707
807
808.. _`FTPHost.path.walk`:
809
810- ``path.walk(path, func, arg)``
811
812  Similar to ``os.path.walk``, the ``walk`` method in
813  `FTPHost.path`_ can be used, though ``FTPHost.walk`` is probably
814  easier to use.
815
816Other methods
817`````````````
818
819- ``close()``
820
821  closes the connection to the remote host. After this, no more
822  interaction with the FTP server is possible without using a new
823  ``FTPHost`` object.
824
825- ``rename(source, target)``
826
827  renames the source file (or directory) on the FTP server.
828
829.. _`FTPHost.chmod`:
830
831- ``chmod(path, mode)``
832
833  sets the access mode (permission flags) for the given path. The mode
834  is an integer as returned for the mode by the ``stat`` and ``lstat``
835  methods. Be careful: Usually, mode values are written as octal
836  numbers, for example 0755 to make a directory readable and writable
837  for the owner, but not writable for the group and others. If you
838  want to use such octal values, rely on Python's support for them::
839
840    host.chmod("some_directory", 0755)
841
842  *Note the leading zero.*
843 
844  Not all FTP servers support the ``chmod`` command. In case of
845  an exception, how do you know if the path doesn't exist or if
846  the command itself is invalid? If the FTP server complies with
847  `RFC 959`_, it should return a status code 502 if the ``SITE CHMOD``
848  command isn't allowed. ``ftputil`` maps this special error
849  response to a ``CommandNotImplementedError`` which is derived from
850  ``PermanentError``.
851 
852  So you need to code like this::
853
854    host = ftputil.FTPHost(server, user, password)
855    try:
856        host.chmod("some_file", 0644)
857    except ftp_error.CommandNotImplementedError:
858        # chmod not supported
859        ...
860    except ftp_error.PermanentError:
861        # possibly a non-existent file
862        ...
863
864  Because the ``CommandNotImplementedError`` is more specific, you
865  have to test for it first.
866
867.. _`RFC 959`: `RFC 959 - File Transfer Protocol (FTP)`_
868
869- ``copyfileobj(source, target, length=64*1024)``
870
871  copies the contents from the file-like object source to the
872  file-like object target. The only difference to
873  ``shutil.copyfileobj`` is the default buffer size. Note that
874  arbitrary file-like objects can be used as arguments (e. g. local
875  files, remote FTP files). See `File-like objects`_ for construction
876  and use of remote file-like objects.
877
878.. _`set_parser`:
879
880- ``set_parser(parser)``
881
882  sets a custom parser for FTP directories. Note that you have to pass
883  in a parser *instance*, not the class.
884
885  An `extra section`_ shows how to write own parsers if the default
886  parsers in ``ftputil`` don't work for you. Possibly you are lucky
887  and someone has already written a parser you can use. Please ask on
888  the `mailing list`_.
889
890.. _`extra section`: `Writing directory parsers`_
891
892
893File-like objects
894-----------------
895
896Construction
897~~~~~~~~~~~~
898
899Basics
900``````
901
902``FTPFile`` objects are returned by a call to ``FTPHost.file`` or
903``FTPHost.open``, never use the constructor directly.
904
905- ``FTPHost.file(path, mode='r')``
906
907  returns a file-like object that refers to the path on the remote
908  host. This path may be absolute or relative to the current directory
909  on the remote host (this directory can be determined with the getcwd
910  method). As with local file objects the default mode is "r", i. e.
911  reading text files. Valid modes are "r", "rb", "w", and "wb".
912
913- ``FTPHost.open(path, mode='r')``
914
915  is an alias for ``file`` (see above).
916
917Support for the ``with`` statement
918``````````````````````````````````
919
920If you are sure that all the users of your code use at least Python
9212.5, you can use Python's `with statement`_ also with the ``FTPFile``
922constructor::
923
924    # not needed for Python 2.6 and later
925    from __future__ import with_statement
926
927    import ftputil
928
929    # get an ``FTPHost`` object from somewhere
930    ...
931
932    with host.file("new_file", "w") as f:
933        f.write("This is some text.")
934
935At the end of the ``with`` block, the file will be closed
936automatically.
937
938If something goes wrong during the construction of the file or in the
939body of the ``with`` statement, the file will be closed as well.
940Exceptions will be propagated (as with ``try ... finally``).
941
942.. _`with statement`: http://www.python.org/dev/peps/pep-0343/
943
944Attributes and methods
945~~~~~~~~~~~~~~~~~~~~~~
946
947The methods
948
949::
950
951    close()
952    read([count])
953    readline([count])
954    readlines()
955    write(data)
956    writelines(string_sequence)
957    xreadlines()
958
959and the attribute ``closed`` have the same semantics as for file
960objects of a local disk file system. The iterator protocol is
961supported as well, i. e. you can use a loop to read a file line by
962line::
963
964    host = ftputil.FTPHost(...)
965    input_file = host.file("some_file")
966    for line in input_file:
967        # do something with the line, e. g.
968        print line.strip().replace("ftplib", "ftputil")
969    input_file.close()
970
971This feature obsoletes the ``xreadlines`` method which is deprecated
972and will be removed in ``ftputil`` version 2.5.
973
974For more on file objects, see the section `File objects`_ in the
975Python Library Reference.
976
977.. _`file objects`: http://www.python.org/doc/current/lib/bltin-file-objects.html
978
979Note that ``ftputil`` supports both binary mode and text mode with the
980appropriate line ending conversions.
981
982
983Writing directory parsers
984-------------------------
985
986``ftputil`` recognizes the two most widely-used FTP directory formats
987(Unix and MS style) and adjusts itself automatically. However, if your
988server uses a format which is different from the two provided by
989``ftputil``, you can plug in an own custom parser and have it used by
990a single method call.
991
992For this, you need to write a parser class by inheriting from the
993class ``Parser`` in the ``ftp_stat`` module. Here's an example::
994
995    from ftputil import ftp_error
996    from ftputil import ftp_stat
997
998    class XyzParser(ftp_stat.Parser):
999        """
1000        Parse the default format of the FTP server of the XYZ
1001        corporation.
1002        """
1003        def parse_line(self, line, time_shift=0.0):
1004            """
1005            Parse a `line` from the directory listing and return a
1006            corresponding `StatResult` object. If the line can't
1007            be parsed, raise `ftp_error.ParserError`.
1008
1009            The `time_shift` argument can be used to fine-tune the
1010            parsing of dates and times. See the class
1011            `ftp_stat.UnixParser` for an example.
1012            """
1013            # split the `line` argument and examine it further; if
1014            #  something goes wrong, raise an `ftp_error.ParserError`
1015            ...
1016            # make a `StatResult` object from the parts above
1017            stat_result = ftp_stat.StatResult(...)
1018            # `_st_name` and `_st_target` are optional
1019            stat_result._st_name = ...
1020            stat_result._st_target = ...
1021            return stat_result
1022
1023        # define `ignores_line` only if the default in the base class
1024        #  doesn't do enough!
1025        def ignores_line(self, line):
1026            """
1027            Return a true value if the line should be ignored. For
1028            example, the implementation in the base class handles
1029            lines like "total 17". On the other hand, if the line
1030            should be used for stat'ing, return a false value.
1031            """
1032            is_total_line = super(XyzParser, self).ignores_line(line)
1033            my_test = ...
1034            return is_total_line or my_test
1035
1036A ``StatResult`` object is similar to the value returned by
1037`os.stat`_ and is usually built with statements like
1038
1039::
1040
1041    stat_result = StatResult(
1042                  (st_mode, st_ino, st_dev, st_nlink, st_uid,
1043                   st_gid, st_size, st_atime, st_mtime, st_ctime) )
1044    stat_result._st_name = ...
1045    stat_result._st_target = ...
1046
1047with the arguments of the ``StatResult`` constructor described in
1048the following table.
1049
1050===== ========== ============ =============== =======================
1051Index Attribute  os.stat type StatResult type Notes
1052===== ========== ============ =============== =======================
10530     st_mode    int          int
10541     st_ino     long         long
10552     st_dev     long         long
10563     st_nlink   int          int
10574     st_uid     int          str             usually only available as string
10585     st_gid     int          str             usually only available as string
10596     st_size    long         long
10607     st_atime   int/float    float
10618     st_mtime   int/float    float
10629     st_ctime   int/float    float
1063\-    _st_name   \-           str             file name without directory part
1064\-    _st_target \-           str             link target
1065===== ========== ============ =============== =======================
1066
1067If you can't extract all the desirable data from a line (for
1068example, the MS format doesn't contain any information about the
1069owner of a file), set the corresponding values in the ``StatResult``
1070instance to ``None``.
1071
1072Parser classes can use several helper methods which are defined in
1073the class ``Parser``:
1074
1075- ``parse_unix_mode`` parses strings like "drwxr-xr-x" and returns
1076  an appropriate ``st_mode`` value.
1077
1078- ``parse_unix_time`` returns a float number usable for the
1079  ``st_...time`` values by parsing arguments like "Nov"/"23"/"02:33" or
1080  "May"/"26"/"2005". Note that the method expects the timestamp string
1081  already split at whitespace.
1082
1083- ``parse_ms_time`` parses arguments like "10-23-01"/"03:25PM" and
1084  returns a float number like from ``time.mktime``. Note that the
1085  method expects the timestamp string already split at whitespace.
1086
1087Additionally, there's an attribute ``_month_numbers`` which maps
1088three-letter month abbreviations to integers.
1089
1090For more details, see the two "standard" parsers ``UnixParser`` and
1091``MSParser`` in the module ``ftp_stat.py``.
1092
1093To actually *use* the parser, call the method `set_parser`_ of the
1094``FTPHost`` instance.
1095
1096If you can't write a parser or don't want to, please ask on the
1097`ftputil mailing list`_. Possibly someone has already written a parser
1098for your server or can help to do it.
1099
1100
1101FAQ / Tips and tricks
1102---------------------
1103
1104Where can I get the latest version?
1105~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1106
1107See the `download page`_. Announcements will be sent to the `mailing
1108list`_. Announcements on major updates will also be posted to the
1109newsgroup `comp.lang.python`_ .
1110
1111.. _`download page`: http://ftputil.sschwarzer.net/download
1112.. _`mailing list`: http://ftputil.sschwarzer.net/mailinglist
1113.. _`comp.lang.python`: news:comp.lang.python
1114
1115Is there a mailing list on ``ftputil``?
1116~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1117
1118Yes, please visit http://ftputil.sschwarzer.net/mailinglist to
1119subscribe or read the archives.
1120
1121I found a bug! What now?
1122~~~~~~~~~~~~~~~~~~~~~~~~
1123
1124Before reporting a bug, make sure that you already tried the `latest
1125version`_ of ``ftputil``. There the bug might have already been fixed.
1126
1127.. _`latest version`: http://ftputil.sschwarzer.net/download
1128
1129Please see http://ftputil.sschwarzer.net/issuetrackernotes for
1130guidelines on entering a bug in ``ftputil``'s ticket system. If you
1131are unsure if the behaviour you found is a bug or not, you can write
1132to the `ftputil mailing list`_. In *either* case you *must not*
1133include confidential information (user id, password, file names, etc.)
1134in the problem report! Be careful!
1135
1136Does ``ftputil`` support SSL?
1137~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1138
1139``ftputil`` has no *built-in* SSL support. On the other hand,
1140you can use M2Crypto_ (in the source code archive, look for the
1141file ``M2Crypto/ftpslib.py``) which has a class derived from
1142``ftplib.FTP`` that supports SSL. You then can use a class
1143(not an object of it) similar to the following as a "session
1144factory" in ``ftputil.FTPHost``'s constructor::
1145
1146    import ftputil
1147
1148    from M2Crypto import ftpslib
1149
1150    class SSLFTPSession(ftpslib.FTP_TLS):
1151        def __init__(self, host, userid, password):
1152            """
1153            Use M2Crypto's `FTP_TLS` class to establish an
1154            SSL connection.
1155            """
1156            ftpslib.FTP_TLS.__init__(self)
1157            # do anything necessary to set up the SSL connection
1158            ...
1159            self.connect(host, port)
1160            self.login(userid, password)
1161            ...
1162
1163    # note the `session_factory` parameter
1164    host = ftputil.FTPHost(host, userid, password,
1165                           session_factory=SSLFTPSession)
1166    # use `host` as usual
1167
1168.. _M2Crypto: http://wiki.osafoundation.org/bin/view/Projects/MeTooCrypto#Downloads
1169
1170Connecting on another port
1171~~~~~~~~~~~~~~~~~~~~~~~~~~
1172
1173By default, an instantiated ``FTPHost`` object connects on the usual
1174FTP ports. If you have to use a different port, refer to the
1175section `FTPHost construction`_.
1176
1177You can use the same approach to connect in active or passive mode, as
1178you like.
1179
1180Using active or passive connections
1181~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1182
1183Use a wrapper class for ``ftplib.FTP``, as described in section
1184`FTPHost construction`_::
1185
1186    import ftplib
1187
1188    class ActiveFTPSession(ftplib.FTP):
1189        def __init__(self, host, userid, password):
1190            """
1191            Act like ftplib.FTP's constructor but use active mode
1192            explicitly.
1193            """
1194            ftplib.FTP.__init__(self)
1195            self.connect(host, port)
1196            self.login(userid, password)
1197            # see http://docs.python.org/lib/ftp-objects.html
1198            self.set_pasv(False)
1199
1200Use this class as the ``session_factory`` argument in ``FTPHost``'s
1201constructor.
1202
1203Conditional upload/download to/from a server in a different time zone
1204~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1205
1206You may find that ``ftputil`` uploads or downloads files
1207unnecessarily, or not when it should. This can happen when the FTP
1208server is in a different time zone than the client on which
1209``ftputil`` runs. Please see the section on setting the
1210`time shift`_. It may even be sufficient to call `synchronize_times`_.
1211
1212Wrong dates or times when stat'ing on a server
1213~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1214
1215Please see the previous tip.
1216
1217I tried to upload or download a file and it's corrupt
1218~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1219
1220Perhaps you used the upload or download methods without a ``mode``
1221argument. For compatibility with Python's code for local file systems,
1222``ftputil`` defaults to ASCII/text mode which will try to convert
1223presumable line endings and thus corrupt binary files. Pass "b" as the
1224``mode`` argument (see `Uploading and downloading files`_).
1225
1226When I use ``ftputil``, all I get is a ``ParserError`` exception
1227~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1228
1229The FTP server you connect to uses a directory format that
1230``ftputil`` doesn't understand. You can either write and
1231`plug in an own parser`_, or preferably ask on the `mailing list`_ for
1232help.
1233
1234.. _`plug in an own parser`: `Writing directory parsers`_
1235
1236I don't find an answer to my problem in this document
1237~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1238
1239Please send an email with your problem report or question to the
1240`ftputil mailing list`_, and we'll see what we can do for you. :-)
1241
1242
1243Bugs and limitations
1244--------------------
1245
1246- ``ftputil`` needs at least Python 2.3 to work.
1247
1248- Due to the implementation of ``lstat`` it can not return a sensible
1249  value for the root directory ``/`` though stat'ing entries *in* the
1250  root directory isn't a problem. If you know an implementation that
1251  can do this, please let me know. The root directory is handled
1252  appropriately in ``FTPHost.path.exists/isfile/isdir/islink``, though.
1253
1254- Timeouts of individual child sessions currently are not handled.
1255  This is only a problem if your ``FTPHost`` object or the generated
1256  ``FTPFile`` objects are inactive for about ten minutes or longer.
1257
1258- Until now, I haven't paid attention to thread safety. In principle,
1259  at least, different ``FTPFile`` objects should be usable in different
1260  threads. If in doubt if your approach will work, ask on the mailing
1261  list.
1262
1263- ``FTPFile`` objects in text mode *may not* support charsets with more
1264  than one byte per character. Please email me your experiences
1265  (address above), if you work with multibyte text streams in FTP
1266  sessions.
1267
1268- Currently, it is not possible to continue an interrupted upload or
1269  download. Contact me if you have problems with that.
1270
1271- There's exactly one cache for lstat results for each ``FTPHost``
1272  object, i. e. there's no sharing of cache results determined by
1273  several ``FTPHost`` objects.
1274
1275
1276Files
1277-----
1278
1279If not overwritten via installation options, the ``ftputil`` files
1280reside in the ``ftputil`` package. The documentation (in
1281`reStructuredText`_ and in HTML format) is in the same directory.
1282
1283.. _`reStructuredText`: http://docutils.sourceforge.net/rst.html
1284
1285The files ``_test_*.py`` and ``_mock_ftplib.py`` are for unit-testing.
1286If you only *use* ``ftputil`` (i. e. *don't* modify it), you can
1287delete these files.
1288
1289
1290References
1291----------
1292
1293- Mackinnon T, Freeman S, Craig P. 2000. `Endo-Testing:
1294  Unit Testing with Mock Objects`_.
1295
1296- Postel J, Reynolds J. 1985. `RFC 959 - File Transfer Protocol (FTP)`_.
1297
1298- Van Rossum G, Drake Jr FL. 2003. `Python Library Reference`_.
1299
1300.. _`Endo-Testing: Unit Testing with Mock Objects`:
1301   http://www.connextra.com/aboutUs/mockobjects.pdf
1302.. _`RFC 959 - File Transfer Protocol (FTP)`: http://www.ietf.org/rfc/rfc959.txt
1303.. _`Python Library Reference`: http://www.python.org/doc/current/lib/lib.html
1304
1305
1306Authors
1307-------
1308
1309``ftputil`` is written by Stefan Schwarzer
1310<sschwarzer@sschwarzer.net>, in part based on suggestions
1311from users.
1312
1313The ``lrucache`` module is written by Evan Prodromou
1314<evan@prodromou.name>.
1315
1316Feedback is appreciated. :-)
1317
Note: See TracBrowser for help on using the repository browser.