root/trunk/ftputil.txt

Revision 734, 42.4 kB (checked in by schwa, 3 months ago)
Change files for release 2.3.
  • Property svn:mime-type set to text/plain
  • Property svn:eol-style set to native
Line 
1 ``ftputil`` - a high-level FTP client library
2 =============================================
3
4 :Version:   2.3
5 :Date:      2008-09-06
6 :Summary:   high-level FTP client library for Python
7 :Keywords:  FTP, ``ftplib`` substitute, virtual filesystem, pure Python
8 :Author:    Stefan Schwarzer <sschwarzer@sschwarzer.net>
9 :`Russian translation`__: Anton Stepanov <antymail@mail.ru>
10
11 .. __: ftputil_ru.html
12
13 .. contents::
14
15
16 Introduction
17 ------------
18
19 The ``ftputil`` module is a high-level interface to the ftplib_
20 module. The `FTPHost objects`_ generated from it allow many operations
21 similar to those of os_, `os.path`_ and `shutil`_.
22
23 .. _ftplib: http://www.python.org/doc/current/lib/module-ftplib.html
24 .. _os: http://www.python.org/doc/current/lib/module-os.html
25 .. _`os.path`: http://www.python.org/doc/current/lib/module-os.path.html
26 .. _`shutil`: http://www.python.org/doc/current/lib/module-shutil.html
27
28 Examples::
29
30     import ftputil
31
32     # download some files from the login directory
33     host = ftputil.FTPHost('ftp.domain.com', 'user', 'password')
34     names = host.listdir(host.curdir)
35     for name in names:
36         if host.path.isfile(name):
37             host.download(name, name, 'b')  # remote, local, binary mode
38
39     # make a new directory and copy a remote file into it
40     host.mkdir('newdir')
41     source = host.file('index.html', 'r')         # file-like object
42     target = host.file('newdir/index.html', 'w')  # file-like object
43     host.copyfileobj(source, target)  # similar to shutil.copyfileobj
44     source.close()
45     target.close()
46
47 Also, there are `FTPHost.lstat`_ and `FTPHost.stat`_ to request size and
48 modification time of a file. The latter can also follow links, similar
49 to `os.stat`_. Even `FTPHost.walk`_ and `FTPHost.path.walk`_ work.
50
51 .. _`os.stat`: http://www.python.org/doc/2.5/lib/os-file-dir.html#l2h-2698
52
53
54 ``ftputil`` features
55 --------------------
56
57 * Method names are familiar from Python's ``os``, ``os.path`` and
58   ``shutil`` modules
59
60 * Remote file system navigation (``getcwd``, ``chdir``)
61
62 * Upload and download files (``upload``, ``upload_if_newer``,
63   ``download``, ``download_if_newer``)
64
65 * Time zone synchronization between client and server (needed
66   for ``upload_if_newer`` and ``download_if_newer``)
67
68 * Create and remove directories (``mkdir``, ``makedirs``, ``rmdir``,
69   ``rmtree``) and remove files (``remove``)
70
71 * Get information about directories, files and links (``listdir``,
72   ``stat``, ``lstat``, ``exists``, ``isdir``, ``isfile``, ``islink``,
73   ``abspath``, ``split``, ``join``, ``dirname``, ``basename`` etc.)
74
75 * Iterate over remote file systems (``walk``)
76
77 * Local caching of results from ``lstat`` and ``stat`` calls to reduce
78   network access (also applies to ``exists``, ``getmtime`` etc.).
79
80 * Read files from and write files to remote hosts via
81   file-like objects (``FTPHost.file``; the generated file-like objects
82   have many common methods like ``read``, ``readline``, ``readlines``,
83   ``write``, ``writelines``, ``close`` and can do automatic line
84   ending conversions on the fly, i. e. text/binary mode)
85
86
87 Exception hierarchy
88 -------------------
89
90 The exceptions are in the namespace of the ``ftp_error`` module, e. g.
91 ``ftp_error.TemporaryError``. Getting the exception classes from the
92 "package module" ``ftputil`` is deprecated.
93
94 The exceptions are organized as follows::
95
96     FTPError
97         FTPOSError(FTPError, OSError)
98             PermanentError(FTPOSError)
99             TemporaryError(FTPOSError)
100         FTPIOError(FTPError)
101         InternalError(FTPError)
102             InaccessibleLoginDirError(InternalError)
103             ParserError(InternalError)
104             RootDirError(InternalError)
105             TimeShiftError(InternalError)
106
107 and are described here:
108
109 - ``FTPError``
110
111   is the root of the exception hierarchy of the module.
112
113 - ``FTPOSError``
114
115   is derived from ``OSError``. This is for similarity between the
116   os module and ``FTPHost`` objects. Compare
117
118   ::
119
120     try:
121         os.chdir('nonexisting_directory')
122     except OSError:
123         ...
124
125   with
126
127   ::
128
129     host = ftputil.FTPHost('host', 'user', 'password')
130     try:
131         host.chdir('nonexisting_directory')
132     except OSError:
133         ...
134
135   Imagine a function
136
137   ::
138
139     def func(path, file):
140         ...
141
142   which works on the local file system and catches ``OSErrors``. If you
143   change the parameter list to
144
145   ::
146
147     def func(path, file, os=os):
148         ...
149
150   where ``os`` denotes the ``os`` module, you can call the function also as
151
152   ::
153
154     host = ftputil.FTPHost('host', 'user', 'password')
155     func(path, file, os=host)
156
157   to use the same code for a local and remote file system. Another
158   similarity between ``OSError`` and ``FTPOSError`` is that the latter
159   holds the FTP server return code in the ``errno`` attribute of the
160   exception object and the error text in ``strerror``.
161
162 - ``PermanentError``
163
164   is raised for 5xx return codes from the FTP server
165   (again, that's similar but *not* identical to ``ftplib.error_perm``).
166
167 - ``TemporaryError``
168
169   is raised for FTP return codes from the 4xx category. This
170   corresponds to ``ftplib.error_temp`` (though ``TemporaryError`` and
171   ``ftplib.error_temp`` are *not* identical).
172
173 - ``FTPIOError``
174
175   denotes an I/O error on the remote host. This appears
176   mainly with file-like objects which are retrieved by invoking
177   ``FTPHost.file`` (``FTPHost.open`` is an alias). Compare
178
179   ::
180
181     >>> try:
182     ...     f = open('not_there')
183     ... except IOError, obj:
184     ...     print obj.errno
185     ...     print obj.strerror
186     ...
187     2
188     No such file or directory
189
190   with
191
192   ::
193
194     >>> host = ftputil.FTPHost('host', 'user', 'password')
195     >>> try:
196     ...     f = host.open('not_there')
197     ... except IOError, obj:
198     ...     print obj.errno
199     ...     print obj.strerror
200     ...
201     550
202     550 not_there: No such file or directory.
203
204   As you can see, both code snippets are similar. (However, the error
205   codes aren't the same.)
206
207 - ``InternalError``
208
209   subsumes exception classes for signaling errors due to limitations
210   of the FTP protocol or the concrete implementation of ``ftputil``.
211
212 - ``InaccessibleLoginDirError``
213
214   This exception is only raised if *both* of the following conditions
215   are met:
216
217   - The directory in which "you" are placed upon login is not
218     accessible, i. e. a ``chdir`` call fails.
219
220   - You try to access a path which contains whitespace.
221
222 - ``ParserError``
223
224   is used for errors during the parsing of directory
225   listings from the server. This exception is used by the ``FTPHost``
226   methods ``stat``, ``lstat``, and ``listdir``.
227
228 - ``RootDirError``
229
230   Because of the implementation of the ``lstat`` method it is not
231   possible to do a ``stat`` call  on the root directory ``/``.
232   If you know *any* way to do it, please let me know. :-)
233
234   This problem does *not* affect stat calls on items *in* the root
235   directory.
236
237 - ``TimeShiftError``
238
239   is used to denote errors which relate to setting the `time shift`_,
240   *for example* trying to set a value which is no multiple of a full
241   hour.
242
243
244 ``FTPHost`` objects
245 -------------------
246
247 .. _`FTPHost construction`:
248
249 Construction
250 ~~~~~~~~~~~~
251
252 Basics
253 ``````
254
255 ``FTPHost`` instances may be generated with the following call::
256
257     host = ftputil.FTPHost(host, user, password, account,
258                            session_factory=ftplib.FTP)
259
260 The first four parameters are strings with the same meaning as for the
261 FTP class in the ``ftplib`` module.
262
263 Session factories
264 `````````````````
265
266 The keyword argument ``session_factory`` may be used to generate FTP
267 connections with other factories than the default ``ftplib.FTP``. For
268 example, the M2Crypto distribution uses a secure FTP class which is
269 derived from ``ftplib.FTP``.
270
271 In fact, all positional and keyword arguments other than
272 ``session_factory`` are passed to the factory to generate a new background
273 session (which happens for every remote file that is opened; see
274 below).
275
276 This functionality of the constructor also allows to wrap
277 ``ftplib.FTP`` objects to do something that wouldn't be possible with
278 the ``ftplib.FTP`` constructor alone.
279
280 As an example, assume you want to connect to another than the default
281 port but ``ftplib.FTP`` only offers this by means of its ``connect``
282 method, but not via its constructor. The solution is to provide a
283 wrapper class::
284
285     import ftplib
286     import ftputil
287
288     EXAMPLE_PORT = 50001
289
290     class MySession(ftplib.FTP):
291         def __init__(self, host, userid, password, port):
292             """Act like ftplib.FTP's constructor but connect to other port."""
293             ftplib.FTP.__init__(self)
294             self.connect(host, port)
295             self.login(userid, password)
296
297     # try not to use MySession() as factory, - use the class itself
298     host = ftputil.FTPHost(host, userid, password,
299                            port=EXAMPLE_PORT, session_factory=MySession)
300     # use `host` as usual
301
302 On login, the format of the directory listings (needed for stat'ing
303 files and directories) should be determined automatically. If not,
304 please `file a bug report`_.
305
306 .. _`file a bug report`: http://ftputil.sschwarzer.net/issuetrackernotes
307
308 Support for the ``with`` statement
309 ``````````````````````````````````
310
311 If you are sure that all the users of your code use at least Python
312 2.5, you can use Python's `with statement`_::
313
314     # not needed for Python 2.6 and later
315     from __future__ import with_statement
316
317     import ftputil
318
319     with ftputil.FTPHost(host, user, password) as host:
320         print host.listdir(host.curdir)
321
322 After the ``with`` block, the ``FTPHost`` instance and the
323 associated FTP sessions will be closed automatically.
324
325 If something goes wrong during the ``FTPHost`` construction or in the
326 body of the ``with`` statement, the instance is closed as well.
327 Exceptions will be propagated (as with ``try ... finally``).
328
329 .. _`with statement`: http://www.python.org/dev/peps/pep-0343/
330
331 ``FTPHost`` attributes and methods
332 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
333
334 Attributes
335 ``````````
336
337 - ``curdir``, ``pardir``, ``sep``
338
339   are strings which denote the current and the parent directory on the
340   remote server. sep identifies the path separator. Though `RFC 959`_
341   (File Transfer Protocol) notes that these values may depend on the
342   FTP server implementation, the Unix counterparts seem to work well
343   in practice, even for non-Unix servers.
344
345 .. _`RFC 959`: `RFC 959 - File Transfer Protocol (FTP)`_
346
347 Remote file system navigation
348 `````````````````````````````
349
350 - ``getcwd()``
351
352   returns the absolute current directory on the remote host. This
353   method acts similar to ``os.getcwd``.
354
355 - ``chdir(directory)``
356
357   sets the current directory on the FTP server. This resembles
358   ``os.chdir``, as you may have expected.
359
360 Uploading and downloading files
361 ```````````````````````````````
362
363 - ``upload(source, target, mode='')``
364
365   copies a local source file (given by a filename, i. e. a string)
366   to the remote host under the name target. Both source and target
367   may be absolute paths or relative to their corresponding current
368   directory (on the local or the remote host, respectively). The
369   mode may be "" or "a" for ASCII uploads or "b" for binary uploads.
370   ASCII mode is the default (again, similar to regular local file
371   objects).
372
373 - ``download(source, target, mode='')``
374
375   performs a download from the remote source to a target file. Both
376   source and target are strings. Additionally, the description of
377   the upload method applies here, too.
378
379 .. _`upload_if_newer`:
380
381 - ``upload_if_newer(source, target, mode='')``
382
383   is similar to the upload method. The only difference is that the
384   upload is only invoked if the time of the last modification for
385   the source file is more recent than that of the target file, or
386   the target doesn't exist at all. If an upload actually happened,
387   the return value is a true value, else a false value.
388
389   Note that this method only checks the existence and/or the
390   modification time of the source and target file; it can't recognize
391   a change in the transfer mode, e. g.
392
393   ::
394
395     # transfer in ASCII mode
396     host.upload_if_newer('source_file', 'target_file', 'a')
397     # won't transfer the file again, which is bad!
398     host.upload_if_newer('source_file', 'target_file', 'b')
399
400   Similarly, if a transfer is interrupted, the remote file will have a
401   newer modification time than the local file, and thus the transfer
402   won't be repeated if ``upload_if_newer`` is used a second time.
403   There are (at least) two possibilities after a failed upload:
404
405   - use ``upload`` instead of ``upload_if_newer``, or
406
407   - remove the incomplete target file with ``FTPHost.remove``, then
408     use ``upload`` or ``upload_if_newer`` to transfer it again.
409
410   If it seems that a file is uploaded unnecessarily, read the
411   subsection on `time shift`_ settings.
412
413 .. _`download_if_newer`:
414
415 - ``download_if_newer(source, target, mode='')``
416
417   corresponds to ``upload_if_newer`` but performs a download from the
418   server to the local host. Read the descriptions of download and
419   ``upload_if_newer`` for more. If a download actually happened, the
420   return value is a true value, else a false value.
421
422   If it seems that a file is downloaded unnecessarily, read the
423   subsection on `time shift`_ settings.
424
425 .. _`time shift`:
426
427 Time zone correction
428 ````````````````````
429
430 .. _`set_time_shift`:
431
432 - ``set_time_shift(time_shift)``
433
434   sets the so-called time shift value (measured in seconds). The time
435   shift is the difference between the local time of the server and the
436   local time of the client at a given moment, i. e. by definition
437
438   ::
439
440     time_shift = server_time - client_time
441
442   Setting this value is important if `upload_if_newer`_ and
443   `download_if_newer`_ should work correctly even if the time zone of
444   the FTP server differs from that of the client (where ``ftputil``
445   runs). Note that the time shift value *can* be negative.
446
447   If the time shift value is invalid, e. g. no multiple of a full hour
448   or its absolute (unsigned) value larger than 24 hours, a
449   ``TimeShiftError`` is raised.
450
451   See also `synchronize_times`_ for a way to set the time shift with a
452   simple method call.
453
454 - ``time_shift()``
455
456   return the currently-set time shift value. See ``set_time_shift``
457   (above) for its definition.
458
459 .. _`synchronize_times`:
460
461 - ``synchronize_times()``
462
463   synchronizes the local times of the server and the client, so that
464   `upload_if_newer`_ and `download_if_newer`_ work as expected, even
465   if the client and the server are in different time zones. For this
466   to work, *all* of the following conditions must be true:
467
468   - The connection between server and client is established.
469
470   - The client has write access to the directory that is current when
471     ``synchronize_times`` is called.
472
473   If you can't fulfill these conditions, you can nevertheless set the
474   time shift value manually with `set_time_shift`_. Trying to call
475   ``synchronize_times`` if the above conditions aren't true results in
476   a ``TimeShiftError`` exception.
477
478 Creating and removing directories
479 `````````````````````````````````
480
481 - ``mkdir(path, [mode])``
482
483   makes the given directory on the remote host. This doesn't construct
484   "intermediate" directories which don't already exist. The ``mode``
485   parameter is ignored; this is for compatibility with ``os.mkdir`` if
486   an ``FTPHost`` object is passed into a function instead of the os
487   module (see the subsection on Python exceptions above for an
488   explanation).
489
490 - ``makedirs(path, [mode])``
491
492   works similar to ``mkdir`` (see above, but also makes intermediate
493   directories, like ``os.makedirs``). The ``mode`` parameter is
494   only there for compatibility with ``os.makedirs`` and is
495   ignored.
496
497 - ``rmdir(path)``
498
499   removes the given remote directory. If it's not empty, raise
500   a ``PermanentError``.
501
502 - ``rmtree(path, ignore_errors=False, onerror=None)``
503
504   removes the given remote, possibly non-empty, directory tree.
505   The interface of this method is rather complex, in favor of
506   compatibility with ``shutil.rmtree``.
507
508   If ``ignore_errors`` is set to a true value, errors are ignored.
509   If ``ignore_errors`` is a false value *and* ``onerror`` isn't
510   set, all exceptions occurring during the tree iteration and
511   processing are raised. These exceptions are all of type
512   ``PermanentError``.
513
514   To distinguish between error situations and/or pass in a callable
515   for ``onerror``. This callable must accept three arguments:
516   ``func``, ``path`` and ``exc_info``). ``func`` is a bound method
517   object, *for example* ``your_host_object.listdir``. ``path`` is
518   the path that was the recent argument of the respective method
519   (``listdir``, ``remove``, ``rmdir``). ``exc_info`` is the exception
520   info as it is got from ``sys.exc_info``.
521
522   The code of ``rmtree`` is taken from Python's ``shutil`` module
523   and adapted for ``ftputil``.
524
525 Removing files and links
526 ````````````````````````
527
528 - ``remove(path)``
529
530   removes a file or link on the remote host (similar to ``os.remove``).
531
532 - ``unlink(path)``
533
534   is an alias for ``remove``.
535
536 Retrieving information about directories, files and links
537 `````````````````````````````````````````````````````````
538
539 - ``listdir(path)``
540
541   returns a list containing the names of the files and directories
542   in the given path; similar to ``os.listdir``. The special names
543   ``.`` and ``..`` are not in the list.
544
545 The methods ``lstat`` and ``stat`` (and others) rely on the directory
546 listing format used by the FTP server. When connecting to a host,
547 ``FTPHost``'s constructor tries to guess the right format, which
548 mostly succeeds. However, if you get strange results or
549 ``ParserError`` exceptions by a mere ``lstat`` call, please `file a
550 bug report`_.
551
552 If ``lstat`` or ``stat`` yield wrong modification dates or times, look
553 at the methods that deal with time zone differences (`time shift`_).
554
555 .. _`FTPHost.lstat`:
556
557 - ``lstat(path)``
558
559   returns an object similar that from ``os.lstat`` (a "tuple" with
560   additional attributes; see the documentation of the ``os`` module for
561   details). However, due to the nature of the application, there are
562   some important aspects to keep in mind:
563
564   - The result is derived by parsing the output of a ``DIR`` command on
565     the server. Therefore, the result from ``FTPHost.lstat`` can not
566     contain more information than the received text. In particular:
567
568   - User and group ids can only be determined as strings, not as
569     numbers, and that only if the server supplies them. This is
570     usually the case with Unix servers but may not be for other FTP
571     server programs.
572
573   - Values for the time of the last modification may be rough,
574     depending on the information from the server. For timestamps
575     older than a year, this usually means that the precision of the
576     modification timestamp value is not better than days. For newer
577     files, the information may be accurate to a minute.
578
579   - Links can only be recognized on servers that provide this
580     information in the ``DIR`` output.
581
582   - Items that can't be determined at all are set to ``None``.
583
584   - There's a special problem with stat'ing the root directory.
585     (Stat'ing things *in* the root directory is fine though.) In
586     this case, a ``RootDirError`` is raised. This has to do with the
587     algorithm used by ``(l)stat`` and I know of no approach which
588     mends this problem.
589
590 ..
591
592   Currently, ``ftputil`` recognizes the common Unix-style and
593   Microsoft/DOS-style directory formats. If you need to parse output
594   from another server type, please write to the `ftputil mailing
595   list`_. You may consider to `write your own parser`_.
596
597 .. _`ftputil mailing list`: http://ftputil.sschwarzer.net/mailinglist
598 .. _`write your own parser`: `Writing directory parsers`_
599
600 .. _`FTPHost.stat`:
601
602 - ``stat(path)``
603   returns ``stat`` information also for files which are pointed to by a
604   link. This method follows multiple links until a regular file or
605   directory is found. If an infinite link chain is encountered, a
606   ``PermanentError`` is raised.
607
608 .. _`FTPHost.path`:
609
610 ``FTPHost`` objects contain an attribute named ``path``, similar to
611 `os.path`_. The following methods can be applied to the remote host
612 with the same semantics as for ``os.path``:
613
614 ::
615
616     abspath(path)
617     basename(path)
618     commonprefix(path_list)
619     dirname(path)
620     exists(path)
621     getmtime(path)
622     getsize(path)
623     isabs(path)
624     isdir(path)
625     isfile(path)
626     islink(path)
627     join(path1, path2, ...)
628     normcase(path)
629     normpath(path)
630     split(path)
631     splitdrive(path)
632     splitext(path)
633     walk(path, func, arg)
634
635 Local caching of file system information
636 ````````````````````````````````````````
637
638 Many of the above methods need access to the remote file system to
639 obtain data on directories and files. To get the most recent data,
640 *each* call to ``lstat``, ``stat``, ``exists``, ``getmtime`` etc.
641 would require to fetch a directory listing from the server, which can
642 make the program very slow. This effect is more pronounced for
643 operations which mostly scan the file system rather than transferring
644 file data.
645
646 For this reason, ``ftputil`` by default saves (caches) the results
647 from directory listings locally and reuses those results. This reduces
648 network accesses and so speeds up the software a lot. However, since
649 data is more rarely fetched from the server, the risk of obsolete data
650 also increases. This will be discussed below.
651
652 Caching can - if necessary at all - be controlled via the
653 ``stat_cache`` object in an ``FTPHost``'s namespace. For example,
654 after calling
655
656 ::
657
658     host = ftputil.FTPHost(host, user, password, account,
659                            session_factory=ftplib.FTP)
660
661 the cache can be accessed as ``host.stat_cache``.
662
663 While ``ftputil`` usually manages the cache quite well, there are two
664 possible reasons for modifying cache parameters. The first is when the
665 number of possible entries is too low. You may notice that when you
666 are processing very large directories (e. g. above 1000 directories or
667 files) and the program becomes much slower than before. It's common
668 for code to read a directory with ``listdir`` and then process the
669 found directories and files. For this application, it's a good rule of
670 thumb to set the cache size to somewhat more than the number of
671 directory entries fetched with ``listdir``. This is done by the
672 ``resize`` method::
673
674     host.stat_cache.resize(2000)
675
676 where the argument is the maximal number of ``lstat`` results to store
677 (the default is 1000). Note that each path on the server, e. g.
678 "/home/schwa/some_dir", corresponds to a single cache entry. (Methods
679 like ``exists`` or ``getmtime`` all derive their results from a
680 previously fetched ``lstat`` result.)
681
682 The value 2000 above means that the cache will hold at most 2000
683 entries. If more are about to be stored, the entries which have not
684 been used for the longest time will be deleted to make place for newer
685 entries.
686
687 Caching is so effective because it reduces network accesses. This can
688 also be a disadvantage if the file system data on the remote server
689 changes after a stat result has been retrieved; the client, when
690 looking at the cached stat data, will use obsolete information.
691
692 There are two ways to get such out-of-date stat data. The first
693 happens when an ``FTPHost`` instance modifies a file path for which it
694 has a cache entry, e. g. by calling ``remove`` or ``rmdir``. Such
695 changes are handled transparently; the path will be deleted from the
696 cache. A different matter are changes unknown to the ``FTPHost``
697 object which reads its cache. Obviously, for example, these are
698 changes by programs running on the remote host. On the other hand,
699 cache inconsistencies can also occur if two ``FTPHost`` objects change
700 a file system simultaneously::
701
702     host1 = ftputil.FTPHost(server, user1, password1)
703     host2 = ftputil.FTPHost(server, user1, password1)
704     try:
705         stat_result1 = host1.stat("some_file")
706         stat_result2 = host2.stat("some_file")
707         host2.remove("some_file")
708         # `host1` will still see the obsolete cache entry!
709         print host1.stat("some_file")
710         # will raise an exception since an `FTPHost` object
711         #  knows of its own changes
712         print host2.stat("some_file")
713     finally:
714         host1.close()
715         host2.close()
716
717 At first sight, it may appear to be a good idea to have a shared cache
718 among several ``FTPHost`` objects. After some thinking, this turns out
719 to be very error-prone. For example, it won't help with different
720 processes using ``ftputil``. So, if you have to deal with concurrent
721 write accesses to a server, you have to handle them explicitly.
722
723 The most useful tool for this probably is the ``invalidate`` method.
724 In the example above, it could be used as::
725
726     host1 = ftputil.FTPHost(server, user1, password1)
727     host2 = ftputil.FTPHost(server, user1, password1)
728     try:
729         stat_result1 = host1.stat("some_file")
730         stat_result2 = host2.stat("some_file")
731         host2.remove("some_file")
732         # invalidate using an absolute path
733         absolute_path = host1.path.abspath(
734                         host1.path.join(host1.curdir, "some_file"))
735         host1.stat_cache.invalidate(absolute_path)
736         # will now raise an exception as it should
737         print host1.stat("some_file")
738         # would raise an exception since an `FTPHost` object
739         #  knows of its own changes, even without `invalidate`
740         print host2.stat("some_file")
741     finally:
742         host1.close()
743         host2.close()
744
745 The method ``invalidate`` can be used on any *absolute* path, be it a
746 directory, a file or a link.
747
748 By default, the cache entries are stored indefinitely, i. e. if you
749 start your Python process using ``ftputil`` and let it run for three
750 days a stat call may still access cache data that old. To avoid this,
751 you can set the ``max_age`` attribute::
752
753     host = ftputil.FTPHost(server, user, password)
754     host.stat_cache.max_age = 60 * 60  # = 3600 seconds
755
756 This sets the maximum age of entries in the cache to an hour. This
757 means any entry older won't be retrieved from the cache but its data
758 instead fetched again from the remote host (and then again stored for
759 up to an hour). To reset `max_age` to the default of unlimited age,
760 i. e. cache entries never expire, use ``None`` as value.
761
762 If you are certain that the cache is in the way, you can disable and
763 later re-enable it completely with ``disable`` and ``enable``::
764
765     host = ftputil.FTPHost(server, user, password)
766     host.stat_cache.disable()
767     ...
768     host.stat_cache.enable()
769
770 During that time, the cache won't be used; all data will be fetched
771 from the network. After enabling the cache, its entries will be the
772 same as when the cache was disabled, that is, entries won't get
773 updated with newer data during this period. Note that even when the
774 cache is disabled, the file system data in the code can become
775 inconsistent::
776
777     host = ftputil.FTPHost(server, user, password)
778     host.stat_cache.disable()
779     if host.path.exists("some_file"):
780         mtime = host.path.getmtime("some_file")
781
782 In that case, the file ``some_file`` may have been removed by another
783 process between the calls to ``exists`` and ``getmtime``!
784
785 Iteration over directories
786 ``````````````````````````
787
788 .. _`FTPHost.walk`:
789
790 - ``walk(top, topdown=True, onerror=None)``
791
792   iterates over a directory tree, similar to `os.walk`_ in Python 2.3
793   and above. Actually, ``FTPHost.walk`` uses the code from Python with
794   just the necessary modifications, so see the linked documentation.
795
796 .. _`os.walk`: http://www.python.org/doc/2.5/lib/os-file-dir.html#l2h-2707
797
798 .. _`FTPHost.path.walk`:
799
800 - ``path.walk(path, func, arg)``
801
802   Similar to ``os.path.walk``, the ``walk`` method in
803   `FTPHost.path`_ can be used.
804
805 Other methods
806 `````````````
807
808 - ``close()``
809
810   closes the connection to the remote host. After this, no more
811   interaction with the FTP server is possible without using a new
812   ``FTPHost`` object.
813
814 - ``rename(source, target)``
815
816   renames the source file (or directory) on the FTP server.
817
818 - ``copyfileobj(source, target, length=64*1024)``
819
820   copies the contents from the file-like object source to the
821   file-like object target. The only difference to
822   ``shutil.copyfileobj`` is the default buffer size. Note that
823   arbitrary file-like objects can be used as arguments (e. g. local
824   files, remote FTP files). See `File-like objects`_ for construction
825   and use of remote file-like objects.
826
827 .. _`set_parser`:
828
829 - ``set_parser(parser)``
830
831   sets a custom parser for FTP directories. Note that you have to pass
832   in a parser *instance*, not the class.
833  
834   An `extra section`_ shows how to write own parsers if the default
835   parsers in ``ftputil`` don't work for you. Possibly you are lucky
836   and someone has already written a parser you can use. Please ask on
837   the `mailing list`_.
838
839 .. _`extra section`: `Writing directory parsers`_
840
841
842 File-like objects
843 -----------------
844
845 Construction
846 ~~~~~~~~~~~~
847
848 Basics
849 ``````
850
851 ``FTPFile`` objects are returned by a call to ``FTPHost.file`` or
852 ``FTPHost.open``, never use the constructor directly.
853
854 - ``FTPHost.file(path, mode='r')``
855
856   returns a file-like object that refers to the path on the remote
857   host. This path may be absolute or relative to the current directory
858   on the remote host (this directory can be determined with the getcwd
859   method). As with local file objects the default mode is "r", i. e.
860   reading text files. Valid modes are "r", "rb", "w", and "wb".
861
862 - ``FTPHost.open(path, mode='r')``
863
864   is an alias for ``file`` (see above).
865
866 Support for the ``with`` statement
867 ``````````````````````````````````
868
869 If you are sure that all the users of your code use at least Python
870 2.5, you can use Python's `with statement`_ also with the ``FTPFile``
871 constructor::
872
873     # not needed for Python 2.6 and later
874     from __future__ import with_statement
875
876     import ftputil
877
878     # get an ``FTPHost`` object from somewhere
879     ...
880
881     with host.file("new_file", "w") as f:
882         f.write("This is some text.")
883
884 At the end of the ``with`` block, the file will be closed
885 automatically.
886
887 If something goes wrong during the construction of the file or in the
888 body of the ``with`` statement, the file will be closed as well.
889 Exceptions will be propagated (as with ``try ... finally``).
890
891 .. _`with statement`: http://www.python.org/dev/peps/pep-0343/
892
893 Attributes and methods
894 ~~~~~~~~~~~~~~~~~~~~~~
895
896 The methods
897
898 ::
899
900     close()
901     read([count])
902     readline([count])
903     readlines()
904     write(data)
905     writelines(string_sequence)
906     xreadlines()
907
908 and the attribute ``closed`` have the same semantics as for file
909 objects of a local disk file system. The iterator protocol is also
910 supported, i. e. you can use a loop to read a file line by line::
911
912     host = ftputil.FTPHost(...)
913     input_file = host.file("some_file")
914     for line in input_file:
915         # do something with the line, e. g.
916         print line.strip().replace("ftplib", "ftputil")
917     input_file.close()
918
919 For more on file objects, see the section `File objects`_ in the
920 Python Library Reference.
921
922 .. _`file objects`: http://www.python.org/doc/current/lib/bltin-file-objects.html
923
924 Note that ``ftputil`` supports both binary mode and text mode with the
925 appropriate line ending conversions.
926
927
928 Writing directory parsers
929 -------------------------
930
931 ``ftputil`` recognizes the two most widely-used FTP directory formats
932 (Unix and MS style) and adjusts itself automatically. However, if your
933 server uses a format which is different from the two provided by
934 ``ftputil``, you can plug in an own custom parser and have it used by
935 a single method call.
936
937 For this, you need to write a parser class by inheriting from the
938 class ``Parser`` in the ``ftp_stat`` module. Here's an example::
939
940     from ftputil import ftp_error
941     from ftputil import ftp_stat
942
943     class XyzParser(ftp_stat.Parser):
944         """
945         Parse the default format of the FTP server of the XYZ
946         corporation.
947         """
948         def parse_line(self, line, time_shift=0.0):
949             """
950             Parse a `line` from the directory listing and return a
951             corresponding `StatResult` object. If the line can't
952             be parsed, raise `ftp_error.ParserError`.
953
954             The `time_shift` argument can be used to fine-tune the
955             parsing of dates and times. See the class
956             `ftp_stat.UnixParser` for an example.
957             """
958             # split the `line` argument and examine it further; if
959             #  something goes wrong, raise an `ftp_error.ParserError`
960             ...
961             # make a `StatResult` object from the parts above
962             stat_result = ftp_stat.StatResult(...)
963             # `_st_name` and `_st_target` are optional
964             stat_result._st_name = ...
965             stat_result._st_target = ...
966             return stat_result
967
968         # define `ignores_line` only if the default in the base class
969         #  doesn't do enough!
970         def ignores_line(self, line):
971             """
972             Return a true value if the line should be ignored. For
973             example, the implementation in the base class handles
974             lines like "total 17". On the other hand, if the line
975             should be used for stat'ing, return a false value.
976             """
977             is_total_line = super(XyzParser, self).ignores_line(line)
978             my_test = ...
979             return is_total_line or my_test
980
981 A ``StatResult`` object is similar to the value returned by
982 `os.stat`_ and is usually built with statements like
983
984 ::
985
986     stat_result = StatResult(
987                   (st_mode, st_ino, st_dev, st_nlink, st_uid,
988                    st_gid, st_size, st_atime, st_mtime, st_ctime) )
989     stat_result._st_name = ...
990     stat_result._st_target = ...
991
992 with the arguments of the ``StatResult`` constructor described in
993 the following table.
994
995 ===== ========== ============ =============== =======================
996 Index Attribute  os.stat type StatResult type Notes
997 ===== ========== ============ =============== =======================
998 0     st_mode    int          int
999 1     st_ino     long         long
1000 2     st_dev     long         long
1001 3     st_nlink   int          int
1002 4     st_uid     int          str             usually only available as string
1003 5     st_gid     int          str             usually only available as string
1004 6     st_size    long         long
1005 7     st_atime   int/float    float
1006 8     st_mtime   int/float    float
1007 9     st_ctime   int/float    float
1008 \-    _st_name   \-           str             file name without directory part
1009 \-    _st_target \-           str             link target
1010 ===== ========== ============ =============== =======================
1011
1012 If you can't extract all the desirable data from a line (for
1013 example, the MS format doesn't contain any information about the
1014 owner of a file), set the corresponding values in the ``StatResult``
1015 instance to ``None``.
1016
1017 Parser classes can use several helper methods which are defined in
1018 the class ``Parser``:
1019
1020 - ``parse_unix_mode`` parses strings like "drwxr-xr-x" and returns
1021   an appropriate ``st_mode`` value.
1022
1023 - ``parse_unix_time`` returns a float number usable for the
1024   ``st_...time`` values by parsing arguments like "Nov"/"23"/"02:33" or
1025   "May"/"26"/"2005". Note that the method expects the timestamp string
1026   already split at whitespace.
1027
1028 - ``parse_ms_time`` parses arguments like "10-23-01"/"03:25PM" and
1029   returns a float number like from ``time.mktime``. Note that the
1030   method expects the timestamp string already split at whitespace.
1031
1032 Additionally, there's an attribute ``_month_numbers`` which maps
1033 three-letter month abbreviations to integers.
1034
1035 For more details, see the two "standard" parsers ``UnixParser`` and
1036 ``MSParser`` in the module ``ftp_stat.py``.
1037
1038 To actually *use* the parser, call the method `set_parser`_ of the
1039 ``FTPHost`` instance.
1040
1041 If you can't write a parser or don't want to, please ask on the
1042 `ftputil mailing list`_. Possibly someone has already written a parser
1043 for your server or can help to do it.
1044
1045
1046 FAQ / Tips and tricks
1047 ---------------------
1048
1049 Where can I get the latest version?
1050 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1051
1052 See the `download page`_. Announcements will be sent to the `mailing
1053 list`_. Announcements on major updates will also be posted to the
1054 newsgroup `comp.lang.python`_ .
1055
1056 .. _`download page`: http://ftputil.sschwarzer.net/download
1057 .. _`mailing list`: http://ftputil.sschwarzer.net/mailinglist
1058 .. _`comp.lang.python`: news:comp.lang.python
1059
1060 Is there a mailing list on ``ftputil``?
1061 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1062
1063 Yes, please visit http://ftputil.sschwarzer.net/mailinglist to
1064 subscribe or read the archives.
1065
1066 I found a bug! What now?
1067 ~~~~~~~~~~~~~~~~~~~~~~~~
1068
1069 Before reporting a bug, make sure that you already tried the `latest
1070 version`_ of ``ftputil``. There the bug might have already been fixed.
1071
1072 .. _`latest version`: http://ftputil.sschwarzer.net/download
1073
1074 Please see http://ftputil.sschwarzer.net/issuetrackernotes for
1075 guidelines on entering a bug in ``ftputil``'s ticket system. If you
1076 are unsure if the behaviour you found is a bug or not, you can write
1077 to the `ftputil mailing list`_. In *either* case you *must not*
1078 include confidential information (user id, password, file names, etc.)
1079 in the problem report! Be careful!
1080
1081 Does ``ftputil`` support SSL?
1082 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1083
1084 ``ftputil`` has no *built-in* SSL support. On the other hand,
1085 you can use M2Crypto_ (in the source code archive, look for the
1086 file ``M2Crypto/ftpslib.py``) which has a class derived from
1087 ``ftplib.FTP`` that supports SSL. You then can use a class
1088 (not an object of it) similar to the following as a "session
1089 factory" in ``ftputil.FTPHost``'s constructor::
1090
1091     import ftputil
1092
1093     from M2Crypto import ftpslib
1094
1095     class SSLFTPSession(ftpslib.FTP_TLS):
1096         def __init__(self, host, userid, password):
1097             """
1098             Use M2Crypto's `FTP_TLS` class to establish an
1099             SSL connection.
1100             """
1101             ftpslib.FTP_TLS.__init__(self)
1102             # do anything necessary to set up the SSL connection
1103             ...
1104             self.connect(host, port)
1105             self.login(userid, password)
1106             ...
1107
1108     # note the `session_factory` parameter
1109     host = ftputil.FTPHost(host, userid, password,
1110                            session_factory=SSLFTPSession)
1111     # use `host` as usual
1112
1113 .. _M2Crypto: http://wiki.osafoundation.org/bin/view/Projects/MeTooCrypto#Downloads
1114
1115 Connecting on another port
1116 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1117
1118 By default, an instantiated ``FTPHost`` object connects on the usual
1119 FTP ports. If you have to use a different port, refer to the
1120 section `FTPHost construction`_.
1121
1122 You can use the same approach to connect in active or passive mode, as
1123 you like.
1124
1125 Using active or passive connections
1126 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1127
1128 Use a wrapper class for ``ftplib.FTP``, as described in section
1129 `FTPHost construction`_::
1130
1131     import ftplib
1132
1133     class ActiveFTPSession(ftplib.FTP):
1134         def __init__(self, host, userid, password):
1135             """
1136             Act like ftplib.FTP's constructor but use active mode
1137             explicitly.
1138             """
1139             ftplib.FTP.__init__(self)
1140             self.connect(host, port)
1141             self.login(userid, password)
1142             # see http://docs.python.org/lib/ftp-objects.html
1143             self.set_pasv(False)
1144
1145 Use this class as the ``session_factory`` argument in ``FTPHost``'s
1146 constructor.
1147
1148 Conditional upload/download to/from a server in a different time zone
1149 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1150
1151 You may find that ``ftputil`` uploads or downloads files
1152 unnecessarily, or not when it should. This can happen when the FTP
1153 server is in a different time zone than the client on which
1154 ``ftputil`` runs. Please see the section on setting the
1155 `time shift`_. It may even be sufficient to call `synchronize_times`_.
1156
1157 Wrong dates or times when stat'ing on a server
1158 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1159
1160 Please see the previous tip.
1161
1162 I tried to upload or download a file and it's corrupt
1163 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1164
1165 Perhaps you used the upload or download methods without a ``mode``
1166 argument. For compatibility with Python's code for local file systems,
1167 ``ftputil`` defaults to ASCII/text mode which will try to convert
1168 presumable line endings and thus corrupt binary files. Pass "b" as the
1169 ``mode`` argument (see `Uploading and downloading files`_).
1170
1171 When I use ``ftputil``, all I get is a ``ParserError`` exception
1172 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1173
1174 The FTP server you connect to uses a directory format that
1175 ``ftputil`` doesn't understand. You can either write and
1176 `plug in an own parser`_, or preferably ask on the `mailing list`_ for
1177 help.
1178
1179 .. _`plug in an own parser`: `Writing directory parsers`_
1180
1181 I don't find an answer to my problem in this document
1182 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1183
1184 Please send an email with your problem report or question to the
1185 `ftputil mailing list`_, and we'll see what we can do for you. :-)
1186
1187
1188 Bugs and limitations
1189 --------------------
1190
1191 - ``ftputil`` needs at least Python 2.3 to work.
1192
1193 - Due to the implementation of ``lstat`` it can not return a sensible
1194   value for the root directory ``/`` though stat'ing entries *in* the
1195   root directory isn't a problem. If you know an implementation that
1196   can do this, please let me know. The root directory is handled
1197   appropriately in ``FTPHost.path.exists/isfile/isdir/islink``, though.
1198
1199 - Timeouts of individual child sessions currently are not handled.
1200   This is only a problem if your ``FTPHost`` object or the generated
1201   ``FTPFile`` objects are inactive for about ten minutes or longer.
1202
1203 - Until now, I haven't paid attention to thread safety. In principle,
1204   at least, different ``FTPFile`` objects should be usable in different
1205   threads.
1206
1207 - ``FTPFile`` objects in text mode *may not* support charsets with more
1208   than one byte per character. Please email me your experiences
1209   (address above), if you work with multibyte text streams in FTP
1210   sessions.
1211
1212 - Currently, it is not possible to continue an interrupted upload or
1213   download. Contact me if you have problems with that.
1214
1215 - There's exactly one cache for lstat results for each ``FTPHost``
1216   object, i. e. there's no sharing of cache results determined by
1217   several ``FTPHost`` objects.
1218
1219
1220 Files
1221 -----
1222
1223 If not overwritten via installation options, the ``ftputil`` files
1224 reside in the ``ftputil`` package. The documentation (in
1225 `reStructuredText`_ and in HTML format) is in the same directory.
1226
1227 .. _`reStructuredText`: http://docutils.sourceforge.net/rst.html
1228
1229 The files ``_test_*.py`` and ``_mock_ftplib.py`` are for unit-testing.
1230 If you only *use* ``ftputil`` (i. e. *don't* modify it), you can
1231 delete these files.
1232
1233
1234 References
1235 ----------
1236
1237 - Mackinnon T, Freeman S, Craig P. 2000. `Endo-Testing:
1238   Unit Testing with Mock Objects`_.
1239
1240 - Postel J, Reynolds J. 1985. `RFC 959 - File Transfer Protocol (FTP)`_.
1241
1242 - Van Rossum G, Drake Jr FL. 2003. `Python Library Reference`_.
1243
1244 .. _`Endo-Testing: Unit Testing with Mock Objects`:
1245    http://www.connextra.com/aboutUs/mockobjects.pdf
1246 .. _`RFC 959 - File Transfer Protocol (FTP)`: http://www.ietf.org/rfc/rfc959.txt
1247 .. _`Python Library Reference`: http://www.python.org/doc/current/lib/lib.html