root/tags/release2_2/ftputil.txt

Revision 677, 40.4 kB (checked in by schwa, 2 years ago)
Increased version to 2.2.
  • Property svn:mime-type set to text/plain
  • Property svn:eol-style set to native
Line 
1 ``ftputil`` - a high-level FTP client library
2 =============================================
3
4 :Version:   2.2
5 :Date:      2006-12-24
6 :Summary:   high-level FTP client library for Python
7 :Keywords:  FTP, ``ftplib`` substitute, virtual filesystem, pure Python
8 :Author:    Stefan Schwarzer <sschwarzer@sschwarzer.net>
9 :`Russian translation`__: Anton Stepanov <antymail@mail.ru>
10
11 .. __: ftputil_ru.html
12
13 .. contents::
14
15
16 Introduction
17 ------------
18
19 The ``ftputil`` module is a high-level interface to the ftplib_
20 module. The `FTPHost objects`_ generated from it allow many operations
21 similar to those of os_, `os.path`_ and `shutil`_.
22
23 .. _ftplib: http://www.python.org/doc/current/lib/module-ftplib.html
24 .. _os: http://www.python.org/doc/current/lib/module-os.html
25 .. _`os.path`: http://www.python.org/doc/current/lib/module-os.path.html
26 .. _`shutil`: http://www.python.org/doc/current/lib/module-shutil.html
27
28 Examples::
29
30     import ftputil
31
32     # download some files from the login directory
33     host = ftputil.FTPHost('ftp.domain.com', 'user', 'password')
34     names = host.listdir(host.curdir)
35     for name in names:
36         if host.path.isfile(name):
37             host.download(name, name, 'b')  # remote, local, binary mode
38
39     # make a new directory and copy a remote file into it
40     host.mkdir('newdir')
41     source = host.file('index.html', 'r')         # file-like object
42     target = host.file('newdir/index.html', 'w')  # file-like object
43     host.copyfileobj(source, target)  # similar to shutil.copyfileobj
44     source.close()
45     target.close()
46
47 Also, there are `FTPHost.lstat`_ and `FTPHost.stat`_ to request size and
48 modification time of a file. The latter can also follow links, similar
49 to `os.stat`_. Even `FTPHost.walk`_ and `FTPHost.path.walk`_ work.
50
51 .. _`os.stat`: http://www.python.org/doc/2.5/lib/os-file-dir.html#l2h-2698
52
53
54 ``ftputil`` features
55 --------------------
56
57 * Method names are familiar from Python's ``os``, ``os.path`` and
58   ``shutil`` modules
59
60 * Remote file system navigation (``getcwd``, ``chdir``)
61
62 * Upload and download files (``upload``, ``upload_if_newer``,
63   ``download``, ``download_if_newer``)
64
65 * Time zone synchronization between client and server (needed
66   for ``upload_if_newer`` and ``download_if_newer``)
67
68 * Create and remove directories (``mkdir``, ``makedirs``, ``rmdir``,
69   ``rmtree``) and remove files (``remove``)
70
71 * Get information about directories, files and links (``listdir``,
72   ``stat``, ``lstat``, ``exists``, ``isdir``, ``isfile``, ``islink``,
73   ``abspath``, ``split``, ``join``, ``dirname``, ``basename`` etc.)
74
75 * Iterate over remote file systems (``walk``)
76
77 * Local caching of results from ``lstat`` and ``stat`` calls to reduce
78   network access (also applies to ``exists``, ``getmtime`` etc.).
79
80 * Read files from and write files to remote hosts via
81   file-like objects (``FTPHost.file``; the generated file-like objects
82   have many common methods like ``read``, ``readline``, ``readlines``,
83   ``write``, ``writelines``, ``close`` and can do automatic line
84   ending conversions on the fly, i. e. text/binary mode)
85
86
87 Exception hierarchy
88 -------------------
89
90 The exceptions are in the namespace of the ``ftp_error`` module, e. g.
91 ``ftp_error.TemporaryError``. Getting the exception classes from the
92 "package module" ``ftputil`` is deprecated.
93
94 The exceptions are organized as follows::
95
96     FTPError
97         FTPOSError(FTPError, OSError)
98             PermanentError(FTPOSError)
99             TemporaryError(FTPOSError)
100         FTPIOError(FTPError)
101         InternalError(FTPError)
102             InaccessibleLoginDirError(InternalError)
103             ParserError(InternalError)
104             RootDirError(InternalError)
105             TimeShiftError(InternalError)
106
107 and are described here:
108
109 - ``FTPError``
110
111   is the root of the exception hierarchy of the module.
112
113 - ``FTPOSError``
114
115   is derived from ``OSError``. This is for similarity between the
116   os module and ``FTPHost`` objects. Compare
117
118   ::
119
120     try:
121         os.chdir('nonexisting_directory')
122     except OSError:
123         ...
124
125   with
126
127   ::
128
129     host = ftputil.FTPHost('host', 'user', 'password')
130     try:
131         host.chdir('nonexisting_directory')
132     except OSError:
133         ...
134
135   Imagine a function
136
137   ::
138
139     def func(path, file):
140         ...
141
142   which works on the local file system and catches ``OSErrors``. If you
143   change the parameter list to
144
145   ::
146
147     def func(path, file, os=os):
148         ...
149
150   where ``os`` denotes the ``os`` module, you can call the function also as
151
152   ::
153
154     host = ftputil.FTPHost('host', 'user', 'password')
155     func(path, file, os=host)
156
157   to use the same code for a local and remote file system. Another
158   similarity between ``OSError`` and ``FTPOSError`` is that the latter
159   holds the FTP server return code in the ``errno`` attribute of the
160   exception object and the error text in ``strerror``.
161
162 - ``PermanentError``
163
164   is raised for 5xx return codes from the FTP server
165   (again, that's similar but *not* identical to ``ftplib.error_perm``).
166
167 - ``TemporaryError``
168
169   is raised for FTP return codes from the 4xx category. This
170   corresponds to ``ftplib.error_temp`` (though ``TemporaryError`` and
171   ``ftplib.error_temp`` are *not* identical).
172
173 - ``FTPIOError``
174
175   denotes an I/O error on the remote host. This appears
176   mainly with file-like objects which are retrieved by invoking
177   ``FTPHost.file`` (``FTPHost.open`` is an alias). Compare
178
179   ::
180
181     >>> try:
182     ...     f = open('not_there')
183     ... except IOError, obj:
184     ...     print obj.errno
185     ...     print obj.strerror
186     ...
187     2
188     No such file or directory
189
190   with
191
192   ::
193
194     >>> host = ftputil.FTPHost('host', 'user', 'password')
195     >>> try:
196     ...     f = host.open('not_there')
197     ... except IOError, obj:
198     ...     print obj.errno
199     ...     print obj.strerror
200     ...
201     550
202     550 not_there: No such file or directory.
203
204   As you can see, both code snippets are similar. (However, the error
205   codes aren't the same.)
206
207 - ``InternalError``
208
209   subsumes exception classes for signaling errors due to limitations
210   of the FTP protocol or the concrete implementation of ``ftputil``.
211
212 - ``InaccessibleLoginDirError``
213
214   This exception is only raised if *both* of the following conditions
215   are met:
216
217   - The directory in which "you" are placed upon login is not
218     accessible, i. e. a ``chdir`` call fails.
219
220   - You try to access a path which contains whitespace.
221
222 - ``ParserError``
223
224   is used for errors during the parsing of directory
225   listings from the server. This exception is used by the ``FTPHost``
226   methods ``stat``, ``lstat``, and ``listdir``.
227
228 - ``RootDirError``
229
230   Because of the implementation of the ``lstat`` method it is not
231   possible to do a ``stat`` call  on the root directory ``/``.
232   If you know *any* way to do it, please let me know. :-)
233
234   This problem does *not* affect stat calls on items *in* the root
235   directory.
236
237 - ``TimeShiftError``
238
239   is used to denote errors which relate to setting the `time shift`_,
240   *for example* trying to set a value which is no multiple of a full
241   hour.
242
243
244 ``FTPHost`` objects
245 -------------------
246
247 .. _`FTPHost construction`:
248
249 Construction
250 ~~~~~~~~~~~~
251
252 ``FTPHost`` instances may be generated with the following call::
253
254     host = ftputil.FTPHost(host, user, password, account,
255                            session_factory=ftplib.FTP)
256
257 The first four parameters are strings with the same meaning as for the
258 FTP class in the ``ftplib`` module. The keyword argument
259 ``session_factory`` may be used to generate FTP connections with other
260 factories than the default ``ftplib.FTP``. For example, the M2Crypto
261 distribution uses a secure FTP class which is derived from
262 ``ftplib.FTP``.
263
264 In fact, all positional and keyword arguments other than
265 ``session_factory`` are passed to the factory to generate a new background
266 session (which happens for every remote file that is opened; see
267 below).
268
269 This functionality of the constructor also allows to wrap
270 ``ftplib.FTP`` objects to do something that wouldn't be possible with
271 the ``ftplib.FTP`` constructor alone.
272
273 As an example, assume you want to connect to another than the default
274 port but ``ftplib.FTP`` only offers this by means of its ``connect``
275 method, but not via its constructor. The solution is to provide a
276 wrapper class::
277
278     import ftplib
279     import ftputil
280
281     EXAMPLE_PORT = 50001
282
283     class MySession(ftplib.FTP):
284         def __init__(self, host, userid, password, port):
285             """Act like ftplib.FTP's constructor but connect to other port."""
286             ftplib.FTP.__init__(self)
287             self.connect(host, port)
288             self.login(userid, password)
289
290     # try not to use MySession() as factory, - use the class itself
291     host = ftputil.FTPHost(host, userid, password,
292                            port=EXAMPLE_PORT, session_factory=MySession)
293     # use `host` as usual
294
295 On login, the format of the directory listings (needed for stat'ing
296 files and directories) should be determined automatically. If not,
297 please `file a bug`_.
298
299 .. _`file a bug`: http://ftputil.sschwarzer.net/issuetrackernotes
300
301 ``FTPHost`` attributes and methods
302 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
303
304 Attributes
305 ``````````
306
307 - ``curdir``, ``pardir``, ``sep``
308
309   are strings which denote the current and the parent directory on the
310   remote server. sep identifies the path separator. Though `RFC 959`_
311   (File Transfer Protocol) notes that these values may depend on the
312   FTP server implementation, the Unix counterparts seem to work well
313   in practice, even for non-Unix servers.
314
315 .. _`RFC 959`: `RFC 959 - File Transfer Protocol (FTP)`_
316
317 Remote file system navigation
318 `````````````````````````````
319
320 - ``getcwd()``
321
322   returns the absolute current directory on the remote host. This
323   method acts similar to ``os.getcwd``.
324
325 - ``chdir(directory)``
326
327   sets the current directory on the FTP server. This resembles
328   ``os.chdir``, as you may have expected.
329
330 Uploading and downloading files
331 ```````````````````````````````
332
333 - ``upload(source, target, mode='')``
334
335   copies a local source file (given by a filename, i. e. a string)
336   to the remote host under the name target. Both source and target
337   may be absolute paths or relative to their corresponding current
338   directory (on the local or the remote host, respectively). The
339   mode may be "" or "a" for ASCII uploads or "b" for binary uploads.
340   ASCII mode is the default (again, similar to regular local file
341   objects).
342
343 - ``download(source, target, mode='')``
344
345   performs a download from the remote source to a target file. Both
346   source and target are strings. Additionally, the description of
347   the upload method applies here, too.
348
349 .. _`upload_if_newer`:
350
351 - ``upload_if_newer(source, target, mode='')``
352
353   is similar to the upload method. The only difference is that the
354   upload is only invoked if the time of the last modification for
355   the source file is more recent than that of the target file, or
356   the target doesn't exist at all. If an upload actually happened,
357   the return value is a true value, else a false value.
358
359   Note that this method only checks the existence and/or the
360   modification time of the source and target file; it can't recognize
361   a change in the transfer mode, e. g.
362
363   ::
364
365     # transfer in ASCII mode
366     host.upload_if_newer('source_file', 'target_file', 'a')
367     # won't transfer the file again, which is bad!
368     host.upload_if_newer('source_file', 'target_file', 'b')
369
370   Similarly, if a transfer is interrupted, the remote file will have a
371   newer modification time than the local file, and thus the transfer
372   won't be repeated if ``upload_if_newer`` is used a second time.
373   There are (at least) two possibilities after a failed upload:
374
375   - use ``upload`` instead of ``upload_if_newer``, or
376
377   - remove the incomplete target file with ``FTPHost.remove``, then
378     use ``upload`` or ``upload_if_newer`` to transfer it again.
379
380   If it seems that a file is uploaded unnecessarily, read the
381   subsection on `time shift`_ settings.
382
383 .. _`download_if_newer`:
384
385 - ``download_if_newer(source, target, mode='')``
386
387   corresponds to ``upload_if_newer`` but performs a download from the
388   server to the local host. Read the descriptions of download and
389   ``upload_if_newer`` for more. If a download actually happened, the
390   return value is a true value, else a false value.
391
392   If it seems that a file is downloaded unnecessarily, read the
393   subsection on `time shift`_ settings.
394
395 .. _`time shift`:
396
397 Time zone correction
398 ````````````````````
399
400 .. _`set_time_shift`:
401
402 - ``set_time_shift(time_shift)``
403
404   sets the so-called time shift value (measured in seconds). The time
405   shift is the difference between the local time of the server and the
406   local time of the client at a given moment, i. e. by definition
407
408   ::
409
410     time_shift = server_time - client_time
411
412   Setting this value is important if `upload_if_newer`_ and
413   `download_if_newer`_ should work correctly even if the time zone of
414   the FTP server differs from that of the client (where ``ftputil``
415   runs). Note that the time shift value *can* be negative.
416
417   If the time shift value is invalid, e. g. no multiple of a full hour
418   or its absolute (unsigned) value larger than 24 hours, a
419   ``TimeShiftError`` is raised.
420
421   See also `synchronize_times`_ for a way to set the time shift with a
422   simple method call.
423
424 - ``time_shift()``
425
426   return the currently-set time shift value. See ``set_time_shift``
427   (above) for its definition.
428
429 .. _`synchronize_times`:
430
431 - ``synchronize_times()``
432
433   synchronizes the local times of the server and the client, so that
434   `upload_if_newer`_ and `download_if_newer`_ work as expected, even
435   if the client and the server are in different time zones. For this
436   to work, *all* of the following conditions must be true:
437
438   - The connection between server and client is established.
439
440   - The client has write access to the directory that is current when
441     ``synchronize_times`` is called.
442
443   If you can't fulfill these conditions, you can nevertheless set the
444   time shift value manually with `set_time_shift`_. Trying to call
445   ``synchronize_times`` if the above conditions aren't true results in
446   a ``TimeShiftError`` exception.
447
448 Creating and removing directories
449 `````````````````````````````````
450
451 - ``mkdir(path, [mode])``
452
453   makes the given directory on the remote host. This doesn't construct
454   "intermediate" directories which don't already exist. The ``mode``
455   parameter is ignored; this is for compatibility with ``os.mkdir`` if
456   an ``FTPHost`` object is passed into a function instead of the os
457   module (see the subsection on Python exceptions above for an
458   explanation).
459
460 - ``makedirs(path, [mode])``
461
462   works similar to ``mkdir`` (see above, but also makes intermediate
463   directories, like ``os.makedirs``). The ``mode`` parameter is
464   only there for compatibility with ``os.makedirs`` and is
465   ignored.
466
467 - ``rmdir(path)``
468
469   removes the given remote directory. If it's not empty, raise
470   a ``PermanentError``.
471
472 - ``rmtree(path, ignore_errors=False, onerror=None)``
473
474   removes the given remote, possibly non-empty, directory tree.
475   The interface of this method is rather complex, in favor of
476   compatibility with ``shutil.rmtree``.
477
478   If ``ignore_errors`` is set to a true value, errors are ignored.
479   If ``ignore_errors`` is a false value *and* ``onerror`` isn't
480   set, all exceptions occurring during the tree iteration and
481   processing are raised. These exceptions are all of type
482   ``PermanentError``.
483
484   To distinguish between error situations and/or pass in a callable
485   for ``onerror``. This callable must accept three arguments:
486   ``func``, ``path`` and ``exc_info``). ``func`` is a bound method
487   object, *for example* ``your_host_object.listdir``. ``path`` is
488   the path that was the recent argument of the respective method
489   (``listdir``, ``remove``, ``rmdir``). ``exc_info`` is the exception
490   info as it is got from ``sys.exc_info``.
491
492   The code of ``rmtree`` is taken from Python's ``shutil`` module
493   and adapted for ``ftputil``.
494
495   **Note: I find this interface rather complicated and would like to
496   simplify it without making error handling too difficult. Possible
497   changes to ``rmtree`` will depend on the discussion between the
498   versions 2.1b and 2.1.**
499
500 Removing files and links
501 ````````````````````````
502
503 - ``remove(path)``
504
505   removes a file or link on the remote host (similar to ``os.remove``).
506
507 - ``unlink(path)``
508
509   is an alias for ``remove``.
510
511 Retrieving information about directories, files and links
512 `````````````````````````````````````````````````````````
513
514 - ``listdir(path)``
515
516   returns a list containing the names of the files and directories
517   in the given path; similar to ``os.listdir``. The special names
518   ``.`` and ``..`` are not in the list.
519
520 The methods ``lstat`` and ``stat`` (and others) rely on the directory
521 listing format used by the FTP server. When connecting to a host,
522 ``FTPHost``'s constructor tries to guess the right format, which
523 mostly succeeds. However, if you get strange results or
524 ``ParserError`` exceptions by a mere ``lstat`` call, please `file a
525 bug`_.
526
527 If ``lstat`` or ``stat`` yield wrong modification dates or times, look
528 at the methods that deal with time zone differences (`time shift`_).
529
530 .. _`FTPHost.lstat`:
531
532 - ``lstat(path)``
533
534   returns an object similar that from ``os.lstat`` (a "tuple" with
535   additional attributes; see the documentation of the ``os`` module for
536   details). However, due to the nature of the application, there are
537   some important aspects to keep in mind:
538
539   - The result is derived by parsing the output of a ``DIR`` command on
540     the server. Therefore, the result from ``FTPHost.lstat`` can not
541     contain more information than the received text. In particular:
542
543   - User and group ids can only be determined as strings, not as
544     numbers, and that only if the server supplies them. This is
545     usually the case with Unix servers but may not be for other FTP
546     server programs.
547
548   - Values for the time of the last modification may be rough,
549     depending on the information from the server. For timestamps
550     older than a year, this usually means that the precision of the
551     modification timestamp value is not better than days. For newer
552     files, the information may be accurate to a minute.
553
554   - Links can only be recognized on servers that provide this
555     information in the ``DIR`` output.
556
557   - Items that can't be determined at all are set to ``None``.
558
559   - There's a special problem with stat'ing the root directory.
560     (Stat'ing things *in* the root directory is fine though.) In
561     this case, a ``RootDirError`` is raised. This has to do with the
562     algorithm used by ``(l)stat`` and I know of no approach which
563     mends this problem.
564
565 ..
566
567   Currently, ``ftputil`` recognizes the common Unix-style and
568   Microsoft/DOS-style directory formats. If you need to parse output
569   from another server type, please write to the `ftputil mailing
570   list`_. You may consider to `write your own parser`_.
571
572 .. _`ftputil mailing list`: http://ftputil.sschwarzer.net/mailinglist
573 .. _`write your own parser`: `Writing directory parsers`_
574
575 .. _`FTPHost.stat`:
576
577 - ``stat(path)``
578   returns ``stat`` information also for files which are pointed to by a
579   link. This method follows multiple links until a regular file or
580   directory is found. If an infinite link chain is encountered, a
581   ``PermanentError`` is raised.
582
583 .. _`FTPHost.path`:
584
585 ``FTPHost`` objects contain an attribute named ``path``, similar to
586 `os.path`_. The following methods can be applied to the remote host
587 with the same semantics as for ``os.path``:
588
589 ::
590
591     abspath(path)
592     basename(path)
593     commonprefix(path_list)
594     dirname(path)
595     exists(path)
596     getmtime(path)
597     getsize(path)
598     isabs(path)
599     isdir(path)
600     isfile(path)
601     islink(path)
602     join(path1, path2, ...)
603     normcase(path)
604     normpath(path)
605     split(path)
606     splitdrive(path)
607     splitext(path)
608     walk(path, func, arg)
609
610 Local caching of file system information
611 ````````````````````````````````````````
612
613 Many of the above methods need access to the remote file system to
614 obtain data on directories and files. To get the most recent data,
615 *each* call to ``lstat``, ``stat``, ``exists``, ``getmtime`` etc.
616 would require to fetch a directory listing from the server, which can
617 make the program very slow. This effect is more pronounced for
618 operations which mostly scan the file system rather than transferring
619 file data.
620
621 For this reason, ``ftputil`` by default saves (caches) the results
622 from directory listings locally and reuses those results. This reduces
623 network accesses and so speeds up the software a lot. However, since
624 data is more rarely fetched from the server, the risk of obsolete data
625 also increases. This will be discussed below.
626
627 Caching can - if necessary at all - be controlled via the
628 ``stat_cache`` object in an ``FTPHost``'s namespace. For example,
629 after calling
630
631 ::
632
633     host = ftputil.FTPHost(host, user, password, account,
634                            session_factory=ftplib.FTP)
635
636 the cache can be accessed as ``host.stat_cache``.
637
638 While ``ftputil`` usually manages the cache quite well, there are two
639 possible reasons for modifying cache parameters. The first is when the
640 number of possible entries is too low. You may notice that when you
641 are processing very large directories (e. g. above 1000 directories or
642 files) and the program becomes much slower than before. It's common
643 for code to read a directory with ``listdir`` and then process the
644 found directories and files. For this application, it's a good rule of
645 thumb to set the cache size to somewhat more than the number of
646 directory entries fetched with ``listdir``. This is done by the
647 ``resize`` method::
648
649     host.stat_cache.resize(2000)
650
651 where the argument is the maximal number of ``lstat`` results to store
652 (the default is 1000). Note that each path on the server, e. g.
653 "/home/schwa/some_dir", corresponds to a single cache entry. (Methods
654 like ``exists`` or ``getmtime`` all derive their results from a
655 previously fetched ``lstat`` result.)
656
657 The value 2000 above means that the cache will hold at most 2000
658 entries. If more are about to be stored, the entries which have not
659 been used for the longest time will be deleted to make place for newer
660 entries.
661
662 Caching is so effective because it reduces network accesses. This can
663 also be a disadvantage if the file system data on the remote server
664 changes after a stat result has been retrieved; the client, when
665 looking at the cached stat data, will use obsolete information.
666
667 There are two ways to get such out-of-date stat data. The first
668 happens when an ``FTPHost`` instance modifies a file path for which it
669 has a cache entry, e. g. by calling ``remove`` or ``rmdir``. Such
670 changes are handled transparently; the path will be deleted from the
671 cache. A different matter are changes unknown to the ``FTPHost``
672 object which reads its cache. Obviously, for example, these are
673 changes by programs running on the remote host. On the other hand,
674 cache inconsistencies can also occur if two ``FTPHost`` objects change
675 a file system simultaneously::
676
677     host1 = ftputil.FTPHost(server, user1, password1)
678     host2 = ftputil.FTPHost(server, user1, password1)
679     try:
680         stat_result1 = host1.stat("some_file")
681         stat_result2 = host2.stat("some_file")
682         host2.remove("some_file")
683         # `host1` will still see the obsolete cache entry!
684         print stat_result1
685         # will raise an exception since an `FTPHost` object
686         #  knows of its own changes
687         print stat_result2
688     finally:
689         host1.close()
690         host2.close()
691
692 At first sight, it may appear to be a good idea to have a shared cache
693 among several ``FTPHost`` objects. After some thinking, this turns out
694 to be very error-prone. For example, it won't help with different
695 processes using ``ftputil``. So, if you have to deal with concurrent
696 write accesses to a server, you have to handle them explicitly.
697
698 The most useful tool for this probably is the ``invalidate`` method.
699 In the example above, it could be used as::
700
701     host1 = ftputil.FTPHost(server, user1, password1)
702     host2 = ftputil.FTPHost(server, user1, password1)
703     try:
704         stat_result1 = host1.stat("some_file")
705         stat_result2 = host2.stat("some_file")
706         host2.remove("some_file")
707         # invalidate using an absolute path
708         absolute_path = host1.path.abspath(
709                         host1.path.join(host1.curdir, "some_file"))
710         host1.stat_cache.invalidate(absolute_path)
711         # will now raise an exception as it should
712         print stat_result1
713         # would raise an exception since an `FTPHost` object
714         #  knows of its own changes, even without `invalidate`
715         print stat_result2
716     finally:
717         host1.close()
718         host2.close()
719
720 The method ``invalidate`` can be used on any *absolute* path, be it a
721 directory, a file or a link.
722
723 By default, the cache entries are stored indefinitely, i. e. if you
724 start your Python process using ``ftputil`` and let it run for three
725 days a stat call may still access cache data that old. To avoid this,
726 you can set the ``max_age`` attribute::
727
728     host = ftputil.FTPHost(server, user, password)
729     host.stat_cache.max_age = 60 * 60  # = 3600 seconds
730
731 This sets the maximum age of entries in the cache to an hour. This
732 means any entry older won't be retrieved from the cache but its data
733 instead fetched again from the remote host (and then again stored for
734 up to an hour). To reset `max_age` to the default of unlimited age,
735 i. e. cache entries never expire, use ``None`` as value.
736
737 If you are certain that the cache is in the way, you can disable and
738 later re-enable it completely with ``disable`` and ``enable``::
739
740     host = ftputil.FTPHost(server, user, password)
741     host.stat_cache.disable()
742     ...
743     host.stat_cache.enable()
744
745 During that time, the cache won't be used; all data will be fetched
746 from the network. After enabling the cache, its entries will be the
747 same as when the cache was disabled, that is, entries won't get
748 updated with newer data during this period. Note that even when the
749 cache is disabled, the file system data in the code can become
750 inconsistent::
751
752     host = ftputil.FTPHost(server, user, password)
753     host.stat_cache.disable()
754     if host.path.exists("some_file"):
755         mtime = host.path.getmtime("some_file")
756
757 In that case, the file ``some_file`` may have been removed by another
758 process between the calls to ``exists`` and ``getmtime``!
759
760 Iteration over directories
761 ``````````````````````````
762
763 .. _`FTPHost.walk`:
764
765 - ``walk(top, topdown=True, onerror=None)``
766
767   iterates over a directory tree, similar to `os.walk`_ in Python 2.3
768   and above. Actually, ``FTPHost.walk`` uses the code from Python with
769   just the necessary modifications, so see the linked documentation.
770
771 .. _`os.walk`: http://www.python.org/doc/2.5/lib/os-file-dir.html#l2h-2707
772
773 .. _`FTPHost.path.walk`:
774
775 - ``path.walk(path, func, arg)``
776
777   Similar to ``os.path.walk``, the ``walk`` method in
778   `FTPHost.path`_ can be used.
779
780 Other methods
781 `````````````
782
783 - ``close()``
784
785   closes the connection to the remote host. After this, no more
786   interaction with the FTP server is possible without using a new
787   ``FTPHost`` object.
788
789 - ``rename(source, target)``
790
791   renames the source file (or directory) on the FTP server.
792
793 - ``copyfileobj(source, target, length=64*1024)``
794
795   copies the contents from the file-like object source to the
796   file-like object target. The only difference to
797   ``shutil.copyfileobj`` is the default buffer size. Note that
798   arbitrary file-like objects can be used as arguments (e. g. local
799   files, remote FTP files). See `File-like objects`_ for construction
800   and use of remote file-like objects.
801
802 .. _`set_parser`:
803
804 - ``set_parser(parser)``
805
806   sets a custom parser for FTP directories. Note that you have to pass
807   in a parser *instance*, not the class.
808
809   An `extra section`_ shows how to write own parsers. Possibly you
810   are lucky and someone has already written a parser you can use.
811   Please ask on the `mailing list`_.
812
813 .. _`extra section`: `Writing directory parsers`_
814
815
816 File-like objects
817 -----------------
818
819 Construction
820 ~~~~~~~~~~~~
821
822 ``FTPFile`` objects are returned by a call to ``FTPHost.file`` (or
823 ``FTPHost.open``).
824
825 - ``FTPHost.file(path, mode='r')``
826
827   returns a file-like object that refers to the path on the remote
828   host. This path may be absolute or relative to the current directory
829   on the remote host (this directory can be determined with the getcwd
830   method). As with local file objects the default mode is "r", i. e.
831   reading text files. Valid modes are "r", "rb", "w", and "wb".
832
833 - ``FTPHost.open(path, mode='r')``
834
835   is an alias for ``file`` (see above).
836
837 Attributes and methods
838 ~~~~~~~~~~~~~~~~~~~~~~
839
840 The methods
841
842 ::
843
844     close()
845     read([count])
846     readline([count])
847     readlines()
848     write(data)
849     writelines(string_sequence)
850     xreadlines()
851
852 and the attribute ``closed`` have the same semantics as for file
853 objects of a local disk file system. The iterator protocol is also
854 supported, i. e. you can use a loop to read a file line by line::
855
856     host = ftputil.FTPHost(...)
857     input_file = host.file("some_file")
858     for line in input_file:
859         # do something with the line, e. g.
860         print line.strip().replace("ftplib", "ftputil")
861     input_file.close()
862
863 For more on file objects, see the section `File objects`_ in the
864 Library Reference.
865
866 .. _`file objects`: http://www.python.org/doc/current/lib/bltin-file-objects.html
867
868 Note that ``ftputil`` supports both binary mode and text mode with the
869 appropriate line ending conversions.
870
871
872 Writing directory parsers
873 -------------------------
874
875 ``ftputil`` recognizes the two most widely-used FTP directory formats
876 (Unix and MS style) and adjusts itself automatically. However, if your
877 server uses a format which is different from the two provided by
878 ``ftputil``, you can plug in an own custom parser and have it used by
879 a single method call.
880
881 For this, you need to write a parser class by inheriting from the
882 class ``Parser`` in the ``ftp_stat`` module::
883
884     from ftputil import ftp_error
885     from ftputil import ftp_stat
886
887     class XyzParser(ftp_stat.Parser):
888         """
889         Parse the default format of the FTP server of the XYZ
890         corporation.
891         """
892         def parse_line(self, line, time_shift=0.0):
893             """
894             Parse a `line` from the directory listing and return a
895             corresponding `StatResult` object. If the line can't
896             be parsed, raise `ftp_error.ParserError`.
897
898             The `time_shift` argument can be used to fine-tune the
899             parsing of dates and times. See the class
900             `ftp_stat.UnixParser` for an example.
901             """
902             # split the `line` argument and examine it further; if
903             #  something goes wrong, raise an `ftp_error.ParserError`
904             ...
905             # make a `StatResult` object from the parts above
906             stat_result = ftp_stat.StatResult(...)
907             # `_st_name` and `_st_target` are optional
908             stat_result._st_name = ...
909             stat_result._st_target = ...
910             return stat_result
911
912         # define `ignores_line` only if the default in the base class
913         #  doesn't do enough!
914         def ignores_line(self, line):
915             """
916             Return a true value if the line should be ignored. For
917             example, the implementation in the base class handles
918             lines like "total 17". On the other hand, if the line
919             should be used for stat'ing, return a false value.
920             """
921             is_total_line = super(XyzParser, self).ignores_line(line)
922             my_test = ...
923             return is_total_line or my_test
924
925 A ``StatResult`` object is similar to the value returned by
926 `os.stat`_ and is usually built with statements like
927
928 ::
929
930     stat_result = StatResult(
931                   (st_mode, st_ino, st_dev, st_nlink, st_uid,
932                    st_gid, st_size, st_atime, st_mtime, st_ctime) )
933     stat_result._st_name = ...
934     stat_result._st_target = ...
935
936 with the arguments of the ``StatResult`` constructor described in
937 the following table.
938
939 ===== ========== ============ =============== =======================
940 Index Attribute  os.stat type StatResult type Notes
941 ===== ========== ============ =============== =======================
942 0     st_mode    int          int
943 1     st_ino     long         long
944 2     st_dev     long         long
945 3     st_nlink   int          int
946 4     st_uid     int          str             usually only available as string
947 5     st_gid     int          str             usually only available as string
948 6     st_size    long         long
949 7     st_atime   int/float    float
950 8     st_mtime   int/float    float
951 9     st_ctime   int/float    float
952 \-    _st_name   \-           str             file name without directory part
953 \-    _st_target \-           str             link target
954 ===== ========== ============ =============== =======================
955
956 If you can't extract all the desirable data from a line (for
957 example, the MS format doesn't contain any information about the
958 owner of a file), set the corresponding values in the ``StatResult``
959 instance to ``None``.
960
961 Parser classes can use several helper methods which are defined in
962 the class ``Parser``:
963
964 - ``parse_unix_mode`` parses strings like "drwxr-xr-x" and returns
965   an appropriate ``st_mode`` value.
966
967 - ``parse_unix_time`` returns a float number usable for the
968   ``st_...time`` values by parsing arguments like "Nov"/"23"/"02:33" or
969   "May"/"26"/"2005". Note that the method expects the timestamp string
970   already split at whitespace.
971
972 - ``parse_ms_time`` parses arguments like "10-23-01"/"03:25PM" and
973   returns a float number like from ``time.mktime``. Note that the
974   method expects the timestamp string already split at whitespace.
975
976 Additionally, there's an attribute ``_month_numbers`` which maps
977 three-letter month abbreviations to integers.
978
979 For more details, see the two "standard" parsers ``UnixParser`` and
980 ``MSParser`` in the module ``ftp_stat.py``.
981
982 To actually *use* the parser, call the method `set_parser`_ of the
983 ``FTPHost`` instance.
984
985 If you can't write a parser or don't want to, please ask on the
986 `ftputil mailing list`_. Possibly someone has already written a parser
987 for your server or can help to do it.
988
989
990 FAQ / Tips and tricks
991 ---------------------
992
993 Where can I get the latest version?
994 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
995
996 See the `download page`_. Announcements will be sent to the `mailing
997 list`_. Announcements on major updates will also be posted to the
998 newsgroup `comp.lang.python`_ .
999
1000 .. _`download page`: http://ftputil.sschwarzer.net/download
1001 .. _`mailing list`: http://ftputil.sschwarzer.net/mailinglist
1002 .. _`comp.lang.python`: news:comp.lang.python
1003
1004 Is there a mailing list on ``ftputil``?
1005 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1006
1007 Yes, please visit http://ftputil.sschwarzer.net/mailinglist to
1008 subscribe or read the archives.
1009
1010 I found a bug! What now?
1011 ~~~~~~~~~~~~~~~~~~~~~~~~
1012
1013 Before reporting a bug, make sure that you already tried the `latest
1014 version`_ of ``ftputil``. There the bug might have already been fixed.
1015
1016 .. _`latest version`: http://ftputil.sschwarzer.net/download
1017
1018 Please see http://ftputil.sschwarzer.net/issuetrackernotes for
1019 guidelines on entering a bug in ``ftputil``'s ticket system. If you
1020 are unsure if the behaviour you found is a bug or not, you can write
1021 to the `ftputil mailing list`_. In *either* case you *must not*
1022 include confidential information (user id, password, file names, etc.)
1023 in the problem report! Be careful!
1024
1025 Does ``ftputil`` support SSL?
1026 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1027
1028 ``ftputil`` has no *built-in* SSL support. On the other hand,
1029 you can use M2Crypto_ (in the source code archive, look for the
1030 file ``M2Crypto/ftpslib.py``) which has a class derived from
1031 ``ftplib.FTP`` that supports SSL. You then can use a class
1032 (not an object of it) similar to the following as a "session
1033 factory" in ``ftputil.FTPHost``'s constructor::
1034
1035     import ftputil
1036
1037     from M2Crypto import ftpslib
1038
1039     class SSLFTPSession(ftpslib.FTP_TLS):
1040         def __init__(self, host, userid, password):
1041             """
1042             Use M2Crypto's `FTP_TLS` class to establish an
1043             SSL connection.
1044             """
1045             ftpslib.FTP_TLS.__init__(self)
1046             self.connect(host, port)
1047             self.login(userid, password)
1048             # do anything necessary to set up the SSL connection
1049
1050     # note the `session_factory` parameter
1051     host = ftputil.FTPHost(host, userid, password,
1052                            session_factory=SSLFTPSession)
1053     # use `host` as usual
1054
1055 .. _M2Crypto: http://wiki.osafoundation.org/bin/view/Projects/MeTooCrypto#Downloads
1056
1057 Connecting on another port
1058 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1059
1060 By default, an instantiated ``FTPHost`` object connects on the usual
1061 FTP ports. If you have to use a different port, refer to the
1062 section `FTPHost construction`_.
1063
1064 You can use the same approach to connect in active or passive mode, as
1065 you like.
1066
1067 Using active or passive connections
1068 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1069
1070 Use a wrapper class for ``ftplib.FTP``, as described in section
1071 `FTPHost construction`_::
1072
1073     import ftplib
1074
1075     class ActiveFTPSession(ftplib.FTP):
1076         def __init__(self, host, userid, password):
1077             """
1078             Act like ftplib.FTP's constructor but use active mode
1079             explicitly.
1080             """
1081             ftplib.FTP.__init__(self)
1082             self.connect(host, port)
1083             self.login(userid, password)
1084             # see http://docs.python.org/lib/ftp-objects.html
1085             self.set_pasv(False)
1086
1087 Use this class as the ``session_factory`` argument in ``FTPHost``'s
1088 constructor.
1089
1090 Conditional upload/download to/from a server in a different time zone
1091 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1092
1093 You may find that ``ftputil`` uploads or downloads files
1094 unnecessarily, or not when it should. This can happen when the FTP
1095 server is in a different time zone than the client on which
1096 ``ftputil`` runs. Please see the section on setting the
1097 `time shift`_. It may even be sufficient to call `synchronize_times`_.
1098
1099 Wrong dates or times when stat'ing on a server
1100 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1101
1102 Please see the previous tip.
1103
1104 When I use ``ftputil``, all I get is a ``ParserError`` exception
1105 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1106
1107 The FTP server you connect to uses a directory format that
1108 ``ftputil`` doesn't understand. You can either write and
1109 `plug in an own parser`_, or preferably ask on the `mailing list`_ for
1110 help.
1111
1112 .. _`plug in an own parser`: `Writing directory parsers`_
1113
1114 I don't find an answer to my problem in this document
1115 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1116
1117 Please send an email with your problem report or question to the
1118 `ftputil mailing list`_, and we'll see what we can do for you. :-)
1119
1120
1121 Bugs and limitations
1122 --------------------
1123
1124 - ``ftputil`` needs at least Python 2.3 to work.
1125
1126 - Due to the implementation of ``lstat`` it can not return a sensible
1127   value for the root directory ``/`` though stat'ing entries *in* the
1128   root directory isn't a problem. If you know an implementation that
1129   can do this, please let me know. The root directory is handled
1130   appropriately in ``FTPHost.path.exists/isfile/isdir/islink``, though.
1131
1132 - Timeouts of individual child sessions currently are not handled.
1133   This is only a problem if your ``FTPHost`` object or the generated
1134   ``FTPFile`` objects are inactive for about ten minutes or longer.
1135
1136 - Until now, I haven't paid attention to thread safety. In principle,
1137   at least, different ``FTPFile`` objects should be usable in different
1138   threads.
1139
1140 - ``FTPFile`` objects in text mode *may not* support charsets with more
1141   than one byte per character. Please email me your experiences
1142   (address above), if you work with multibyte text streams in FTP
1143   sessions.
1144
1145 - Currently, it is not possible to continue an interrupted upload or
1146   download. Contact me if you have problems with that.
1147
1148 - There's exactly one cache for lstat results for each ``FTPHost``
1149   object, i. e. there's no sharing of cache results determined by
1150   several ``FTPHost`` objects.
1151
1152
1153 Files
1154 -----
1155
1156 If not overwritten via installation options, the ``ftputil`` files
1157 reside in the ``ftputil`` package. The documentation (in
1158 `reStructuredText`_ and in HTML format) is in the same directory.
1159
1160 .. _`reStructuredText`: http://docutils.sourceforge.net/rst.html
1161
1162 The files ``_test_*.py`` and ``_mock_ftplib.py`` are for unit-testing.
1163 If you only *use* ``ftputil`` (i. e. *don't* modify it), you can
1164 delete these files.
1165
1166
1167 References
1168 ----------
1169
1170 - Mackinnon T, Freeman S, Craig P. 2000. `Endo-Testing:
1171   Unit Testing with Mock Objects`_.
1172
1173 - Postel J, Reynolds J. 1985. `RFC 959 - File Transfer Protocol (FTP)`_.
1174
1175 - Van Rossum G, Drake Jr FL. 2003. `Python Library Reference`_.
1176
1177 .. _`Endo-Testing: Unit Testing with Mock Objects`:
1178    http://www.connextra.com/aboutUs/mockobjects.pdf
1179 .. _`RFC 959 - File Transfer Protocol (FTP)`: http://www.ietf.org/rfc/rfc959.txt
1180 .. _`Python Library Reference`: http://www.python.org/doc/current/lib/lib.html
1181
1182
1183 Authors
1184 -------
1185
1186 ``ftputil`` is written by Stefan Schwarzer
1187 <sschwarzer@sschwarzer.net>, in part based on suggestions
1188 from users.
1189
1190 The ``lrucache`` module is written by Evan Prodromou
1191 <evan@bad.dynu.ca>.
1192
1193 Feedback is appreciated. :-)
1194
Note: See TracBrowser for help on using the browser.