source: ftputil.txt @ 781:2f6b2934a207

Last change on this file since 781:2f6b2934a207 was 781:2f6b2934a207, checked in by Stefan Schwarzer <sschwarzer@…>, 12 years ago
Fixed typo.
File size: 44.2 KB
Line 
1``ftputil`` - a high-level FTP client library
2=============================================
3
4:Version:   2.4
5:Date:      2009-02-15
6:Summary:   high-level FTP client library for Python
7:Keywords:  FTP, ``ftplib`` substitute, virtual filesystem, pure Python
8:Author:    Stefan Schwarzer <sschwarzer@sschwarzer.net>
9:`Russian translation`__: Anton Stepanov <antymail@mail.ru>
10
11.. __: ftputil_ru.html
12
13.. contents::
14
15
16Introduction
17------------
18
19The ``ftputil`` module is a high-level interface to the ftplib_
20module. The `FTPHost objects`_ generated from it allow many operations
21similar to those of os_, `os.path`_ and `shutil`_.
22
23.. _ftplib: http://www.python.org/doc/current/lib/module-ftplib.html
24.. _os: http://www.python.org/doc/current/lib/module-os.html
25.. _`os.path`: http://www.python.org/doc/current/lib/module-os.path.html
26.. _`shutil`: http://www.python.org/doc/current/lib/module-shutil.html
27
28Examples::
29
30    import ftputil
31
32    # download some files from the login directory
33    host = ftputil.FTPHost('ftp.domain.com', 'user', 'password')
34    names = host.listdir(host.curdir)
35    for name in names:
36        if host.path.isfile(name):
37            host.download(name, name, 'b')  # remote, local, binary mode
38
39    # make a new directory and copy a remote file into it
40    host.mkdir('newdir')
41    source = host.file('index.html', 'r')         # file-like object
42    target = host.file('newdir/index.html', 'w')  # file-like object
43    host.copyfileobj(source, target)  # similar to shutil.copyfileobj
44    source.close()
45    target.close()
46
47Also, there are `FTPHost.lstat`_ and `FTPHost.stat`_ to request size and
48modification time of a file. The latter can also follow links, similar
49to `os.stat`_. Even `FTPHost.walk`_ and `FTPHost.path.walk`_ work.
50
51.. _`os.stat`: http://www.python.org/doc/2.5/lib/os-file-dir.html#l2h-2698
52
53
54``ftputil`` features
55--------------------
56
57* Method names are familiar from Python's ``os``, ``os.path`` and
58  ``shutil`` modules
59
60* Remote file system navigation (``getcwd``, ``chdir``)
61
62* Upload and download files (``upload``, ``upload_if_newer``,
63  ``download``, ``download_if_newer``)
64
65* Time zone synchronization between client and server (needed
66  for ``upload_if_newer`` and ``download_if_newer``)
67
68* Create and remove directories (``mkdir``, ``makedirs``, ``rmdir``,
69  ``rmtree``) and remove files (``remove``)
70
71* Get information about directories, files and links (``listdir``,
72  ``stat``, ``lstat``, ``exists``, ``isdir``, ``isfile``, ``islink``,
73  ``abspath``, ``split``, ``join``, ``dirname``, ``basename`` etc.)
74
75* Iterate over remote file systems (``walk``)
76
77* Local caching of results from ``lstat`` and ``stat`` calls to reduce
78  network access (also applies to ``exists``, ``getmtime`` etc.).
79
80* Read files from and write files to remote hosts via
81  file-like objects (``FTPHost.file``; the generated file-like objects
82  have many common methods like ``read``, ``readline``, ``readlines``,
83  ``write``, ``writelines``, ``close`` and can do automatic line
84  ending conversions on the fly, i. e. text/binary mode)
85
86
87Exception hierarchy
88-------------------
89
90The exceptions are in the namespace of the ``ftp_error`` module, e. g.
91``ftp_error.TemporaryError``. Getting the exception classes from the
92"package module" ``ftputil`` is deprecated and will no longer be
93supported in ``ftputil`` version 2.5.
94
95The exceptions are organized as follows::
96
97    FTPError
98        FTPOSError(FTPError, OSError)
99            PermanentError(FTPOSError)
100                CommandNotImplementedError(PermanentError)
101            TemporaryError(FTPOSError)
102        FTPIOError(FTPError)
103        InternalError(FTPError)
104            InaccessibleLoginDirError(InternalError)
105            ParserError(InternalError)
106            RootDirError(InternalError)
107            TimeShiftError(InternalError)
108
109and are described here:
110
111- ``FTPError``
112
113  is the root of the exception hierarchy of the module.
114
115- ``FTPOSError``
116
117  is derived from ``OSError``. This is for similarity between the
118  os module and ``FTPHost`` objects. Compare
119
120  ::
121
122    try:
123        os.chdir('nonexisting_directory')
124    except OSError:
125        ...
126
127  with
128
129  ::
130
131    host = ftputil.FTPHost('host', 'user', 'password')
132    try:
133        host.chdir('nonexisting_directory')
134    except OSError:
135        ...
136
137  Imagine a function
138
139  ::
140
141    def func(path, file):
142        ...
143
144  which works on the local file system and catches ``OSErrors``. If you
145  change the parameter list to
146
147  ::
148
149    def func(path, file, os=os):
150        ...
151
152  where ``os`` denotes the ``os`` module, you can call the function also as
153
154  ::
155
156    host = ftputil.FTPHost('host', 'user', 'password')
157    func(path, file, os=host)
158
159  to use the same code for a local and remote file system. Another
160  similarity between ``OSError`` and ``FTPOSError`` is that the latter
161  holds the FTP server return code in the ``errno`` attribute of the
162  exception object and the error text in ``strerror``.
163
164- ``PermanentError``
165
166  is raised for 5xx return codes from the FTP server. This
167  corresponds to ``ftplib.error_perm`` (though ``PermanentError`` and
168  ``ftplib.error_perm`` are *not* identical).
169
170- ``CommandNotImplementedError``
171
172  indicates that an underlying command the code tries to use is not
173  implemented. For an example, see the description of the
174  `FTPHost.chmod`_ method.
175
176- ``TemporaryError``
177
178  is raised for FTP return codes from the 4xx category. This
179  corresponds to ``ftplib.error_temp`` (though ``TemporaryError`` and
180  ``ftplib.error_temp`` are *not* identical).
181
182- ``FTPIOError``
183
184  denotes an I/O error on the remote host. This appears
185  mainly with file-like objects which are retrieved by invoking
186  ``FTPHost.file`` (``FTPHost.open`` is an alias). Compare
187
188  ::
189
190    >>> try:
191    ...     f = open('not_there')
192    ... except IOError, obj:
193    ...     print obj.errno
194    ...     print obj.strerror
195    ...
196    2
197    No such file or directory
198
199  with
200
201  ::
202
203    >>> host = ftputil.FTPHost('host', 'user', 'password')
204    >>> try:
205    ...     f = host.open('not_there')
206    ... except IOError, obj:
207    ...     print obj.errno
208    ...     print obj.strerror
209    ...
210    550
211    550 not_there: No such file or directory.
212
213  As you can see, both code snippets are similar. (However, the error
214  codes aren't the same.)
215
216- ``InternalError``
217
218  subsumes exception classes for signaling errors due to limitations
219  of the FTP protocol or the concrete implementation of ``ftputil``.
220
221- ``InaccessibleLoginDirError``
222
223  This exception is only raised if *both* of the following conditions
224  are met:
225
226  - The directory in which "you" are placed upon login is not
227    accessible, i. e. a ``chdir`` call fails.
228
229  - You try to access a path which contains whitespace.
230
231- ``ParserError``
232
233  is used for errors during the parsing of directory
234  listings from the server. This exception is used by the ``FTPHost``
235  methods ``stat``, ``lstat``, and ``listdir``.
236
237- ``RootDirError``
238
239  Because of the implementation of the ``lstat`` method it is not
240  possible to do a ``stat`` call  on the root directory ``/``.
241  If you know *any* way to do it, please let me know. :-)
242
243  This problem does *not* affect stat calls on items *in* the root
244  directory.
245
246- ``TimeShiftError``
247
248  is used to denote errors which relate to setting the `time shift`_,
249  *for example* trying to set a value which is no multiple of a full
250  hour.
251
252
253``FTPHost`` objects
254-------------------
255
256.. _`FTPHost construction`:
257
258Construction
259~~~~~~~~~~~~
260
261Basics
262``````
263
264``FTPHost`` instances may be generated with the following call::
265
266    host = ftputil.FTPHost(host, user, password, account,
267                           session_factory=ftplib.FTP)
268
269The first four parameters are strings with the same meaning as for the
270FTP class in the ``ftplib`` module.
271
272Session factories
273`````````````````
274
275The keyword argument ``session_factory`` may be used to generate FTP
276connections with other factories than the default ``ftplib.FTP``. For
277example, the M2Crypto distribution uses a secure FTP class which is
278derived from ``ftplib.FTP``.
279
280In fact, all positional and keyword arguments other than
281``session_factory`` are passed to the factory to generate a new background
282session (which happens for every remote file that is opened; see
283below).
284
285This functionality of the constructor also allows to wrap
286``ftplib.FTP`` objects to do something that wouldn't be possible with
287the ``ftplib.FTP`` constructor alone.
288
289As an example, assume you want to connect to another than the default
290port but ``ftplib.FTP`` only offers this by means of its ``connect``
291method, but not via its constructor. The solution is to provide a
292wrapper class::
293
294    import ftplib
295    import ftputil
296
297    EXAMPLE_PORT = 50001
298
299    class MySession(ftplib.FTP):
300        def __init__(self, host, userid, password, port):
301            """Act like ftplib.FTP's constructor but connect to other port."""
302            ftplib.FTP.__init__(self)
303            self.connect(host, port)
304            self.login(userid, password)
305
306    # try not to use MySession() as factory, - use the class itself
307    host = ftputil.FTPHost(host, userid, password,
308                           port=EXAMPLE_PORT, session_factory=MySession)
309    # use `host` as usual
310
311On login, the format of the directory listings (needed for stat'ing
312files and directories) should be determined automatically. If not,
313please `file a bug report`_.
314
315.. _`file a bug report`: http://ftputil.sschwarzer.net/issuetrackernotes
316
317Support for the ``with`` statement
318``````````````````````````````````
319
320If you are sure that all the users of your code use at least Python
3212.5, you can use Python's `with statement`_::
322
323    # not needed for Python 2.6 and later
324    from __future__ import with_statement
325
326    import ftputil
327
328    with ftputil.FTPHost(host, user, password) as host:
329        print host.listdir(host.curdir)
330
331After the ``with`` block, the ``FTPHost`` instance and the
332associated FTP sessions will be closed automatically.
333
334If something goes wrong during the ``FTPHost`` construction or in the
335body of the ``with`` statement, the instance is closed as well.
336Exceptions will be propagated (as with ``try ... finally``).
337
338.. _`with statement`: http://www.python.org/dev/peps/pep-0343/
339
340``FTPHost`` attributes and methods
341~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
342
343Attributes
344``````````
345
346- ``curdir``, ``pardir``, ``sep``
347
348  are strings which denote the current and the parent directory on the
349  remote server. sep identifies the path separator. Though `RFC 959`_
350  (File Transfer Protocol) notes that these values may depend on the
351  FTP server implementation, the Unix counterparts seem to work well
352  in practice, even for non-Unix servers.
353
354Remote file system navigation
355`````````````````````````````
356
357- ``getcwd()``
358
359  returns the absolute current directory on the remote host. This
360  method acts similar to ``os.getcwd``.
361
362- ``chdir(directory)``
363
364  sets the current directory on the FTP server. This resembles
365  ``os.chdir``, as you may have expected.
366
367Uploading and downloading files
368```````````````````````````````
369
370- ``upload(source, target, mode='')``
371
372  copies a local source file (given by a filename, i. e. a string)
373  to the remote host under the name target. Both source and target
374  may be absolute paths or relative to their corresponding current
375  directory (on the local or the remote host, respectively). The
376  mode may be "" or "a" for ASCII uploads or "b" for binary uploads.
377  ASCII mode is the default (again, similar to regular local file
378  objects).
379
380- ``download(source, target, mode='')``
381
382  performs a download from the remote source to a target file. Both
383  source and target are strings. Additionally, the description of
384  the upload method applies here, too.
385
386.. _`upload_if_newer`:
387
388- ``upload_if_newer(source, target, mode='')``
389
390  is similar to the upload method. The only difference is that the
391  upload is only invoked if the time of the last modification for
392  the source file is more recent than that of the target file, or
393  the target doesn't exist at all. If an upload actually happened,
394  the return value is a true value, else a false value.
395
396  Note that this method only checks the existence and/or the
397  modification time of the source and target file; it can't recognize
398  a change in the transfer mode, e. g.
399
400  ::
401
402    # transfer in ASCII mode
403    host.upload_if_newer('source_file', 'target_file', 'a')
404    # won't transfer the file again, which is bad!
405    host.upload_if_newer('source_file', 'target_file', 'b')
406
407  Similarly, if a transfer is interrupted, the remote file will have a
408  newer modification time than the local file, and thus the transfer
409  won't be repeated if ``upload_if_newer`` is used a second time.
410  There are (at least) two possibilities after a failed upload:
411
412  - use ``upload`` instead of ``upload_if_newer``, or
413
414  - remove the incomplete target file with ``FTPHost.remove``, then
415    use ``upload`` or ``upload_if_newer`` to transfer it again.
416
417  If it seems that a file is uploaded unnecessarily, read the
418  subsection on `time shift`_ settings.
419
420.. _`download_if_newer`:
421
422- ``download_if_newer(source, target, mode='')``
423
424  corresponds to ``upload_if_newer`` but performs a download from the
425  server to the local host. Read the descriptions of download and
426  ``upload_if_newer`` for more. If a download actually happened, the
427  return value is a true value, else a false value.
428
429  If it seems that a file is downloaded unnecessarily, read the
430  subsection on `time shift`_ settings.
431
432.. _`time shift`:
433
434Time zone correction
435````````````````````
436
437.. _`set_time_shift`:
438
439- ``set_time_shift(time_shift)``
440
441  sets the so-called time shift value (measured in seconds). The time
442  shift is the difference between the local time of the server and the
443  local time of the client at a given moment, i. e. by definition
444
445  ::
446
447    time_shift = server_time - client_time
448
449  Setting this value is important if `upload_if_newer`_ and
450  `download_if_newer`_ should work correctly even if the time zone of
451  the FTP server differs from that of the client (where ``ftputil``
452  runs). Note that the time shift value *can* be negative.
453
454  If the time shift value is invalid, e. g. no multiple of a full hour
455  or its absolute (unsigned) value larger than 24 hours, a
456  ``TimeShiftError`` is raised.
457
458  See also `synchronize_times`_ for a way to set the time shift with a
459  simple method call.
460
461- ``time_shift()``
462
463  return the currently-set time shift value. See ``set_time_shift``
464  (above) for its definition.
465
466.. _`synchronize_times`:
467
468- ``synchronize_times()``
469
470  synchronizes the local times of the server and the client, so that
471  `upload_if_newer`_ and `download_if_newer`_ work as expected, even
472  if the client and the server are in different time zones. For this
473  to work, *all* of the following conditions must be true:
474
475  - The connection between server and client is established.
476
477  - The client has write access to the directory that is current when
478    ``synchronize_times`` is called.
479
480  If you can't fulfill these conditions, you can nevertheless set the
481  time shift value manually with `set_time_shift`_. Trying to call
482  ``synchronize_times`` if the above conditions aren't true results in
483  a ``TimeShiftError`` exception.
484
485Creating and removing directories
486`````````````````````````````````
487
488- ``mkdir(path, [mode])``
489
490  makes the given directory on the remote host. This doesn't construct
491  "intermediate" directories which don't already exist. The ``mode``
492  parameter is ignored; this is for compatibility with ``os.mkdir`` if
493  an ``FTPHost`` object is passed into a function instead of the os
494  module (see the subsection on Python exceptions above for an
495  explanation).
496
497- ``makedirs(path, [mode])``
498
499  works similar to ``mkdir`` (see above, but also makes intermediate
500  directories, like ``os.makedirs``). The ``mode`` parameter is
501  only there for compatibility with ``os.makedirs`` and is
502  ignored.
503
504- ``rmdir(path)``
505
506  removes the given remote directory. If it's not empty, raise
507  a ``PermanentError``.
508
509- ``rmtree(path, ignore_errors=False, onerror=None)``
510
511  removes the given remote, possibly non-empty, directory tree.
512  The interface of this method is rather complex, in favor of
513  compatibility with ``shutil.rmtree``.
514
515  If ``ignore_errors`` is set to a true value, errors are ignored.
516  If ``ignore_errors`` is a false value *and* ``onerror`` isn't
517  set, all exceptions occurring during the tree iteration and
518  processing are raised. These exceptions are all of type
519  ``PermanentError``.
520
521  To distinguish between error situations and/or pass in a callable
522  for ``onerror``. This callable must accept three arguments:
523  ``func``, ``path`` and ``exc_info``). ``func`` is a bound method
524  object, *for example* ``your_host_object.listdir``. ``path`` is
525  the path that was the recent argument of the respective method
526  (``listdir``, ``remove``, ``rmdir``). ``exc_info`` is the exception
527  info as it is got from ``sys.exc_info``.
528
529  The code of ``rmtree`` is taken from Python's ``shutil`` module
530  and adapted for ``ftputil``.
531
532Removing files and links
533````````````````````````
534
535- ``remove(path)``
536
537  removes a file or link on the remote host (similar to ``os.remove``).
538
539- ``unlink(path)``
540
541  is an alias for ``remove``.
542
543Retrieving information about directories, files and links
544`````````````````````````````````````````````````````````
545
546- ``listdir(path)``
547
548  returns a list containing the names of the files and directories
549  in the given path; similar to ``os.listdir``. The special names
550  ``.`` and ``..`` are not in the list.
551
552The methods ``lstat`` and ``stat`` (and others) rely on the directory
553listing format used by the FTP server. When connecting to a host,
554``FTPHost``'s constructor tries to guess the right format, which
555mostly succeeds. However, if you get strange results or
556``ParserError`` exceptions by a mere ``lstat`` call, please `file a
557bug report`_.
558
559If ``lstat`` or ``stat`` yield wrong modification dates or times, look
560at the methods that deal with time zone differences (`time shift`_).
561
562.. _`FTPHost.lstat`:
563
564- ``lstat(path)``
565
566  returns an object similar that from ``os.lstat`` (a "tuple" with
567  additional attributes; see the documentation of the ``os`` module for
568  details). However, due to the nature of the application, there are
569  some important aspects to keep in mind:
570
571  - The result is derived by parsing the output of a ``DIR`` command on
572    the server. Therefore, the result from ``FTPHost.lstat`` can not
573    contain more information than the received text. In particular:
574
575  - User and group ids can only be determined as strings, not as
576    numbers, and that only if the server supplies them. This is
577    usually the case with Unix servers but may not be for other FTP
578    server programs.
579
580  - Values for the time of the last modification may be rough,
581    depending on the information from the server. For timestamps
582    older than a year, this usually means that the precision of the
583    modification timestamp value is not better than days. For newer
584    files, the information may be accurate to a minute.
585
586  - Links can only be recognized on servers that provide this
587    information in the ``DIR`` output.
588
589  - Items that can't be determined at all are set to ``None``.
590
591  - There's a special problem with stat'ing the root directory.
592    (Stat'ing things *in* the root directory is fine though.) In
593    this case, a ``RootDirError`` is raised. This has to do with the
594    algorithm used by ``(l)stat`` and I know of no approach which
595    mends this problem.
596
597..
598
599  Currently, ``ftputil`` recognizes the common Unix-style and
600  Microsoft/DOS-style directory formats. If you need to parse output
601  from another server type, please write to the `ftputil mailing
602  list`_. You may consider to `write your own parser`_.
603
604.. _`ftputil mailing list`: http://ftputil.sschwarzer.net/mailinglist
605.. _`write your own parser`: `Writing directory parsers`_
606
607.. _`FTPHost.stat`:
608
609- ``stat(path)``
610  returns ``stat`` information also for files which are pointed to by a
611  link. This method follows multiple links until a regular file or
612  directory is found. If an infinite link chain is encountered, a
613  ``PermanentError`` is raised.
614
615.. _`FTPHost.path`:
616
617``FTPHost`` objects contain an attribute named ``path``, similar to
618`os.path`_. The following methods can be applied to the remote host
619with the same semantics as for ``os.path``:
620
621::
622
623    abspath(path)
624    basename(path)
625    commonprefix(path_list)
626    dirname(path)
627    exists(path)
628    getmtime(path)
629    getsize(path)
630    isabs(path)
631    isdir(path)
632    isfile(path)
633    islink(path)
634    join(path1, path2, ...)
635    normcase(path)
636    normpath(path)
637    split(path)
638    splitdrive(path)
639    splitext(path)
640    walk(path, func, arg)
641
642Local caching of file system information
643````````````````````````````````````````
644
645Many of the above methods need access to the remote file system to
646obtain data on directories and files. To get the most recent data,
647*each* call to ``lstat``, ``stat``, ``exists``, ``getmtime`` etc.
648would require to fetch a directory listing from the server, which can
649make the program very slow. This effect is more pronounced for
650operations which mostly scan the file system rather than transferring
651file data.
652
653For this reason, ``ftputil`` by default saves (caches) the results
654from directory listings locally and reuses those results. This reduces
655network accesses and so speeds up the software a lot. However, since
656data is more rarely fetched from the server, the risk of obsolete data
657also increases. This will be discussed below.
658
659Caching can - if necessary at all - be controlled via the
660``stat_cache`` object in an ``FTPHost``'s namespace. For example,
661after calling
662
663::
664
665    host = ftputil.FTPHost(host, user, password, account,
666                           session_factory=ftplib.FTP)
667
668the cache can be accessed as ``host.stat_cache``.
669
670While ``ftputil`` usually manages the cache quite well, there are two
671possible reasons for modifying cache parameters. The first is when the
672number of possible entries is too low. You may notice that when you
673are processing very large directories (e. g. above 1000 directories or
674files) and the program becomes much slower than before. It's common
675for code to read a directory with ``listdir`` and then process the
676found directories and files. For this application, it's a good rule of
677thumb to set the cache size to somewhat more than the number of
678directory entries fetched with ``listdir``. This is done by the
679``resize`` method::
680
681    host.stat_cache.resize(2000)
682
683where the argument is the maximum number of ``lstat`` results to store
684(the default is 1000). Note that each path on the server, e. g.
685"/home/schwa/some_dir", corresponds to a single cache entry. (Methods
686like ``exists`` or ``getmtime`` all derive their results from a
687previously fetched ``lstat`` result.)
688
689The value 2000 above means that the cache will hold at most 2000
690entries. If more are about to be stored, the entries which have not
691been used for the longest time will be deleted to make place for newer
692entries.
693
694Caching is so effective because it reduces network accesses. This can
695also be a disadvantage if the file system data on the remote server
696changes after a stat result has been retrieved; the client, when
697looking at the cached stat data, will use obsolete information.
698
699There are two ways to get such out-of-date stat data. The first
700happens when an ``FTPHost`` instance modifies a file path for which it
701has a cache entry, e. g. by calling ``remove`` or ``rmdir``. Such
702changes are handled transparently; the path will be deleted from the
703cache. A different matter are changes unknown to the ``FTPHost``
704object which reads its cache. Obviously, for example, these are
705changes by programs running on the remote host. On the other hand,
706cache inconsistencies can also occur if two ``FTPHost`` objects change
707a file system simultaneously::
708
709    host1 = ftputil.FTPHost(server, user1, password1)
710    host2 = ftputil.FTPHost(server, user1, password1)
711    try:
712        stat_result1 = host1.stat("some_file")
713        stat_result2 = host2.stat("some_file")
714        host2.remove("some_file")
715        # `host1` will still see the obsolete cache entry!
716        print host1.stat("some_file")
717        # will raise an exception since an `FTPHost` object
718        #  knows of its own changes
719        print host2.stat("some_file")
720    finally:
721        host1.close()
722        host2.close()
723
724At first sight, it may appear to be a good idea to have a shared cache
725among several ``FTPHost`` objects. After some thinking, this turns out
726to be very error-prone. For example, it won't help with different
727processes using ``ftputil``. So, if you have to deal with concurrent
728write accesses to a server, you have to handle them explicitly.
729
730The most useful tool for this probably is the ``invalidate`` method.
731In the example above, it could be used as::
732
733    host1 = ftputil.FTPHost(server, user1, password1)
734    host2 = ftputil.FTPHost(server, user1, password1)
735    try:
736        stat_result1 = host1.stat("some_file")
737        stat_result2 = host2.stat("some_file")
738        host2.remove("some_file")
739        # invalidate using an absolute path
740        absolute_path = host1.path.abspath(
741                        host1.path.join(host1.curdir, "some_file"))
742        host1.stat_cache.invalidate(absolute_path)
743        # will now raise an exception as it should
744        print host1.stat("some_file")
745        # would raise an exception since an `FTPHost` object
746        #  knows of its own changes, even without `invalidate`
747        print host2.stat("some_file")
748    finally:
749        host1.close()
750        host2.close()
751
752The method ``invalidate`` can be used on any *absolute* path, be it a
753directory, a file or a link.
754
755By default, the cache entries are stored indefinitely, i. e. if you
756start your Python process using ``ftputil`` and let it run for three
757days a stat call may still access cache data that old. To avoid this,
758you can set the ``max_age`` attribute::
759
760    host = ftputil.FTPHost(server, user, password)
761    host.stat_cache.max_age = 60 * 60  # = 3600 seconds
762
763This sets the maximum age of entries in the cache to an hour. This
764means any entry older won't be retrieved from the cache but its data
765instead fetched again from the remote host (and then again stored for
766up to an hour). To reset `max_age` to the default of unlimited age,
767i. e. cache entries never expire, use ``None`` as value.
768
769If you are certain that the cache is in the way, you can disable and
770later re-enable it completely with ``disable`` and ``enable``::
771
772    host = ftputil.FTPHost(server, user, password)
773    host.stat_cache.disable()
774    ...
775    host.stat_cache.enable()
776
777During that time, the cache won't be used; all data will be fetched
778from the network. After enabling the cache, its entries will be the
779same as when the cache was disabled, that is, entries won't get
780updated with newer data during this period. Note that even when the
781cache is disabled, the file system data in the code can become
782inconsistent::
783
784    host = ftputil.FTPHost(server, user, password)
785    host.stat_cache.disable()
786    if host.path.exists("some_file"):
787        mtime = host.path.getmtime("some_file")
788
789In that case, the file ``some_file`` may have been removed by another
790process between the calls to ``exists`` and ``getmtime``!
791
792Iteration over directories
793``````````````````````````
794
795.. _`FTPHost.walk`:
796
797- ``walk(top, topdown=True, onerror=None)``
798
799  iterates over a directory tree, similar to `os.walk`_ in Python 2.3
800  and above. Actually, ``FTPHost.walk`` uses the code from Python with
801  just the necessary modifications, so see the linked documentation.
802
803.. _`os.walk`: http://www.python.org/doc/2.5/lib/os-file-dir.html#l2h-2707
804
805.. _`FTPHost.path.walk`:
806
807- ``path.walk(path, func, arg)``
808
809  Similar to ``os.path.walk``, the ``walk`` method in
810  `FTPHost.path`_ can be used.
811
812Other methods
813`````````````
814
815- ``close()``
816
817  closes the connection to the remote host. After this, no more
818  interaction with the FTP server is possible without using a new
819  ``FTPHost`` object.
820
821- ``rename(source, target)``
822
823  renames the source file (or directory) on the FTP server.
824
825.. _`FTPHost.chmod`:
826
827- ``chmod(path, mode)``
828
829  sets the access mode (permission flags) for the given path. The mode
830  is an integer as returned for the mode by the ``stat`` and ``lstat``
831  methods. Be careful: Usually, mode values are written as octal
832  numbers, for example 0755 to make a directory readable and writable
833  for the owner, but not writable for the group and others. If you
834  want to use such octal values, rely on Python's support for them::
835
836    host.chmod("some_directory", 0755)
837
838  Note the leading zero.
839 
840  Not all FTP servers support the ``chmod`` command. In case of
841  an exception, how do you know if the path doesn't exist or if
842  the command itself is invalid? If the FTP server complies with
843  `RFC 959`_, it should return a status code 502 if the ``SITE CHMOD``
844  command isn't allowed. ``ftputil`` maps this special error
845  response to a ``CommandNotImplementedError`` which is derived from
846  ``PermanentError``.
847 
848  So you need to code like this::
849
850    host = ftputil.FTPHost(server, user, password)
851    try:
852        host.chmod("some_file", 0644)
853    except ftp_error.CommandNotImplementedError:
854        # chmod not supported
855        ...
856    except ftp_error.PermanentError:
857        # possibly a non-existent file
858        ...
859
860  Because the ``CommandNotImplementedError`` is more specific, you
861  have to test for it first.
862
863.. _`RFC 959`: `RFC 959 - File Transfer Protocol (FTP)`_
864
865- ``copyfileobj(source, target, length=64*1024)``
866
867  copies the contents from the file-like object source to the
868  file-like object target. The only difference to
869  ``shutil.copyfileobj`` is the default buffer size. Note that
870  arbitrary file-like objects can be used as arguments (e. g. local
871  files, remote FTP files). See `File-like objects`_ for construction
872  and use of remote file-like objects.
873
874.. _`set_parser`:
875
876- ``set_parser(parser)``
877
878  sets a custom parser for FTP directories. Note that you have to pass
879  in a parser *instance*, not the class.
880
881  An `extra section`_ shows how to write own parsers if the default
882  parsers in ``ftputil`` don't work for you. Possibly you are lucky
883  and someone has already written a parser you can use. Please ask on
884  the `mailing list`_.
885
886.. _`extra section`: `Writing directory parsers`_
887
888
889File-like objects
890-----------------
891
892Construction
893~~~~~~~~~~~~
894
895Basics
896``````
897
898``FTPFile`` objects are returned by a call to ``FTPHost.file`` or
899``FTPHost.open``, never use the constructor directly.
900
901- ``FTPHost.file(path, mode='r')``
902
903  returns a file-like object that refers to the path on the remote
904  host. This path may be absolute or relative to the current directory
905  on the remote host (this directory can be determined with the getcwd
906  method). As with local file objects the default mode is "r", i. e.
907  reading text files. Valid modes are "r", "rb", "w", and "wb".
908
909- ``FTPHost.open(path, mode='r')``
910
911  is an alias for ``file`` (see above).
912
913Support for the ``with`` statement
914``````````````````````````````````
915
916If you are sure that all the users of your code use at least Python
9172.5, you can use Python's `with statement`_ also with the ``FTPFile``
918constructor::
919
920    # not needed for Python 2.6 and later
921    from __future__ import with_statement
922
923    import ftputil
924
925    # get an ``FTPHost`` object from somewhere
926    ...
927
928    with host.file("new_file", "w") as f:
929        f.write("This is some text.")
930
931At the end of the ``with`` block, the file will be closed
932automatically.
933
934If something goes wrong during the construction of the file or in the
935body of the ``with`` statement, the file will be closed as well.
936Exceptions will be propagated (as with ``try ... finally``).
937
938.. _`with statement`: http://www.python.org/dev/peps/pep-0343/
939
940Attributes and methods
941~~~~~~~~~~~~~~~~~~~~~~
942
943The methods
944
945::
946
947    close()
948    read([count])
949    readline([count])
950    readlines()
951    write(data)
952    writelines(string_sequence)
953    xreadlines()
954
955and the attribute ``closed`` have the same semantics as for file
956objects of a local disk file system. The iterator protocol is also
957supported, i. e. you can use a loop to read a file line by line::
958
959    host = ftputil.FTPHost(...)
960    input_file = host.file("some_file")
961    for line in input_file:
962        # do something with the line, e. g.
963        print line.strip().replace("ftplib", "ftputil")
964    input_file.close()
965
966This feature obsoletes the ``xreadlines`` method which is deprecated
967and will be removed in ``ftputil`` version 2.5.
968
969For more on file objects, see the section `File objects`_ in the
970Python Library Reference.
971
972.. _`file objects`: http://www.python.org/doc/current/lib/bltin-file-objects.html
973
974Note that ``ftputil`` supports both binary mode and text mode with the
975appropriate line ending conversions.
976
977
978Writing directory parsers
979-------------------------
980
981``ftputil`` recognizes the two most widely-used FTP directory formats
982(Unix and MS style) and adjusts itself automatically. However, if your
983server uses a format which is different from the two provided by
984``ftputil``, you can plug in an own custom parser and have it used by
985a single method call.
986
987For this, you need to write a parser class by inheriting from the
988class ``Parser`` in the ``ftp_stat`` module. Here's an example::
989
990    from ftputil import ftp_error
991    from ftputil import ftp_stat
992
993    class XyzParser(ftp_stat.Parser):
994        """
995        Parse the default format of the FTP server of the XYZ
996        corporation.
997        """
998        def parse_line(self, line, time_shift=0.0):
999            """
1000            Parse a `line` from the directory listing and return a
1001            corresponding `StatResult` object. If the line can't
1002            be parsed, raise `ftp_error.ParserError`.
1003
1004            The `time_shift` argument can be used to fine-tune the
1005            parsing of dates and times. See the class
1006            `ftp_stat.UnixParser` for an example.
1007            """
1008            # split the `line` argument and examine it further; if
1009            #  something goes wrong, raise an `ftp_error.ParserError`
1010            ...
1011            # make a `StatResult` object from the parts above
1012            stat_result = ftp_stat.StatResult(...)
1013            # `_st_name` and `_st_target` are optional
1014            stat_result._st_name = ...
1015            stat_result._st_target = ...
1016            return stat_result
1017
1018        # define `ignores_line` only if the default in the base class
1019        #  doesn't do enough!
1020        def ignores_line(self, line):
1021            """
1022            Return a true value if the line should be ignored. For
1023            example, the implementation in the base class handles
1024            lines like "total 17". On the other hand, if the line
1025            should be used for stat'ing, return a false value.
1026            """
1027            is_total_line = super(XyzParser, self).ignores_line(line)
1028            my_test = ...
1029            return is_total_line or my_test
1030
1031A ``StatResult`` object is similar to the value returned by
1032`os.stat`_ and is usually built with statements like
1033
1034::
1035
1036    stat_result = StatResult(
1037                  (st_mode, st_ino, st_dev, st_nlink, st_uid,
1038                   st_gid, st_size, st_atime, st_mtime, st_ctime) )
1039    stat_result._st_name = ...
1040    stat_result._st_target = ...
1041
1042with the arguments of the ``StatResult`` constructor described in
1043the following table.
1044
1045===== ========== ============ =============== =======================
1046Index Attribute  os.stat type StatResult type Notes
1047===== ========== ============ =============== =======================
10480     st_mode    int          int
10491     st_ino     long         long
10502     st_dev     long         long
10513     st_nlink   int          int
10524     st_uid     int          str             usually only available as string
10535     st_gid     int          str             usually only available as string
10546     st_size    long         long
10557     st_atime   int/float    float
10568     st_mtime   int/float    float
10579     st_ctime   int/float    float
1058\-    _st_name   \-           str             file name without directory part
1059\-    _st_target \-           str             link target
1060===== ========== ============ =============== =======================
1061
1062If you can't extract all the desirable data from a line (for
1063example, the MS format doesn't contain any information about the
1064owner of a file), set the corresponding values in the ``StatResult``
1065instance to ``None``.
1066
1067Parser classes can use several helper methods which are defined in
1068the class ``Parser``:
1069
1070- ``parse_unix_mode`` parses strings like "drwxr-xr-x" and returns
1071  an appropriate ``st_mode`` value.
1072
1073- ``parse_unix_time`` returns a float number usable for the
1074  ``st_...time`` values by parsing arguments like "Nov"/"23"/"02:33" or
1075  "May"/"26"/"2005". Note that the method expects the timestamp string
1076  already split at whitespace.
1077
1078- ``parse_ms_time`` parses arguments like "10-23-01"/"03:25PM" and
1079  returns a float number like from ``time.mktime``. Note that the
1080  method expects the timestamp string already split at whitespace.
1081
1082Additionally, there's an attribute ``_month_numbers`` which maps
1083three-letter month abbreviations to integers.
1084
1085For more details, see the two "standard" parsers ``UnixParser`` and
1086``MSParser`` in the module ``ftp_stat.py``.
1087
1088To actually *use* the parser, call the method `set_parser`_ of the
1089``FTPHost`` instance.
1090
1091If you can't write a parser or don't want to, please ask on the
1092`ftputil mailing list`_. Possibly someone has already written a parser
1093for your server or can help to do it.
1094
1095
1096FAQ / Tips and tricks
1097---------------------
1098
1099Where can I get the latest version?
1100~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1101
1102See the `download page`_. Announcements will be sent to the `mailing
1103list`_. Announcements on major updates will also be posted to the
1104newsgroup `comp.lang.python`_ .
1105
1106.. _`download page`: http://ftputil.sschwarzer.net/download
1107.. _`mailing list`: http://ftputil.sschwarzer.net/mailinglist
1108.. _`comp.lang.python`: news:comp.lang.python
1109
1110Is there a mailing list on ``ftputil``?
1111~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1112
1113Yes, please visit http://ftputil.sschwarzer.net/mailinglist to
1114subscribe or read the archives.
1115
1116I found a bug! What now?
1117~~~~~~~~~~~~~~~~~~~~~~~~
1118
1119Before reporting a bug, make sure that you already tried the `latest
1120version`_ of ``ftputil``. There the bug might have already been fixed.
1121
1122.. _`latest version`: http://ftputil.sschwarzer.net/download
1123
1124Please see http://ftputil.sschwarzer.net/issuetrackernotes for
1125guidelines on entering a bug in ``ftputil``'s ticket system. If you
1126are unsure if the behaviour you found is a bug or not, you can write
1127to the `ftputil mailing list`_. In *either* case you *must not*
1128include confidential information (user id, password, file names, etc.)
1129in the problem report! Be careful!
1130
1131Does ``ftputil`` support SSL?
1132~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1133
1134``ftputil`` has no *built-in* SSL support. On the other hand,
1135you can use M2Crypto_ (in the source code archive, look for the
1136file ``M2Crypto/ftpslib.py``) which has a class derived from
1137``ftplib.FTP`` that supports SSL. You then can use a class
1138(not an object of it) similar to the following as a "session
1139factory" in ``ftputil.FTPHost``'s constructor::
1140
1141    import ftputil
1142
1143    from M2Crypto import ftpslib
1144
1145    class SSLFTPSession(ftpslib.FTP_TLS):
1146        def __init__(self, host, userid, password):
1147            """
1148            Use M2Crypto's `FTP_TLS` class to establish an
1149            SSL connection.
1150            """
1151            ftpslib.FTP_TLS.__init__(self)
1152            # do anything necessary to set up the SSL connection
1153            ...
1154            self.connect(host, port)
1155            self.login(userid, password)
1156            ...
1157
1158    # note the `session_factory` parameter
1159    host = ftputil.FTPHost(host, userid, password,
1160                           session_factory=SSLFTPSession)
1161    # use `host` as usual
1162
1163.. _M2Crypto: http://wiki.osafoundation.org/bin/view/Projects/MeTooCrypto#Downloads
1164
1165Connecting on another port
1166~~~~~~~~~~~~~~~~~~~~~~~~~~
1167
1168By default, an instantiated ``FTPHost`` object connects on the usual
1169FTP ports. If you have to use a different port, refer to the
1170section `FTPHost construction`_.
1171
1172You can use the same approach to connect in active or passive mode, as
1173you like.
1174
1175Using active or passive connections
1176~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1177
1178Use a wrapper class for ``ftplib.FTP``, as described in section
1179`FTPHost construction`_::
1180
1181    import ftplib
1182
1183    class ActiveFTPSession(ftplib.FTP):
1184        def __init__(self, host, userid, password):
1185            """
1186            Act like ftplib.FTP's constructor but use active mode
1187            explicitly.
1188            """
1189            ftplib.FTP.__init__(self)
1190            self.connect(host, port)
1191            self.login(userid, password)
1192            # see http://docs.python.org/lib/ftp-objects.html
1193            self.set_pasv(False)
1194
1195Use this class as the ``session_factory`` argument in ``FTPHost``'s
1196constructor.
1197
1198Conditional upload/download to/from a server in a different time zone
1199~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1200
1201You may find that ``ftputil`` uploads or downloads files
1202unnecessarily, or not when it should. This can happen when the FTP
1203server is in a different time zone than the client on which
1204``ftputil`` runs. Please see the section on setting the
1205`time shift`_. It may even be sufficient to call `synchronize_times`_.
1206
1207Wrong dates or times when stat'ing on a server
1208~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1209
1210Please see the previous tip.
1211
1212I tried to upload or download a file and it's corrupt
1213~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1214
1215Perhaps you used the upload or download methods without a ``mode``
1216argument. For compatibility with Python's code for local file systems,
1217``ftputil`` defaults to ASCII/text mode which will try to convert
1218presumable line endings and thus corrupt binary files. Pass "b" as the
1219``mode`` argument (see `Uploading and downloading files`_).
1220
1221When I use ``ftputil``, all I get is a ``ParserError`` exception
1222~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1223
1224The FTP server you connect to uses a directory format that
1225``ftputil`` doesn't understand. You can either write and
1226`plug in an own parser`_, or preferably ask on the `mailing list`_ for
1227help.
1228
1229.. _`plug in an own parser`: `Writing directory parsers`_
1230
1231I don't find an answer to my problem in this document
1232~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1233
1234Please send an email with your problem report or question to the
1235`ftputil mailing list`_, and we'll see what we can do for you. :-)
1236
1237
1238Bugs and limitations
1239--------------------
1240
1241- ``ftputil`` needs at least Python 2.3 to work.
1242
1243- Due to the implementation of ``lstat`` it can not return a sensible
1244  value for the root directory ``/`` though stat'ing entries *in* the
1245  root directory isn't a problem. If you know an implementation that
1246  can do this, please let me know. The root directory is handled
1247  appropriately in ``FTPHost.path.exists/isfile/isdir/islink``, though.
1248
1249- Timeouts of individual child sessions currently are not handled.
1250  This is only a problem if your ``FTPHost`` object or the generated
1251  ``FTPFile`` objects are inactive for about ten minutes or longer.
1252
1253- Until now, I haven't paid attention to thread safety. In principle,
1254  at least, different ``FTPFile`` objects should be usable in different
1255  threads.
1256
1257- ``FTPFile`` objects in text mode *may not* support charsets with more
1258  than one byte per character. Please email me your experiences
1259  (address above), if you work with multibyte text streams in FTP
1260  sessions.
1261
1262- Currently, it is not possible to continue an interrupted upload or
1263  download. Contact me if you have problems with that.
1264
1265- There's exactly one cache for lstat results for each ``FTPHost``
1266  object, i. e. there's no sharing of cache results determined by
1267  several ``FTPHost`` objects.
1268
1269
1270Files
1271-----
1272
1273If not overwritten via installation options, the ``ftputil`` files
1274reside in the ``ftputil`` package. The documentation (in
1275`reStructuredText`_ and in HTML format) is in the same directory.
1276
1277.. _`reStructuredText`: http://docutils.sourceforge.net/rst.html
1278
1279The files ``_test_*.py`` and ``_mock_ftplib.py`` are for unit-testing.
1280If you only *use* ``ftputil`` (i. e. *don't* modify it), you can
1281delete these files.
1282
1283
1284References
1285----------
1286
1287- Mackinnon T, Freeman S, Craig P. 2000. `Endo-Testing:
1288  Unit Testing with Mock Objects`_.
1289
1290- Postel J, Reynolds J. 1985. `RFC 959 - File Transfer Protocol (FTP)`_.
1291
1292- Van Rossum G, Drake Jr FL. 2003. `Python Library Reference`_.
1293
1294.. _`Endo-Testing: Unit Testing with Mock Objects`:
1295   http://www.connextra.com/aboutUs/mockobjects.pdf
1296.. _`RFC 959 - File Transfer Protocol (FTP)`: http://www.ietf.org/rfc/rfc959.txt
1297.. _`Python Library Reference`: http://www.python.org/doc/current/lib/lib.html
1298
1299
1300Authors
1301-------
1302
1303``ftputil`` is written by Stefan Schwarzer
1304<sschwarzer@sschwarzer.net>, in part based on suggestions
1305from users.
1306
1307The ``lrucache`` module is written by Evan Prodromou
1308<evan@bad.dynu.ca>.
1309
1310Feedback is appreciated. :-)
1311
Note: See TracBrowser for help on using the repository browser.