source: ftputil.txt @ 796:6929c27acdc9

Last change on this file since 796:6929c27acdc9 was 796:6929c27acdc9, checked in by Stefan Schwarzer <sschwarzer@…>, 12 years ago
Tried to make the documentation more readable. In particular, removed parentheses and made difficult sentences easier to parse.
File size: 45.9 KB
Line 
1``ftputil`` -- a high-level FTP client library
2==============================================
3
4:Version:   2.4
5:Date:      2009-02-15
6:Summary:   high-level FTP client library for Python
7:Keywords:  FTP, ``ftplib`` substitute, virtual filesystem, pure Python
8:Author:    Stefan Schwarzer <sschwarzer@sschwarzer.net>
9:`Russian translation`__: Anton Stepanov <antymail@mail.ru>
10
11.. __: ftputil_ru.html
12
13.. contents::
14
15
16Introduction
17------------
18
19The ``ftputil`` module is a high-level interface to the ftplib_
20module. The `FTPHost objects`_ generated from it allow many operations
21similar to those of os_, `os.path`_ and `shutil`_.
22
23.. _ftplib: http://www.python.org/doc/current/lib/module-ftplib.html
24.. _os: http://www.python.org/doc/current/lib/module-os.html
25.. _`os.path`: http://www.python.org/doc/current/lib/module-os.path.html
26.. _`shutil`: http://www.python.org/doc/current/lib/module-shutil.html
27
28Examples::
29
30    import ftputil
31
32    # download some files from the login directory
33    host = ftputil.FTPHost('ftp.domain.com', 'user', 'password')
34    names = host.listdir(host.curdir)
35    for name in names:
36        if host.path.isfile(name):
37            host.download(name, name, 'b')  # remote, local, binary mode
38
39    # make a new directory and copy a remote file into it
40    host.mkdir('newdir')
41    source = host.file('index.html', 'r')         # file-like object
42    target = host.file('newdir/index.html', 'w')  # file-like object
43    host.copyfileobj(source, target)  # similar to shutil.copyfileobj
44    source.close()
45    target.close()
46
47Also, there are `FTPHost.lstat`_ and `FTPHost.stat`_ to request size and
48modification time of a file. The latter can also follow links, similar
49to `os.stat`_. Even `FTPHost.walk`_ and `FTPHost.path.walk`_ work.
50
51.. _`os.stat`: http://www.python.org/doc/2.5/lib/os-file-dir.html#l2h-2698
52
53
54``ftputil`` features
55--------------------
56
57* Method names are familiar from Python's ``os``, ``os.path`` and
58  ``shutil`` modules
59
60* Remote file system navigation (``getcwd``, ``chdir``)
61
62* Upload and download files (``upload``, ``upload_if_newer``,
63  ``download``, ``download_if_newer``)
64
65* Time zone synchronization between client and server (needed
66  for ``upload_if_newer`` and ``download_if_newer``)
67
68* Create and remove directories (``mkdir``, ``makedirs``, ``rmdir``,
69  ``rmtree``) and remove files (``remove``)
70
71* Get information about directories, files and links (``listdir``,
72  ``stat``, ``lstat``, ``exists``, ``isdir``, ``isfile``, ``islink``,
73  ``abspath``, ``split``, ``join``, ``dirname``, ``basename`` etc.)
74
75* Iterate over remote file systems (``walk``)
76
77* Local caching of results from ``lstat`` and ``stat`` calls to reduce
78  network access (also applies to ``exists``, ``getmtime`` etc.).
79
80* Read files from and write files to remote hosts via
81  file-like objects (``FTPHost.file``; the generated file-like objects
82  have many common methods like ``read``, ``readline``, ``readlines``,
83  ``write``, ``writelines``, ``close`` and can do automatic line
84  ending conversions on the fly, i. e. text/binary mode).
85
86
87Exception hierarchy
88-------------------
89
90The exceptions are in the namespace of the ``ftp_error`` module, e. g.
91``ftp_error.TemporaryError``. Getting the exception classes from the
92"package module" ``ftputil`` is deprecated and will no longer be
93supported in ``ftputil`` version 2.5.
94
95The exception classes are organized as follows::
96
97    FTPError
98        FTPOSError(FTPError, OSError)
99            PermanentError(FTPOSError)
100                CommandNotImplementedError(PermanentError)
101            TemporaryError(FTPOSError)
102        FTPIOError(FTPError)
103        InternalError(FTPError)
104            InaccessibleLoginDirError(InternalError)
105            ParserError(InternalError)
106            RootDirError(InternalError)
107            TimeShiftError(InternalError)
108
109and are described here:
110
111- ``FTPError``
112
113  is the root of the exception hierarchy of the module.
114
115- ``FTPOSError``
116
117  is derived from ``OSError``. This is for similarity between the
118  os module and ``FTPHost`` objects. Compare
119
120  ::
121
122    try:
123        os.chdir('nonexisting_directory')
124    except OSError:
125        ...
126
127  with
128
129  ::
130
131    host = ftputil.FTPHost('host', 'user', 'password')
132    try:
133        host.chdir('nonexisting_directory')
134    except OSError:
135        ...
136
137  Imagine a function
138
139  ::
140
141    def func(path, file):
142        ...
143
144  which works on the local file system and catches ``OSErrors``. If you
145  change the parameter list to
146
147  ::
148
149    def func(path, file, os=os):
150        ...
151
152  where ``os`` denotes the ``os`` module, you can call the function also as
153
154  ::
155
156    host = ftputil.FTPHost('host', 'user', 'password')
157    func(path, file, os=host)
158
159  to use the same code for both a local and remote file system.
160  Another similarity between ``OSError`` and ``FTPOSError`` is that
161  the latter holds the FTP server return code in the ``errno``
162  attribute of the exception object and the error text in
163  ``strerror``.
164
165- ``PermanentError``
166
167  is raised for 5xx return codes from the FTP server. This
168  corresponds to ``ftplib.error_perm`` (though ``PermanentError`` and
169  ``ftplib.error_perm`` are *not* identical).
170
171- ``CommandNotImplementedError``
172
173  indicates that an underlying command the code tries to use is not
174  implemented. For an example, see the description of the
175  `FTPHost.chmod`_ method.
176
177- ``TemporaryError``
178
179  is raised for FTP return codes from the 4xx category. This
180  corresponds to ``ftplib.error_temp`` (though ``TemporaryError`` and
181  ``ftplib.error_temp`` are *not* identical).
182
183- ``FTPIOError``
184
185  denotes an I/O error on the remote host. This appears
186  mainly with file-like objects which are retrieved by invoking
187  ``FTPHost.file`` (``FTPHost.open`` is an alias). Compare
188
189  ::
190
191    >>> try:
192    ...     f = open('not_there')
193    ... except IOError, obj:
194    ...     print obj.errno
195    ...     print obj.strerror
196    ...
197    2
198    No such file or directory
199
200  with
201
202  ::
203
204    >>> host = ftputil.FTPHost('host', 'user', 'password')
205    >>> try:
206    ...     f = host.open('not_there')
207    ... except IOError, obj:
208    ...     print obj.errno
209    ...     print obj.strerror
210    ...
211    550
212    550 not_there: No such file or directory.
213
214  As you can see, both code snippets are similar. However, the error
215  codes aren't the same.
216
217- ``InternalError``
218
219  subsumes exception classes for signaling errors due to limitations
220  of the FTP protocol or the concrete implementation of ``ftputil``.
221
222- ``InaccessibleLoginDirError``
223
224  This exception is only raised if *both* of the following conditions
225  are met:
226
227  - The directory in which "you" are placed upon login is not
228    accessible, i. e. a ``chdir`` call with the directory as
229    argument would fail.
230
231  - You try to access a path which contains whitespace.
232
233- ``ParserError``
234
235  is used for errors during the parsing of directory
236  listings from the server. This exception is used by the ``FTPHost``
237  methods ``stat``, ``lstat``, and ``listdir``.
238
239- ``RootDirError``
240
241  Because of the implementation of the ``lstat`` method it is not
242  possible to do a ``stat`` call  on the root directory ``/``.
243  If you know *any* way to do it, please let me know. :-)
244
245  This problem does *not* affect stat calls on items *in* the root
246  directory.
247
248- ``TimeShiftError``
249
250  is used to denote errors which relate to setting the `time shift`_,
251  *for example* trying to set a value which is no multiple of a full
252  hour.
253
254
255``FTPHost`` objects
256-------------------
257
258.. _`FTPHost construction`:
259
260Construction
261~~~~~~~~~~~~
262
263Basics
264``````
265
266``FTPHost`` instances can be generated with the following call::
267
268    host = ftputil.FTPHost(host, user, password, account,
269                           session_factory=ftplib.FTP)
270
271The first four parameters are strings with the same meaning as for the
272FTP class in the ``ftplib`` module.
273
274Session factories
275`````````````````
276
277The keyword argument ``session_factory`` may be used to generate FTP
278connections with other factories than the default ``ftplib.FTP``. For
279example, the M2Crypto distribution uses a secure FTP class which is
280derived from ``ftplib.FTP``.
281
282In fact, all positional and keyword arguments other than
283``session_factory`` are passed to the factory to generate a new
284background session. This happens for every remote file that is opened;
285see below.
286
287This functionality of the constructor also allows to wrap
288``ftplib.FTP`` objects to do something that wouldn't be possible with
289the ``ftplib.FTP`` constructor alone.
290
291As an example, assume you want to connect to another than the default
292port, but ``ftplib.FTP`` only offers this by means of its ``connect``
293method, not via its constructor. The solution is to use a wrapper
294class::
295
296    import ftplib
297    import ftputil
298
299    EXAMPLE_PORT = 50001
300
301    class MySession(ftplib.FTP):
302        def __init__(self, host, userid, password, port):
303            """Act like ftplib.FTP's constructor but connect to another port."""
304            ftplib.FTP.__init__(self)
305            self.connect(host, port)
306            self.login(userid, password)
307
308    # try not to use MySession() as factory, - use the class itself
309    host = ftputil.FTPHost(host, userid, password,
310                           port=EXAMPLE_PORT, session_factory=MySession)
311    # use `host` as usual
312
313On login, the format of the directory listings (needed for stat'ing
314files and directories) should be determined automatically. If not,
315please `file a bug report`_.
316
317.. _`file a bug report`: http://ftputil.sschwarzer.net/issuetrackernotes
318
319Support for the ``with`` statement
320``````````````````````````````````
321
322If you are sure that all the users of your code use at least Python
3232.5, you can use Python's `with statement`_::
324
325    # not needed for Python 2.6 and later
326    from __future__ import with_statement
327
328    import ftputil
329
330    with ftputil.FTPHost(host, user, password) as host:
331        print host.listdir(host.curdir)
332
333After the ``with`` block, the ``FTPHost`` instance and the
334associated FTP sessions will be closed automatically.
335
336If something goes wrong during the ``FTPHost`` construction or in the
337body of the ``with`` statement, the instance is closed as well.
338Exceptions will be propagated (as with ``try ... finally``).
339
340.. _`with statement`: http://www.python.org/dev/peps/pep-0343/
341
342``FTPHost`` attributes and methods
343~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
344
345Attributes
346``````````
347
348- ``curdir``, ``pardir``, ``sep``
349
350  are strings which denote the current and the parent directory on the
351  remote server. ``sep`` holds the path separator. Though `RFC 959`_
352  (File Transfer Protocol) notes that these values may depend on the
353  FTP server implementation, the Unix variants seem to work well in
354  practice, even for non-Unix servers.
355
356Remote file system navigation
357`````````````````````````````
358
359- ``getcwd()``
360
361  returns the absolute current directory on the remote host. This
362  method acts similar to ``os.getcwd``.
363
364- ``chdir(directory)``
365
366  sets the current directory on the FTP server. This resembles
367  ``os.chdir``, as you may have expected.
368
369Uploading and downloading files
370```````````````````````````````
371
372- ``upload(source, target, mode='')``
373
374  copies a local source file (given by a filename, i. e. a string)
375  to the remote host under the name target. Both ``source`` and
376  ``target`` may be absolute paths or relative to their corresponding
377  current directory (on the local or the remote host, respectively).
378  The mode may be "" or "a" for ASCII uploads or "b" for binary
379  uploads. ASCII mode is the default, similar to regular local
380  file objects.
381
382- ``download(source, target, mode='')``
383
384  performs a download from the remote source to a target file. Both
385  ``source`` and ``target`` are strings. Most of the description of
386  the upload method applies here, too.
387
388.. _`upload_if_newer`:
389
390- ``upload_if_newer(source, target, mode='')``
391
392  is similar to the ``upload`` method. The only difference is that the
393  upload is only invoked if the time of the last modification for the
394  source file is more recent than that of the target file or the
395  target doesn't exist at all. If an upload actually happened, the
396  return value is a true value, else a false value.
397
398  Note that this method only checks the existence and/or the
399  modification time of the source and target file; it can't recognize
400  a change in the transfer mode, e. g.
401
402  ::
403
404    # transfer in ASCII mode
405    host.upload_if_newer('source_file', 'target_file', 'a')
406    # won't transfer the file again, which is bad!
407    host.upload_if_newer('source_file', 'target_file', 'b')
408
409  Similarly, if a transfer is interrupted, the remote file will have a
410  newer modification time than the local file, and thus the transfer
411  won't be repeated if ``upload_if_newer`` is used a second time.
412  There are at least two possibilities after a failed upload:
413
414  - use ``upload`` instead of ``upload_if_newer``, or
415
416  - remove the incomplete target file with ``FTPHost.remove``, then
417    use ``upload`` or ``upload_if_newer`` to transfer it again.
418
419  If it seems that a file is uploaded unnecessarily or not when it
420  should, read the subsection on `time shift`_ settings.
421
422.. _`download_if_newer`:
423
424- ``download_if_newer(source, target, mode='')``
425
426  corresponds to ``upload_if_newer`` but performs a download from the
427  server to the local host. Read the descriptions of download and
428  ``upload_if_newer`` for more. If a download actually happened, the
429  return value is a true value, else a false value.
430
431  If it seems that a file is downloaded unnecessarily or not when it
432  should, read the subsection on `time zone correction`_.
433
434.. _`time shift`:
435.. _`time zone correction`:
436
437Time zone correction
438````````````````````
439
440If the client where ``ftputil`` runs and the server have a different
441understanding of their local times, this has to be taken into account
442for ``upload_if_newer`` and ``download_if_newer`` to work correctly.
443
444Note that even if the client and the server are in the same time zone
445(or even on the same computer), the time shift value (see below) may
446be different from zero. For example, my computer is set to use local
447time whereas the server running on the very same host insists on using
448UTC time.
449
450.. _`set_time_shift`:
451
452- ``set_time_shift(time_shift)``
453
454  sets the so-called time shift value, measured in seconds. The time
455  shift is the difference between the local time of the server and the
456  local time of the client at a given moment, i. e. by definition
457
458  ::
459
460    time_shift = server_time - client_time
461
462  Setting this value is important for `upload_if_newer`_ and
463  `download_if_newer`_ to work correctly even if the time zone of the
464  FTP server differs from that of the client. Note that the time shift
465  value *can be negative*.
466
467  If the time shift value is invalid, e. g. no multiple of a full hour
468  or its absolute value larger than 24 hours, a ``TimeShiftError`` is
469  raised.
470
471  See also `synchronize_times`_ for a way to set the time shift with a
472  simple method call.
473
474- ``time_shift()``
475
476  returns the currently-set time shift value. See ``set_time_shift``
477  above for its definition.
478
479.. _`synchronize_times`:
480
481- ``synchronize_times()``
482
483  synchronizes the local times of the server and the client, so that
484  `upload_if_newer`_ and `download_if_newer`_ work as expected, even
485  if the client and the server use different time zones. For this
486  to work, *all* of the following conditions must be true:
487
488  - The connection between server and client is established.
489
490  - The client has write access to the directory that is current when
491    ``synchronize_times`` is called.
492
493  If you can't fulfill these conditions, you can nevertheless set the
494  time shift value explicitly with `set_time_shift`_. Trying to call
495  ``synchronize_times`` if the above conditions aren't met results in
496  a ``TimeShiftError`` exception.
497
498Creating and removing directories
499`````````````````````````````````
500
501- ``mkdir(path, [mode])``
502
503  makes the given directory on the remote host. This doesn't construct
504  "intermediate" directories which don't already exist. The ``mode``
505  parameter is ignored; this is for compatibility with ``os.mkdir`` if
506  an ``FTPHost`` object is passed into a function instead of the
507  ``os`` module. See the explanation in the subsection `Exception
508  hierarchy`_.
509
510- ``makedirs(path, [mode])``
511
512  works similar to ``mkdir`` (see above), but also makes intermediate
513  directories like ``os.makedirs``. The ``mode`` parameter is only
514  there for compatibility with ``os.makedirs`` and is ignored.
515
516- ``rmdir(path)``
517
518  removes the given remote directory. If it's not empty, raise
519  a ``PermanentError``.
520
521- ``rmtree(path, ignore_errors=False, onerror=None)``
522
523  removes the given remote, possibly non-empty, directory tree.
524  The interface of this method is rather complex, in favor of
525  compatibility with ``shutil.rmtree``.
526
527  If ``ignore_errors`` is set to a true value, errors are ignored.
528  If ``ignore_errors`` is a false value *and* ``onerror`` isn't
529  set, all exceptions occurring during the tree iteration and
530  processing are raised. These exceptions are all of type
531  ``PermanentError``.
532
533  To distinguish between different kinds of errors, pass in a callable
534  for ``onerror``. This callable must accept three arguments:
535  ``func``, ``path`` and ``exc_info``. ``func`` is a bound method
536  object, *for example* ``your_host_object.listdir``. ``path`` is the
537  path that was the recent argument of the respective method
538  (``listdir``, ``remove``, ``rmdir``). ``exc_info`` is the exception
539  info as it is gotten from ``sys.exc_info``.
540
541  The code of ``rmtree`` is taken from Python's ``shutil`` module
542  and adapted for ``ftputil``.
543
544Removing files and links
545````````````````````````
546
547- ``remove(path)``
548
549  removes a file or link on the remote host, similar to ``os.remove``.
550
551- ``unlink(path)``
552
553  is an alias for ``remove``.
554
555Retrieving information about directories, files and links
556`````````````````````````````````````````````````````````
557
558- ``listdir(path)``
559
560  returns a list containing the names of the files and directories
561  in the given path, similar to ``os.listdir``. The special names
562  ``.`` and ``..`` are not in the list.
563
564The methods ``lstat`` and ``stat`` (and some others) rely on the
565directory listing format used by the FTP server. When connecting to a
566host, ``FTPHost``'s constructor tries to guess the right format, which
567succeeds in most cases. However, if you get strange results or
568``ParserError`` exceptions by a mere ``lstat`` call, please `file a
569bug report`_.
570
571If ``lstat`` or ``stat`` yield wrong modification dates or times, look
572at the methods that deal with time zone differences (`time zone
573correction`_).
574
575.. _`FTPHost.lstat`:
576
577- ``lstat(path)``
578
579  returns an object similar to that from ``os.lstat``. This is a
580  "tuple" with additional attributes; see the documentation of the
581  ``os`` module for details.
582
583  The result is derived by parsing the output of a ``DIR`` command on
584  the server. Therefore, the result from ``FTPHost.lstat`` can not
585  contain more information than the received text. In particular:
586
587  - User and group ids can only be determined as strings, not as
588    numbers, and that only if the server supplies them. This is
589    usually the case with Unix servers but maybe not for other FTP
590    server programs.
591
592  - Values for the time of the last modification may be rough,
593    depending on the information from the server. For timestamps
594    older than a year, this usually means that the precision of the
595    modification timestamp value is not better than days. For newer
596    files, the information may be accurate to a minute.
597
598  - Links can only be recognized on servers that provide this
599    information in the ``DIR`` output.
600
601  - Stat attributes that can't be determined at all are set to
602        ``None``. For example, a line of a directory listing may not
603        contain the date/time of a directory's last modification.
604
605  - There's a special problem with stat'ing the root directory.
606    (Stat'ing things *in* the root directory is fine though.) In
607    this case, a ``RootDirError`` is raised. This has to do with the
608    algorithm used by ``(l)stat``, and I know of no approach which
609    mends this problem.
610
611  Currently, ``ftputil`` recognizes the common Unix-style and
612  Microsoft/DOS-style directory formats. If you need to parse output
613  from another server type, please write to the `ftputil mailing
614  list`_. You may consider to `write your own parser`_.
615
616.. _`ftputil mailing list`: http://ftputil.sschwarzer.net/mailinglist
617.. _`write your own parser`: `Writing directory parsers`_
618
619.. _`FTPHost.stat`:
620
621- ``stat(path)``
622
623  returns ``stat`` information also for files which are pointed to by a
624  link. This method follows multiple links until a regular file or
625  directory is found. If an infinite link chain is encountered or the
626  target of the last link in the chain doesn't exist, a
627  ``PermanentError`` is raised.
628
629.. _`FTPHost.path`:
630
631``FTPHost`` objects contain an attribute named ``path``, similar to
632`os.path`_. The following methods can be applied to the remote host
633with the same semantics as for ``os.path``:
634
635::
636
637    abspath(path)
638    basename(path)
639    commonprefix(path_list)
640    dirname(path)
641    exists(path)
642    getmtime(path)
643    getsize(path)
644    isabs(path)
645    isdir(path)
646    isfile(path)
647    islink(path)
648    join(path1, path2, ...)
649    normcase(path)
650    normpath(path)
651    split(path)
652    splitdrive(path)
653    splitext(path)
654    walk(path, func, arg)
655
656Like Python's counterparts under `os.path`_, ``ftputil``'s ``is...``
657methods return ``False`` if they can't find the path given by their
658argument.
659
660Local caching of file system information
661````````````````````````````````````````
662
663Many of the above methods need access to the remote file system to
664obtain data on directories and files. To get the most recent data,
665*each* call to ``lstat``, ``stat``, ``exists``, ``getmtime`` etc.
666would require to fetch a directory listing from the server, which can
667make the program *very* slow. This effect is more pronounced for
668operations which mostly scan the file system rather than transferring
669file data.
670
671For this reason, ``ftputil`` by default saves the results from
672directory listings locally and reuses those results. This reduces
673network accesses and so speeds up the software a lot. However, since
674data is more rarely fetched from the server, the risk of obsolete data
675also increases. This will be discussed below.
676
677Caching can be controlled -- if necessary at all -- via the
678``stat_cache`` object in an ``FTPHost``'s namespace. For example,
679after calling
680
681::
682
683    host = ftputil.FTPHost(host, user, password, account,
684                           session_factory=ftplib.FTP)
685
686the cache can be accessed as ``host.stat_cache``.
687
688While ``ftputil`` usually manages the cache quite well, there are two
689possible reasons that may suggest modifying cache parameters.
690The first is when the number of possible entries is too low. You may
691notice that when you are processing very large directories, e. g.
692containing more than 1000 directories or files, and the program
693becomes much slower than before. It's common for code to read a
694directory with ``listdir`` and then process the found directories and
695files. For this application, it's a good rule of thumb to set the
696cache size to somewhat more than the number of directory entries
697fetched with ``listdir``. This is done by the ``resize`` method::
698
699    host.stat_cache.resize(2000)
700
701where the argument is the maximum number of ``lstat`` results to store
702(the default is 1000). Note that each path on the server, e. g.
703"/home/schwa/some_dir", corresponds to a single cache entry. Methods
704like ``exists`` or ``getmtime`` all derive their results from a
705previously fetched ``lstat`` result.
706
707The value 2000 above means that the cache will hold at most 2000
708entries. If more are about to be stored, the entries which haven't
709been used for the longest time will be deleted to make place for newer
710entries.
711
712Caching is so effective because it reduces network accesses. This can
713also be a disadvantage if the file system data on the remote server
714changes after a stat result has been retrieved; the client, when
715looking at the cached stat data, will use obsolete information.
716
717There are two ways to get such out-of-date stat data. The first
718happens when an ``FTPHost`` instance modifies a file path for which it
719has a cache entry, e. g. by calling ``remove`` or ``rmdir``. Such
720changes are handled transparently; the path will be deleted from the
721cache. A different matter are changes unknown to the ``FTPHost``
722object which inspects its cache. Obviously, for example, these are
723changes by programs running on the remote host. On the other hand,
724cache inconsistencies can also occur if two ``FTPHost`` objects change
725a file system simultaneously::
726
727    host1 = ftputil.FTPHost(server, user1, password1)
728    host2 = ftputil.FTPHost(server, user1, password1)
729    try:
730        stat_result1 = host1.stat("some_file")
731        stat_result2 = host2.stat("some_file")
732        host2.remove("some_file")
733        # `host1` will still see the obsolete cache entry!
734        print host1.stat("some_file")
735        # will raise an exception since an `FTPHost` object
736        #  knows of its own changes
737        print host2.stat("some_file")
738    finally:
739        host1.close()
740        host2.close()
741
742At first sight, it may appear to be a good idea to have a shared cache
743among several ``FTPHost`` objects. After some thinking, this turns out
744to be very error-prone. For example, it won't help with different
745processes using ``ftputil``. So, if you have to deal with concurrent
746write/read accesses to a server, you have to handle them explicitly.
747
748The most useful tool for this is the ``invalidate`` method. In the
749example above, it could be used like this::
750
751    host1 = ftputil.FTPHost(server, user1, password1)
752    host2 = ftputil.FTPHost(server, user1, password1)
753    try:
754        stat_result1 = host1.stat("some_file")
755        stat_result2 = host2.stat("some_file")
756        host2.remove("some_file")
757        # invalidate using an absolute path
758        absolute_path = host1.path.abspath(
759                        host1.path.join(host1.curdir, "some_file"))
760        host1.stat_cache.invalidate(absolute_path)
761        # will now raise an exception as it should
762        print host1.stat("some_file")
763        # would raise an exception since an `FTPHost` object
764        #  knows of its own changes, even without `invalidate`
765        print host2.stat("some_file")
766    finally:
767        host1.close()
768        host2.close()
769
770The method ``invalidate`` can be used on any *absolute* path, be it a
771directory, a file or a link.
772
773By default, the cache entries (if not replaced by newer ones) are
774stored for an infinite time. That is, if you start your Python process
775using ``ftputil`` and let it run for three days a stat call may still
776access cache data that old. To avoid this, you can set the ``max_age``
777attribute::
778
779    host = ftputil.FTPHost(server, user, password)
780    host.stat_cache.max_age = 60 * 60  # = 3600 seconds
781
782This sets the maximum age of entries in the cache to an hour. This
783means any entry older won't be retrieved from the cache but its data
784instead fetched again from the remote host and then again stored for
785up to an hour. To reset `max_age` to the default of unlimited age,
786i. e. cache entries never expire, use ``None`` as value.
787
788If you are certain that the cache will be in the way, you can disable
789and later re-enable it completely with ``disable`` and ``enable``::
790
791    host = ftputil.FTPHost(server, user, password)
792    host.stat_cache.disable()
793    ...
794    host.stat_cache.enable()
795
796During that time, the cache won't be used; all data will be fetched
797from the network. After enabling the cache, its entries will be the
798same as when the cache was disabled, that is, entries won't get
799updated with newer data during this period. Note that even when the
800cache is disabled, the file system data in the code can become
801inconsistent::
802
803    host = ftputil.FTPHost(server, user, password)
804    host.stat_cache.disable()
805    if host.path.exists("some_file"):
806        mtime = host.path.getmtime("some_file")
807
808In that case, the file ``some_file`` may have been removed by another
809process between the calls to ``exists`` and ``getmtime``!
810
811Iteration over directories
812``````````````````````````
813
814.. _`FTPHost.walk`:
815
816- ``walk(top, topdown=True, onerror=None)``
817
818  iterates over a directory tree, similar to `os.walk`_. Actually,
819  ``FTPHost.walk`` uses the code from Python with just the necessary
820  modifications, so see the linked documentation.
821
822.. _`os.walk`: http://www.python.org/doc/2.5/lib/os-file-dir.html#l2h-2707
823
824.. _`FTPHost.path.walk`:
825
826- ``path.walk(path, func, arg)``
827
828  Similar to ``os.path.walk``, the ``walk`` method in
829  `FTPHost.path`_ can be used, though ``FTPHost.walk`` is probably
830  easier to use.
831
832Other methods
833`````````````
834
835- ``close()``
836
837  closes the connection to the remote host. After this, no more
838  interaction with the FTP server is possible without using a new
839  ``FTPHost`` object.
840
841- ``rename(source, target)``
842
843  renames the source file (or directory) on the FTP server.
844
845.. _`FTPHost.chmod`:
846
847- ``chmod(path, mode)``
848
849  sets the access mode (permission flags) for the given path. The mode
850  is an integer as returned for the mode by the ``stat`` and ``lstat``
851  methods. Be careful: Usually, mode values are written as octal
852  numbers, for example 0755 to make a directory readable and writable
853  for the owner, but not writable for the group and others. If you
854  want to use such octal values, rely on Python's support for them::
855
856    host.chmod("some_directory", 0755)
857
858  *Note the leading zero.*
859
860  Not all FTP servers support the ``chmod`` command. In case of
861  an exception, how do you know if the path doesn't exist or if
862  the command itself is invalid? If the FTP server complies with
863  `RFC 959`_, it should return a status code 502 if the ``SITE CHMOD``
864  command isn't allowed. ``ftputil`` maps this special error
865  response to a ``CommandNotImplementedError`` which is derived from
866  ``PermanentError``.
867
868  So you need to code like this::
869
870    host = ftputil.FTPHost(server, user, password)
871    try:
872        host.chmod("some_file", 0644)
873    except ftp_error.CommandNotImplementedError:
874        # chmod not supported
875        ...
876    except ftp_error.PermanentError:
877        # possibly a non-existent file
878        ...
879
880  Because the ``CommandNotImplementedError`` is more specific, you
881  have to test for it first.
882
883.. _`RFC 959`: `RFC 959 - File Transfer Protocol (FTP)`_
884
885- ``copyfileobj(source, target, length=64*1024)``
886
887  copies the contents from the file-like object source to the
888  file-like object target. The only difference to
889  ``shutil.copyfileobj`` is the default buffer size. Note that
890  arbitrary file-like objects can be used as arguments (e. g. local
891  files, remote FTP files). See `File-like objects`_ for construction
892  and use of remote file-like objects.
893
894.. _`set_parser`:
895
896- ``set_parser(parser)``
897
898  sets a custom parser for FTP directories. Note that you have to pass
899  in a parser *instance*, not the class.
900
901  An `extra section`_ shows how to write own parsers if the default
902  parsers in ``ftputil`` don't work for you. Possibly you are lucky
903  and someone has already written a parser you can use. Please ask on
904  the `mailing list`_.
905
906.. _`extra section`: `Writing directory parsers`_
907
908
909File-like objects
910-----------------
911
912Construction
913~~~~~~~~~~~~
914
915Basics
916``````
917
918``FTPFile`` objects are returned by a call to ``FTPHost.file`` or
919``FTPHost.open``, never use the constructor directly.
920
921- ``FTPHost.file(path, mode='r')``
922
923  returns a file-like object that refers to the path on the remote
924  host. This path may be absolute or relative to the current directory
925  on the remote host (this directory can be determined with the getcwd
926  method). As with local file objects the default mode is "r", i. e.
927  reading text files. Valid modes are "r", "rb", "w", and "wb".
928
929- ``FTPHost.open(path, mode='r')``
930
931  is an alias for ``file`` (see above).
932
933Support for the ``with`` statement
934``````````````````````````````````
935
936If you are sure that all the users of your code use at least Python
9372.5, you can use Python's `with statement`_ with the ``FTPFile``
938constructor::
939
940    # not needed for Python 2.6 and later
941    from __future__ import with_statement
942
943    import ftputil
944
945    # get an ``FTPHost`` object from somewhere
946    ...
947
948    with host.file("new_file", "w") as f:
949        f.write("This is some text.")
950
951At the end of the ``with`` block, the file will be closed
952automatically.
953
954If something goes wrong during the construction of the file or in the
955body of the ``with`` statement, the file will be closed as well.
956Exceptions will be propagated as with ``try ... finally``.
957
958.. _`with statement`: http://www.python.org/dev/peps/pep-0343/
959
960Attributes and methods
961~~~~~~~~~~~~~~~~~~~~~~
962
963The methods
964
965::
966
967    close()
968    read([count])
969    readline([count])
970    readlines()
971    write(data)
972    writelines(string_sequence)
973    xreadlines()
974
975and the attribute ``closed`` have the same semantics as for file
976objects of a local disk file system. The iterator protocol is
977supported as well, i. e. you can use a loop to read a file line by
978line::
979
980    host = ftputil.FTPHost(...)
981    input_file = host.file("some_file")
982    for line in input_file:
983        # do something with the line, e. g.
984        print line.strip().replace("ftplib", "ftputil")
985    input_file.close()
986
987This feature obsoletes the ``xreadlines`` method which is deprecated
988and will be removed in ``ftputil`` version 2.5.
989
990For more on file objects, see the section `File objects`_ in the
991Python Library Reference.
992
993.. _`file objects`: http://www.python.org/doc/current/lib/bltin-file-objects.html
994
995Note that ``ftputil`` supports both binary mode and text mode with the
996appropriate line ending conversions.
997
998
999Writing directory parsers
1000-------------------------
1001
1002``ftputil`` recognizes the two most widely-used FTP directory formats
1003(Unix and MS style) and adjusts itself automatically. However, if your
1004server uses a format which is different from the two provided by
1005``ftputil``, you can plug in a custom parser and have it used by
1006a single method call.
1007
1008For this, you need to write a parser class by inheriting from the
1009class ``Parser`` in the ``ftp_stat`` module. Here's an example::
1010
1011    from ftputil import ftp_error
1012    from ftputil import ftp_stat
1013
1014    class XyzParser(ftp_stat.Parser):
1015        """
1016        Parse the default format of the FTP server of the XYZ
1017        corporation.
1018        """
1019        def parse_line(self, line, time_shift=0.0):
1020            """
1021            Parse a `line` from the directory listing and return a
1022            corresponding `StatResult` object. If the line can't
1023            be parsed, raise `ftp_error.ParserError`.
1024
1025            The `time_shift` argument can be used to fine-tune the
1026            parsing of dates and times. See the class
1027            `ftp_stat.UnixParser` for an example.
1028            """
1029            # split the `line` argument and examine it further; if
1030            #  something goes wrong, raise an `ftp_error.ParserError`
1031            ...
1032            # make a `StatResult` object from the parts above
1033            stat_result = ftp_stat.StatResult(...)
1034            # `_st_name` and `_st_target` are optional
1035            stat_result._st_name = ...
1036            stat_result._st_target = ...
1037            return stat_result
1038
1039        # define `ignores_line` only if the default in the base class
1040        #  doesn't do enough!
1041        def ignores_line(self, line):
1042            """
1043            Return a true value if the line should be ignored. For
1044            example, the implementation in the base class handles
1045            lines like "total 17". On the other hand, if the line
1046            should be used for stat'ing, return a false value.
1047            """
1048            is_total_line = super(XyzParser, self).ignores_line(line)
1049            my_test = ...
1050            return is_total_line or my_test
1051
1052A ``StatResult`` object is similar to the value returned by
1053`os.stat`_ and is usually built with statements like
1054
1055::
1056
1057    stat_result = StatResult(
1058                  (st_mode, st_ino, st_dev, st_nlink, st_uid,
1059                   st_gid, st_size, st_atime, st_mtime, st_ctime) )
1060    stat_result._st_name = ...
1061    stat_result._st_target = ...
1062
1063with the arguments of the ``StatResult`` constructor described in
1064the following table.
1065
1066===== ========== ============ =============== =======================
1067Index Attribute  os.stat type StatResult type Notes
1068===== ========== ============ =============== =======================
10690     st_mode    int          int
10701     st_ino     long         long
10712     st_dev     long         long
10723     st_nlink   int          int
10734     st_uid     int          str             usually only available as string
10745     st_gid     int          str             usually only available as string
10756     st_size    long         long
10767     st_atime   int/float    float
10778     st_mtime   int/float    float
10789     st_ctime   int/float    float
1079\-    _st_name   \-           str             file name without directory part
1080\-    _st_target \-           str             link target
1081===== ========== ============ =============== =======================
1082
1083If you can't extract all the desirable data from a line (for
1084example, the MS format doesn't contain any information about the
1085owner of a file), set the corresponding values in the ``StatResult``
1086instance to ``None``.
1087
1088Parser classes can use several helper methods which are defined in
1089the class ``Parser``:
1090
1091- ``parse_unix_mode`` parses strings like "drwxr-xr-x" and returns
1092  an appropriate ``st_mode`` value.
1093
1094- ``parse_unix_time`` returns a float number usable for the
1095  ``st_...time`` values by parsing arguments like "Nov"/"23"/"02:33" or
1096  "May"/"26"/"2005". Note that the method expects the timestamp string
1097  already split at whitespace.
1098
1099- ``parse_ms_time`` parses arguments like "10-23-01"/"03:25PM" and
1100  returns a float number like from ``time.mktime``. Note that the
1101  method expects the timestamp string already split at whitespace.
1102
1103Additionally, there's an attribute ``_month_numbers`` which maps
1104lowercase three-letter month abbreviations to integers.
1105
1106For more details, see the two "standard" parsers ``UnixParser`` and
1107``MSParser`` in the module ``ftp_stat.py``.
1108
1109To actually *use* the parser, call the method `set_parser`_ of the
1110``FTPHost`` instance.
1111
1112If you can't write a parser or don't want to, please ask on the
1113`ftputil mailing list`_. Possibly someone has already written a parser
1114for your server or can help to do it.
1115
1116
1117FAQ / Tips and tricks
1118---------------------
1119
1120Where can I get the latest version?
1121~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1122
1123See the `download page`_. Announcements will be sent to the `mailing
1124list`_. Announcements on major updates will also be posted to the
1125newsgroup `comp.lang.python`_ .
1126
1127.. _`download page`: http://ftputil.sschwarzer.net/download
1128.. _`mailing list`: http://ftputil.sschwarzer.net/mailinglist
1129.. _`comp.lang.python`: news:comp.lang.python
1130
1131Is there a mailing list on ``ftputil``?
1132~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1133
1134Yes, please visit http://ftputil.sschwarzer.net/mailinglist to
1135subscribe or read the archives.
1136
1137Though you can *technically* post without subscribing first I can't
1138recommend that: The mails from non-subscribers have to be approved by
1139me and because the arriving mails contain *lots* of spam, I rarely go
1140through this bunch of mails.
1141
1142I found a bug! What now?
1143~~~~~~~~~~~~~~~~~~~~~~~~
1144
1145Before reporting a bug, make sure that you already tried the `latest
1146version`_ of ``ftputil``. There the bug might have already been fixed.
1147
1148.. _`latest version`: http://ftputil.sschwarzer.net/download
1149
1150Please see http://ftputil.sschwarzer.net/issuetrackernotes for
1151guidelines on entering a bug in ``ftputil``'s ticket system. If you
1152are unsure if the behaviour you found is a bug or not, you should write
1153to the `ftputil mailing list`_. In *either* case you *must not*
1154include confidential information (user id, password, file names, etc.)
1155in the problem report! Be careful!
1156
1157Does ``ftputil`` support SSL?
1158~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1159
1160``ftputil`` has no *built-in* SSL support. On the other hand,
1161you can use M2Crypto_ (in the source code archive, look for the
1162file ``M2Crypto/ftpslib.py``) which has a class derived from
1163``ftplib.FTP`` that supports SSL. You then can use a class
1164(not an object of it) similar to the following as a "session
1165factory" in ``ftputil.FTPHost``'s constructor::
1166
1167    import ftputil
1168
1169    from M2Crypto import ftpslib
1170
1171    class SSLFTPSession(ftpslib.FTP_TLS):
1172        def __init__(self, host, userid, password):
1173            """
1174            Use M2Crypto's `FTP_TLS` class to establish an
1175            SSL connection.
1176            """
1177            ftpslib.FTP_TLS.__init__(self)
1178            # do anything necessary to set up the SSL connection
1179            ...
1180            self.connect(host, port)
1181            self.login(userid, password)
1182            ...
1183
1184    # note the `session_factory` parameter
1185    host = ftputil.FTPHost(host, userid, password,
1186                           session_factory=SSLFTPSession)
1187    # use `host` as usual
1188
1189.. _M2Crypto: http://wiki.osafoundation.org/bin/view/Projects/MeTooCrypto#Downloads
1190
1191Connecting on another port
1192~~~~~~~~~~~~~~~~~~~~~~~~~~
1193
1194By default, an instantiated ``FTPHost`` object connects on the usual
1195FTP ports. If you have to use a different port, refer to the
1196section `FTPHost construction`_.
1197
1198You can use the same approach to connect in active or passive mode, as
1199you like.
1200
1201Using active or passive connections
1202~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1203
1204Use a wrapper class for ``ftplib.FTP``, as described in section
1205`FTPHost construction`_::
1206
1207    import ftplib
1208
1209    class ActiveFTPSession(ftplib.FTP):
1210        def __init__(self, host, userid, password):
1211            """
1212            Act like ftplib.FTP's constructor but use active mode
1213            explicitly.
1214            """
1215            ftplib.FTP.__init__(self)
1216            self.connect(host, port)
1217            self.login(userid, password)
1218            # see http://docs.python.org/lib/ftp-objects.html
1219            self.set_pasv(False)
1220
1221Use this class as the ``session_factory`` argument in ``FTPHost``'s
1222constructor.
1223
1224Conditional upload/download to/from a server in a different time zone
1225~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1226
1227You may find that ``ftputil`` uploads or downloads files
1228unnecessarily, or not when it should. This can happen when the FTP
1229server is in a different time zone than the client on which
1230``ftputil`` runs. Please see the section on `time zone correction`_.
1231It may even be sufficient to call `synchronize_times`_.
1232
1233Wrong dates or times when stat'ing on a server
1234~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1235
1236Please see the previous tip.
1237
1238I tried to upload or download a file and it's corrupt
1239~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1240
1241Perhaps you used the upload or download methods without a ``mode``
1242argument. For compatibility with Python's code for local file systems,
1243``ftputil`` defaults to ASCII/text mode which will try to convert
1244presumable line endings and thus corrupt binary files. Pass "b" as the
1245``mode`` argument (see `Uploading and downloading files`_).
1246
1247When I use ``ftputil``, all I get is a ``ParserError`` exception
1248~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1249
1250The FTP server you connect to uses a directory format that
1251``ftputil`` doesn't understand. You can either write and
1252`plug in an own parser`_, or preferably ask on the `mailing list`_ for
1253help.
1254
1255.. _`plug in an own parser`: `Writing directory parsers`_
1256
1257``isdir``, ``isfile`` or ``islink`` incorrectly return ``False``
1258~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1259
1260Like Python's counterparts under `os.path`_, ``ftputil``'s methods
1261return ``False`` if they can't find the given path.
1262
1263Probably you used ``listdir`` on a directory and called ``is...()`` on
1264the returned names. But if the argument for ``listdir`` wasn't the
1265current directory, the paths won't be found and so all ``is...()``
1266variants will return ``False``.
1267
1268I don't find an answer to my problem in this document
1269~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1270
1271Please send an email with your problem report or question to the
1272`ftputil mailing list`_, and we'll see what we can do for you. :-)
1273
1274
1275Bugs and limitations
1276--------------------
1277
1278- ``ftputil`` needs at least Python 2.3 to work.
1279
1280- Due to the implementation of ``lstat`` it can not return a sensible
1281  value for the root directory ``/`` though stat'ing entries *in* the
1282  root directory isn't a problem. If you know an implementation that
1283  can do this, please let me know. The root directory is handled
1284  appropriately in ``FTPHost.path.exists/isfile/isdir/islink``, though.
1285
1286- Timeouts of individual child sessions currently are not handled.
1287  This is only a problem if your ``FTPHost`` object or the generated
1288  ``FTPFile`` objects are inactive for about ten minutes or longer.
1289
1290- Until now, I haven't paid attention to thread safety. In principle,
1291  at least, different ``FTPFile`` objects should be usable in different
1292  threads. If in doubt if your approach will work, ask on the mailing
1293  list.
1294
1295- ``FTPFile`` objects in text mode *may not* support charsets with
1296  more than one byte per character. Please e-mail your experiences to
1297  the mailing list (see above), if you work with multibyte text
1298  streams in FTP sessions.
1299
1300- Currently, it is not possible to continue an interrupted upload or
1301  download. Contact me if you have problems with that.
1302
1303- There's exactly one cache for lstat results for each ``FTPHost``
1304  object, i. e. there's no sharing of cache results determined by
1305  several ``FTPHost`` objects.
1306
1307
1308Files
1309-----
1310
1311If not overwritten via installation options, the ``ftputil`` files
1312reside in the ``ftputil`` package. The documentation in
1313`reStructuredText`_ and in HTML format is in the same directory.
1314
1315.. _`reStructuredText`: http://docutils.sourceforge.net/rst.html
1316
1317The files ``_test_*.py`` and ``_mock_ftplib.py`` are for unit-testing.
1318If you only *use* ``ftputil``, i. e. *don't* modify it, you can
1319delete these files.
1320
1321
1322References
1323----------
1324
1325- Mackinnon T, Freeman S, Craig P. 2000. `Endo-Testing:
1326  Unit Testing with Mock Objects`_.
1327
1328- Postel J, Reynolds J. 1985. `RFC 959 - File Transfer Protocol (FTP)`_.
1329
1330- Van Rossum G, Drake Jr FL. 2003. `Python Library Reference`_.
1331
1332.. _`Endo-Testing: Unit Testing with Mock Objects`:
1333   http://www.connextra.com/aboutUs/mockobjects.pdf
1334.. _`RFC 959 - File Transfer Protocol (FTP)`: http://www.ietf.org/rfc/rfc959.txt
1335.. _`Python Library Reference`: http://www.python.org/doc/current/lib/lib.html
1336
1337
1338Authors
1339-------
1340
1341``ftputil`` is written by Stefan Schwarzer
1342<sschwarzer@sschwarzer.net>, in part based on suggestions
1343from users.
1344
1345The ``lrucache`` module is written by Evan Prodromou
1346<evan@prodromou.name>.
1347
1348Feedback is appreciated. :-)
1349
Note: See TracBrowser for help on using the repository browser.