source: doc/ftputil.txt @ 1613:f6d7fe5a44bb

Last change on this file since 1613:f6d7fe5a44bb was 1613:f6d7fe5a44bb, checked in by Stefan Schwarzer <sschwarzer@…>, 6 years ago
Correct and expand section "Directory and file names" The previous text assumed that `ftputil` would implicitly use the encoding from `locale.getpreferredencoding`. This is wrong. `ftputil` uses `ftplib` and (on Python 3) `ftplib` implicitly always uses latin-1 encoding.
File size: 62.1 KB
Line 
1``ftputil`` -- a high-level FTP client library
2==============================================
3
4:Version:   3.2
5:Date:      2014-10-12
6:Summary:   high-level FTP client library for Python
7:Keywords:  FTP, ``ftplib`` substitute, virtual filesystem, pure Python
8:Author:    Stefan Schwarzer <sschwarzer@sschwarzer.net>
9
10.. contents::
11
12
13Introduction
14------------
15
16The ``ftputil`` module is a high-level interface to the ftplib_
17module. The `FTPHost objects`_ generated from it allow many operations
18similar to those of os_, `os.path`_ and `shutil`_.
19
20.. _ftplib: https://docs.python.org/library/ftplib.html
21.. _os: https://docs.python.org/library/os.html
22.. _`os.stat`: https://docs.python.org/library/os.html#os.stat
23.. _`os.path`: https://docs.python.org/library/os.path.html
24.. _`shutil`: https://docs.python.org/library/shutil.html
25
26Example::
27
28    import ftputil
29
30    # Download some files from the login directory.
31    with ftputil.FTPHost("ftp.domain.com", "user", "password") as ftp_host:
32        names = ftp_host.listdir(ftp_host.curdir)
33        for name in names:
34            if ftp_host.path.isfile(name):
35                ftp_host.download(name, name)  # remote, local
36        # Make a new directory and copy a remote file into it.
37        ftp_host.mkdir("newdir")
38        with ftp_host.open("index.html", "rb") as source:
39            with ftp_host.open("newdir/index.html", "wb") as target:
40                ftp_host.copyfileobj(source, target)  # similar to shutil.copyfileobj
41
42Also, there are `FTPHost.lstat`_ and `FTPHost.stat`_ to request size and
43modification time of a file. The latter can also follow links, similar
44to `os.stat`_. `FTPHost.walk`_ and `FTPHost.path.walk`_ work, too.
45
46
47``ftputil`` features
48--------------------
49
50* Method names are familiar from Python's ``os``, ``os.path`` and
51  ``shutil`` modules. For example, use ``os.path.join`` to join
52  paths for a local file system and ``ftp_host.path.join`` to join
53  paths for a remote FTP file system.
54
55* Remote file system navigation (``getcwd``, ``chdir``)
56
57* Upload and download files (``upload``, ``upload_if_newer``,
58  ``download``, ``download_if_newer``)
59
60* Time zone synchronization between client and server (needed
61  for ``upload_if_newer`` and ``download_if_newer``)
62
63* Create and remove directories (``mkdir``, ``makedirs``, ``rmdir``,
64  ``rmtree``) and remove files (``remove``)
65
66* Get information about directories, files and links (``listdir``,
67  ``stat``, ``lstat``, ``exists``, ``isdir``, ``isfile``, ``islink``,
68  ``abspath``, ``split``, ``join``, ``dirname``, ``basename`` etc.)
69
70* Iterate over remote file systems (``walk``)
71
72* Local caching of results from ``lstat`` and ``stat`` calls to reduce
73  network access (also applies to ``exists``, ``getmtime`` etc.).
74
75* Read files from and write files to remote hosts via
76  file-like objects (``FTPHost.open``; the generated file-like objects
77  have the familiar methods like ``read``, ``readline``, ``readlines``,
78  ``write``, ``writelines`` and ``close``. You can also iterate over
79  these files line by line in a ``for`` loop.
80
81
82Exception hierarchy
83-------------------
84
85The exceptions are in the namespace of the ``ftputil.error`` module, e. g.
86``ftputil.error.TemporaryError``.
87
88The exception classes are organized as follows::
89
90    FTPError
91        FTPOSError(FTPError, OSError)
92            PermanentError(FTPOSError)
93                CommandNotImplementedError(PermanentError)
94            TemporaryError(FTPOSError)
95        FTPIOError(FTPError)
96        InternalError(FTPError)
97            InaccessibleLoginDirError(InternalError)
98            ParserError(InternalError)
99            RootDirError(InternalError)
100            TimeShiftError(InternalError)
101
102and are described here:
103
104- ``FTPError``
105
106  is the root of the exception hierarchy of the module.
107
108- ``FTPOSError``
109
110  is derived from ``OSError``. This is for similarity between the
111  os module and ``FTPHost`` objects. Compare
112
113  ::
114
115    try:
116        os.chdir("nonexisting_directory")
117    except OSError:
118        ...
119
120  with
121
122  ::
123
124    host = ftputil.FTPHost("host", "user", "password")
125    try:
126        host.chdir("nonexisting_directory")
127    except OSError:
128        ...
129
130  Imagine a function
131
132  ::
133
134    def func(path, file):
135        ...
136
137  which works on the local file system and catches ``OSErrors``. If you
138  change the parameter list to
139
140  ::
141
142    def func(path, file, os=os):
143        ...
144
145  where ``os`` denotes the ``os`` module, you can call the function also as
146
147  ::
148
149    host = ftputil.FTPHost("host", "user", "password")
150    func(path, file, os=host)
151
152  to use the same code for both a local and remote file system.
153  Another similarity between ``OSError`` and ``FTPOSError`` is that
154  the latter holds the FTP server return code in the ``errno``
155  attribute of the exception object and the error text in
156  ``strerror``.
157
158- ``PermanentError``
159
160  is raised for 5xx return codes from the FTP server. This
161  corresponds to ``ftplib.error_perm`` (though ``PermanentError`` and
162  ``ftplib.error_perm`` are *not* identical).
163
164- ``CommandNotImplementedError``
165
166  indicates that an underlying command the code tries to use is not
167  implemented. For an example, see the description of the
168  `FTPHost.chmod`_ method.
169
170- ``TemporaryError``
171
172  is raised for FTP return codes from the 4xx category. This
173  corresponds to ``ftplib.error_temp`` (though ``TemporaryError`` and
174  ``ftplib.error_temp`` are *not* identical).
175
176- ``FTPIOError``
177
178  denotes an I/O error on the remote host. This appears
179  mainly with file-like objects that are retrieved by calling
180  ``FTPHost.open``. Compare
181
182  ::
183
184    >>> try:
185    ...     f = open("not_there")
186    ... except IOError as obj:
187    ...     print obj.errno
188    ...     print obj.strerror
189    ...
190    2
191    No such file or directory
192
193  with
194
195  ::
196
197    >>> ftp_host = ftputil.FTPHost("host", "user", "password")
198    >>> try:
199    ...     f = ftp_host.open("not_there")
200    ... except IOError as obj:
201    ...     print obj.errno
202    ...     print obj.strerror
203    ...
204    550
205    550 not_there: No such file or directory.
206
207  As you can see, both code snippets are similar. However, the error
208  codes aren't the same.
209
210- ``InternalError``
211
212  subsumes exception classes for signaling errors due to limitations
213  of the FTP protocol or the concrete implementation of ``ftputil``.
214
215- ``InaccessibleLoginDirError``
216
217  This exception is raised if the directory in which "you" are placed
218  upon login is not accessible, i. e. a ``chdir`` call with the
219  directory as argument would fail.
220
221- ``ParserError``
222
223  is used for errors during the parsing of directory
224  listings from the server. This exception is used by the ``FTPHost``
225  methods ``stat``, ``lstat``, and ``listdir``.
226
227- ``RootDirError``
228
229  Because of the implementation of the ``lstat`` method it is not
230  possible to do a ``stat`` call on the root directory ``/``.
231  If you know *any* way to do it, please let me know. :-)
232
233  This problem does *not* affect stat calls on items *in* the root
234  directory.
235
236- ``TimeShiftError``
237
238  is used to denote errors which relate to setting the `time shift`_,
239  *for example* trying to set a value which is no multiple of a full
240  hour.
241
242
243Directory and file names
244------------------------
245
246.. note::
247
248   Keep in mind that this section only applies to directory and file
249   *names*, not file *contents*. Encoding and decoding for file
250   contents is handled by the ``encoding`` argument for
251   `FTPHost.open`_.
252
253First off: If your directory and file names (both as
254arguments and on the server) contain only ISO 8859-1 (latin-1)
255characters, you can use such names in the form of byte strings or
256unicode strings. However, you can't mix different string types (bytes
257and unicode) in one call (for example in ``FTPHost.path.join``).
258
259If you have directory or file names with characters that aren't in
260latin-1, it's recommended to use byte strings. In that case,
261returned paths will be byte strings, too.
262
263Read on for details.
264
265.. note::
266
267   The approach described below may look awkward and in a way it is.
268   The intention of ``ftputil`` is to behave like the local file
269   system APIs of Python 3 as far as it makes sense. Moreover, the
270   taken approach makes sure that directory and file names that were
271   used with Python 3's native ``ftplib`` module will be compatible
272   with ``ftputil`` and vice versa. Otherwise you may be able to use a
273   file name with ``ftputil``, but get an exception when trying to
274   read the same file with Python 3's ``ftplib`` module.
275
276Methods that take names of directories and/or files can take either
277byte or unicode strings. If a method got a string argument and returns
278one or more strings, these strings will have the same string type as
279the argument(s). Mixing different string arguments in one call (for
280example in ``FTPHost.path.join``) isn't allowed and will cause a
281``TypeError``. These rules are the same as for local file system
282operations in Python 3. Since ``ftputil`` uses the same API for Python
2832, ``ftputil`` will do the same when run on Python 2.
284
285Byte strings for directory and file names will be sent to the server
286as-is. On the other hand, unicode strings will be encoded to byte
287strings, assuming latin-1 encoding. This implies that such unicode
288strings must only contain code points 0-255 for the latin-1 character
289set. Using any other characters will result in a
290``UnicodeEncodeError`` exception.
291
292If you have directory or file names as unicode strings with non-latin-1
293characters, encode the unicode strings to byte strings yourself, using
294the encoding you know the server uses. Decode received paths with the
295same encoding. Encapsulate these conversions as far as you can.
296Otherwise, you'd have to adapt potentially a lot of code if the server
297encoding changes.
298
299If you *don't* know the encoding on the server side,
300it's probably the best to only use byte strings for directory and file
301names. That said, as soon as you *show* the names to a user, you -- or
302the library you use for displaying the names -- has to guess an
303encoding.
304
305
306``FTPHost`` objects
307-------------------
308
309.. _`FTPHost construction`:
310
311Construction
312~~~~~~~~~~~~
313
314Basics
315``````
316
317``FTPHost`` instances can be generated with the following call::
318
319    ftp_host = ftputil.FTPHost(server, user, password, account,
320                               session_factory=ftplib.FTP)
321
322The first four parameters are strings with the same meaning as for the
323FTP class in the ``ftplib`` module.
324
325``FTPHost`` objects can also be used in a ``with`` statement::
326
327    import ftputil
328
329    with ftputil.FTPHost(server, user, password) as ftp_host:
330        print ftp_host.listdir(ftp_host.curdir)
331
332After the ``with`` block, the ``FTPHost`` instance and the
333associated FTP sessions will be closed automatically.
334
335If something goes wrong during the ``FTPHost`` construction or in the
336body of the ``with`` statement, the instance is closed as well.
337Exceptions will be propagated (as with ``try ... finally``).
338
339Session factories
340`````````````````
341
342The keyword argument ``session_factory`` may be used to generate FTP
343connections with other factories than the default ``ftplib.FTP``. For
344example, the standard library of Python 2.7 contains a class
345``ftplib.FTP_TLS`` which extends ``ftplib.FTP`` to use an encrypted
346connection.
347
348In fact, all positional and keyword arguments other than
349``session_factory`` are passed to the factory to generate a new
350background session. This also happens for every remote file that is
351opened; see below.
352
353This functionality of the constructor also allows to wrap
354``ftplib.FTP`` objects to do something that wouldn't be possible with
355the ``ftplib.FTP`` constructor alone.
356
357As an example, assume you want to connect to another than the default
358port, but ``ftplib.FTP`` only offers this by means of its ``connect``
359method, not via its constructor. One solution is to use a custom
360class as a session factory::
361
362    import ftplib
363    import ftputil
364
365    EXAMPLE_PORT = 50001
366
367    class MySession(ftplib.FTP):
368
369        def __init__(self, host, userid, password, port):
370            """Act like ftplib.FTP's constructor but connect to another port."""
371            ftplib.FTP.__init__(self)
372            self.connect(host, port)
373            self.login(userid, password)
374
375    # Try _not_ to use an _instance_ `MySession()` as factory, -
376    # use the class itself.
377    with ftputil.FTPHost(host, userid, password, port=EXAMPLE_PORT,
378                         session_factory=MySession) as ftp_host:
379        # Use `ftp_host` as usual.
380        ...
381
382On login, the format of the directory listings (needed for stat'ing
383files and directories) should be determined automatically. If not,
384please `file a bug report`_.
385
386.. _`file a bug report`: http://ftputil.sschwarzer.net/issuetrackernotes
387
388For the most common uses you don't need to create your own session
389factory class though. The ``ftputil.session`` module has a function
390``session_factory`` that can create session factories for a variety
391of parameters::
392
393    session_factory(base_class=ftplib.FTP,
394                    port=21,
395                    use_passive_mode=None,
396                    encrypt_data_channel=True,
397                    debug_level=None)
398
399with
400
401- ``base_class`` is a base class to inherit a new session factory
402  class from. By default, this is ``ftplib.FTP`` from the Python
403  standard library.
404
405- ``port`` is the command channel port. The default is 21, used in most
406  FTP server configurations.
407
408- ``use_passive_mode`` is either a boolean that determines whether
409  passive mode should be used or ``None``. ``None`` means to let the
410  base class choose active or passive mode.
411
412- ``encrypt_data_channel`` defines whether to encrypt the data channel
413  for secure connections. This is only supported for the base classes
414  ``ftplib.FTP_TLS`` and ``M2Crypto.ftpslib.FTP_TLS``, otherwise the
415  the parameter is ignored.
416
417- ``debug_level`` sets the debug level for FTP session instances. The
418  semantics is defined by the base class. For example, a debug level
419  of 2 causes the most verbose output for Python's ``ftplib.FTP``
420  class.
421
422All of these parameters can be combined. For example, you could use
423
424::
425
426    import ftplib
427
428    import ftputil
429    import ftputil.session
430
431
432    my_session_factory = ftputil.session.session_factory(
433                           base_class=ftpslib.FTP_TLS,
434                           port=31,
435                           encrypt_data_channel=True,
436                           debug_level=2)
437
438    with ftputil.FTPHost(server, user, password,
439                         session_factory=my_session_factory) as ftp_host:
440        ...
441
442to create and use a session factory derived from ``ftplib.FTP_TLS``
443that connects on command channel 31, will encrypt the data channel and
444print output for debug level 2.
445
446Note: Generally, you can achieve everything you can do with
447``ftputil.session.session_factory`` with an explicit session factory
448as described at the start of this section. However, the class
449``M2Crypto.ftpslib.FTP_TLS`` has a limitation so that you can't use
450it with ftputil out of the box. The function ``session_factory``
451contains a workaround for this limitation. For details refer to `this
452bug report`_.
453
454.. _`this bug report`: http://ftputil.sschwarzer.net/trac/ticket/78
455
456Hidden files and directories
457~~~~~~~~~~~~~~~~~~~~~~~~~~~~
458
459Whether ftputil sees "hidden" files and directories (usually files or
460directories whose names start with a dot) depends on the FTP server
461configuration. By default, ftputil uses the ``-a`` option in the FTP
462``LIST`` command to find hidden files. However, the server may ignore
463this.
464
465If using the ``-a`` option leads to problems, for example if an
466FTP server causes an exception, you may switch off the use of the
467option::
468
469    ftp_host = ftputil.FTPHost(server, user, password, account,
470                               session_factory=ftplib.FTP)
471    ftp_host.use_list_a_option = False
472
473``FTPHost`` attributes and methods
474~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
475
476Attributes
477``````````
478
479- ``curdir``, ``pardir``, ``sep``
480
481  are strings which denote the current and the parent directory on the
482  remote server. ``sep`` holds the path separator. Though `RFC 959`_
483  (File Transfer Protocol) notes that these values may depend on the
484  FTP server implementation, the Unix variants seem to work well in
485  practice, even for non-Unix servers.
486
487  Nevertheless, it's recommended that you don't hardcode these values
488  for remote paths, but use `FTPHost.path`_ as you would use
489  ``os.path`` to write platform-independent Python code for local
490  filesystems. Keep in mind that most, *but not all*, arguments of
491  ``FTPHost`` methods refer to remote directories or files. For
492  example, in `FTPHost.upload`_, the first argument is a local
493  path and the second a remote path. Both of these should use their
494  respective path separators.
495
496.. _`FTPHost.upload`: `Uploading and downloading files`_
497
498Remote file system navigation
499`````````````````````````````
500
501- ``getcwd()``
502
503  returns the absolute current directory on the remote host. This
504  method works like ``os.getcwd``.
505
506- ``chdir(directory)``
507
508  sets the current directory on the FTP server. This resembles
509  ``os.chdir``, as you may have expected.
510
511.. _`callback function`:
512
513Uploading and downloading files
514```````````````````````````````
515
516- ``upload(source, target, callback=None)``
517
518  copies a local source file (given by a filename, i. e. a string)
519  to the remote host under the name target. Both ``source`` and
520  ``target`` may be absolute paths or relative to their corresponding
521  current directory (on the local or the remote host, respectively).
522
523  The file content is always transferred in binary mode.
524
525  The callback, if given, will be invoked for each transferred chunk
526  of data::
527
528    callback(chunk)
529
530  where ``chunk`` is a bytestring. An example usage of a callback
531  method is to display a progress indicator.
532
533- ``download(source, target, callback=None)``
534
535  performs a download from the remote source file to a local target
536  file. Both ``source`` and ``target`` are strings. See the
537  description of ``upload`` for more details.
538
539.. _`upload_if_newer`:
540
541- ``upload_if_newer(source, target, callback=None)``
542
543  is similar to the ``upload`` method. The only difference is that the
544  upload is only invoked if the time of the last modification for the
545  source file is more recent than that of the target file or the
546  target doesn't exist at all. The check for the last modification
547  time considers the precision of the timestamps and transfers a file
548  "if in doubt". Consequently the code
549
550  ::
551
552    ftp_host.upload_if_newer("source_file", "target_file")
553    time.sleep(10)
554    ftp_host.upload_if_newer("source_file", "target_file")
555
556  might upload the file again if the timestamp of the target file is
557  precise up to a minute, which is typically the case because the
558  remote datetime is determined by parsing a directory listing from
559  the server. To avoid unnecessary transfers, wait at least a minute
560  between calls of ``upload_if_newer`` for the same file. If it still
561  seems that a file is uploaded unnecessarily (or not when it should),
562  read the subsection on `time shift`_ settings.
563
564  If an upload actually happened, the return value of
565  ``upload_if_newer`` is a ``True``, else ``False``.
566
567  Note that the method only checks the existence and/or the
568  modification time of the source and target file; it doesn't
569  compare any other file properties, say, the file size.
570
571  This also means that if a transfer is interrupted, the remote file
572  will have a newer modification time than the local file, and thus
573  the transfer won't be repeated if ``upload_if_newer`` is used a
574  second time. There are at least two possibilities after a failed
575  upload:
576
577  - use ``upload`` instead of ``upload_if_newer``, or
578
579  - remove the incomplete target file with ``FTPHost.remove``, then
580    use ``upload`` or ``upload_if_newer`` to transfer it again.
581
582.. _`download_if_newer`:
583
584- ``download_if_newer(source, target, callback=None)``
585
586  corresponds to ``upload_if_newer`` but performs a download from the
587  server to the local host. Read the descriptions of download and
588  ``upload_if_newer`` for more information. If a download actually
589  happened, the return value is ``True``, else ``False``.
590
591.. _`time shift`:
592.. _`time zone correction`:
593
594Time zone correction
595````````````````````
596
597If the client where ``ftputil`` runs and the server have a different
598understanding of their local times, this has to be taken into account
599for ``upload_if_newer`` and ``download_if_newer`` to work correctly.
600
601Note that even if the client and the server are in the same time zone
602(or even on the same computer), the time shift value (see below) may
603be different from zero. For example, my computer is set to use local
604time whereas the server running on the very same host insists on using
605UTC time.
606
607.. _`set_time_shift`:
608
609- ``set_time_shift(time_shift)``
610
611  sets the so-called time shift value, measured in seconds. The time
612  shift is the difference between the local time of the server and the
613  local time of the client at a given moment, i. e. by definition
614
615  ::
616
617    time_shift = server_time - client_time
618
619  Setting this value is important for `upload_if_newer`_ and
620  `download_if_newer`_ to work correctly even if the time zone of the
621  FTP server differs from that of the client. Note that the time shift
622  value *can be negative*.
623
624  If the time shift value is invalid, e. g. no multiple of a full hour
625  or its absolute value larger than 24 hours, a ``TimeShiftError`` is
626  raised.
627
628  See also `synchronize_times`_ for a way to set the time shift with a
629  simple method call.
630
631- ``time_shift()``
632
633  returns the currently-set time shift value. See ``set_time_shift``
634  above for its definition.
635
636.. _`synchronize_times`:
637
638- ``synchronize_times()``
639
640  synchronizes the local times of the server and the client, so that
641  `upload_if_newer`_ and `download_if_newer`_ work as expected, even
642  if the client and the server use different time zones. For this
643  to work, *all* of the following conditions must be true:
644
645  - The connection between server and client is established.
646
647  - The client has write access to the directory that is current when
648    ``synchronize_times`` is called.
649
650  If you can't fulfill these conditions, you can nevertheless set the
651  time shift value explicitly with `set_time_shift`_. Trying to call
652  ``synchronize_times`` if the above conditions aren't met results in
653  a ``TimeShiftError`` exception.
654
655Creating and removing directories
656`````````````````````````````````
657
658- ``mkdir(path, [mode])``
659
660  makes the given directory on the remote host. This does *not*
661  construct "intermediate" directories that don't already exist. The
662  ``mode`` parameter is ignored; this is for compatibility with
663  ``os.mkdir`` if an ``FTPHost`` object is passed into a function
664  instead of the ``os`` module. See the explanation in the subsection
665  `Exception hierarchy`_.
666
667- ``makedirs(path, [mode])``
668
669  works similar to ``mkdir`` (see above), but also makes intermediate
670  directories like ``os.makedirs``. The ``mode`` parameter is only
671  there for compatibility with ``os.makedirs`` and is ignored.
672
673- ``rmdir(path)``
674
675  removes the given remote directory. If it's not empty, raise
676  a ``PermanentError``.
677
678- ``rmtree(path, ignore_errors=False, onerror=None)``
679
680  removes the given remote, possibly non-empty, directory tree.
681  The interface of this method is rather complex, in favor of
682  compatibility with ``shutil.rmtree``.
683
684  If ``ignore_errors`` is set to a true value, errors are ignored.
685  If ``ignore_errors`` is a false value *and* ``onerror`` isn't
686  set, all exceptions occurring during the tree iteration and
687  processing are raised. These exceptions are all of type
688  ``PermanentError``.
689
690  To distinguish between different kinds of errors, pass in a callable
691  for ``onerror``. This callable must accept three arguments:
692  ``func``, ``path`` and ``exc_info``. ``func`` is a bound method
693  object, *for example* ``your_host_object.listdir``. ``path`` is the
694  path that was the recent argument of the respective method
695  (``listdir``, ``remove``, ``rmdir``). ``exc_info`` is the exception
696  info as it is gotten from ``sys.exc_info``.
697
698  The code of ``rmtree`` is taken from Python's ``shutil`` module
699  and adapted for ``ftputil``.
700
701Removing files and links
702````````````````````````
703
704- ``remove(path)``
705
706  removes a file or link on the remote host, similar to ``os.remove``.
707
708- ``unlink(path)``
709
710  is an alias for ``remove``.
711
712Retrieving information about directories, files and links
713`````````````````````````````````````````````````````````
714
715- ``listdir(path)``
716
717  returns a list containing the names of the files and directories
718  in the given path, similar to `os.listdir`_. The special names
719  ``.`` and ``..`` are not in the list.
720
721The methods ``lstat`` and ``stat`` (and some others) rely on the
722directory listing format used by the FTP server. When connecting to a
723host, ``FTPHost``'s constructor tries to guess the right format, which
724succeeds in most cases. However, if you get strange results or
725``ParserError`` exceptions by a mere ``lstat`` call, please `file a
726bug report`_.
727
728If ``lstat`` or ``stat`` give wrong modification dates or times, look
729at the methods that deal with time zone differences (`time zone
730correction`_).
731
732.. _`FTPHost.lstat`:
733
734- ``lstat(path)``
735
736  returns an object similar to that from `os.lstat`_. This is a
737  "tuple" with additional attributes; see the documentation of the
738  ``os`` module for details.
739
740  The result is derived by parsing the output of a ``LIST`` command on
741  the server. Therefore, the result from ``FTPHost.lstat`` can not
742  contain more information than the received text. In particular:
743
744  - User and group ids can only be determined as strings, not as
745    numbers, and that only if the server supplies them. This is
746    usually the case with Unix servers but maybe not for other FTP
747    server programs.
748
749  - Values for the time of the last modification may be rough,
750    depending on the information from the server. For timestamps
751    older than a year, this usually means that the precision of the
752    modification timestamp value is not better than days. For newer
753    files, the information may be accurate to a minute.
754
755    If the time of the last modification is before the epoch (usually
756    1970-01-01 UTC), set the time of the last modification to 0.0.
757
758  - Links can only be recognized on servers that provide this
759    information in the ``LIST`` output.
760
761  - Stat attributes that can't be determined at all are set to
762        ``None``. For example, a line of a directory listing may not
763        contain the date/time of a directory's last modification.
764
765  - There's a special problem with stat'ing the root directory.
766    (Stat'ing things *in* the root directory is fine though.) In
767    this case, a ``RootDirError`` is raised. This has to do with the
768    algorithm used by ``(l)stat``, and I know of no approach which
769    mends this problem.
770
771  Currently, ``ftputil`` recognizes the common Unix-style and
772  Microsoft/DOS-style directory formats. If you need to parse output
773  from another server type, please write to the `ftputil mailing
774  list`_. You may consider `writing your own parser`_.
775
776.. _`os.listdir`: https://docs.python.org/library/os.html#os.listdir
777.. _`os.lstat`: https://docs.python.org/library/os.html#os.lstat
778.. _`ftputil mailing list`: http://ftputil.sschwarzer.net/mailinglist
779.. _`writing your own parser`: `Writing directory parsers`_
780
781.. _`FTPHost.stat`:
782
783- ``stat(path)``
784
785  returns ``stat`` information also for files which are pointed to by a
786  link. This method follows multiple links until a regular file or
787  directory is found. If an infinite link chain is encountered or the
788  target of the last link in the chain doesn't exist, a
789  ``PermanentError`` is raised.
790
791  The limitations of the ``lstat`` method also apply to ``stat``.
792
793.. _`FTPHost.path`:
794
795``FTPHost`` objects contain an attribute named ``path``, similar to
796`os.path`_. The following methods can be applied to the remote host
797with the same semantics as for ``os.path``:
798
799::
800
801    abspath(path)
802    basename(path)
803    commonprefix(path_list)
804    dirname(path)
805    exists(path)
806    getmtime(path)
807    getsize(path)
808    isabs(path)
809    isdir(path)
810    isfile(path)
811    islink(path)
812    join(path1, path2, ...)
813    normcase(path)
814    normpath(path)
815    split(path)
816    splitdrive(path)
817    splitext(path)
818    walk(path, func, arg)
819
820Like Python's counterparts under `os.path`_, ``ftputil``'s ``is...``
821methods return ``False`` if they can't find the path given by their
822argument.
823
824Local caching of file system information
825````````````````````````````````````````
826
827Many of the above methods need access to the remote file system to
828obtain data on directories and files. To get the most recent data,
829*each* call to ``lstat``, ``stat``, ``exists``, ``getmtime`` etc.
830would require to fetch a directory listing from the server, which can
831make the program *very* slow. This effect is more pronounced for
832operations which mostly scan the file system rather than transferring
833file data.
834
835For this reason, ``ftputil`` by default saves the results from
836directory listings locally and reuses those results. This reduces
837network accesses and so speeds up the software a lot. However, since
838data is more rarely fetched from the server, the risk of obsolete data
839also increases. This will be discussed below.
840
841Caching can be controlled -- if necessary at all -- via the
842``stat_cache`` object in an ``FTPHost``'s namespace. For example,
843after calling
844
845::
846
847    ftp_host = ftputil.FTPHost(host, user, password)
848
849the cache can be accessed as ``ftp_host.stat_cache``.
850
851While ``ftputil`` usually manages the cache quite well, there are two
852possible reasons for modifying cache parameters.
853
854The first is when the number of possible entries is too low. You may
855notice that when you are processing very large directories and the
856program becomes much slower than before. It's common for code to read
857a directory with ``listdir`` and then process the found directories
858and files. This can also happen implicitly by a call to
859``FTPHost.walk``. Since version 2.6 ``ftputil`` automatically
860increases the cache size if directories with more entries than the
861current maximum cache size are to be scanned. Most of the time, this
862works fine.
863
864However, if you need access to stat data for several directories at
865the same time, you may need to increase the cache explicitly. This is
866done by the ``resize`` method::
867
868    ftp_host.stat_cache.resize(20000)
869
870where the argument is the maximum number of ``lstat`` results to store
871(the default is 5000, in versions before 2.6 it was 1000). Note that
872each path on the server, e. g. "/home/schwa/some_dir", corresponds to
873a single cache entry. Methods like ``exists`` or ``getmtime`` all
874derive their results from a previously fetched ``lstat`` result.
875
876The value 5000 above means that the cache will hold *at most* 5000
877entries (unless increased automatically by an explicit or implicit
878``listdir`` call, see above). If more are about to be stored, the
879entries which haven't been used for the longest time will be deleted
880to make place for newer entries.
881
882The second possible reason to change the cache parameters is to avoid
883stale cache data. Caching is so effective because it reduces network
884accesses. This can also be a disadvantage if the file system data on
885the remote server changes after a stat result has been retrieved; the
886client, when looking at the cached stat data, will use obsolete
887information.
888
889There are two potential ways to get such out-of-date stat data. The
890first happens when an ``FTPHost`` instance modifies a file path for
891which it has a cache entry, e. g. by calling ``remove`` or ``rmdir``.
892Such changes are handled transparently; the path will be deleted from
893the cache. A different matter are changes unknown to the ``FTPHost``
894object which inspects its cache. Obviously, for example, these are
895changes by programs running on the remote host. On the other hand,
896cache inconsistencies can also occur if two ``FTPHost`` objects change
897a file system simultaneously::
898
899    with (
900      ftputil.FTPHost(server, user1, password1) as ftp_host1,
901      ftputil.FTPHost(server, user1, password1) as ftp_host2
902    ):
903        stat_result1 = ftp_host1.stat("some_file")
904        stat_result2 = ftp_host2.stat("some_file")
905        ftp_host2.remove("some_file")
906        # `ftp_host1` will still see the obsolete cache entry!
907        print ftp_host1.stat("some_file")
908        # Will raise an exception since an `FTPHost` object
909        # knows of its own changes.
910        print ftp_host2.stat("some_file")
911
912At first sight, it may appear to be a good idea to have a shared cache
913among several ``FTPHost`` objects. After some thinking, this turns out
914to be very error-prone. For example, it won't help with different
915processes using ``ftputil``. So, if you have to deal with concurrent
916write/read accesses to a server, you have to handle them explicitly.
917
918The most useful tool for this is the ``invalidate`` method. In the
919example above, it could be used like this::
920
921    with (
922      ftputil.FTPHost(server, user1, password1) as ftp_host1,
923      ftputil.FTPHost(server, user1, password1) as ftp_host2
924    ):
925        stat_result1 = ftp_host1.stat("some_file")
926        stat_result2 = ftp_host2.stat("some_file")
927        ftp_host2.remove("some_file")
928        # Invalidate using an absolute path.
929        absolute_path = ftp_host1.path.abspath(
930                          ftp_host1.path.join(ftp_host1.getcwd(), "some_file"))
931        ftp_host1.stat_cache.invalidate(absolute_path)
932        # Will now raise an exception as it should.
933        print ftp_host1.stat("some_file")
934        # Would raise an exception since an `FTPHost` object
935        # knows of its own changes, even without `invalidate`.
936        print ftp_host2.stat("some_file")
937
938The method ``invalidate`` can be used on any *absolute* path, be it a
939directory, a file or a link.
940
941By default, the cache entries (if not replaced by newer ones) are
942stored for an infinite time. That is, if you start your Python process
943using ``ftputil`` and let it run for three days a stat call may still
944access cache data that old. To avoid this, you can set the ``max_age``
945attribute::
946
947    with ftputil.FTPHost(server, user, password) as ftp_host:
948        ftp_host.stat_cache.max_age = 60 * 60  # = 3600 seconds
949
950This sets the maximum age of entries in the cache to an hour. This
951means any entry older won't be retrieved from the cache but its data
952instead fetched again from the remote host and then again stored for
953up to an hour. To reset `max_age` to the default of unlimited age,
954i. e. cache entries never expire, use ``None`` as value.
955
956If you are certain that the cache will be in the way, you can disable
957and later re-enable it completely with ``disable`` and ``enable``::
958
959    with ftputil.FTPHost(server, user, password) as ftp_host:
960        ftp_host.stat_cache.disable()
961        ...
962        ftp_host.stat_cache.enable()
963
964During that time, the cache won't be used; all data will be fetched
965from the network. After enabling the cache again, its entries will be
966the same as when the cache was disabled, that is, entries won't get
967updated with newer data during this period. Note that even when the
968cache is disabled, the file system data in the code can become
969inconsistent::
970
971    with ftputil.FTPHost(server, user, password) as ftp_host:
972        ftp_host.stat_cache.disable()
973        if ftp_host.path.exists("some_file"):
974            mtime = ftp_host.path.getmtime("some_file")
975
976In that case, the file ``some_file`` may have been removed by another
977process between the calls to ``exists`` and ``getmtime``!
978
979Iteration over directories
980``````````````````````````
981
982.. _`FTPHost.walk`:
983
984- ``walk(top, topdown=True, onerror=None, followlinks=False)``
985
986  iterates over a directory tree, similar to `os.walk`_. Actually,
987  ``FTPHost.walk`` uses the code from Python with just the necessary
988  modifications, so see the linked documentation.
989
990.. _`os.walk`: https://docs.python.org/2/library/os.html#os.walk
991
992.. _`FTPHost.path.walk`:
993
994- ``path.walk(path, func, arg)``
995
996  Similar to ``os.path.walk``, the ``walk`` method in
997  `FTPHost.path`_ can be used, though ``FTPHost.walk`` is probably
998  easier to use.
999
1000Other methods
1001`````````````
1002
1003- ``close()``
1004
1005  closes the connection to the remote host. After this, no more
1006  interaction with the FTP server is possible with this ``FTPHost``
1007  object. Usually you don't need to close an ``FTPHost`` instance
1008  with ``close`` if you set up the instance in a ``with`` statement.
1009
1010- ``rename(source, target)``
1011
1012  renames the source file (or directory) on the FTP server.
1013
1014.. _`FTPHost.chmod`:
1015
1016- ``chmod(path, mode)``
1017
1018  sets the access mode (permission flags) for the given path. The mode
1019  is an integer as returned for the mode by the ``stat`` and ``lstat``
1020  methods. Be careful: Usually, mode values are written as octal
1021  numbers, for example 0755 to make a directory readable and writable
1022  for the owner, but not writable for the group and others. If you
1023  want to use such octal values, rely on Python's support for them::
1024
1025    ftp_host.chmod("some_directory", 0o755)
1026
1027  Not all FTP servers support the ``chmod`` command. In case of
1028  an exception, how do you know if the path doesn't exist or if
1029  the command itself is invalid? If the FTP server complies with
1030  `RFC 959`_, it should return a status code 502 if the ``SITE CHMOD``
1031  command isn't allowed. ``ftputil`` maps this special error
1032  response to a ``CommandNotImplementedError`` which is derived from
1033  ``PermanentError``.
1034
1035  So you need to code like this::
1036
1037    with ftputil.FTPHost(server, user, password) as ftp_host:
1038        try:
1039            ftp_host.chmod("some_file", 0o644)
1040        except ftputil.error.CommandNotImplementedError:
1041            # `chmod` not supported
1042            ...
1043        except ftputil.error.PermanentError:
1044            # Possibly a non-existent file
1045            ...
1046
1047  Because the ``CommandNotImplementedError`` is more specific, you
1048  have to test for it first.
1049
1050.. _`RFC 959`: `RFC 959 - File Transfer Protocol (FTP)`_
1051
1052- ``copyfileobj(source, target, length=64*1024)``
1053
1054  copies the contents from the file-like object ``source`` to the
1055  file-like object ``target``. The only difference to
1056  ``shutil.copyfileobj`` is the default buffer size. Note that
1057  arbitrary file-like objects can be used as arguments (e. g. local
1058  files, remote FTP files).
1059
1060  However, the interfaces of ``source`` and ``target`` have to match;
1061  the string type read from ``source`` must be an accepted string type
1062  when written to ``target``. For example, if you open ``source`` in
1063  Python 3 as a local text file and ``target`` as a remote file object
1064  in binary mode, the transfer will fail since ``source.read`` gives
1065  unicode strings whereas ``target.write`` only accepts byte strings.
1066
1067  See `File-like objects`_ for the construction and use of remote
1068  file-like objects.
1069
1070.. _`set_parser`:
1071
1072- ``set_parser(parser)``
1073
1074  sets a custom parser for FTP directories. Note that you have to pass
1075  in a parser *instance*, not the class.
1076
1077  An `extra section`_ shows how to write own parsers if the default
1078  parsers in ``ftputil`` don't work for you.
1079
1080.. _`extra section`: `Writing directory parsers`_
1081
1082.. _`keep_alive`:
1083
1084- ``keep_alive()``
1085
1086  attempts to keep the connection to the remote server active in order
1087  to prevent timeouts from happening. This method is primarily
1088  intended to keep the underlying FTP connection of an ``FTPHost``
1089  object alive while a file is uploaded or downloaded. This will
1090  require either an extra thread while the upload or download is in
1091  progress or calling ``keep_alive`` from a `callback function`_.
1092
1093  The ``keep_alive`` method won't help if the connection has already
1094  timed out. In this case, a ``ftputil.error.TemporaryError`` is raised.
1095
1096  If you want to use this method, keep in mind that FTP servers define
1097  a timeout for a reason. A timeout prevents running out of server
1098  connections because of clients that never disconnect on their own.
1099
1100  Note that the ``keep_alive`` method does *not* affect the "hidden"
1101  FTP child connections established by ``FTPHost.open`` (see section
1102  `FTPHost instances vs. FTP connections`_ for details). You *can't*
1103  use ``keep_alive`` to avoid a timeout in a stalling transfer like
1104  this::
1105
1106      with ftputil.FTPHost(server, userid, password) as ftp_host:
1107          with ftp_host.open("some_remote_file", "rb") as fobj:
1108              data = fobj.read(100)
1109              # _Futile_ attempt to avoid file connection timeout.
1110              for i in xrange(15):
1111                  time.sleep(60)
1112                  ftp_host.keep_alive()
1113              # Will raise an `ftputil.error.TemporaryError`.
1114              data += fobj.read()
1115
1116
1117.. _`FTPHost.open`:
1118
1119File-like objects
1120-----------------
1121
1122Construction
1123~~~~~~~~~~~~
1124
1125Basics
1126``````
1127
1128``FTPFile`` objects are returned by a call to ``FTPHost.open``;
1129never use the ``FTPFile`` constructor directly.
1130
1131The API of remote file-like objects are is modeled after the API of
1132the io_ module in Python 3, which has also been backported to Python
11332.6 and 2.7.
1134
1135.. _io: http://docs.python.org/library/io.html
1136
1137- ``FTPHost.open(path, mode="r", buffering=None, encoding=None,
1138  errors=None, newline=None)``
1139
1140  returns a file-like object that refers to the path on the remote
1141  host. This path may be absolute or relative to the current directory
1142  on the remote host (this directory can be determined with the
1143  ``getcwd`` method). As with local file objects, the default mode is
1144  "r", i. e. reading text files. Valid modes are "r", "rb", "w", and
1145  "wb".
1146
1147  If a file is opened in binary mode, you *must not* specify an
1148  encoding. On the other hand, if you open a file in text mode, an
1149  encoding is used. By default, this is the return value of
1150  ``locale.getpreferredencoding``, but you can (and probably should)
1151  specify a distinct encoding.
1152
1153  If you open a file in binary mode, the read and write operations use
1154  byte strings (``str`` in Python 2, ``bytes`` in Python 3). That is,
1155  read operations return byte strings and write operations only accept
1156  byte strings.
1157
1158  Similarly, text files always work with unicode strings (``unicode``
1159  in Python 2, ``str`` in Python 3). Here, read operations return
1160  unicode strings and write operations only accept unicode strings.
1161
1162  The arguments ``errors`` and ``newline`` have the same semantics as
1163  in `io.open`_. The argument ``buffering`` currently is ignored.
1164  It's only there for compatibility with the ``io.open`` interface.
1165
1166.. _`io.open`: http://docs.python.org/library/io.html#io.open
1167
1168Note that the semantics of "text mode" has changed fundamentally from
1169ftputil 2.8 and earlier. Previously, "text mode" implied converting
1170newline characters to ``\r\n`` when writing remote files and
1171converting newlines to ``\n`` when reading remote files. This is
1172in line with the "text mode" notion of FTP command line clients.
1173Now, "text mode" follows the semantics in Python's ``io`` module.
1174
1175``FTPHost.open`` can also be used in a ``with`` statement::
1176
1177    import ftputil
1178
1179    with ftputil.FTPHost(...) as ftp_host:
1180        ...
1181        with ftp_host.open("new_file", "w", encoding="utf8") as fobj:
1182            fobj.write("This is some text.")
1183
1184At the end of the ``with`` block, the remote file will be closed
1185automatically.
1186
1187If something goes wrong during the construction of the file or in the
1188body of the ``with`` statement, the file will be closed as well.
1189Exceptions will be propagated as with ``try ... finally``.
1190
1191Attributes and methods
1192~~~~~~~~~~~~~~~~~~~~~~
1193
1194The methods
1195
1196::
1197
1198    close()
1199    read([count])
1200    readline([count])
1201    readlines()
1202    write(data)
1203    writelines(string_sequence)
1204
1205and the attribute ``closed`` have the same semantics as for file
1206objects of a local disk file system. The iterator protocol is
1207supported as well, i. e. you can use a loop to read a file line by
1208line::
1209
1210    with ftputil.FTPHost(server, user, password) as ftp_host:
1211        with ftp_host.open("some_file") as input_file:
1212            for line in input_file:
1213                # Do something with the line, e. g.
1214                print line.strip().replace("ftplib", "ftputil")
1215
1216For more on file objects, see the section `File objects`_ in the
1217Python Library Reference.
1218
1219.. _`file objects`: https://docs.python.org/2.7/library/stdtypes.html#file-objects
1220
1221
1222.. _`child_connections`:
1223
1224``FTPHost`` instances vs. FTP connections
1225-----------------------------------------
1226
1227This section explains why keeping an ``FTPHost`` instance "alive"
1228without timing out sometimes isn't trivial. If you always finish your
1229FTP operations in time, you don't need to read this section.
1230
1231The file transfer protocol is a stateful protocol. That means an FTP
1232connection always is in a certain state. Each of these states can only
1233change to certain other states under certain conditions triggered by
1234the client or the server.
1235
1236One of the consequences is that a single FTP connection can't be used
1237at the same time, say, to transfer data on the FTP data channel and to
1238create a directory on the remote host.
1239
1240For example, consider this::
1241
1242    >>> import ftplib
1243    >>> ftp = ftplib.FTP(server, user, password)
1244    >>> ftp.pwd()
1245    '/'
1246    >>> # Start transfer. `CONTENTS` is a text file on the server.
1247    >>> socket = ftp.transfercmd("RETR CONTENTS")
1248    >>> socket
1249    <socket._socketobject object at 0x7f801a6386e0>
1250    >>> ftp.pwd()
1251    Traceback (most recent call last):
1252      File "<stdin>", line 1, in <module>
1253      File "/usr/lib64/python2.7/ftplib.py", line 578, in pwd
1254        return parse257(resp)
1255      File "/usr/lib64/python2.7/ftplib.py", line 842, in parse257
1256        raise error_reply, resp
1257    ftplib.error_reply: 226-File successfully transferred
1258    226 0.000 seconds (measured here), 5.60 Mbytes per second
1259    >>>
1260
1261Note that ``ftp`` is a single FTP connection, represented by an
1262``ftplib.FTP`` instance, not an ``ftputil.FTPHost`` instance.
1263
1264On the other hand, consider this::
1265
1266    >>> import ftputil
1267    >>> ftp_host = ftputil.FTPHost(server, user, password)
1268    >>> ftp_host.getcwd()
1269    >>> fobj = ftp_host.open("CONTENTS")
1270    >>> fobj
1271    <ftputil.file.FTPFile object at 0x7f8019d3aa50>
1272    >>> ftp_host.getcwd()
1273    u'/'
1274    >>> fobj.readline()
1275    u'Contents of FTP test directory\n'
1276    >>> fobj.close()
1277    >>>
1278
1279To be able to start a file transfer (i. e. open a remote file for
1280reading or writing) and still be able to use other FTP commands,
1281ftputil uses a trick. For every remote file, ftputil creates a new FTP
1282connection, called a child connection in the ftputil source code.
1283(Actually, FTP connections belonging to closed remote files are
1284re-used if they haven't timed out yet.)
1285
1286In most cases this approach isn't noticeable by code using ftputil.
1287However, the nice abstraction of dealing with a single FTP connection
1288falls apart if one of the child connections times out. For example, if
1289you open a remote file and work only with the initial "main"
1290connection to navigate the file system, the FTP connection for the
1291remote file may eventually time out.
1292
1293While it's often relatively easy to prevent the "main" connection from
1294timing out it's unfortunately practically impossible to do this for a
1295remote file connection (apart from transferring some data, of course).
1296For this reason, `FTPHost.keep_alive`_ affects only the main
1297connection. Child connections may still time out if they're idle for
1298too long.
1299
1300.. _`FTPHost.keep_alive`: `keep_alive`_
1301
1302Some more details:
1303
1304- A kind of "straightforward" way of keeping the main connection alive
1305  would be to call ``ftp_host.getcwd()``. However, this doesn't work
1306  because ftputil caches the current directory and returns it without
1307  actually contacting the server. That's the main reason why there's
1308  a ``keep_alive`` method since it calls ``pwd`` on the FTP connection
1309  (i. e. the session object), which isn't a public attribute.
1310
1311- Some servers define not only an idle timeout but also a transfer
1312  timeout. This means the connection times out unless there's some
1313  transfer on the data channel for this connection. So ftputil's
1314  ``keep_alive`` doesn't prevent this timeout, but an
1315  ``ftp_host.listdir(ftp_host.curdir)`` call should do it. However,
1316  this transfers the data for the whole directory listing which might
1317  take some time if the directory has many entries.
1318
1319Bottom line: If you can, you should organize your FTP actions so that
1320you finish everything before a timeout happens.
1321
1322
1323Writing directory parsers
1324-------------------------
1325
1326``ftputil`` recognizes the two most widely-used FTP directory formats,
1327Unix and MS style, and adjusts itself automatically. Almost every FTP
1328server uses one of these formats.
1329
1330However, if your server uses a format which is different from the two
1331provided by ``ftputil``, you can plug in a custom parser with a single
1332method call and have ``ftputil`` use this parser.
1333
1334For this, you need to write a parser class by inheriting from the
1335class ``Parser`` in the ``ftputil.stat`` module. Here's an example::
1336
1337    import ftputil.error
1338    import ftputil.stat
1339
1340    class XyzParser(ftputil.stat.Parser):
1341        """
1342        Parse the default format of the FTP server of the XYZ
1343        corporation.
1344        """
1345
1346        def parse_line(self, line, time_shift=0.0):
1347            """
1348            Parse a `line` from the directory listing and return a
1349            corresponding `StatResult` object. If the line can't
1350            be parsed, raise `ftputil.error.ParserError`.
1351
1352            The `time_shift` argument can be used to fine-tune the
1353            parsing of dates and times. See the class
1354            `ftputil.stat.UnixParser` for an example.
1355            """
1356            # Split the `line` argument and examine it further; if
1357            # something goes wrong, raise an `ftputil.error.ParserError`.
1358            ...
1359            # Make a `StatResult` object from the parts above.
1360            stat_result = ftputil.stat.StatResult(...)
1361            # `_st_name`, `_st_target` and `_st_mtime_precision` are optional.
1362            stat_result._st_name = ...
1363            stat_result._st_target = ...
1364            stat_result._st_mtime_precision = ...
1365            return stat_result
1366
1367        # Define `ignores_line` only if the default in the base class
1368        # doesn't do enough!
1369        def ignores_line(self, line):
1370            """
1371            Return a true value if the line should be ignored. For
1372            example, the implementation in the base class handles
1373            lines like "total 17". On the other hand, if the line
1374            should be used for stat'ing, return a false value.
1375            """
1376            is_total_line = super(XyzParser, self).ignores_line(line)
1377            my_test = ...
1378            return is_total_line or my_test
1379
1380A ``StatResult`` object is similar to the value returned by
1381`os.stat`_ and is usually built with statements like
1382
1383::
1384
1385    stat_result = StatResult(
1386                    (st_mode, st_ino, st_dev, st_nlink, st_uid,
1387                     st_gid, st_size, st_atime, st_mtime, st_ctime))
1388    stat_result._st_name = ...
1389    stat_result._st_target = ...
1390    stat_result._st_mtime_precision = ...
1391
1392with the arguments of the ``StatResult`` constructor described in
1393the following table.
1394
1395===== =================== ============ =================== =======================
1396Index Attribute           os.stat type ``StatResult`` type Notes
1397===== =================== ============ =================== =======================
13980     st_mode             int          int
13991     st_ino              long         long
14002     st_dev              long         long
14013     st_nlink            int          int
14024     st_uid              int          str                 usually only available as string
14035     st_gid              int          str                 usually only available as string
14046     st_size             long         long
14057     st_atime            int/float    float
14068     st_mtime            int/float    float
14079     st_ctime            int/float    float
1408\-    _st_name            \-           str                 file name without directory part
1409\-    _st_target          \-           str                 link target (may be absolute or relative)
1410\-    _st_mtime_precision \-           int                 ``st_mtime`` precision in seconds
1411===== =================== ============ =================== =======================
1412
1413If you can't extract all the desirable data from a line (for
1414example, the MS format doesn't contain any information about the
1415owner of a file), set the corresponding values in the ``StatResult``
1416instance to ``None``.
1417
1418Parser classes can use several helper methods which are defined in
1419the class ``Parser``:
1420
1421- ``parse_unix_mode`` parses strings like "drwxr-xr-x" and returns
1422  an appropriate ``st_mode`` integer value.
1423
1424- ``parse_unix_time`` returns a float number usable for the
1425  ``st_...time`` values by parsing arguments like "Nov"/"23"/"02:33" or
1426  "May"/"26"/"2005". Note that the method expects the timestamp string
1427  already split at whitespace.
1428
1429- ``parse_ms_time`` parses arguments like "10-23-01"/"03:25PM" and
1430  returns a float number like from ``time.mktime``. Note that the
1431  method expects the timestamp string already split at whitespace.
1432
1433Additionally, there's an attribute ``_month_numbers`` which maps
1434lowercase three-letter month abbreviations to integers.
1435
1436For more details, see the two "standard" parsers ``UnixParser`` and
1437``MSParser`` in the module ``ftputil/stat.py``.
1438
1439To actually *use* the parser, call the method `set_parser`_ of the
1440``FTPHost`` instance.
1441
1442If you can't write a parser or don't want to, please ask on the
1443`ftputil mailing list`_. Possibly someone has already written a parser
1444for your server or can help with it.
1445
1446
1447FAQ / Tips and tricks
1448---------------------
1449
1450Where can I get the latest version?
1451~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1452
1453See the `download page`_. Announcements will be sent to the `mailing
1454list`_. Announcements on major updates will also be posted to the
1455newsgroup `comp.lang.python.announce`_ .
1456
1457.. _`download page`: http://ftputil.sschwarzer.net/download
1458.. _`mailing list`: http://ftputil.sschwarzer.net/mailinglist
1459.. _`comp.lang.python.announce`: news:comp.lang.python.announce
1460
1461Is there a mailing list on ``ftputil``?
1462~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1463
1464Yes, please visit http://ftputil.sschwarzer.net/mailinglist to
1465subscribe or read the archives.
1466
1467Though you can *technically* post without subscribing first I can't
1468recommend it: The mails from non-subscribers have to be approved by
1469me and because the arriving mails contain *lots* of spam, I rarely go
1470through these mails.
1471
1472I found a bug! What now?
1473~~~~~~~~~~~~~~~~~~~~~~~~
1474
1475Before reporting a bug, make sure that you already read this manual
1476and tried the `latest version`_ of ``ftputil``. There the bug might
1477have already been fixed.
1478
1479.. _`latest version`: http://ftputil.sschwarzer.net/download
1480
1481Please see http://ftputil.sschwarzer.net/issuetrackernotes for
1482guidelines on entering a bug in ``ftputil``'s ticket system. If you
1483are unsure if the behaviour you found is a bug or not, you should write
1484to the `ftputil mailing list`_. In *either* case you *must not*
1485include confidential information (user id, password, file names, etc.)
1486in the problem report! Be careful!
1487
1488Does ``ftputil`` support SSL/TLS?
1489~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1490
1491``ftputil`` has no *built-in* SSL/TLS support.
1492
1493On the other hand, there are two ways to get TLS support with
1494ftputil:
1495
1496- In Python 2.7 and Python 3.2 and up, the ``ftplib`` library has a
1497  class ``FTP_TLS`` that you can use for the ``session_factory``
1498  keyword argument in the ``FTPHost`` constructor. You can't use the
1499  class directly though if you need additional setup code in
1500  comparison to ``ftplib.FTP``, for example calling ``prot_p``, to
1501  secure the data connection. On the other hand,
1502  `ftputil.session.session_factory`_ can be used to create a custom
1503  session factory.
1504 
1505  If you have other requirements that ``session_factory`` can't
1506  fulfill, you may create your own session factory by inheriting from
1507  ``ftplib.FTP_TLS``::
1508
1509    import ftplib
1510
1511    import ftputil
1512
1513
1514    class FTPTLSSession(ftplib.FTP_TLS):
1515
1516        def __init__(self, host, user, password):
1517            ftplib.FTP_TLS.__init__(self)
1518            self.connect(host, port)
1519            self.login(user, password)
1520            # Set up encrypted data connection.
1521            self.prot_p()
1522            ...
1523
1524    # Note the `session_factory` parameter. Pass the class, not
1525    # an instance.
1526    with ftputil.FTPHost(server, user, password,
1527                         session_factory=FTPTLSSession) as ftp_host:
1528        # Use `ftp_host` as usual.
1529        ...
1530
1531.. _`ftputil.session.session_factory`: `Session factories`_
1532
1533- If you need to work with Python 2.6, you can use the
1534  ``ftpslib.FTP_TLS`` class from the M2Crypto_ project. Again, you
1535  can't use the class directly but need to use
1536  ``ftputil.session.session_factory`` or a recipe similar to that
1537  above.
1538
1539  Unfortunately, ``M2Crypto.ftpslib.FTP_TLS`` (at least in version
1540  0.22.3) doesn't work correctly if you pass unicode strings to its
1541  methods. Since ``ftputil`` does exactly that at some point (even if
1542  you used byte strings in ``ftputil`` calls) you need a workaround in
1543  the session factory class::
1544
1545    import M2Crypto
1546
1547    import ftputil
1548    import ftputil.tool
1549
1550
1551    class M2CryptoSession(M2Crypto.ftpslib.FTP_TLS):
1552
1553        def __init__(self, host, user, password):
1554            M2Crypto.ftpslib.FTP_TLS.__init__(self)
1555            # Change the port number if needed.
1556            self.connect(host, 21)
1557            self.auth_tls()
1558            self.login(user, password)
1559            self.prot_p()
1560            self._fix_socket()
1561            ...
1562
1563        def _fix_socket(self):
1564            """
1565            Change the socket object so that arguments to `sendall`
1566            are converted to byte strings before being used.
1567            """
1568            original_sendall = self.sock.sendall
1569            # Bound method, therefore no `self` argument.
1570            def sendall(data):
1571                data = ftputil.tool.as_bytes(data)
1572                return original_sendall(data)
1573            self.sock.sendall = sendall
1574
1575    # Note the `session_factory` parameter. Pass the class, not
1576    # an instance.
1577    with ftputil.FTPHost(server, user, password,
1578                         session_factory=M2CryptoSession) as ftp_host:
1579        # Use `ftp_host` as usual.
1580        ...
1581
1582  That said, ``session_factory`` has this workaround built in, so
1583  normally you don't need to define the session factory yourself!
1584
1585.. _M2Crypto: https://github.com/martinpaljak/M2Crypto
1586
1587How do I connect to a non-default port?
1588~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1589
1590By default, an instantiated ``FTPHost`` object connects on the usual
1591FTP port. If you have to use a different port, refer to the section
1592`Session factories`_.
1593
1594How do I set active or passive mode?
1595~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1596
1597Please see the section `Session factories`_.
1598
1599How can I debug an FTP connection problem?
1600~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1601
1602You can do this with a session factory. See `Session factories`_.
1603
1604If you want to change the debug level only temporarily after the
1605connection is established, you can reach the `session object`_ as the
1606``_session`` attribute of the ``FTPHost`` instance and call
1607``_session.set_debuglevel``. Note that the ``_session`` attribute
1608should *only* be accessed for debugging. Calling arbitrary
1609``ftplib.FTP`` methods on the session object may *cause* bugs!
1610
1611.. _`session object`: `Session factories`_
1612
1613Conditional upload/download to/from a server in a different time zone
1614~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1615
1616You may find that ``ftputil`` uploads or downloads files
1617unnecessarily, or not when it should. This can happen when the FTP
1618server is in a different time zone than the client on which
1619``ftputil`` runs. Please see the section on `time zone correction`_.
1620It may even be sufficient to call `synchronize_times`_.
1621
1622When I use ``ftputil``, all I get is a ``ParserError`` exception
1623~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1624
1625The FTP server you connect to may use a directory format that
1626``ftputil`` doesn't understand. You can either write and
1627`plug in an own parser`_ or ask on the `mailing list`_ for
1628help.
1629
1630.. _`plug in an own parser`: `Writing directory parsers`_
1631
1632``isdir``, ``isfile`` or ``islink`` incorrectly return ``False``
1633~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1634
1635Like Python's counterparts under `os.path`_, ``ftputil``'s methods
1636return ``False`` if they can't find the given path.
1637
1638Probably you used ``listdir`` on a directory and called ``is...()`` on
1639the returned names. But if the argument for ``listdir`` wasn't the
1640current directory, the paths won't be found and so all ``is...()``
1641variants will return ``False``.
1642
1643I don't find an answer to my problem in this document
1644~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1645
1646Please send an email with your problem report or question to the
1647`ftputil mailing list`_, and we'll see what we can do for you. :-)
1648
1649
1650Bugs and limitations
1651--------------------
1652
1653- ``ftputil`` needs at least Python 2.6 to work.
1654
1655- Whether ``ftputil`` "sees" "hidden" directory and file names (i. e.
1656  names starting with a dot) depends on the configuration of the FTP
1657  server. See `Hidden files and directories`_ for details.
1658
1659- Due to the implementation of ``lstat`` it can not return a sensible
1660  value for the root directory ``/`` though stat'ing entries *in* the
1661  root directory isn't a problem. If you know an implementation that
1662  can do this, please let me know. The root directory is handled
1663  appropriately in ``FTPHost.path.exists/isfile/isdir/islink``, though.
1664
1665- In multithreaded programs, you can have each thread use one or more
1666  ``FTPHost`` instances as long as no instance is shared with other
1667  threads.
1668
1669- Currently, it is not possible to continue an interrupted upload or
1670  download. Contact me if this causes problems for you.
1671
1672- There's exactly one cache for ``lstat`` results for each ``FTPHost``
1673  object, i. e. there's no sharing of cache results determined by
1674  several ``FTPHost`` objects. See `Local caching of file system
1675  information`_ for the reasons.
1676
1677
1678Files
1679-----
1680
1681If not overwritten via installation options, the ``ftputil`` files
1682reside in the ``ftputil`` package. There's also documentation in
1683`reStructuredText`_ and in HTML format. The locations of these
1684files after installation is system-dependent.
1685
1686.. _`reStructuredText`: http://docutils.sourceforge.net/rst.html
1687
1688The files ``test_*.py`` and ``mock_ftplib.py`` are for unit-testing.
1689If you only *use* ``ftputil``, i. e. *don't* modify it, you can
1690delete these files.
1691
1692
1693References
1694----------
1695
1696- Mackinnon T, Freeman S, Craig P. 2000. `Endo-Testing:
1697  Unit Testing with Mock Objects`_.
1698
1699- Postel J, Reynolds J. 1985. `RFC 959 - File Transfer Protocol (FTP)`_.
1700
1701- Van Rossum G et al. 2013. `Python Library Reference`_.
1702
1703.. _`Endo-Testing: Unit Testing with Mock Objects`:
1704   http://www.connextra.com/aboutUs/mockobjects.pdf
1705.. _`RFC 959 - File Transfer Protocol (FTP)`: http://www.ietf.org/rfc/rfc959.txt
1706.. _`Python Library Reference`: https://docs.python.org/library/index.html
1707
1708
1709Authors
1710-------
1711
1712``ftputil`` is written by Stefan Schwarzer
1713<sschwarzer@sschwarzer.net> and contributors (see
1714``doc/contributors.txt``).
1715
1716The original ``lrucache`` module was written by Evan Prodromou
1717<evan@prodromou.name>.
1718
1719Feedback is appreciated. :-)
Note: See TracBrowser for help on using the repository browser.