Changes between Version 39 and Version 40 of Documentation

Feb 17, 2021, 7:53:35 PM (10 months ago)



  • Documentation

    v39 v40  
    6 :Version:   4.0.0
    7 :Date:      2020-06-13
     6:Version:   5.0.0
     7:Date:      2021-02-17
    88:Summary:   high-level FTP client library for Python
    99:Keywords:  FTP, ``ftplib`` substitute, virtual filesystem, pure Python
    9898        InternalError(FTPError)
    9999            InaccessibleLoginDirError(InternalError)
     100            NoEncodingError(InternalError)
    100101            ParserError(InternalError)
    101102            RootDirError(InternalError)
    221222  directory as argument would fail.
     224- ``NoEncodingError``
     226  is raised if an FTP session instance doesn't have an ``encoding``
     227  attribute (see also `session factories`_).
    223229- ``ParserError``
    243 Directory and file names
    244 ------------------------
    246 .. note::
    248    Keep in mind that this section only applies to directory and file
    249    *names*, not file *contents*. Encoding and decoding for file
    250    contents is handled by the ``encoding`` argument for
    251    ``_.
    253 First off: If your directory and file names (both as arguments and on
    254 the server) contain only ISO 8859-1 (latin-1) characters, you can use
    255 such names in the form of ``bytes`` or ``str`` objects. However, you
    256 can't mix different string types (``bytes`` and ``str``) in one call
    257 (for example in ``FTPHost.path.join``).
    259 If you have directory or file names with characters that aren't in
    260 latin-1, it's recommended to use ``bytes`` objects. In that case,
    261 returned paths will be ``bytes`` objects, too.
    263 Read on for details.
    265 .. note::
    267    The approach described below may look awkward and in a way it is.
    268    The intention of ``ftputil`` is to behave like the local file
    269    system APIs of Python 3 as far as it makes sense. Moreover, the
    270    taken approach makes sure that directory and file names that were
    271    used with Python 3's native ``ftplib`` module will be compatible
    272    with ``ftputil`` and vice versa. Otherwise you may be able to use a
    273    file name with ``ftputil``, but get an exception when trying to
    274    read the same file with Python 3's ``ftplib`` module.
    276 Methods that take paths of directories and/or files can take either
    277 ``bytes`` or ``str`` objects, or `PathLike`_ objects that can be
    278 converted to ``bytes`` or ``str``.
    280 .. _PathLike:
    282 If a method gets a string argument (or a string argument wrapped in a
    283 PathLike_ object) and returns one or more strings, these strings will
    284 have the same string type (``bytes`` or ``str``) as the argument(s).
    285 Mixing different string types in one call (for example in
    286 ``FTPHost.path.join``) isn't allowed and will cause a ``TypeError``.
    287 These rules are the same as for local file system operations in Python 3.
    289 ``bytes`` objects for directory and file names will be sent to the
    290 server as-is. On the other hand, ``str`` objects will be encoded to
    291 ``bytes`` objects, assuming latin-1 encoding. This implies that such
    292 ``str`` objects must only contain code points 0-255 for the latin-1
    293 character set. Using any other characters will result in a
    294 ``UnicodeEncodeError`` exception.
    296 If you have directory or file names as ``str`` objects with
    297 non-latin-1 characters, encode the strings to ``bytes`` yourself,
    298 using the encoding you know the server uses for its file system.
    299 Decode received paths with the same encoding. Encapsulate these
    300 conversions as far as you can. Otherwise, you'd have to adapt
    301 potentially a lot of code if the server encoding changes.
    303 If you *don't* know the encoding on the server side, it's probably the
    304 best to only use ``bytes`` for directory and file names. That said, as
    305 soon as you *show* the names to a user, you -- or the library you use
    306 for displaying the names -- has to guess an encoding.
    308 If you can decide about paths yourself, it's generally safest to use
    309 only ASCII characters in FTP paths.
    312249``FTPHost`` objects
    343280body of the ``with`` statement, the instance is closed as well.
    344281Exceptions will be propagated (as with ``try ... finally``).
     283.. _`session factory`:
    346285Session factories
    402341                    use_passive_mode=None,
    403342                    encrypt_data_channel=True,
     343                    encoding=None,
    404344                    debug_level=None)
    422362  parameter is ignored.
     364- ``encoding`` can be a string to set the encoding of directory and
     365  file paths on the remote server. (This has nothing to do with the
     366  encoding of file contents!) If you pass a string and your base class
     367  is neither ``ftplib.FTP`` nor ``ftplib.FTP_TLS``, the used heuristic
     368  in ``session_factory`` may not work reliably. Therefore, if in
     369  doubt, let ``encoding`` be ``None`` and define your ``base_class``
     370  so that it sets the encoding you want.
     372  Note: In Python 3.9, the default path encoding for ``ftplib.FTP``
     373  and ``ftplib.FTP_TLS`` changed from previously "latin-1" to "utf-8".
     374  Hence, if you don't pass an ``encoding`` to ``session_factory``,
     375  you'll get different path encodings for Python 3.8 and earlier vs.
     376  Python 3.9 and later.
     378  If you're sure that you always use only ASCII characters in your
     379  remote paths, you don't need to worry about the path encoding and
     380  don't need to use the ``encoding`` argument.
    424382- ``debug_level`` sets the debug level for FTP session instances. The
    425383  semantics is defined by the base class. For example, a debug level
    441399                           port=31,
    442400                           encrypt_data_channel=True,
     401                           encoding="UTF-8",
    443402                           debug_level=2)
    449408to create and use a session factory derived from ``ftplib.FTP_TLS``
    450 that connects on command channel 31, will encrypt the data channel and
    451 print output for debug level 2.
     409that connects on command channel 31, will encrypt the data channel,
     410use the UTF-8 encoding for remote paths and print output for debug
     411level 2.
    453413Note: Generally, you can achieve everything you can do with
    454414``ftputil.session.session_factory`` with an explicit session factory
    455 as described at the start of this section. However, the class
    456 ``M2Crypto.ftpslib.FTP_TLS`` has a limitation so that you can't use
    457 it with ftputil out of the box. The function ``session_factory``
    458 contains a workaround for this limitation. For details refer to `this
    459 ticket`_.
    461 .. _`this ticket`:
     415as described at the start of this section.
     418Directory and file names
     421.. note::
     423   Keep in mind that this section only applies to directory and file
     424   *names*, not file *contents*. Encoding and decoding for file
     425   contents is handled by the ``encoding`` argument for
     426   ``_.
     428Generally, paths can be ``str`` or ``bytes`` objects (or `PathLike`_
     429objects wrapping ``str`` or ``bytes``). However, you can't mix
     430different string types (``bytes`` and ``str``) in one call (for
     431example in ``FTPHost.path.join``). If a method gets a string argument
     432(or a string argument wrapped in a PathLike_ object) and returns one
     433or more strings, these strings will have the same string type
     434(``bytes`` or ``str``) as the argument(s). Mixing different string
     435types in one call (for example in ``FTPHost.path.join``) isn't allowed
     436and will cause a ``TypeError``. These rules are the same as for local
     437file system operations.
     439.. _PathLike:
     441Although you can pass paths as ``str`` or ``bytes``, the former is
     442recommended. See below for the reason.
     444*If* you have directory or file names with non-ASCII characters, you
     445need to be aware of the encoding the `session factory`_ (e. g.
     446``ftplib.FTP``) uses. This needs to be the same encoding that the FTP
     447server uses for the paths.
     449The following diagram shows string conversions on the way from your
     450code to the remote FTP server. The opposite way works analogously, so
     451encoding steps in the diagram become decoding steps and decoding steps
     452in the diagram become encoding steps.
     454Both "branching points" in the upper and lower part of diagrams are
     455independent, so depending on how you pass paths to ftputil and which
     456file system API the FTP server uses, there are four possible
     461     +-----------+       +-----------+
     462     | Your code |       | Your code |
     463     +-----------+       +-----------+
     464          |                    |
     465          |  str               |  bytes
     466          v                    v
     467    +-------------+     +-------------+  decode with encoding of session,
     468    | ftputil API |     | ftputil API |  e. g. `ftplib.FTP` instance
     469    +-------------+     +-------------+
     470            \               /
     471             \     str     /
     472              v           v
     473            +---------------+  encode with encoding
     474            |  ftplib API   |  specified in `FTP` instance
     475            +---------------+
     476                    |
     477                    |  bytes
     478                    v
     479             +-------------+
     480             | socket API  |
     481             +-------------+
     482                /       \
     483               /         \                 local / client
     484    - - - - - / - - - - - \ - - - - - - - - - - - - - - - - - - - - - -
     485             /             \              remote / server
     486            /     bytes     \
     487           v                 v
     488    +------------+      +------------+  decode with encoding from
     489    | FTP server |      | FTP server |  FTP server configuration
     490    +------------+      +------------+
     491          |                   |
     492          |  bytes            |  str
     493          v                   v
     494   +-------------+      +-------------+
     495   | remote file |      | remote file |
     496   | system API  |      | system API  |
     497   +-------------+      +-------------+
     498           \                 /
     499            \      bytes    /
     500             v             v
     501          +-------------------+
     502          |    file system    |
     503          +-------------------+
     505As you can see at the top of the diagram, if you use ``str`` objects
     506(regular unicode strings), there's one fewer decoding step, and so one
     507fewer source of problems. If you use ``bytes`` objects for paths,
     508ftputil tries to get the encoding for the FTP server from the
     509``encoding`` attribute of the session instance (say, an instance of
     510``ftplib.FTP``). If no ``encoding`` attribute is present, a
     511``NoEncodingError`` is raised.
     513All encoding/decoding steps must use the same encoding, the encoding
     514the server uses (at the bottom of the diagram). If the server uses the
     515bytes from the socket directly, i. e. without an encoding step, you
     516have to use the file system encoding.
     518Until and including Python 3.8, the encoding implicitly assumed by
     519the ``ftplib`` module was latin-1, so using ``bytes`` was the safest
     520strategy. However, Python 3.9 made the ``encoding``
     521configurable via an ``ftplib.FTP`` constructor argument ``encoding``,
     522*but defaults to UTF-8*.
     524If you don't pass a `session factory`_ to the ``ftputil.FTPHost``
     525constructor, ftputil will use latin-1 encoding for the paths. This is
     526the same value as in earlier ftputil versions in combination with
     527Python 3.8 and earlier.
     531- If possible, use only ASCII characters in paths.
     532- If possible, pass paths to ftputil as ``str``, not ``bytes``.
     533- If you use a custom session factory, the session instances created
     534  by the factory must have an ``encoding`` attribute with the name of
     535  the path encoding to use. If your session instances don't have an
     536  ``encoding`` attribute, ftputil raises a ``NoEncodingError`` when
     537  the session is created.
    463540Hidden files and directories