Changes between Version 39 and Version 40 of Documentation


Ignore:
Timestamp:
Feb 17, 2021, 7:53:35 PM (10 months ago)
Author:
schwa
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Documentation

    v39 v40  
    44==============================================
    55
    6 :Version:   4.0.0
    7 :Date:      2020-06-13
     6:Version:   5.0.0
     7:Date:      2021-02-17
    88:Summary:   high-level FTP client library for Python
    99:Keywords:  FTP, ``ftplib`` substitute, virtual filesystem, pure Python
     
    9898        InternalError(FTPError)
    9999            InaccessibleLoginDirError(InternalError)
     100            NoEncodingError(InternalError)
    100101            ParserError(InternalError)
    101102            RootDirError(InternalError)
     
    221222  directory as argument would fail.
    222223
     224- ``NoEncodingError``
     225
     226  is raised if an FTP session instance doesn't have an ``encoding``
     227  attribute (see also `session factories`_).
     228
    223229- ``ParserError``
    224230
     
    241247
    242248
    243 Directory and file names
    244 ------------------------
    245 
    246 .. note::
    247 
    248    Keep in mind that this section only applies to directory and file
    249    *names*, not file *contents*. Encoding and decoding for file
    250    contents is handled by the ``encoding`` argument for
    251    `FTPHost.open`_.
    252 
    253 First off: If your directory and file names (both as arguments and on
    254 the server) contain only ISO 8859-1 (latin-1) characters, you can use
    255 such names in the form of ``bytes`` or ``str`` objects. However, you
    256 can't mix different string types (``bytes`` and ``str``) in one call
    257 (for example in ``FTPHost.path.join``).
    258 
    259 If you have directory or file names with characters that aren't in
    260 latin-1, it's recommended to use ``bytes`` objects. In that case,
    261 returned paths will be ``bytes`` objects, too.
    262 
    263 Read on for details.
    264 
    265 .. note::
    266 
    267    The approach described below may look awkward and in a way it is.
    268    The intention of ``ftputil`` is to behave like the local file
    269    system APIs of Python 3 as far as it makes sense. Moreover, the
    270    taken approach makes sure that directory and file names that were
    271    used with Python 3's native ``ftplib`` module will be compatible
    272    with ``ftputil`` and vice versa. Otherwise you may be able to use a
    273    file name with ``ftputil``, but get an exception when trying to
    274    read the same file with Python 3's ``ftplib`` module.
    275 
    276 Methods that take paths of directories and/or files can take either
    277 ``bytes`` or ``str`` objects, or `PathLike`_ objects that can be
    278 converted to ``bytes`` or ``str``.
    279 
    280 .. _PathLike: https://docs.python.org/3/library/os.html#os.PathLike
    281 
    282 If a method gets a string argument (or a string argument wrapped in a
    283 PathLike_ object) and returns one or more strings, these strings will
    284 have the same string type (``bytes`` or ``str``) as the argument(s).
    285 Mixing different string types in one call (for example in
    286 ``FTPHost.path.join``) isn't allowed and will cause a ``TypeError``.
    287 These rules are the same as for local file system operations in Python 3.
    288 
    289 ``bytes`` objects for directory and file names will be sent to the
    290 server as-is. On the other hand, ``str`` objects will be encoded to
    291 ``bytes`` objects, assuming latin-1 encoding. This implies that such
    292 ``str`` objects must only contain code points 0-255 for the latin-1
    293 character set. Using any other characters will result in a
    294 ``UnicodeEncodeError`` exception.
    295 
    296 If you have directory or file names as ``str`` objects with
    297 non-latin-1 characters, encode the strings to ``bytes`` yourself,
    298 using the encoding you know the server uses for its file system.
    299 Decode received paths with the same encoding. Encapsulate these
    300 conversions as far as you can. Otherwise, you'd have to adapt
    301 potentially a lot of code if the server encoding changes.
    302 
    303 If you *don't* know the encoding on the server side, it's probably the
    304 best to only use ``bytes`` for directory and file names. That said, as
    305 soon as you *show* the names to a user, you -- or the library you use
    306 for displaying the names -- has to guess an encoding.
    307 
    308 If you can decide about paths yourself, it's generally safest to use
    309 only ASCII characters in FTP paths.
    310 
    311 
    312249``FTPHost`` objects
    313250-------------------
     
    343280body of the ``with`` statement, the instance is closed as well.
    344281Exceptions will be propagated (as with ``try ... finally``).
     282
     283.. _`session factory`:
    345284
    346285Session factories
     
    402341                    use_passive_mode=None,
    403342                    encrypt_data_channel=True,
     343                    encoding=None,
    404344                    debug_level=None)
    405345
     
    422362  parameter is ignored.
    423363
     364- ``encoding`` can be a string to set the encoding of directory and
     365  file paths on the remote server. (This has nothing to do with the
     366  encoding of file contents!) If you pass a string and your base class
     367  is neither ``ftplib.FTP`` nor ``ftplib.FTP_TLS``, the used heuristic
     368  in ``session_factory`` may not work reliably. Therefore, if in
     369  doubt, let ``encoding`` be ``None`` and define your ``base_class``
     370  so that it sets the encoding you want.
     371
     372  Note: In Python 3.9, the default path encoding for ``ftplib.FTP``
     373  and ``ftplib.FTP_TLS`` changed from previously "latin-1" to "utf-8".
     374  Hence, if you don't pass an ``encoding`` to ``session_factory``,
     375  you'll get different path encodings for Python 3.8 and earlier vs.
     376  Python 3.9 and later.
     377
     378  If you're sure that you always use only ASCII characters in your
     379  remote paths, you don't need to worry about the path encoding and
     380  don't need to use the ``encoding`` argument.
     381
    424382- ``debug_level`` sets the debug level for FTP session instances. The
    425383  semantics is defined by the base class. For example, a debug level
     
    441399                           port=31,
    442400                           encrypt_data_channel=True,
     401                           encoding="UTF-8",
    443402                           debug_level=2)
    444403
     
    448407
    449408to create and use a session factory derived from ``ftplib.FTP_TLS``
    450 that connects on command channel 31, will encrypt the data channel and
    451 print output for debug level 2.
     409that connects on command channel 31, will encrypt the data channel,
     410use the UTF-8 encoding for remote paths and print output for debug
     411level 2.
    452412
    453413Note: Generally, you can achieve everything you can do with
    454414``ftputil.session.session_factory`` with an explicit session factory
    455 as described at the start of this section. However, the class
    456 ``M2Crypto.ftpslib.FTP_TLS`` has a limitation so that you can't use
    457 it with ftputil out of the box. The function ``session_factory``
    458 contains a workaround for this limitation. For details refer to `this
    459 ticket`_.
    460 
    461 .. _`this ticket`: https://ftputil.sschwarzer.net/trac/ticket/78
     415as described at the start of this section.
     416
     417
     418Directory and file names
     419~~~~~~~~~~~~~~~~~~~~~~~~
     420
     421.. note::
     422
     423   Keep in mind that this section only applies to directory and file
     424   *names*, not file *contents*. Encoding and decoding for file
     425   contents is handled by the ``encoding`` argument for
     426   `FTPHost.open`_.
     427
     428Generally, paths can be ``str`` or ``bytes`` objects (or `PathLike`_
     429objects wrapping ``str`` or ``bytes``). However, you can't mix
     430different string types (``bytes`` and ``str``) in one call (for
     431example in ``FTPHost.path.join``). If a method gets a string argument
     432(or a string argument wrapped in a PathLike_ object) and returns one
     433or more strings, these strings will have the same string type
     434(``bytes`` or ``str``) as the argument(s). Mixing different string
     435types in one call (for example in ``FTPHost.path.join``) isn't allowed
     436and will cause a ``TypeError``. These rules are the same as for local
     437file system operations.
     438
     439.. _PathLike: https://docs.python.org/3/library/os.html#os.PathLike
     440
     441Although you can pass paths as ``str`` or ``bytes``, the former is
     442recommended. See below for the reason.
     443
     444*If* you have directory or file names with non-ASCII characters, you
     445need to be aware of the encoding the `session factory`_ (e. g.
     446``ftplib.FTP``) uses. This needs to be the same encoding that the FTP
     447server uses for the paths.
     448
     449The following diagram shows string conversions on the way from your
     450code to the remote FTP server. The opposite way works analogously, so
     451encoding steps in the diagram become decoding steps and decoding steps
     452in the diagram become encoding steps.
     453
     454Both "branching points" in the upper and lower part of diagrams are
     455independent, so depending on how you pass paths to ftputil and which
     456file system API the FTP server uses, there are four possible
     457combinations.
     458
     459::
     460
     461     +-----------+       +-----------+
     462     | Your code |       | Your code |
     463     +-----------+       +-----------+
     464          |                    |
     465          |  str               |  bytes
     466          v                    v
     467    +-------------+     +-------------+  decode with encoding of session,
     468    | ftputil API |     | ftputil API |  e. g. `ftplib.FTP` instance
     469    +-------------+     +-------------+
     470            \               /
     471             \     str     /
     472              v           v
     473            +---------------+  encode with encoding
     474            |  ftplib API   |  specified in `FTP` instance
     475            +---------------+
     476                    |
     477                    |  bytes
     478                    v
     479             +-------------+
     480             | socket API  |
     481             +-------------+
     482                /       \
     483               /         \                 local / client
     484    - - - - - / - - - - - \ - - - - - - - - - - - - - - - - - - - - - -
     485             /             \              remote / server
     486            /     bytes     \
     487           v                 v
     488    +------------+      +------------+  decode with encoding from
     489    | FTP server |      | FTP server |  FTP server configuration
     490    +------------+      +------------+
     491          |                   |
     492          |  bytes            |  str
     493          v                   v
     494   +-------------+      +-------------+
     495   | remote file |      | remote file |
     496   | system API  |      | system API  |
     497   +-------------+      +-------------+
     498           \                 /
     499            \      bytes    /
     500             v             v
     501          +-------------------+
     502          |    file system    |
     503          +-------------------+
     504
     505As you can see at the top of the diagram, if you use ``str`` objects
     506(regular unicode strings), there's one fewer decoding step, and so one
     507fewer source of problems. If you use ``bytes`` objects for paths,
     508ftputil tries to get the encoding for the FTP server from the
     509``encoding`` attribute of the session instance (say, an instance of
     510``ftplib.FTP``). If no ``encoding`` attribute is present, a
     511``NoEncodingError`` is raised.
     512
     513All encoding/decoding steps must use the same encoding, the encoding
     514the server uses (at the bottom of the diagram). If the server uses the
     515bytes from the socket directly, i. e. without an encoding step, you
     516have to use the file system encoding.
     517
     518Until and including Python 3.8, the encoding implicitly assumed by
     519the ``ftplib`` module was latin-1, so using ``bytes`` was the safest
     520strategy. However, Python 3.9 made the ``encoding``
     521configurable via an ``ftplib.FTP`` constructor argument ``encoding``,
     522*but defaults to UTF-8*.
     523
     524If you don't pass a `session factory`_ to the ``ftputil.FTPHost``
     525constructor, ftputil will use latin-1 encoding for the paths. This is
     526the same value as in earlier ftputil versions in combination with
     527Python 3.8 and earlier.
     528
     529Summary:
     530
     531- If possible, use only ASCII characters in paths.
     532- If possible, pass paths to ftputil as ``str``, not ``bytes``.
     533- If you use a custom session factory, the session instances created
     534  by the factory must have an ``encoding`` attribute with the name of
     535  the path encoding to use. If your session instances don't have an
     536  ``encoding`` attribute, ftputil raises a ``NoEncodingError`` when
     537  the session is created.
     538
    462539
    463540Hidden files and directories