Changeset 1887:31c01275f303


Ignore:
Timestamp:
Jan 1, 2020, 8:15:53 PM (3 months ago)
Author:
Stefan Schwarzer <sschwarzer@…>
Branch:
default
histedit_source:
d9a2ae6dfaaf05530d3e072125d8de25f85de51d
Message:
Document that `PathLike` objects are allowed

Avoid "byte strings" and "unicode strings". Use `bytes` and `str`
instead since this is the (more) common terminology with Python 3.
File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/ftputil.txt

    r1886 r1887  
    249249   `FTPHost.open`_.
    250250
    251 First off: If your directory and file names (both as
    252 arguments and on the server) contain only ISO 8859-1 (latin-1)
    253 characters, you can use such names in the form of byte strings or
    254 unicode strings. However, you can't mix different string types (bytes
    255 and unicode) in one call (for example in ``FTPHost.path.join``).
     251First off: If your directory and file names (both as arguments and on
     252the server) contain only ISO 8859-1 (latin-1) characters, you can use
     253such names in the form of ``bytes`` or ``str`` objects. However, you
     254can't mix different string types (``bytes`` and ``str``) in one call
     255(for example in ``FTPHost.path.join``).
    256256
    257257If you have directory or file names with characters that aren't in
    258 latin-1, it's recommended to use byte strings. In that case,
    259 returned paths will be byte strings, too.
     258latin-1, it's recommended to use ``bytes`` objects. In that case,
     259returned paths will be ``bytes`` objects, too.
    260260
    261261Read on for details.
     
    272272   read the same file with Python 3's ``ftplib`` module.
    273273
    274 Methods that take names of directories and/or files can take either
    275 byte or unicode strings. If a method got a string argument and returns
    276 one or more strings, these strings will have the same string type as
    277 the argument(s). Mixing different string arguments in one call (for
    278 example in ``FTPHost.path.join``) isn't allowed and will cause a
    279 ``TypeError``. These rules are the same as for local file system
    280 operations in Python 3. Since ``ftputil`` uses the same API for Python
    281 2, ``ftputil`` will do the same when run on Python 2.
    282 
    283 Byte strings for directory and file names will be sent to the server
    284 as-is. On the other hand, unicode strings will be encoded to byte
    285 strings, assuming latin-1 encoding. This implies that such unicode
    286 strings must only contain code points 0-255 for the latin-1 character
    287 set. Using any other characters will result in a
     274Methods that take paths of directories and/or files can take either
     275``bytes`` or ``str`` objects, or `PathLike`_ objects that can be
     276converted to ``bytes`` or ``str``.
     277
     278.. _PathLike: https://docs.python.org/3/library/os.html#os.PathLike
     279
     280If a method gets a string argument (or a string argument wrapped in a
     281PathLike_ object) and returns one or more strings, these strings will
     282have the same string type (``bytes`` or ``str``) as the argument(s).
     283Mixing different string types in one call (for example in
     284``FTPHost.path.join``) isn't allowed and will cause a ``TypeError``.
     285These rules are the same as for local file system operations in Python 3.
     286
     287``bytes`` objects for directory and file names will be sent to the
     288server as-is. On the other hand, ``str`` objects will be encoded to
     289``bytes`` objects, assuming latin-1 encoding. This implies that such
     290``str`` objects must only contain code points 0-255 for the latin-1
     291character set. Using any other characters will result in a
    288292``UnicodeEncodeError`` exception.
    289293
    290 If you have directory or file names as unicode strings with non-latin-1
    291 characters, encode the unicode strings to byte strings yourself, using
    292 the encoding you know the server uses. Decode received paths with the
    293 same encoding. Encapsulate these conversions as far as you can.
    294 Otherwise, you'd have to adapt potentially a lot of code if the server
    295 encoding changes.
    296 
    297 If you *don't* know the encoding on the server side,
    298 it's probably the best to only use byte strings for directory and file
    299 names. That said, as soon as you *show* the names to a user, you -- or
    300 the library you use for displaying the names -- has to guess an
    301 encoding.
     294If you have directory or file names as ``str`` objects with
     295non-latin-1 characters, encode the strings to ``bytes`` yourself,
     296using the encoding you know the server uses for its file system.
     297Decode received paths with the same encoding. Encapsulate these
     298conversions as far as you can. Otherwise, you'd have to adapt
     299potentially a lot of code if the server encoding changes.
     300
     301If you *don't* know the encoding on the server side, it's probably the
     302best to only use ``bytes`` for directory and file names. That said, as
     303soon as you *show* the names to a user, you -- or the library you use
     304for displaying the names -- has to guess an encoding.
     305
     306If you can decide about paths yourself, it's generally safest to use
     307only ASCII characters in FTP paths.
    302308
    303309
Note: See TracChangeset for help on using the changeset viewer.