Changeset 1948:2ec437f9c646


Ignore:
Timestamp:
Jun 1, 2020, 5:33:09 PM (14 months ago)
Author:
Stefan Schwarzer <sschwarzer@…>
Branch:
default
Message:
Add draft of "What's new in ftputil 4.0.0?"
File:
1 copied

Legend:

Unmodified
Added
Removed
  • doc/whats_new_in_ftputil_4.0.0.txt

    r1565 r1948  
    1 What's new in ftputil 3.0?
     1What's new in ftputil 4.0?
    22==========================
    33
    4 :Version:   3.0
    5 :Date:      2013-09-29
     4:Version:   4.0
     5:Date:      2020-06-01
    66:Author:    Stefan Schwarzer <sschwarzer@sschwarzer.net>
    77
     
    99
    1010
    11 Added support for Python 3
    12 --------------------------
     11Supported Python versions
     12-------------------------
    1313
    14 This ftputil release adds support for Python 3.0 and up.
     14Support for Python 2 is dropped. The minimum required Python 3 version
     15is 3.6.
    1516
    16 Python 2 and 3 are supported with the same source code. Also, the API
    17 including the semantics is the same. As for Python 3 code, in ftputil
    18 3.0 unicode is somewhat preferred over byte strings. On the other
    19 hand, in line with the file system APIs of both Python 2 and 3,
    20 methods take either byte strings or unicode strings. Methods that take
    21 and return strings (for example, ``FTPHost.path.abspath`` or
    22 ``FTPHost.listdir``), return the same string type they get.
    23 
    24 .. Note::
    25 
    26     Both Python 2 and 3 have two "string" types where one type represents a
    27     sequence of bytes and the other type character (text) data.
    28 
    29     ============== =========== =========== ===========================
    30     Python version Binary type Text type   Default string literal type
    31     ============== =========== =========== ===========================
    32     2              ``str``     ``unicode`` ``str`` (= binary type)
    33     3              ``bytes``   ``str``     ``str`` (= text type)
    34     ============== =========== =========== ===========================
    35 
    36     So both lines of Python have an ``str`` type, but in Python 2 it's
    37     the byte type and in Python 3 the text type. The ``str`` type is
    38     also what you get when you write a literal string without any
    39     prefixes. For example ``"Python"`` is a binary string in Python 2
    40     and a text (unicode) string in Python 3.
    41 
    42     If this seems confusing, please read `this description`_ in the Python
    43     documentation for more details.
    44 
    45     .. _`this description`: http://docs.python.org/3.0/whatsnew/3.0.html#text-vs-data-instead-of-unicode-vs-8-bit
     17Find more details in `Questions and answers`_.
    4618
    4719
    48 Dropped support for Python 2.4 and 2.5
    49 --------------------------------------
     20Time shift handling
     21-------------------
    5022
    51 To make it easier to use the same code for Python 2 and 3, I decided
    52 to use the Python 3 features backported to Python 2.6. As a
    53 consequence, ftputil 3.0 doesn't work with Python 2.4 and 2.5.
     23ftputil uses the notion of "time shift" to deal with time zone
     24differences between client and server. This is important for
     25the methods ``upload_if_newer`` and ``download_if_newer``.
     26
     27The defintion of "time shift" changed from earlier ftputil versions to
     28ftputil 4.0.0.
     29
     30Previously, the time shift was defined as
     31
     32  time_used_by_server - local_time_used_by_client
     33
     34The new definition is
     35
     36  time_used_by_server - `UTC`_
     37
     38.. _`UTC`: https://en.wikipedia.org/wiki/Coordinated_Universal_Time
     39
     40Both definitions have their pros and cons (detailed in the `Questions
     41and answers`_).
     42
     43Find more details in `Porting to ftputil 4.0.0`_.
    5444
    5545
    56 Newlines and encoding of remote file content
    57 --------------------------------------------
     46ftputil no longer uses the ``-a`` option by default
     47---------------------------------------------------
    5848
    59 Traditionally, "text mode" for FTP transfers meant translation to
    60 ``\r\n`` newlines, even between transfers of Unix clients and Unix
    61 servers. Since this presumably most of the time is neither the expected
    62 nor the desired behavior, the ``FTPHost.open`` method now has the API
    63 and semantics of the built-in ``open`` function in Python 3. If you
    64 want the same API for *local* files in Python 2.6 and 2.7, you can use
    65 the ``open`` function from the ``io`` module.
     49Earlier ftputil versions by default sent an ``-a`` option with the
     50FTP ``DIR`` command to include "hidden" directories and files (names
     51starting with a dot) in the listing.
    6652
    67 Thus, when opening remote files in *binary* mode, the new API does
    68 *not* accept an encoding argument. On the other hand, opening a file
    69 in text mode always implies an encoding step when writing and decoding
    70 step when reading files. If the ``encoding`` argument isn't specified,
    71 it defaults to the value of ``locale.getpreferredencoding(False)``.
     53That led to problems when the server `didn't understand the option`_
     54and treated it as a directory or file name.
    7255
    73 Also as with Python 3's ``open`` builtin, opening a file in binary
    74 mode for reading will give you byte string data. If you write to a
    75 file opened in binary mode, you must write byte strings. Along the
    76 same lines, files opened in text mode will give you unicode strings
    77 when read, and require unicode strings to be passed to write
    78 operations.
     56.. _`didn't understand the option`: https://ftputil.sschwarzer.net/trac/ticket/110
     57
     58Therefore, ftputil no longer uses the ``-a`` option by default, but
     59you can enable this by setting ``use_list_a_option`` on the
     60``FTPHost`` instance to ``True``::
     61
     62    with ftputil.FTPHost(host, user, password) as ftp_host:
     63        ftp_host.use_list_a_option = True
     64        ...
     65
     66However, do this only if you're *sure* the server interprets the
     67option correctly!
    7968
    8069
    81 Module and method name changes
    82 ------------------------------
     70Porting to ftputil 4.0.0
     71------------------------
    8372
    84 In earlier ftputil versions, most module names had a redundant
    85 ``ftp_`` prefix. In ftputil 3.0, these prefixes are removed. Of the
    86 module names that are part of the public ftputil API, this affects
    87 only ``ftputil.error`` and ``ftputil.stat``.
     73Using the correct time shift value
     74~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    8875
    89 In Python 2.2, ``file`` became an alias for ``open``, and previous
    90 ftputil versions also had an ``FTPHost.file`` besides the
    91 ``FTPHost.open`` method. In Python 3.0, the ``file`` builtin was
    92 removed and the return values from the built-in ``open`` methods
    93 are no longer ``file`` instances. Along the same lines, ftputil 3.0
    94 also drops the ``FTPHost.file`` alias and requires ``FTPHost.open``.
     76As said above, the time shift value is now defined as
     77
     78  time_used_by_server - `UTC`_
     79
     80If you don't use any methods that deal with time stamps from the FTP
     81server, you don't need to change anything. Such methods include
     82``upload_if_newer``, ``download_if_newer`` and using the timestamp
     83values from the ``stat`` and ``lstat`` methods.
     84
     85Ideally, you have write access on the server in the current directory.
     86In this case you can call ``synchronize_times``::
     87
     88    with ftputil.FTPHost(host, user, password) as ftp_host:
     89        ftp_host.synchronize_times()
     90        ...
     91
     92If using ``synchronize_times`` isn't an option, you have to set the
     93time shift explicitly with ``set_time_shift``::
     94
     95    with ftputil.FTPHost(host, user, password) as ftp_host:
     96        ftp_host.set_time_shift(new_time_shift)
     97        ...
     98
     99*If* you're sure the server uses the same timezone as the client,
     100you can use
     101
     102::
     103
     104    with ftputil.FTPHost(host, user, password) as ftp_host:
     105        ftp_host.set_time_shift(
     106          round((datetime.datetime.now() - datetime.datetime.utcnow()).seconds,
     107                -2)
     108        )
     109        ...
     110
     111This is roughly equivalent to the old ftputil behavior. The only
     112difference is that the new behavior requires that you adapt the time
     113shift value if there's a switch to or from daylight saving time.
    95114
    96115
    97 Upload and download modes
    98 -------------------------
     116Using the ``-a`` option for ``DIR`` commands
     117~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    99118
    100 The ``FTPHost`` methods for downloading and uploading files
    101 (``download``, ``download_if_newer``, ``upload`` and
    102 ``upload_if_newer``) now always use binary mode; a ``mode`` argument
    103 is no longer needed or even allowed. Although this behavior makes
    104 downloads and uploads slightly less flexible, it should cover almost
    105 all use cases.
     119If you need to list hidden directories and files (names starting
     120with a dot) *and* are sure the server understands the ``-a`` option
     121for ``DIR`` commands, you now need to tell the ``FTPHost`` instance
     122explicitly to use the option::
    106123
    107 If you *really* want to do a transfer involving files opened in text
    108 mode, you can still do::
    109 
    110     import ftputil.file_transfer
    111 
    112     ...
    113 
    114     with FTPHost.open("source.txt", "r", encoding="UTF-8") as source, \
    115          FTPHost.open("target.txt", "w", encoding="latin1") as target:
    116         ftputil.file_transfer.copyfileobj(source, target)
    117 
    118 Note that it's not possible anymore to open one file in binary
    119 mode and the other file in text mode and transfer data between
    120 them with ``copyfileobj``. For example, opening the source in
    121 binary mode will read byte strings, but a target file opened in
    122 text mode will only allow writing of unicode strings. Then again,
    123 I assume that the cases where you want a mixed binary/text mode
    124 transfer should be *very* rare.
    125 
    126 
    127 Custom parsers receive lines as unicode strings
    128 -----------------------------------------------
    129 
    130 Custom parsers, as described in the documentation_, receive a text
    131 line for each directory entry in the methods ``ignores_line`` and
    132 ``parse_line``. In previous ftputil versions, the ``line`` arguments
    133 were byte strings; now they're unicode strings.
    134 
    135 .. _documentation: http://ftputil.sschwarzer.net/documentation#writing-directory-parsers
    136 
    137 If you aren't sure what this is about, this may help: If you never
    138 used the ``FTPHost.set_parser`` method, you can ignore this section.
    139 :-)
    140 
    141 
    142 Porting to ftputil 3.0
    143 ----------------------
    144 
    145 - It's likely that you catch an ftputil exception here and there.
    146   In that case, you need to change ``import ftputil.ftp_error``
    147   to ``import ftputil.error`` and modify the uses of the module
    148   accordingly. If you used ``from ftputil import ftp_error``, you can
    149   change this to ``from ftputil import error as ftp_error`` without
    150   changing the code using the module.
    151 
    152 - If you use the download or upload methods, you need to remove
    153   the ``mode`` argument from the call. If you used something
    154   else than ``"b"`` for binary mode (which I assume to be unlikely),
    155   you'll need to adapt the code that calls the download or upload
    156   methods.
    157 
    158 - If you use custom parsers, you'll need to change ``import
    159   ftputil.ftp_stat`` to ``import ftputil.stat`` and adapt your code in
    160   the module. Moreover, you might need to change your ``ignores_line``
    161   or ``parse_line`` calls if they rely on their ``line`` argument
    162   being a byte string.
    163 
    164 - If you use remote files, especially ones opened in text mode, you
    165   may need to change your code to adapt to the changes in newline
    166   conversion, encoding and/or string type (see above sections).
    167 
    168 .. Note::
    169 
    170     In the root directory of the installed ftputil package is a script
    171     ``find_invalid_code.py`` which, given a start directory as
    172     argument, will scan that directory tree for code that may need to
    173     be fixed. However, this script uses very simple heuristics, so it
    174     may miss some problematic code or list perfectly valid code.
    175 
    176     In particular, you may want to change the regular expression
    177     string ``HOST_REGEX`` for the names you usually use for
    178     ``FTPHost`` objects.
     124    with ftputil.FTPHost(host, user, password) as ftp_host:
     125        ftp_host.use_list_a_option = True
     126        ...
    179127
    180128
     
    182130---------------------
    183131
    184 The advice to "adapt code to the new string types" is rather vague. Can't you be more specific?
    185 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     132Why is Python 2 no longer supported?
     133~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    186134
    187 It's difficult to be more specific without knowing your application.
    188 
    189 That said, best practices nowadays are:
    190 
    191 - If you're dealing with character data, use unicode strings whenever
    192   possible. In Python 2, this means the ``unicode`` type and in Python
    193   3 the ``str`` type.
    194 
    195 - Whenever you deal with binary data which is actually character data,
    196   decode it as *soon* as possible when *reading* data. Encode the data
    197   as *late* as possible when *writing* data.
    198 
    199 Yes, I know that's not much more specific.
     135Python 2 is officially no longer maintained and supporting it in
     136combination with Python 3 led to lots of extra work. Therefore, I
     137decided to drop support for Python 2.
    200138
    201139
    202 Why don't you use a "Python 2 API" for Python 2 and a "Python 3 API" for Python 3?
    203 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     140Why is the minimum version Python 3.6 and not 3.5?
     141~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    204142
    205 (What's meant here is, for example, that if you opened a remote file
    206 as text, the read data could be of byte string type in Python 2 and of
    207 unicode type in Python 3. Similarly, under Python 2 a text file opened
    208 for writing could accept both byte strings and unicode strings in the
    209 ``write*`` methods.)
     143My plan *was* to support Python 3.5 since it's not yet end-of-life'd
     144and it's used in some LTS Linux/Unix distributions. However, dropping
     145Python 3.5 support made it much easier to implement `ticket #119`_,
     146support for `path-like objects`_. Python 3.6 introduced some
     147infrastructure so that code that used to use only ``str`` and
     148``bytes`` paths can now use path-like objects as well. I considered it
     149more important to support path-like objects than Python 3.5. I guess
     150it might have been possible to add support for path-like objects on
     151top of Python 3.5, but it would have been a hassle and Python 3.5
     152support officially ends in `just a few months`_.
    210153
    211 Actually, I had at first thought of implementing this but dropped the
    212 idea because it has several problems:
    213 
    214 - Basically, I would have to support two APIs for the same set of
    215   methods. I can imagine that some things can be simplified by just
    216   using ``str`` to convert to the "right" string type automatically,
    217   but I assume these opportunities would be rather the exception than
    218   the rule. I'd certainly not look forward to maintaining such code.
    219 
    220 - Using two different APIs might require people to change their code
    221   if they move from using ftputil 3.x in Python 2 to using it in
    222   Python 3.
    223 
    224 - Developers who want to support both Python 2 and 3 with the same
    225   source code (as I do now in ftputil) would "inherit" the "dual API"
    226   and would have to use different wrapper code depending on the Python
    227   version their code is run under.
    228 
    229 For these reasons, I `ended up`_ choosing the same API semantics for
    230 Python 2 and 3.
    231 
    232 .. _`ended up`: https://groups.google.com/forum/?fromgroups=#!topic/comp.lang.python/XKof6DpNyH4
    233 
    234 Why don't you use the six_ module to be able to support Python 2.4 and 2.5?
    235 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    236 
    237 .. _six: https://pypi.python.org/pypi/six/
    238 
    239 There are two reasons:
    240 
    241 - ftputil so far has no dependencies other than the Python standard
    242   library, and I think that's a nice feature.
    243 
    244 - Although ``six`` makes it easier to support Python 2.4/2.5 and
    245   Python 3 at the same time, the resulting code is somewhat awkward. I
    246   wanted a code base that feels more like "modern Python"; I wanted to
    247   use the Python 3 features backported to Python 2.6 and 2.7.
    248 
    249 Why don't you use 2to3_ to generate the Python 3 version of ftputil?
    250 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    251 
    252 .. _2to3: http://docs.python.org/2/library/2to3.html
    253 
    254 I had considered this when I started adapting the ftputil source code
    255 for Python 3. On the other hand, although using 2to3 used to be the
    256 recommended approach for Python 3 support, even `rather large
    257 projects`_ have chosen the route of having one code base and using it
    258 unmodified for Python 2 and 3.
    259 
    260 .. _`rather large projects`: https://docs.djangoproject.com/en/dev/topics/python3/
    261 
    262 When I looked into this approach for ftputil 3.0, it became quickly
    263 obvious that it would be easier and I found it worked out very well.
     154.. _`ticket #119`: https://ftputil.sschwarzer.net/trac/ticket/119
     155.. _`path-like objects`: https://docs.python.org/3/library/os.html#os.PathLike
     156.. _`just a few months`: https://devguide.python.org/#status-of-python-branches
Note: See TracChangeset for help on using the changeset viewer.