Changeset 1526:9218cbe591b0


Ignore:
Timestamp:
Jun 8, 2014, 11:26:20 PM (5 years ago)
Author:
Stefan Schwarzer <sschwarzer@…>
Branch:
default
Message:
Added a section on child sessions/connections.

This should explain why calling `FTPHost.keep_alive` may not be
enough if there are open remote files.
File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/ftputil.txt

    r1525 r1526  
    10501050
    10511051  Note that the ``keep_alive`` method does *not* affect the "hidden"
    1052   FTP connections established by ``FTPHost.open`` (see below). You
    1053   *can't* use ``keep_alive`` to avoid a timeout in a stalling transfer
    1054   like this::
     1052  FTP child connections established by ``FTPHost.open`` (see section
     1053  `FTPHost instances vs. FTP connections`_ for details). You *can't*
     1054  use ``keep_alive`` to avoid a timeout in a stalling transfer like
     1055  this::
    10551056
    10561057      with ftputil.FTPHost(server, userid, password) as ftp_host:
     
    11561157line::
    11571158
    1158     with ftputil.FTPHost(...) as ftp_host:
     1159    with ftputil.FTPHost(server, user, password) as ftp_host:
    11591160        with ftp_host.open("some_file") as input_file:
    11601161            for line in input_file:
     
    11661167
    11671168.. _`file objects`: https://docs.python.org/2.7/library/stdtypes.html#file-objects
     1169
     1170
     1171.. _`child_connections`:
     1172
     1173``FTPHost`` instances vs. FTP connections
     1174-----------------------------------------
     1175
     1176This section explains why keeping an ``FTPHost`` instance "alive"
     1177without timing out sometimes isn't trivial. If you always finish your
     1178FTP operations in time, you don't need to read this section.
     1179
     1180The file transfer protocol is a stateful protocol. That means an FTP
     1181connection always is in a certain state. Each of these states can only
     1182change to certain other states under certain conditions triggered by
     1183the client or the server.
     1184
     1185One of the consequences is that a single FTP connection can't be used
     1186at the same time, say, to transfer data on the FTP data channel and to
     1187create a directory on the remote host.
     1188
     1189For example, consider this::
     1190
     1191    >>> import ftplib
     1192    >>> ftp = ftplib.FTP(server, user, password)
     1193    >>> ftp.pwd()
     1194    '/'
     1195    >>> # Start transfer. `CONTENTS` is a text file on the server.
     1196    >>> socket = ftp.transfercmd("RETR CONTENTS")
     1197    >>> socket
     1198    <socket._socketobject object at 0x7f801a6386e0>
     1199    >>> ftp.pwd()
     1200    Traceback (most recent call last):
     1201      File "<stdin>", line 1, in <module>
     1202      File "/usr/lib64/python2.7/ftplib.py", line 578, in pwd
     1203        return parse257(resp)
     1204      File "/usr/lib64/python2.7/ftplib.py", line 842, in parse257
     1205        raise error_reply, resp
     1206    ftplib.error_reply: 226-File successfully transferred
     1207    226 0.000 seconds (measured here), 5.60 Mbytes per second
     1208    >>>
     1209
     1210Note that ``ftp`` is a single FTP connection, represented by an
     1211``ftplib.FTP`` instance, not an ``ftputil.FTPHost`` instance.
     1212
     1213On the other hand, consider this::
     1214
     1215    >>> import ftputil
     1216    >>> ftp_host = ftputil.FTPHost(server, user, password)
     1217    >>> ftp_host.getcwd()
     1218    >>> fobj = ftp_host.open("CONTENTS")
     1219    >>> fobj
     1220    <ftputil.file.FTPFile object at 0x7f8019d3aa50>
     1221    >>> ftp_host.getcwd()
     1222    u'/'
     1223    >>> fobj.readline()
     1224    u'Contents of FTP test directory\n'
     1225    >>> fobj.close()
     1226    >>>
     1227
     1228To be able to start a file transfer (i. e. open a remote file for
     1229reading or writing) and still be able to use other FTP commands,
     1230ftputil uses a trick. For every remote file, ftputil creates a new FTP
     1231connection, called a child connection in the ftputil source code.
     1232(Actually, FTP connections belonging to closed remote files are
     1233re-used if they haven't timed out yet.)
     1234
     1235In most cases this approach isn't noticable by code using ftputil.
     1236However, the nice abstraction of dealing with a single FTP connection
     1237falls apart if one of the child connections times out. For example, if
     1238you open a remote file and work only with the initial "main"
     1239connection to navigate the file system, the FTP connection for the
     1240remote file may eventually time out.
     1241
     1242While it's often relatively easy to prevent the "main" connection from
     1243timing out it's unfortunately practically impossible to do this for a
     1244remote file connection (apart from transferring some data, of course).
     1245For this reason, `FTPHost.keep_alive`_ affects only the main
     1246connection. Child connections may still time out if they're idle for
     1247too long.
     1248
     1249.. _`FTPHost.keep_alive`: `keep_alive`_
     1250
     1251Some more details:
     1252
     1253- A kind of "straightforward" way of keeping the main connection alive
     1254  would be to call ``ftp_host.getcwd()``. However, this doesn't work
     1255  because ftputil caches the current directory and returns it without
     1256  actually contacting the server. That's the main reason why there's
     1257  a ``keep_alive`` method since it calls ``pwd`` on the FTP connection
     1258  (i. e. the session object), which isn't a public attribute.
     1259
     1260- Some servers define not only an idle timeout but also a transfer
     1261  timeout. This means the connection times out unless there's some
     1262  transfer on the data channel for this connection. So ftputil's
     1263  ``keep_alive`` doesn't prevent this timeout, but an
     1264  ``ftp_host.listdir(ftp_host.curdir)`` call should do it. However,
     1265  this transfers the data for the whole directory listing which might
     1266  take some time if the directory has many entries.
     1267
     1268Bottom line: If you can, you should organize your FTP actions so that
     1269you finish everything before a timeout happens.
    11681270
    11691271
Note: See TracChangeset for help on using the changeset viewer.