wiki:Documentation

Version 3 (modified by schwa, 15 years ago) (diff)

--

ftputil - a high-level FTP client library for Python

Version: 2.0.3
Date: 2004-07-29
Summary:high-level FTP client library for Python
Keywords:FTP, ftplib substitute, virtual filesystem
Author: Stefan Schwarzer <sschwarzer@sschwarzer.net>

Introduction

The ftputil module is a high-level interface to the ftplib module. The FTPHost objects generated from it allow many operations similar to those of os and os.path.

Examples:

import ftputil

# download some files from the login directory
host = ftputil.FTPHost('ftp.domain.com', 'user', 'password')
names = host.listdir(host.curdir)
for name in names:
    if host.path.isfile(name):
        host.download(name, name, 'b')  # remote, local, binary mode

# make a new directory and copy a remote file into it
host.mkdir('newdir')
source = host.file('index.html', 'r')  # file-like object
target = host.file('newdir/index.html', 'w')  # file-like object
host.copyfileobj(source, target)  # similar to shutil.copyfileobj
source.close()
target.close()

Also, there are FTPHost.lstat and FTPHost.stat to request size and modification time of a file. The latter can also follow links, similar to os.stat. Even FTPHost.path.walk works.

The distribution contains a custom UserTuple module to provide stat results with Python versions 2.0 and 2.1.

Exception hierarchy

The exceptions are in the namespace of the ftputil package (e. g. ftputil.TemporaryError). They are organized as follows:

FTPError
    FTPOSError(FTPError, OSError)
        TemporaryError(FTPOSError)
        PermanentError(FTPOSError)
        ParserError(FTPOSError)
    FTPIOError(FTPError)
    InternalError(FTPError)
        RootDirError(InternalError)
        InaccessibleLoginDirError(InternalError)
    TimeShiftError(FTPError)

and are described here:

  • FTPError

    is the root of the exception hierarchy of the module.

  • FTPOSError

    is derived from OSError. This is for similarity between the os module and FTPHost objects. Compare

    try:
        os.chdir('nonexisting_directory')
    except OSError:
        ...
    

    with

    host = ftputil.FTPHost('host', 'user', 'password')
    try:
        host.chdir('nonexisting_directory')
    except OSError:
        ...
    

    Imagine a function

    def func(path, file):
        ...
    

    which works on the local file system and catches OSErrors. If you change the parameter list to

    def func(path, file, os=os):
        ...
    

    where os denotes the os module, you can call the function also as

    host = ftputil.FTPHost('host', 'user', 'password')
    func(path, file, os=host)
    

    to use the same code for a local and remote file system. Another similarity between OSError and FTPOSError is that the latter holds the FTP server return code in the errno attribute of the exception object and the error text in strerror.

  • TemporaryError

    is raised for FTP return codes from the 4xx category. This corresponds to ftplib.error_temp (though TemporaryError and ftplib.error_temp are not identical).

  • PermanentError

    is raised for 5xx return codes from the FTP server (again, that's similar but not identical to ftplib.error_perm).

  • ParserError

    is used for errors during the parsing of directory listings from the server. This exception is used by the FTPHost methods stat, lstat, and listdir.

  • FTPIOError

    denotes an I/O error on the remote host. This appears mainly with file-like objects which are retrieved by invoking FTPHost.file (FTPHost.open is an alias). Compare

    >>> try:
    ...     f = open('notthere')
    ... except IOError, obj:
    ...     print obj.errno
    ...     print obj.strerror
    ...
    2
    No such file or directory
    

    with

    >>> host = ftputil.FTPHost('host', 'user', 'password')
    >>> try:
    ...     f = host.open('notthere')
    ... except IOError, obj:
    ...     print obj.errno
    ...     print obj.strerror
    ...
    550
    550 notthere: No such file or directory.
    

    As you can see, both code snippets are similar. (However, the error codes aren't the same.)

  • InternalError

    subsumes exception classes for signaling errors due to limitations of the FTP protocol or the concrete implementation of ftputil.

  • RootDirError

    Because of the implementation of the lstat method it is not possible to do a stat call on the root directory /. If you know any way to do it, please let me know. :-)

  • InaccessibleLoginDirError

    This exception is only raised if both of the following conditions are met:

    • The directory in which "you" are placed upon login is not accessible, i. e. a chdir call fails.
    • You try to access a path which contains whitespace.
  • TimeShiftError

    is used to denote errors which relate to setting the time shift, e. g. trying to set a value which is no multiple of a full hour.

FTPHost objects

Construction

FTPHost instances may be generated with the following call:

host = ftputil.FTPHost(host, user, password, account,
                       session_factory=ftplib.FTP)

The first four parameters are strings with the same meaning as for the FTP class in the ftplib module. The keyword argument session_factory may be used to generate FTP connections with other factories than the default ftplib.FTP. For example, the M2Crypto distribution uses a secure FTP class which is derived from ftplib.FTP.

In fact, all positional and keyword arguments other than session_factory are passed to the factory to generate a new background session (which happens for every remote file that is opened; see below).

This functionality of the constructor also allows to wrap ftplib.FTP objects to do something that wouldn't be possible with the ftplib.FTP constructor alone.

As an example, assume you want to connect to another than the default port but ftplib.FTP only offers this by means of its connect method, but not via its constructor. The solution is to provide a wrapper class:

import ftplib
import ftputil

EXAMPLE_PORT = 50001

class MySession(ftplib.FTP):
    def __init__(self, host, userid, password, port):
        """Act like ftplib.FTP's constructor but connect to other port."""
        ftplib.FTP.__init__(self)
        self.connect(host, port)
        self.login(userid, password)

# try not to use MySession() as factory, - use the class itself
host = ftputil.FTPHost(host, userid, password,
                       port=EXAMPLE_PORT, session_factory=MySession)
# use `host` as usual

On login, the format of the directory listings (needed for stat'ing files and directories) should be determined automatically. If not, you may use the method set_directory_format to set the format "manually".

FTPHost attributes and methods

Attributes

  • curdir, pardir, sep

    are strings which denote the current and the parent directory on the remote server. sep identifies the path separator. Though RFC 959 (File Transfer Protocol) notes that these values may be server dependent, the Unix counterparts seem to work well in practice, even for non-Unix servers.

Time zone correction

  • set_time_shift(time_shift)

    sets the so-called time shift value (measured in seconds). The time shift is the difference between the local time of the server and the local time of the client at a given moment, i. e. by definition

    time_shift = server_time - client_time
    

    Setting this value is important if upload_if_newer and download_if_newer should work correctly even if the time zone of the FTP server differs from that of the client (where ftputil runs). Note that the time shift value can be negative.

    If the time shift value is invalid, e. g. no multiple of a full hour or its absolute (unsigned) value larger than 24 hours, a TimeShiftError is raised.

    See also synchronize_times for a way to set the time shift with a simple method call.

  • time_shift()

    return the currently-set time shift value. See set_time_shift (above) for its definition.

  • synchronize_times()

    synchronizes the local times of the server and the client, so that upload_if_newer and download_if_newer work as expected, even if the client and the server are in different time zones. For this to work, all of the following conditions must be true:

    • The connection between server and client is established.
    • The client has write access to the directory that is current when synchronize_times is called.
    • That directory is not the root directory (i. e. /) of the FTP server.

    If you can't fulfill these conditions, you can nevertheless set the time shift value manually with set_time_shift. Trying to call synchronize_times if the above conditions aren't true results in a TimeShiftError exception.

Files and directories

  • file(path, mode='r')

    returns a file-like object that is connected to the path on the remote host. This path may be absolute or relative to current directory on the remote host (this directory can be determined with the getcwd method). As with local file objects the default mode is "r", i. e. reading text files. Valid modes are "r", "rb", "w", and "wb".

  • open(path, mode='r')

    is an alias for file (see above).

  • copyfileobj(source, target, length=64*1024)

    copies the contents from the file-like object source to the file-like object target. The only difference to shutil.copyfileobj is the default buffer size.

  • close()

    closes the connection to the remote host. After this, no more interaction with the FTP server is possible without using a new FTPHost object.

  • getcwd()

    returns the absolute current directory on the remote host. This method acts similar to os.getcwd.

  • chdir(directory)

    sets the current directory on the FTP server. This resembles os.chdir, as you may have expected. :-)

  • mkdir(path, [mode])

    makes the given directory on the remote host. In the current implementation, this doesn't construct "intermediate" directories which don't already exist. The mode parameter is ignored. This is for compatibilty with os.mkdir if an FTPHost object is passed into a function instead of the os module (see the subsection on Python exceptions above for an explanation).

  • rmdir(path)

    removes the given remote directory.

    In previous versions of ftputil, it depended on the remote server whether non-empty directories could be deleted. ftputil version 2.0 and up by default allow only to delete empty directories.

    If you want to update to ftputil 2.0 with minimal changes to your source code, pass in an additional parameter _remove_only_empty=False. Note that this is deprecated and will probably be unsupported in future ftputil versions.

  • remove(path)

    removes a file on the remote host (similar to os.remove).

  • unlink(path)

    is an alias for remove.

  • rename(source, target)

    renames the source file (or directory) on the FTP server.

  • listdir(path)

    returns a list containing the names of the files and directories in the given path; similar to os.listdir.

Uploading and downloading files

  • upload(source, target, mode='')

    copies a local source file (given by a filename, i. e. a string) to the remote host under the name target. Both source and target may be absolute paths or relative to their corresponding current directory (on the local or the remote host, respectively). The mode may be "" or "a" for ASCII uploads or "b" for binary uploads. ASCII mode is the default (again, similar to regular local file objects).

  • download(source, target, mode='')

    performs a download from the remote source to a target file. Both source and target are strings. Additionally, the description of the upload method applies here, too.

  • upload_if_newer(source, target, mode='')

    is similar to the upload method. The only difference is that the upload is only invoked if the time of the last modification for the source file is more recent than that of the target file, or the target doesn't exist at all. If an upload actually happened, the return value is a true value, else a false value.

    Note that this method only checks the existence and/or the modification time of the source and target file; it can't recognize a change in the transfer mode, e. g.

    # transfer in ASCII mode
    host.upload_if_newer('source_file', 'target_file', 'a')
    # won't transfer the file again
    host.upload_if_newer('source_file', 'target_file', 'b')
    

    Similarly, if a transfer is interrupted, the remote file will have a newer modification time than the local file, and thus the transfer won't be repeated if upload_if_newer is used a second time. There are (at least) two possibilities after a failed upload:

    • use upload instead of upload_if_newer, or
    • remove the incomplete target file with FTPHost.remove, then use upload or upload_if_newer to transfer it again.

    If it seems that a file is uploaded unnecessarily, read the subsection on time shift settings.

  • download_if_newer(source, target, mode='')

    corresponds to upload_if_newer but performs a download from the server to the local host. Read the descriptions of download and upload_if_newer for more. If a download actually happened, the return value is a true value, else a false value.

    If it seems that a file is downloaded unnecessarily, read the subsection on time shift settings.

Stat'ing files and directories

The methods lstat and stat (and others) rely on the directory listing format used by the FTP server. When connecting to a host, FTPHost's constructor tries to guess the right format, which mostly succeeds. However, if you get strange results (or exceptions), it may be necessary to set the directory format "manually". This is done - immediately after connecting - with

  • set_directory_format(server_format)

    server_format is one of the strings "unix" or "ms". To choose the correct format, you have to start a command line FTP client and request a directory listing (most clients do this with the DIR command).

    If the resulting lines look like

    drwxr-sr-x   2 45854    200           512 Jul 30 17:14 image
    -rw-r--r--   1 45854    200          4604 Jan 19 23:11 index.html
    

    use "unix" as the argument.

    If the output looks like

    12-07-01  02:05PM       <DIR>          XPLaunch
    07-17-00  02:08PM             12266720 digidash.exe
    

    use "ms" as the argument string server_format.

    If none of the above settings help, contact me. It would be very helpful if you could provide an example listing (DIR's output).

If calling lstat or stat yields wrong modification dates or times, look at the methods that deal with time zone differences (time shift).

  • lstat(path)

    returns an object similar that from os.lstat (a "tuple" with additional attributes; see the documentation of the os module for details). However, due to the nature of the application, there are some important aspects to keep in mind:

    • The result is derived by parsing the output of a DIR command on the server. Therefore, the result from FTPHost.lstat can not contain more information than the received text. In particular:
    • User and group ids can only be determined as strings, not as numbers, and that only if the server supplies them. This is usually the case with Unix servers but may not be for other FTP server programs.
    • Values for the time of the last modification may be rough, depending on the information from the server. For timestamps older than a year, this usually means that the precision of the modification timestamp value is not better than days. For newer files, the information may be accurate to a minute.
    • Links can only be recognized on servers that provide this information in the DIR output.
    • Items that can't be determined at all are set to None.
    • There's a special problem with stat'ing the root directory. In this case, a RootDirError is raised. This has to do with the algorithm used by (l)stat and I know of no approach which solves this problem.
Currently, ftputil recognizes the MS Robin FTP server. Otherwise, a format commonly used by Unix servers is assumed. If you need to parse output from another server type, please contact me under the email address at the end of this text.
  • stat(path) returns stat information also for files which are pointed to by a link. This method follows multiple links until a regular file or directory is found. If an infinite link chain is encountered, a PermanentError is raised.

FTPHost.path

FTPHost objects contain an attribute named path, similar to os.path. The following methods can be applied to the remote host with the same semantics as for os.path:

abspath(path)
basename(path)
commonprefix(path_list)
dirname(path)
exists(path)
getmtime(path)
getsize(path)
isabs(path)
isdir(path)
isfile(path)
islink(path)
join(path1, path2, ...)
normcase(path)
normpath(path)
split(path)
splitdrive(path)
splitext(path)
walk(path, func, arg)

FTPFile objects

FTPFile objects as returned by a call to FTPHost.file (or FTPHost.open) have the following methods - with the same arguments and semantics as for local files:

close()
read([count])
readline([count])
readlines()
write(data)
writelines(string_sequence)
xreadlines()

and the attribute closed. For details, see the section File objects in the Library Reference.

Note that ftputil supports both binary mode and text mode with the appropriate line ending conversions.

Tips and tricks / FAQ

Where can I get the latest version?

See http://www.sschwarzer.net/python/python_software.html#ftputil . Announcements will also be sent to the mailing list (see the question below). Announcements on major updates will also be posted to the newsgroup comp.lang.python .

I found a bug! What now?

Before reporting a bug, make sure that you already tried the latest version of ftputil. There the bug might have already been fixed.

Please send your bug report to the ftputil mailing list (see above for the address) or to me (email address at the top of this file). In either case you must not include confidential information (user id, password, file names, etc.) in the mail.

When reporting a bug, please provide the following information:

  • version of ftputil
  • version of Python
  • type and version of FTP server (should be visible in its "welcome message")
  • operating system and version for server and client (output of uname -a on Unix)
  • bug description
  • if possible, a short code example which reproduces the bug
  • if possible, ideas that might help to find the cause of the bug

Connecting on another port

By default, an instantiated FTPHost object connects on the usual FTP ports. If you have to use a different port, refer to the section FTPHost construction.

You can use the same approach to connect in active or passive mode, as you like.

Using active or passive connections

Please see the previous tip.

Conditional upload/download to/from a server in a different time zone

You may find that ftputil uploads or downloads files unnecessarily, or not when it should. This can happen when the FTP server is in a different time zone than the client on which ftputil runs. Please see the the section on setting the time shift. It may even be sufficient to call synchronize_times.

Wrong dates or times when stat'ing on a server

Please see the previous and the next tip.

I don't find an answer to my problem in this document

Please send me an email with your question, and I'll see what I can do for you. :-) Probably a better way is to send your question to the mailing list at ftputil@codespeak.net; potentially more people might be able to help you.

Bugs and limitations

  • ftputil needs at least Python 2.0 to work.
  • Due to the implementation of lstat it can not return a sensible value for the root directory /. If you know an implementation that can do this, please let me know. The root directory is handled appropriately in FTPHost.path.exists/isfile/isdir/islink, though.
  • Timeouts of individual child sessions currently are not handled. This is only a problem if your FTPHost object or the generated FTPFile objects are inactive for about ten minutes or longer.
  • Until now, I haven't paid attention to thread safety. In principle, at least, different FTPFile objects should be usable in different threads.
  • FTPFile objects in text mode may not support charsets with more than one byte per character. Please email me your experiences (address above), if you work with multibyte text streams in FTP sessions.
  • Currently, it is not possible to continue an interrupted upload or download. Contact me if you have problems with that.
  • The UserTuple class, provided in UserTuple.py, is not thoroughly tested. If you encouter problems, please notify me.

Files

If not overwritten via installation options, the ftputil files reside in the ftputil package. The documentation (in reStructured Text and in HTML format) is in the same directory.

The files _test_*.py and _mock_ftplib.py are for unit-testing. If you only use ftputil (i. e. not modifying it), you can delete these files.

References

Author

ftputil is written by Stefan Schwarzer <sschwarzer@sschwarzer.net>.

Feedback is appreciated. :-)