Opened 2 years ago

Closed 14 months ago

#107 closed defect (fixed)

untrapped exception

Reported by: ftputiluser Owned by: schwa
Priority: major Milestone: 3.4
Component: Library Version: 3.3.1
Keywords: os.walk exception followlinks Cc:

Description (last modified by schwa)

     for root,dirs,files in ftp_host.walk(ftp_host.curdir, topdown=True, onerror=None, followlinks=False):
  File "/usr/local/lib/python3.4/dist-packages/ftputil/host.py", line 917, in walk
    for item in self.walk(path, topdown, onerror, followlinks):
  File "/usr/local/lib/python3.4/dist-packages/ftputil/host.py", line 917, in walk
    for item in self.walk(path, topdown, onerror, followlinks):
  File "/usr/local/lib/python3.4/dist-packages/ftputil/host.py", line 917, in walk
    for item in self.walk(path, topdown, onerror, followlinks):
  File "/usr/local/lib/python3.4/dist-packages/ftputil/host.py", line 908, in walk
    if self.path.isdir(self.path.join(top, name)):
  File "/usr/local/lib/python3.4/dist-packages/ftputil/path.py", line 159, in isdir
    path, _exception_for_missing_path=False)
  File "/usr/local/lib/python3.4/dist-packages/ftputil/host.py", line 889, in stat
    return self._stat._stat(path, _exception_for_missing_path)
  File "/usr/local/lib/python3.4/dist-packages/ftputil/stat.py", line 747, in _stat
    _exception_for_missing_path)
  File "/usr/local/lib/python3.4/dist-packages/ftputil/stat.py", line 701, in __call_with_parser_retry
    result = method(*args, **kwargs)
  File "/usr/local/lib/python3.4/dist-packages/ftputil/stat.py", line 685, in _real_stat
    format(original_path))
ftputil.error.PermanentError: recursive link structure detected for remote path './old-gnu/Manuals/automake-1.3/automake-1.4'
Debugging info: ftputil 3.3.1, Python 3.4.3 (linux)
Uncaught exception. Entering post mortem debugging

wrapped by
try:
            result = method(*args, **kwargs)
            # If a `listdir` call didn't find anything, we can't
            # say anything about the usefulness of the parser.
            if (method is not self._real_listdir) and result:
                self._allow_parser_switching = False
            return result
        except ftputil.error.ParserError:
            if self._allow_parser_switching:
                self._allow_parser_switching = False
                self._parser = MSParser()
                return method(*args, **kwargs)
            else:
                raisetry:
            result = method(*args, **kwargs)
            # If a `listdir` call didn't find anything, we can't
            # say anything about the usefulness of the parser.
            if (method is not self._real_listdir) and result:
                self._allow_parser_switching = False
            return result
        except ftputil.error.ParserError:
            if self._allow_parser_switching:
                self._allow_parser_switching = False
                self._parser = MSParser()
                return method(*args, **kwargs)
            else:
                raise

so no handler for Permanent error

Change History (9)

comment:1 Changed 2 years ago by ftputiluser

adam.rothschild@…

comment:2 Changed 2 years ago by schwa

  • Description modified (diff)

comment:3 Changed 2 years ago by schwa

  • Status changed from new to assigned

Hi Adam, thanks for your report.

I think you're right. With the given arguments, walk should ignore errors.

On a related note, I read the documentation on os.walk and it says that if followlinks is True - unlike in this ticket - os.walk may end up in an infinite recursion. On the other hand, ftputil does keep track of the inspected directories and would raise an exception. Unless somebody comes up with a convincing argument, I'd like to keep the behavior for followlinks=True.

comment:4 Changed 2 years ago by schwa

  • Keywords os.walk exception followlinks added
  • Milestone set to 3.3.2
  • Version set to 3.3.1

comment:5 Changed 14 months ago by schwa

I'm preparing a new ftputil version 3.4 and just had another look at this ticket again.

My thoughts:

  • If followlinks was True and walk would encounter a recursive directory structure, I think the PermanentError is justified even if onerror is None. After all, how should ftputil otherwise behave if the only alternative is an infinite recursion? According to the Python documentation, the os.walk implementation could indeed run into an infinite recursion.
  • However, in the above case (in the ticket description) followlinks is False. Therefore ftputil shouldn't ever run into a recursive structure in the first place. It seems that ftputil's FTPHost.stat call follows symbolic links (at least in certain cases) even though it shouldn't.

comment:6 Changed 14 months ago by schwa

The current implementation of FTPHost.walk looks like this:

    def walk(self, top, topdown=True, onerror=None, followlinks=False):
        """
        Iterate over directory tree and return a tuple (dirpath,
        dirnames, filenames) on each iteration, like the `os.walk`
        function (see https://docs.python.org/library/os.html#os.walk ).
        """
        top = ftputil.tool.as_unicode(top)
        # The following code is copied from `os.walk` in Python 2.4
        # and adapted to ftputil.
        try:
            names = self.listdir(top)
        except ftputil.error.FTPOSError as err:
            if onerror is not None:
                onerror(err)
            return
        dirs, nondirs = [], []
        for name in names:
            if self.path.isdir(self.path.join(top, name)):
                dirs.append(name)
            else:
                nondirs.append(name)
        if topdown:
            yield top, dirs, nondirs
        for name in dirs:
            path = self.path.join(top, name)
            if followlinks or not self.path.islink(path):
                for item in self.walk(path, topdown, onerror, followlinks):
                    yield item
        if not topdown:
            yield top, dirs, nondirs

The above exception (in the ticket description) is triggered by the isdir call

self.path.isdir(self.path.join(top, name))

From here the call chain is FTPHost.path.isdir -> FTPHost.stat -> ftputil.stat._Stat._stat -> ftputil.stat._Stat._real_stat. Now, to find out whether the path is a link that points to a directory, the link, if any, has to be followed to the "real" directory or file. This is why walk ends up following links even though it shouldn't.

I need to look into this to find out if we can implement walk in a way that doesn't need to follow symlinks. Probably, if the path self.path.join(top, name) is a link, walk should either ignore it or it should be added to the nondirs list.

comment:7 Changed 14 months ago by schwa

  • Milestone changed from 3.3.2 to 3.4

The behavior of os.path.isdir for recursive symlinks is this:

$ ls -l bad_link
lrwxrwxrwx. 1 schwa schwa 8 Oct 28 21:45 bad_link -> bad_link

$ python
Python 3.5.4 (default, Aug 23 2017, 18:32:05) 
[GCC 6.4.1 20170727 (Red Hat 6.4.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.path.isdir("bad_link")
False
>>> os.path.isfile("bad_link")
False
>>> os.path.islink("bad_link")
True

It seems that os.path.isdir and os.path.isfile detect the link chain and return False if they "notice" they won't find a directory or file. Therefore, FTPHost.path.isdir and FTPHost.path.isfile should do the same. With this change, walk should behave as expected without any additional changes.

comment:8 Changed 14 months ago by schwa

For completeness, the behavior of os.stat is as the corresponding function in ftputil:

>>> os.stat("bad_link")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: [Errno 40] Too many levels of symbolic links: 'bad_link'
Last edited 14 months ago by schwa (previous) (diff)

comment:9 Changed 14 months ago by schwa

  • Resolution set to fixed
  • Status changed from assigned to closed

Fixed in [ec059f9d9dc8].

FTPHost.path.isdir and FTPHost.path.isfile now return False if they encounter an infinite link chain.

Note: See TracTickets for help on using tickets.