technaut (technaut) wrote,
technaut
technaut

Dependency Hell

There is a major problem I have with every single Linux distribution I've ever used. Before very long I've managed to get myself into what is known as dependency hell.

It always starts innocently enough. I have some project in mind that requires the latest feature of some piece of open-source software. By far the easiest way to install software in Linux is to use one of the many package management tools such as apt-get, yum, yast, urpmi or emerge. These automate the process of installing pre-compiled files and save the user the difficulty of knowing how best to compile a given piece of software for their particular distribution.

The trouble with these tools are that they often insist that a given piece of software can only be installed in an environment that has a particular set of libraries available. This is not really a problem in Linux, in that it can happily host multiple versions of a given library in a seamless way. Theoretically, if you need a library you don't have, you can simply install it and you're good to go.

In practice its much harder than that. None of the package managers I've ever used have a way of saying that you want to install a second (or third or fourth) version of a library. They always want to remove the old version before installing the new one. This, I suppose, reduces the problem of having a dozen different (and presumably unused) versions of a single library on a machine.

In practice though, it leads to a far worse problem. You often discover that your current versions of a dozen different tools rely on a currently installed library. If you want to upgrade the library, you have to upgrade all of these tools as well. Of course, these new tools may themselves depend on other new libraries that need to be installed and so on and so on.

I once installed a (I thought) simple upgrade to a program because it had a new feature I desperately needed. By the time all of the dependencies were resolved, my entire operating system had jumped two versions. I had started out running Fedora Core 2 and was now running Fedora Core 4. Or rather, a horrible mongrelized version of Fedora Core 2 and Fedora Core 4 which happened to think it was Fedora Core 4.

This is the equivalent in the Microsoft world of trying to install a new program in Windows 95, only to discover yourself running Windows 2000 when you're done.

There are many ways that one might try to combat the problem of dependency hell. The current system that most distributions use is a system of repositories. These hold versions of the software that are all consistent, and can be safely installed together. Of course that means that if a new piece of software requires a new library feature, one that isn't supported by the base libraries of the repository, then that version of the software doesn't get into the repository.

When one is facing a software problem that is fixed by the latest version of some piece of software, but that isn't in the repository, that can be very frustrating. The temptation is to link to one of the experimental or 'unstable' repositories to grab the version of the software you need. Three times in four, it works. You install the program and a couple of supplemental libraries and you've happily solved your problem. The fourth time, you end up back in dependency hell.

There are other obvious ways to with aspects of the problem, from having an explicit language to describe how library APIs change, so that library dependencies can be derived rather than being set by the software author, to having a far more strongly-typed and object-oriented operating system that could more easily deal with these issues. Both of these are difficult and manpower-intensive changes to implement and what you'd have when you were done would arguably not be Linux any more.

A far simpler solution and one that would go a long way toward solving the problem would be to add a pair of features to most package managers. The first would be a flag to specify that you wish to install a parallel version of a package, be it a software tool or just a library. Thus, I would finally have a simple way of telling my distro that I need 3 different versions of GCC installed for development testing purposes.

The second feature would be a way of installing just a library. Often you'll install a package and discover that it needs Library X. Library X is part of Program X and you have to install that program (which requires you to satisfy its dependencies first), even if you never want to use that program. Sometimes you have to the option of installing Development Kit X instead, which also contains Library X and is designed to let you write programs that use Library X. The thing is, you don't want to write programs that use Library X, you already have one that you simply want to install.

By paring down the number of files you need to install to satisfy a given set of dependencies, and by having tools that are willing to explicitly manage multiple versions of the same installed library, we could go a long way to eliminating dependency hell.
Subscribe
  • Post a new comment

    Error

    Anonymous comments are disabled in this journal

    default userpic

    Your IP address will be recorded 

  • 11 comments