OPIUM: Optimal Package Install/Uninstall Manager

Submitted by anonymous 2007-06-04 Benchmarks 56 Comments

“We have developed a new package-management tool, called Opium, that improves on current tools in two ways: Opium is complete, in that if there is a solution, Opium is guaranteed to find it, and Opium can optimize a user-provided objective function, which could for example state that smaller
packages should be preferred over larger ones. We performed a comparative study of our tool against Debian’s apt-get on 600 traces of real-world package installations. We show that Opium runs fast enough to be usable,
and that its completeness and optimality guarantees provide
concrete benefits to end users.”

About The Author

Thom Holwerda

Follow me on Mastodon @[email protected]

56 Comments

2007-06-04 5:12 pm

Excel Hearts Choi
Just what the world needs, another package manager. Save all the choice is good crap. A uniform way of installing/removing/upgrading would go a long way in making linux a more enticing offer to enterprise customers. So much time, money, and energy is wasted on creating a new (probably buggy when compared to linux mainstays like apt-get) way of doing something for which there already exist several quality candidates. Why not really make a difference and start lobbying hardware companies to open their specs, help code KDE 4, improve suspend/resume for the ever increasing number of laptop configurations, or any other the other worthy causes in the FOSS community.

2007-06-04 6:13 pm

christucker
OPIUM isn’t a full package manager, it’s a dependency analysis algorithm designed to plug in to existing package managers (in particular, APT). The paper is a research piece exploring the algorithm; hopefully the community will find value in it and we can look at integrating it into APT in the future. It certainly doesn’t aim to replace apt-get entirely, only the portion of it that computes dependencies. Because the paper was targeted at an academic, not an industrial, audience, this kind of implementation mechanism was not covered in it. However, if you read the paper you’ll be able to see very quickly what we do and do not deal with.
2007-06-05 6:55 am

evangs
Why not really make a difference and …

And start reading the linked article, instead of the OSAlert blurb before spewing misinformed comments and dissing the work people have done? You don’t even need to look beyond the introduction to see the goals of OPIUM.

2007-06-04 5:19 pm

deanlinkous
Confused….was this a study done on *spire or on Debian? I could of swore that *spire use to claim that CNR did not even use apt – so which is it? I think any study related to apt should be done on debian.

In our experiments, we discovered a real user trace where an install attempt for OCaml using apt-get caused 61 packages to be removed, including the Linux kernel This poor user would not be able to reboot their machine after installing OCaml.

I am not familiar with any issue on Debian about installing Ocaml and it removing the kernel and 60 packages? Anyone else? Oh, that poor user!

Was this done using Debians repository or Linspires warehouse? I am not sure if using the Linspire repository is the best test case for a study since iirc they like to mix packages from various debian flavors and make it a bit messy. Then again, maybe it is the perfect test for OPIUM but I am not sure it is reflective of the nature of apt either. Anyone else?

He also has a pdf slideshow presentation on his site –

http://www.cjtucker.com/ICSE_Presentation.pdf

Oh and it was also a Linspire Letter

http://www.linspire.com/linspire_letter_archives.php?id=46

Is he basically saying that apt is broke? I wonder why so many distros use apt and go for a debian base if apt is such a horrible tool?

Is this just a marketing tool to sell the *new and improved* CNR?

Edited 2007-06-04 17:25

2007-06-04 6:02 pm

Excel Hearts Choi
On Debian if you try to remove Evolution (obviously a Gnome desktop) than apt-get will try to remove Gnome. I emailed somebody at Debian about this, and he said that the next release will fix whatever caused this. Perhaps somebody has the technical knowledge to fill in the gaps of my very vague description. I don’t know how Ubuntu responds under the same circumstances.

2007-06-04 6:14 pm

deanlinkous
On Debian if you try to remove Evolution (obviously a Gnome desktop) than apt-get will try to remove Gnome. I emailed somebody at Debian about this, and he said that the next release will fix whatever caused this. Perhaps somebody has the technical knowledge to fill in the gaps of my very vague description. I don’t know how Ubuntu responds under the same circumstances.

It will remove a meta-package (wrapper) called gnome that is used to install bunch of programs that they call (as a whole) gnome. In other words, it will not remove ANY programs (that I can think of) only the wrapping paper for the package called gnome.

2007-06-04 6:26 pm

christucker
Hi Dean. We address this on the forum posts over at Freespire related to the Linspire Letter that was sent out. In particular, my comment at:

http://forum.freespire.org/showthread.php?p=62301#post62301

The study was done on Linspire’s distribution (warehouse). The distribution obviously does make a difference to what specific issues you’ll see using APT, but I guarantee you no current distribution can protect you from APT’s problems with dependency resolution…the problem of removals and failure to install can’t be resolved with distribution-side QA checks because it’s heavily dependent on the user’s machine configuration. Furthermore, we offer optimization of the solution: we can find you the newest packages, the smallest packages, the highest rated packages, the most stable packages, whatever you can come up with to assign value to packages. This alone, even without the completeness and optimal removal guarantees, is a significant improvement on APTs heuristic approach.

So many distros go for an APT basis because APT is a whole lot better than a lot of the other options. Just because something is better than the other options doesn’t mean we should decide against improving it for the future though: that’s the anti-innovation mindset that OSS tries so hard to rebel against.

2007-06-04 5:47 pm

TommyD
Another package system? Sometimes I wonder what these guys are smoking… Oh wait!
2007-06-04 5:51 pm

thecwin
I always wondered about this sort of thing, but I assumed that given that a package may have somewhat complicated dependencies, I ended up assuming that a “complete” solver wasn’t possible.

Consider that you might have two applications installed that depend on packages that are incompatible with each other. How could this new OPIUM system solve this, other than by not including incompatible applications?

For an end-user linux distribution it might be a good idea to not include end applications with incompatible dependency trees. For an end user, this is absolute hell. For a server administrator, these are all important decisions, and with appropriate planning, apt’s dependency resolving “problems” shouldn’t provide much of a headache.

So does anyone know how OPIUM deals with a situation where A depends on X, B depends on Y, X conflicts with Y, and you try to install both?

2007-06-05 9:55 am

r3m0t
Assuming you meant you have installed none of A,B,X,Y and tried to install A,B, it would say:

Either: {A,X} (i.e. install A and X)

Or: {B,Y} (i.e. install B and Y)

Or: {} (i.e. do nothing) – but I think they probably filter out that option early on

2007-06-04 6:03 pm

Tom5
From the paper:

Libraries and software packages have dependencies that must be satisfied, and conflicts that must be avoided. Otherwise the entire system, not just a single application, may become unstable.

The solution isn’t a better solver (although that’s nice to have anyway), the solution is remove the possibility of conflicts.

Why do I need to use a massively complex algorithm to find the single version of some libfoo that works with all of the one thousand packages on my machine? What the system needs is a bit of slack. If there’s a period of a few months when one program needs libfoo > 3 and another needs libfoo == 3 that shouldn’t matter. Just keep two copies of libfoo on my system until the problem is fixed.

This is how Zero Install works:

http://0install.net

2007-06-04 7:04 pm

chris_dk
I completely agree.

Package manager seem to get increasingly complex while they should strive for simplicity.

Zeroinstall has my support.

2007-06-04 7:21 pm

christucker
I would argue that OPIUM is much simpler than APT in the way it solves dependencies. We make a trivial rewriting of the problem into a logical form, and then just throw it at an off-the-shelf solver to give us an answer. No mucking about with traversing graphs, choosing which path to take, back-tracking, or anything. Sometimes simplicity comes from an elegant new representation, not from a reduction in flexibility or power.

2007-06-04 8:41 pm

Tom5
I would argue that OPIUM is much simpler than APT in the way it solves dependencies.

Simpler, but with the same problems.

As I understand it, your success criteria is that the new package is installed and a minimum of other packages are removed. So, if I ask OPIUM to install “abiword” and it removes “gimp” in the process then that is a successful (and optimal) installation as far as the paper is concerned.

To a user, that probably looks more like failure! The real optimal solution is to install abiword without uninstalling anything.

2007-06-04 8:55 pm

christucker
OPIUM is optimal within the bounds of the packaging system. If the Abiword package says it conflicts with the gimp package then we can’t install Abiword on a system with Gimp. I agree that to the user this looks like a failure, but it’s a failure independent of the algorithm used to install software: it’s a failure in the packaging of Abiword and Gimp, or a failure in the packaging system. One thing OPIUM can do that APT can’t, though, is say something like: “I don’t consider it a solution if I have to remove something that the user explicitly installed (i.e. didn’t come in as a dependency)”. In general, we can guide the solver in any way we like to prefer removing certain packages in the case where packages have to be removed. Ideally this situation (need to remove packages) would never occur, but given in both of the major packaging formats we do encounter this problem it’s valuable to address it.
2007-06-04 9:40 pm

deanlinkous
Cant you place a hold or possibly pin packages with apt/aptitude and then those will not be removed. Now getting past that may be tricky.

I am also unsure how much room there is to “guide the solver” in solutions. Sure you can choose a different MTA or browser but if gstreamer wants libxine1.4 then you are getting libxine1.4

I would like to see opium in action – hard to say how it will work just by someone telling me about it. Or maybe, some examples that I can try and prove to myself that apt is actually adding/removing stuff incorrectly?
2007-06-04 10:12 pm

christucker

Cant you place a hold or possibly pin packages with apt/aptitude and then those will not be removed. Now getting past that may be tricky.

Pinning certainly would be an option for APT that would give a similar behavior to OPIUM in not removing things (assuming pinning gives you that capacity and is fully implemented: I know some work at Linspire was put into it, but I don’t know how far it went or how effective it turned out to be). However, the more you constrain APT the more likely it is to fail to find a solution and the more valuable the completeness of OPIUM becomes.

I am also unsure how much room there is to “guide the solver” in solutions. Sure you can choose a different MTA or browser but if gstreamer wants libxine1.4 then you are getting libxine1.4

This comment was directly addressed (after you raised it) on the Freespire forum and you never made any response. Grep the current Debian repository and look at how many packages specify disjunctive dependencies. Remember that each disjunction doubles the number of possible install choices. Try taking a package (let’s say kpoker) and walk its dependency tree…now tell me if you had to make any choices along the way. You’ll notice that there’s a whole lot of room to guide the solver (regardless of if you’re using Linspire, Debian, Red Hat, SuSE, or any other distro you care to name).

I would like to see opium in action – hard to say how it will work just by someone telling me about it. Or maybe, some examples that I can try and prove to myself that apt is actually adding/removing stuff incorrectly?

Read the paper. Much as you might think otherwise, we actually didn’t lie in it: we ran APT on 50,000 real-world installation attempts (i.e. attempted installations that happened on a user’s machine) and found that 1/4th of those users hit a completeness problem. Again, as I said to you on the Freespire forums, if you don’t like the math just read the intro and the evaluation.

Finally, might I suggest that just reposting concerns of yours that have been addressed on the Freespire forums over here is rather pointless? If you’re given an answer you don’t like then contest it, don’t just repeat the question. It just wastes everybody’s time.
2007-06-04 10:25 pm

deanlinkous
well it is a bit confusing on the freespire forums since everything I post gets turned into a negative bashing of spire….and I have to fend off ten people nit-picking everything I say….and might I suggest that if you do not care to discuss something then just ignore it – otherwise you are wasting everyones time. God knows that nobody on a forum or a internet news site has time to waste….

I appreciate the assumption that I haven’t read the paper…..real nice. Nice way to keep it on a sane level and not cause someone to be upset and argumentative.

I appreciate that you read into my words that I think you are lying….nice way to discuss something.

Yes, you ran apt, but I didn’t. Show ME not tell me about it.

The things I read seemed to suggest it was linspire and there warehouse – is that a good measure of apt performance considering they mix old/new packages which apt was likely never meant to handle?

Maybe answer a few of my questions specifically and I may not repeat them – or just keep repeating how you can guide the solver…yad yada…
2007-06-04 10:37 pm

christucker
I apologise if my comments offend you. I should reiterate something that I’ve said on the Freespire forums and thought you had already seen: this is a research paper, not a product. I can’t “show” you a running copy of OPIUM because what I have is a prototype tool hard-coded to instrument the distributions. The purpose of a paper like this is to get the algorithm out there, not to give you a running piece of code. What I can do is show you the results of the experiments we ran; if you require that you be able to run it to believe it works, how can I assume anything other than that you think I’m lying?

We do use Linspire and the Linspire distributions deployed over a two year period for our investigation. Your suggestion that mixing old and new packages is not something APT is designed to handle is odd, and I believe incorrect. APT is designed to handle Debian distributions, which allow for any mixing of packages you like. I’m also not sure what you mean by old and new packages: are you talking about having multiple versions of a package in a distribution (something that happens if you point APT at multiple distributions, which happens regularly with Debian)?

I’ve answered all of your questions directly as far as I can tell. If you have specific questions to be answered, please write out a numbered list of them and I’ll address them each in turn in a response.

2007-06-04 11:28 pm

leibowitz
If there is a way to install abiword without uninstalling any package then OPIUM will do this.

If apt can do it, OPIUM is not a big win.

But in case of apt can’t do it, OPIUM is a win-win.

2007-06-04 11:41 pm

christucker
Exactly! The only other thing that OPIUM gives you is the optimality: if apt can do it and OPIUM can do it, OPIUM can also guarantee that you’re getting the best choice of packages (smallest, newest, highest rated, best tested, whatever you want to go by). I think that’s really valuable too, but even without that OPIUM is a win.

2007-06-04 7:17 pm

christucker
There are several alternative packaging mechanisms that do this kind of partitioning of your library space, and I agree that they are interesting. However, the vast majority of the Linux world currently runs on Debian or RPM; switching to something like Zero Install requires replacing this architecture. Beyond that, even with Zero Install OPIUM offers the optimality benefit: if you eliminate conflicts you don’t have to worry about the uninstall problem, but you do still have to figure out what is the best set of packages (or equivalent) to install to, for example, download the fewest bytes or get the most up-to-date system.

2007-06-04 8:30 pm

Tom5
Even with Zero Install OPIUM offers the optimality benefit: if you eliminate conflicts you don’t have to worry about the uninstall problem, but you do still have to figure out what is the best set of packages (or equivalent) to install to, for example, download the fewest bytes or get the most up-to-date system.

Sure, putting a solver like this in Zero Install would be interesting. The overhead might be a bit high (Zero Install runs its solver* each time a program is run; I guess we’d have to cache the result instead if using this).

* The current solver is very naive, but we mainly get away with it because we only worry about conflicts between the libraries required by a single program.

Good paper, BTW.

2007-06-04 8:39 pm

christucker
Glad you liked the paper. Although the performance results in the paper show us running about 3-4 times slower than APT the majority of this time is spent in distribution read and slice time, both of which can be improved a lot by using less naive approaches (they weren’t the focus of the research, so we kept them simple). Also, bear in mind that the dist read time only happens when you ask it to: dist read time == apt-get update time. Realistically, I would expect to be able run a solve fast enough to be performance-comparable with APT (within 1-2x the runtime of APT) and guarantee all the optimality stuff. Caching results is certainly a good option for already-installed programs, though. Additionally, results in the paper are prior to a major optimization to conflict resolution, which we only have a preliminary idea of numbers for but which gives us comparable performance whether or not there is a conflict. This may not be an issue anyway for Zero Install as there are so few (if any) conflicts.

2007-06-04 7:35 pm

deanlinkous
The solution isn’t a better solver (although that’s nice to have anyway), the solution is remove the possibility of conflicts.

Why do I need to use a massively complex algorithm to find the single version of some libfoo that works with all of the one thousand packages on my machine? What the system needs is a bit of slack. If there’s a period of a few months when one program needs libfoo > 3 and another needs libfoo == 3 that shouldn’t matter. Just keep two copies of libfoo on my system until the problem is fixed.

Exactly! Trying to correct the actual problem instead of trying to be more complex in order to alleviate the already complex nature of something. The version system and model is a rough-spot to my untrained eye.

That is along a similar line of thought that I TRIED to discuss with Chris at the forums. Not that I have used 0install or anything but that the development model version system needs to be standardized or some slack put into it to allow some flex and maintain some backward/forward compatibility/flexibility.If we could assure that packageB1.2.x would work with the handful of packages that rely on packageB instead of requiring packageB1.2.8 or > then that IMO would alleviate a lot of the dependency issues and conflicts.

IOW – if something requires libfoo4 and you have stuff that requires libfoo3 then libfoo4 SHOULD be all that is required and should automatically register as libfoo4 but also inform the package manager that libfoo3 is also satisfied. Of course, that isn’t always possible depending on changes but often it IS possible to use a newer library for something that thinks it needs a older one.

Heck, how many of us hasn’t symlinked a mis-match version and it worked fine or simply went in and fudged the requirements listed in a deb?

But I am not a big fan of standardizing anything either…

I am just not sure that apt has as many failures and problems as this study makes it out to have.

Edited 2007-06-04 19:41

2007-06-04 8:48 pm

christucker

Exactly! Trying to correct the actual problem instead of trying to be more complex in order to alleviate the already complex nature of something. The version system and model is a rough-spot to my untrained eye.

There are many different problems in any system. I don’t disagree that there are serious complexities and difficulties in the packaging system, and in particular in creating and testing packages. I’m afraid we can’t solve all the world’s problems in one go, though: the OPIUM paper addresses three clearly identified (and demonstrated) problems; it doesn’t preclude you or anyone else from tackling the others (see the comments on Zero Install and my responses: all this work is good and productive!).

IOW – if something requires libfoo4 and you have stuff that requires libfoo3 then libfoo4 SHOULD be all that is required and should automatically register as libfoo4 but also inform the package manager that libfoo3 is also satisfied. Of course, that isn’t always possible depending on changes but often it IS possible to use a newer library for something that thinks it needs a older one.

You’re describing a packaging problem. If X depends on libfoo it has a couple of choices:

1) Depends: libfoo == 3

2) Depends: libfoo

3) Depends: libfoo > 3

Either of the last two will work when installing X on a system with libfoo3. The first should only be used if X is known to only work with version 3 of libfoo. If something declares it explicitly needs libfoo version 3 when in fact it doesn’t there is *nothing* that APT or OPIUM or any other package installation algorithm can do to fix it. We have to be able to rely on the rules the packages lay down because they’re all we have to go on: imagine if in a real world system we said “Oh, it says it needs libc6 v.2.2, but we’re going to upgrade to 2.3 because we think everything will be OK”. You’d have a lot of broken systems and a lot of angry people.

2007-06-04 9:31 pm

deanlinkous
Yawn….yes please explain to me how I do not understand anything….again….

Yes, a packaging problem that is at the root of a lot of dependency issues/conflicts that require a package manager to be smart and try and resolve all the issues that occur.

I am saying what if you COULD ensure that newer versions (at least point or sub-point releases) of libfoo provided for old version requirements as well. No more conflict of having libfoo1.2.9 and some older app needing libfoo1.2.2 because the package manager KNEW that 1.2.9 provides 1.2.2 so the package manager creates a symlink for 1.2.2 to 1.2.9 for the older app that needs 1.2.2

Not that this is probable for various reasons but only that it is possible and would alleviate some issues at the CORE of the issue.

2007-06-04 9:55 pm

christucker
I’m not sure if you’re a software developer or not, but what you’re effectively suggesting is that we change development practices to guarantee that our library APIs do not change in point releases. In general this is a good idea, and one adhered to by most developers. But, let’s say you find a bug in your API and fix it because it’s hurting a bunch of programs that depend on you. It may turn out (and often does) that one or more things that depend on you actually rely on that bug, and so if they get given the new version of the library they’ll fail.

Libraries with different version numbers are by definition different, and if a program depends on something that is changed it may break between versions. Attempting to write better software that has to change less and fails less is a laudable goal and one we all strive for, as is maintaining API consistency and minimizing disruption when we fix bugs and change behavior. Unfortunately, software is immensely complex and constantly evolving so the problem is here to stay. Sometimes you’ll get lucky and the symlinking will work (in which case the package was incorrectly packaged in declaring it doesn’t work with the newer library), and sometimes you’ll get unlucky and there’ll be a reason why that dependency was specified as it was. As I’ve said before in these comments, there are a lot of problems in the world (even restricted to software!) and OPIUM tries only to solve a finite, concrete set of them. If you want to take on some of those other issues that clearly vex you then more power to you, and I look forward to reading your results.

2007-06-04 6:16 pm

deanlinkous
Sounds like you RE-created the idea used in the Smart package manager http://labix.org/smart

Edited 2007-06-04 18:17

2007-06-04 6:32 pm

christucker
Show me the performance numbers and the optimality guarantees of SMART. Also, describe the algorithm to me and explain how it appropriately deals with NP-completeness. I’ve looked at SMART and it looks like it still treats dependencies as a graphing problem, and uses an enumerative approach (as far as I can tell) to establishing optimality. This is going to be problematic in large-optimization scenarios (with lots of possible choices: e.g. installing KDE on a system without X). If the authors of SMART or someone more familiar with its algorithms is around, though, I’d love to learn more about its inner workings: the only way I could establish what it does is by reading the code, which is never the best way to get a high-level algorithmic-approach takeaway from a project. Oh, and if the SMART guys are interested, we’d love to work with you (or anyone else!) on integrating our approaches into your existing tools. We’re not here to compete with what’s already out there, just to offer ways to collaborate and improve on it!

2007-06-04 9:39 pm

Luminair
I’m sold: pack it up and ship it in Ubuntu

2007-06-04 7:04 pm

PipoDeClown
i would love to see an equivalent for thinstall.com for linux/*nix

2007-06-04 7:23 pm

justinbest
http://klik.atekon.de/ looks a whole lot like what you describe.
2007-06-04 10:03 pm

Luminair
That topic is a slightly different can of worms, but one worth mentioning.

Take the average Suse or Ubuntu desktop that is a year or two old. Look at every program that is installed.

Re-install every program in its own directory with its own copy of every dependency except for giant things like Java.

How much extra hard drive space have you used? That is the price of never having to deal with a dependency conflict ever again.

2007-06-04 10:23 pm

christucker
Unfortunately disk space is only half of the problem. The other issue is that if you have multiple copies of a library on disk you will load it into memory multiple times. In some cases this is completely appropriate (single-use servers I run tend to use more statically linked binaries because I know what is getting loaded into memory), but for end-user systems the memory overhead when running multiple applications can kill you. There may also be issues with security updates etc having to be applied in multiple places. Still, static linking is certainly a widely used solution in many contexts, and should always be considered as another option to dealing with the complexities of dynamic dependencies.

2007-06-04 10:54 pm

Luminair
I agree with what you’re saying

Re: downloading redundant updates, and storing redundant data in memory

It’s the same as hard disk space. Look at the present and the future, not the past.

Explicitly: Hard drive space, memory space, and bandwidth are all dirt cheap. Some people get ripped off more than others, but they are all dirt cheap in reality.

Today: $100 for 500gb of hard drive space. $20 for 1gb of ram. $10 a month for 3000GB of website transfers. $40 a month for 5Mbps download speed to your home. And you can find even better rates when you leave North America and enter small European countries where they rip you off less.

So keep that in mind when you think of how much hard drive space, memory space, and bandwidth would be used by statically linking all those small libraries.

2007-06-04 9:44 pm

deanlinkous
If this was evaluated on Linspire and the Linspire warehouse then is the performance of apt really applicable since Linspire mixes a lot of old/new packages in the warehouse that apt was never intended to handle?

Don’t get me wrong – it is probably a GREAT test to prove OPIUM but is it really a good example of what apt expects and was designed to provide?

I would concede that with a mixed warehouse that OPIUM would beat apt to death, heck SmartPM seems to do well with some mixing but with a good repository that is well maintained I would assert that apt does perfectly with it.

2007-06-04 10:27 pm

christucker

I would concede that with a mixed warehouse that OPIUM would beat apt to death, heck SmartPM seems to do well with some mixing but with a good repository that is well maintained I would assert that apt does perfectly with it.

Please back that assertion up with some data. Also, take a look at the EDOS project work that has done a lot to show that distribution management is, in fact, a tremendously difficult problem that is very much not solved yet. Linspire’s repositories are as valid as any other: APT is designed to work with any distribution, not just the official Debian ones! Perhaps you could also offer some concrete rules for what makes one distribution “good” while another is “bad”, and explain why those rules are useful/valid?

2007-06-04 10:50 pm

deanlinkous
So how about letting us at that prototype tool. Not lying, only hi-lighting the parts you have to prove your point. That is what you do with a experiment, a case study, – you prove what you set out to prove and I doubt there has ever been a case of not proving what the purpose of the study was.

Nah, I dont think it is recommended in Debian that you keep stable/testing/unstable sources in your sources.list and use them wily-nily. If that is what you are suggesting. Yet, that is almost exactly what you are doing with Linspires repo.

No, I do not consider Linspires warehouse to be a acceptable test case for apt since they F*CK up everything and have packages from 2005 in the same warehouse with mono. I am not surprised you ended up with a shitty apt result.

I have installed and removed ocaml in debian and nothing bad happened – sorry if linspire fubar’d something and you think it reflects on apt.

2007-06-04 11:15 pm

christucker

So how about letting us at that prototype tool. Not lying, only hi-lighting the parts you have to prove your point. That is what you do with a experiment, a case study, – you prove what you set out to prove and I doubt there has ever been a case of not proving what the purpose of the study was.

I don’t know how much clearer I can make this. This is not a tool that an end user can use at the moment. It’s not something you can run on your system: there’s a big ol’ pile of instrumentation surrounding an automated testing suite surrounding the algorithm. There’s nowhere you, as a user, can tell it what you want to install, where the sources.list is, any of that stuff. The paper describes and evaluates the algorithm ; if you really really want to test it you can either (1) write it up yourself as an end-user usable program or (2) get someone else to do so. All you need to implement the algorithm is in the paper, and it’s really remarkably simple. You even know how it’ll behave, because we thoroughly evaluate it in the paper! What exactly do you think we’re hiding from you? That APT on other distributions suddenly doesn’t fail when it encounters complex problems? Or that other distributions magically don’t have these problems because Debian is smarter than Linspire and the devs somehow solve NP-complete problems in their heads?

Nah, I dont think it is recommended in Debian that you keep stable/testing/unstable sources in your sources.list and use them wily-nily. If that is what you are suggesting. Yet, that is almost exactly what you are doing with Linspires repo.

I don’t believe I recommended that anywhere. I’d be surprised if you’ve never installed something from somewhere outside the official Debian repos, though, and I doubt you’d find that the APT developers expect APT to fail if you do so.

No, I do not consider Linspires warehouse to be a acceptable test case for apt since they F*CK up everything and have packages from 2005 in the same warehouse with mono. I am not surprised you ended up with a shitty apt result.

If this is really what it comes down to then there’s not much conversation to have here: you have a problem with Linspire and aren’t going to trust anything that I put forward if we used their data (which, obviously, we did). Many, many people use the Linspire repos successfully, and the process by which packages are moved into the repos are well thought out and tested. I’m not quite sure why having a package from 2005 in a repo with mono is any kind of a problem for you at all: do you have an example where this caused APT to fail? We can argue all day about the quality of the Linspire distribution, but none of that changes whether OPIUM does a better job than APT. Do you think it’s bad to let users specify which packages should be preferred for installation or removal? Or do you think that Debian repos somehow magically makes it a non-issue? If the latter, again, please look at the number of disjunctive dependencies in any Debian distro and tell me that there aren’t choices to be made.

I have installed and removed ocaml in debian and nothing bad happened – sorry if linspire fubar’d something and you think it reflects on apt.

How is it Linspire’s fault that APT removed the kernel when it didn’t need to? APT finds a solution that removes the kernel; OPIUM finds one that doesn’t. How is APT better?

Your comments seem to have degenerated into bashing Linspire rather than offering anything substantive for us to discuss. Take Linspire out of the equation and do some research on the Debian distros; you’ll quickly see that making choices is a huge part of the package selection process. Are you saying even in that context OPIUM is not worth the effort?

2007-06-04 11:24 pm

deanlinkous
How is it Linspire’s fault that APT removed the kernel when it didn’t need to?

Same reason that removing Jackd will remove every damn thing from your system – the warehouse is a fubar mess.

How much clearer can I make that.

If the package/repo data says that removing ocaml means the kernel is ripped out then apt is following directions – no more and no less. How is that apts fault. As you said you have to trust the package data. So was it being installed or removed? Because the thread I remember was about removing ocaml – not installing as your paper states.

remove jackd and watch everything disappear as well – why? Who knows why – it isn’t apts fault that someone screwed up and somehow wove jackd into almost every package on the system.

Edited 2007-06-04 23:25

2007-06-04 11:37 pm

christucker

If the package/repo data says that removing ocaml means the kernel is ripped out then apt is following directions – no more and no less

That’s the whole point! The package rules did not say that the kernel had to be ripped out! When you install something using Debian there are many (often hundreds) of ways of performing that install: add these things, remove these things, upgrade these things, etc. What APT did was grab the first solution it found where one of the things to remove was the kernel. What OPIUM did was find the best solution, which did not remove the kernel! The kernel was removed when ocaml was installed by APT because some dependency of ocaml conflicted in some way with something the kernel depended on; choosing a different package to satisfy that dependency meant that ocaml could be installed with the kernel. Both APT’s and OPIUM’s solutions adhere to the package rules, but one (OPIUM’s) is objectively better than the other. Does that make sense? Perhaps this whole thread has been a colossal misunderstanding!

2007-06-05 2:22 am

deanlinkous
This thread

http://forum.linspire.com/viewtopic.php?t=415761

is just one of many about dependency problems on linspire. Linspire forums are full of threads that apt-get -f had to be used and sometimes repeatedly – other times CNR had to be removed and reinstalled just to get it straight.

Here is another thread

http://forum.linspire.com/viewtopic.php?t=428171

notice that it says

Some packages could not be installed. This may mean that you have | | | |

| requested an impossible situation or if you are using the unstable | | | |

| distribution that some required packages have not yet been created | | | |

| or been moved out of Incoming. | | | |

Notice the comment about using packages from unstable/incoming. You dont get messages like that in debian unless you mix repos or are using testing/unstable and the package versions are all screwed.

What about jackd? Ever try to remove that on linspire? Ever try to remove any of the LOS packages or jiffy-stuff? All those want to rip out loads of packages – why? IMO it would have to be because they are threaded as a dependency thru everything or some sort of meta-package.

Why do these things occur on linspire but not on debian if one repo is as good as another? Certainly apt did not just decide to rip out the kernel because it felt like it. It was a package conflict – they happen on Linspire all the time. Something was being upgraded, which preferred something else being upgraded which prefered a newer something else. It is great that you could of told the *solver* to accomplish it some other way but I question how long you can steer the solver before it freaks out.

The fact that apt does perform so well on debian is pretty good proof to me. I am not saying it is perfect. Many times- especially during upgrades it gets freaked out because of the mix of packages you are dealing with and that is exactly the problem on linspire.

IMO apt is poor on Linspire because they mix packages, from last weeks latest to packages from three years ago. I mean they have yahoo messenger and whatever dependencies it needs – but yahoo messenger for linux hasnt been updated since Debian Woody!!! Same as with other packages.

And no, I dont think the apt guys would be mad at me for saying – apt was never made to handle a cluster-fluck of packages with stuff from three years ago to only months ago along with proprietary and commercial packages without some good usage of pinning/holds and knowing what you are doing. IMO apt is the package manager to be used on Debian stable – anything else is asking from trouble – not from apt but from the package data that apt deals with.

I do agree that you likely built a awesome tool to be used on a mixed warehouse/repository but in a *sane* warehouse with proper flavor structure I would bet that apt would peform just as well. But I guess we will never know….

But since you are so concerned about wasting everyones time – please feel free to stop replying to me because you won’t prove anything about apt to me using the linspire repo.
2007-06-05 3:37 am

christucker
I’ll try to keep this short so I don’t, as you say, waste everybody’s time. Dependency errors are not unique to Linspire and to suggest they are is ridiculous. Take a look at the EDOS project’s work on Debian:

http://debian.edos-project.org/anla/list_bundles

Every single Debian distribution has at least one broken package in it, where “broken” is defined as “doesn’t have the dependencies needed in the distribution. Given that that is the most trivial of errors to find, do you really think you can claim that more complex errors related to system state and interaction with those repositories don’t exist? You might also like to read some of the Debian bug list and google about a bit to see people hitting these problems. Maybe they don’t count, who knows.

I don’t know how to explain this to you any better; you seem unwilling to listen, unwilling to research this on your own (did you ever look at the different ways you could install, say, kpoker on a system?), and unwilling to believe that Linspire isn’t some evil monster distribution that deliberately sets things up so APT will fail.

For the very last time, the particular distribution we point OPIUM at does not change the fact that:

1) It guarantees it will find a way to install any valid package (i.e. one where dependencies are available in the distribution). APT offers no such guarantee.

2) It guarantees the solution it will find will be the best according to a user-supplied metric: it can be anything, but we find size, age, and rating to be particularly interesting ones. APT offers no optimal install functionality.

2) It guarantees the solution will remove the best packages to remove should a package need to be removed, again according to any user-supplied metric: we find raw number of packages and weighting packages installed as dependencies as less valuable to be two interesting ones. APT offers no such optimal removal functionality.

We demonstrated using a perfectly legitimate distribution with several hundred real-world user cases from actual users using the system day-to-day that these are valuable properties to have. We obeyed all of the rules of Debian, as did the distribution and user systems we tested. Work by the EDOS project demonstrates that distribution correctness problems are certainly not isolated to Linspire. We’re not aiming to replace APT, we’re aiming to improve it in cooperation with anyone who’s interested. Anyone out there with an open mind who’s interested in learning more about what we did and how we improved on the algorithms at the heart of APT, thank you for enduring this flame fest to get this far and I’d love to hear your ideas.

Dean, with you, I’m done. I’ve tried to be patient and civil in my responses to your comments (and believe me, that has been difficult). I’ve pointed you at numerous resources, given you plenty of examples and explanations of why and how things happen, detailed what the benefits and costs of OPIUM are, and you just don’t seem to give a damn. You just bitch about Linspire and complain that it’s not a real-world example because everyone should be using Debian stable (because that would obviously be more realistic…right).

As Duane said to you over on the Freespire forums “How are your contributions to the Free Software pool going?”

I thought so.

2007-06-04 10:54 pm

deanlinkous
I think Kevin Carmony himself said that CNR has around a 99% success rate and I assume that was CNR using apt. Seems like some numbers are confusing or I am missing some math (which is possible since I suck at math)

2007-06-04 11:21 pm

christucker
Yes, the math is off. 99% (or something close) of CNRs are successful. Some percentage fail for sundry reasons (network connection problems etc.). About 0.25% fail because of APTs incompleteness. This is the “install attempt failure rate”. However, bear in mind that most users install a lot of software on their system: our 600 sample users installed between about 50 and 200 additional packages on their systems, and of those 600 about one quarter experienced an install attempt failure at least once in their use of their machine. If you’re trying to offer any kinds of guarantees to your users it sucks when a quarter of them are hitting problems that you can’t identify or solve because of the state of their machine and APT’s behavior. That’s where OPIUM comes in.

2007-06-04 11:35 pm

deanlinkous
IMO if you want to prove something about apt, then use the distro that created it and where they know how to manage it not the one where it isn’t even a supported method of software installation and the warehouse is a mix of packages…

2007-06-04 11:39 pm

christucker
I’m afraid we can only test it for the distro we have data for. If you can get me installation histories (including starting state and packages files at the time of every installation attempt) for 600 users of Debian I’m sure we could look into re-running the experiments. As far as I know there’s no way to get this data, though: for reference, the data we ran on ran to some 4gb of package descriptions and such, so it’s not a trivial task to collect and use it!
2007-06-05 10:58 am

robilad
For the research questions asked in the paper, whatever software is in the repository does not really matter, it could be all empty packages with nothing in there, except the metadata.

The paper is a rather interesting contribution, given that completeness & optimality are not in general questions that are on the feature list of package/repository management tools. And I think the use of formal methods is rather novel in the field, as well.

2007-06-04 11:41 pm

leibowitz
Chris: you seems involved in that thing. I must say I’m totally impressed by the way you tackled the problem, and how you and your mates found a solution.

To me the solution does not really matter, I’m not too good at maths. Anyway I like coding, and I’m really interested in your work. So maybe one day I will try to implement it. But I think someone else will do it before I can understand apt source code and how to patch that “thing”-in.

I hope someone will do that soon enough!

2007-06-05 4:23 am

christucker
I’m really glad you enjoyed it. The goal, after all, was to get this stuff out into the public so that people can read it and potentially implement it for others to use! Good luck if you do decide to dive into the APT source code, though…I believe the original name of “Deity” may have been a measure of the code-mastery required to successfully hack on it.

2007-06-05 1:07 am

broken_symlink
best name ever for a package manager.
2007-06-05 7:04 am

AdamW
seriously, you couldn’t manufacturalize this stuffication in an upwardly directification.

2007-06-05 10:02 am

r3m0t
It’s a computer science term. What else did you want? Optimalness? That’s not a word. Optimal? That wouldn’t fit in the sentence. Optimality? Oh wait, you think that’s too stuffy.

2007-06-05 6:22 pm

chris_dk
I’m wondering why dpkg does not support side by side installations (eg. firefox 1.5, firefox 2.0)? This is a genuine question, not a troll.

I have seen some discussion regarding shared libs and hard disk but are these issues more important than flexibility?

All these issues about conflicting packages: why not install packages like gobolinux or zeroinstall? Would that not help?

2007-06-05 7:21 pm

christucker
There are a number of reasons to use dynamic rather than static linking/dependency management which are all ultimately about avoiding duplication and saving space. However, it’s really important to think about what the duplication means and exactly what space is being used.

Let’s take a library like libc. It’s used by pretty much every application on your system, and is responsible for all that low level guck like IO. If you ever run strace on an app you’ll see whole lot of calls going through libc. Here are a couple of reasons to prefer dynamic linking over static linking:

1) Let’s say libc has a security flaw. If everything is statically linked to it, you have to update all of your applications. If, instead, everything dynamically links you need only update libc.

2) Every time you make a call into libc the computer has to figure out where the instructions to run are and go out and get them. Your computer has a couple of key pieces of hardware to help with this: the TLB and the processor cache. Both are small because they’re on-die with your processor, but are blazingly fast (many times faster than having to go out and have a shufti at main memory, which itself is many many times faster than having to have a look at a hard disk). You want to have the stuff you use most commonly in the TLB and cache; if you have duplicates of a library loaded not only will they compete for space in memory and on disk, but also in the TLB and cache. With dynamic linking there will only be one copy of each library, whereas with static there may be many. Incidentally, managing the cache and TLB are two of the problems that make virtualization tricky, and you may notice big performance hits when switching between virtualized apps because of this.

Essentially it’ll always come down to a space and duplication argument. The downside of dynamic dependencies, of course, is that they’re complicated to manage and maintain. Like everything the decision of whether to go with a static dependency of a dynamic dependency is a question of trade offs, and the answer will depend heavily on your situation. Realistically, dynamic linking is here to stay, and static linking is typically most appropriate for user-level applications running in relative isolation (often on single or limited use boxes). Systems like Zero Install blur the line and give the user more options, and it’s great to see these things being tested and used. You just have to remember that the flexibility/ease-of-use doesn’t come for free.