Re-thinking versioning, cohesion and Require-Bundle

This post is just a reflection, based on something that was recently discussed in the Bnd developer’s Slack group. It caused us to briefly touch on an age-old argument of Import-Package vs Require-Bundle.

Historically, the Bnd team has been staunch supporters of favouring Import-Package over Require-Bundle. However, the Eclipse team has not been so purist and the Eclipse core currently relies on Require-Bundle in order to work properly. Even @pkriens admitted (in the aforementioned Bndtools Slack group) to softening his purist approach against Require-Bundle (though I’m still not 100% sure if he was serious or joking… :smile: ) I was prompted to re-visit @pkriens blog post from a decade-and-a-half ago where the Import-Package approach was put forward (OSGi Blog: JSR 277 and Import-Package).

I nearly always use Import-Package for the reasons @pkriens outlined in the article. The only time I don’t is when my upstream dependencies are constructed in such a way as to force me to use Require-Bundle, often because of split packages (which is the case with iDempiere). However, at the same time, I have also seen a glimmer of when Require-Bundle might be a good idea (or at least, not a terrible idea).

In @pkriens’ original article, the argument is based on the concept of cohesion. Cohesion is roughly defined as the amount of interdependency between different parts of the code. If you draw a dependency graph, a highly cohesive block of code will have lots of links. To put it another way, there will lots of different parts of the code that don’t really work without each other (so it is rare to import one package without the others), and changes to one more often than not require changes to the others.

The argument consists of two parts:

  1. Versioning makes most sense on aggregates of code that are highly cohesive.
  2. Packages are usually highly cohesive, but bundles are usually not so cohesive.

I think that the first part of this argument is 100% accurate. However, upon reflection, I think that the second part of the argument, while generally true, is less absolute - and it is something that is under the control of the developer.

I think it is certainly true that packages tend to be more cohesive than bundles. I think it is equally certain that it is possible to produce packages that aren’t quite so cohesive - the fact that packages can be “split” across bundles, causing the “split package” problem, is an indication of a package whose cohesion is breaking up and should probably be refactored into separate packages.

However, it is possible to have a cohesive group that spans multiple packages. Examples are some of the API specs out there like JAX-WS or servlet APIs - groups of packages that implement a single version of the API. In such cases, it often makes sense to use a single version number for the group of public API packages. The OSGi even created the concept of “contracts” to encompass this very idea (disclaimer: I’ve never really used contracts in OSGi so this is based on my theoretical understanding). More on contracts here and here.

It is true that a bundle could theoretically contain packages that aren’t highly cohesive (indeed, in practice this is often the case). However, it is also possible that a developer could make a commitment to downstream users to be careful about how they aggregate (exported) packages into bundles - to maintain only and all cohesive packages in their bundle (ie, don’t move them out somewhere else without an appropriate major version bump, and don’t put other packages in there that aren’t closely related). If such a commitment is made, then the bundle becomes a de facto type of a “contract”. The advantage of using Require-Bundle over a contract is that you don’t have to explicitly list all of the required packages that you will inevitably require if you are using the API.

The main disadvantage of Require-Bundle is that you end up importing all packages in the bundle - including those that you don’t need. This increases dependency fanout unnecessarily. However, if the bundle you are requiring consists of only cohesive packages, chances are that you will end up needing to import all or most of them anyway, and as the packages are mutually cohesive they are likely to have more-or-less the same set of transitive dependencies so the chances of increasing your dependency fanout is small.

That being said, the only advantage I can think of here is that it’s easier to write your dependencies in your OSGi manifest - you only have to put one Require-Bundle rather than several Import-Packages. This advantage isn’t much of a real issue if you’re using Bnd to build your bundles (and automatically generate the appropriate Import-Package statements).

I’m not sure what the runtime performance overheads are of Require-Bundle vs Import-Package though. I can’t imagine there would be a great deal of measurable difference between the two.

So in the end, I think sticking with Import-Package is probably the best way to go if you’re using the Bnd toolchain. However perhaps Require-Bundle is not (always) as bad as it seems?

Thoughts?

2 Likes

Thanks for this great article Father Krieg, appreciated.

The missing part here is services. The key idea of OSGi is to develop software that is based on a service model. Where a service is an object with a well defined separate API. From the beginning we made it very clear that a service was specified in a package. A service is not only an interface, it generally uses a collaboration of multiple objects. Following this model, you’ll find that an API package is not only highly cohesive, substitutable, but most of all a service API package tends to be extremely simple. because it does not include the implementation dependencies. The service API quite often only depends on the Java runtime and itself. Even in OSGi where the OSGi framework could’ve caused us to include the ServiceReference in almost any spec, it is rare to see a service specification that needs it.

To me this is the biggest value of OSGi, the reduction of the public API that you get with the service model. It is not hard to get a service API obviously correct because it is so simple. Because it is so simple, it is not too hard to get the implementation right. A service API that goes over multiple packages is therefore rarely necessary.

That said, I do recognize the situation when API and implementation intersect. I called that a library in enRoute. In my experience, it is also not too hard to keep the content of a package cohesive. However, my rather sad experience is that many developers decompose by arbitrary rules and place classes in other packages without considering the effects on cohesion and coupling.

However, some things are really complex and do require more packages. Even there I think I can argue that package versioning is better than bundle versioning. If you need different packages, there should be a reason why they are separate. That reason might one day cause you to change the packaging. For example, we delivered all the service packages in a compendium JAR and then decided to deliver them as stand alone JARs for each service. With package version and resolving you won’t even notice the refactoring. With Require-Bundle, however, …

So I do hope you understood I was joking :slight_smile:

If you need different packages, there should be a reason why they are separate.

I think this is a key point. One could make an argument that, if you have two or more packages that are cohesive/coupled enough that you might contemplate a “contract”-style implementation (whether implemented as official OSGi contract spec or using the pseudo-contract alternative I proposed in conjunction with Require-Bundle), then quite possibly one could argue that they shouldn’t be separate packages to begin with.

Sometimes the only reason that packages are separate is historical accident + inertia - ie, an original decision was made for separate packages which later turned out to be the wrong decision, but due to the difficulty of renaming/moving packages in an API the packages remain separate. But in such cases, the burden of merging the packages is probably no greater than the burden of maintaining a contract/bundle that includes them and only them.

For example, we delivered all the service packages in a compendium JAR and then decided to deliver them as stand alone JARs for each service.

To be fair, I don’t think this is a good counter-example to the use-case I was contemplating. The use-case I was contemplating was for a bundle of packages that are all mutually tightly-coupled/highly cohesive. The very fact that the service package compendium bundle was called a “compendium” indicated that it was an aggregate of packages that were only loosely coupled to each other and doesn’t fall in this category.

So I do hope you understood I was joking :slight_smile:

I was 99% sure. :smile: The above was something I had been contemplating for a while.

At the end of the day (aside from the performance, which I think isn’t a practical issue but which I haven’t verified), I think the only thing Require-Bundle has going for it is it makes it easier to write your manifest by hand. So for those of us who are already using Bnd (and thanks to @laeubi 's contributions, even more in the future for those using PDE too), there is no real upside to Require-Bundle.

As an aside, it also occurred to me that the kind of analysis that Bnd already does (looking at dependencies at a class level and translating them to packages), it could already (in theory at least) produce some kind of metric to measure and report on inter-package cohesion. This could aid the developer in aiding pain points that could be targeted for refactoring.

1 Like

Show me :slight_smile:

In my experience developer lump stuff together because it is convenient at the time. Very few jars show signs from well thought out modules imho. They are often more based on categorizing than following the original goal from Parnas: minimizing complexity in the light of change. And it is true, that is really hard to get right the first time. The difference I see with many other developers that I am willing to redo it until I get it right because it is imho extremely important long term. Many developers are often not even aware that their composition can be improved with beneficial long term effects.

it could already (in theory at least) produce some kind of metric to measure and report on inter-package cohesion

Its probably 15 years ago or so when we had a Google Summer of Code student from China with exactly this goal. If you look in aQute.libg you’ll find a Tarjan implementation that does the ground work. It calculates groups of cohesive groups. At the time I thought the student would be perfect to work this out and make decomposition recommendations or reviews.

Unfortunately, he showed up about three days before he needed my ok to let Google pay him after which he disappeared again until the next session. Fell for it 2 times, but it was a total scam :frowning:

So I tried …

1 Like

Thanks both of you @kriegfrj and @pkriens this is very interesting.

Thanks for the read. I must admit I fall / fell in this category. But working with OSGi and with this community made me aware at least.

Oh something like that would be cool in bndtools to right click on a bundle or package and see see somekind of cohesion metric / graph. E.g. if a package has no outgoing or incomming connections to other packages it would mean it is pretty cohesive.

anyway, keep going :slight_smile:

1 Like