Using Eclipse JDT: AST

I’ve been writing lots of quick fixes for bndtools lately. Since there seems to be very little introductory material in this area, I decided to share my experiences here so I can use it as a reference. The code used in here is currently in a coming PR but will soon end up in 7.1.0-SNAPSHOT.

I admit, I am not a fan of the Eclipse JDT APIs. This can partly be explained that they have a distinct vintage feel due to lack of generics and enums. There is a huge amount of slippery code where it is up to you to track the type safety and it there are an overwhelmingly large number of strings involved. Usage of interfaces is not consistent and in many places an adapter objects defines the API, which makes lambda’s hard, if not impossible, to use. Also, the Java language evolved tremendously since the beginning and the code clearly shows that an overhaul is long overdue.

However, there is also the problem that it is hard to get a good oversight what the intention was. I’ve been working with the AST API for some weeks and only now do I get some respect for what the code does.

Quick fixes show up when you type Control-1. Some time ago one of our committers, Father Krieg, made it trivial to add quick fix processors in an OSGi component. So now you only have to create your component class and you can add a quick fix. This has become so easy that organizations could consider writing a plugin for their own annotation types. The work I did was adding quick fix support for gogo commands, components, literals, and some minor frustrations with missing base quick fixes. These quick fixes can save developers significant time.

A quick fix must analyze the context of where the text cursor is placed. For example, if it is placed on a String Literal, it would be possible to propose converting the text to upper case if the literal had letters that could be changed to upper case. The proposal then consists of some text to explain the proposed change, an icon, and a callback. When the user selects the proposal, the callback is called and it must modify the program.

There is a lot going on in this definition. First. what is the cursor placed on? Clearly the cursor is in the text buffer but it would be quite hard to try to find out what that position really means. Anybody ever trying to figure out in a random text buffer where in a Java program you are will testify how hard this is.

For this reason, Eclipse JDT has parsed the text into an AST, an Abstract Syntax Tree. In an AST, every construct is represented as a node with a given type, and a set of children nodes of potentially mixed other types. That is, the nodes and the resulting tree structure are generic, the type of node provides the semantics.

At the top of the Eclipse AST we have a Compilation Unit, which represents the whole source file. For each structure in the Java Language there is a specific class that extends the ASTNode class. For example, the PackageDeclaration class represents the package foo.bar; part. Every semantic aspect of the source is represented as an ASTNode. Also types and names are nodes. There are no simple attributes or properties, everything, however small, is represented as a a node. For example, a fully qualified name has a Name node for each of its segments.

It is important to realize that the ASTNode represents the textual representation of the Java source code, not its semantics. Although it provides significant support for the Java semantics, it must faithfully represent the lexical structure of the Java source as well as being able to represent erroneous code. For example, also comments and Javadoc construct are represented as specifically typed nodes. The purpose of the AST is to structurally edit the content of an editor. The AST cannot have cycles and a node can only be part of one tree or not connected.

Navigating the AST can happen in different ways. It supports the visitor pattern. With this pattern you can quickly traverse the tree and with little effort filter out the interesting node. The disadvantage is that it that it requires a bit of work to keep the context because they’re calling you out of the blue.

The other ways is to traverse the tree manually. The children of an ASTNode are organized as properties. There are two types of properties, simple and list properties. Simple properties represent 0…1 node, list properties represent 0…n nodes. Ever property has a key. A key is unique for a specific ASTNode type and a property. There are simple keys and list keys. Navigating is cumbersome with this API, especially since it does not use any generics. The API sadly does not protect you from common mistakes.

When you can make a quick-fix, you get the context node in the AST. For example, a String Literal. You can then poke around to see what the content. If you find you can do something useful, you make a proposal that describes the change you can make, for example turning the text into uppercase, and provide a callback. When your proposal gets selected, the callback is executed.

Although the AST represents the current editor’s content, you cannot directly change the AST. Instead, you use the ASTRewriter class. This class takes the existing AST and records the changes you want to make to the AST without actually making them.

For example, you can create a new String Literal node and set the content to the upper cased string. This new String Literal is not connected to the original AST and never will. However, the ASTRewriter can make the changes to the text buffer to remove the old String Literal node and add the new String Literal node. These changes are recorded as text changes to the text buffer. This is a relative fragile process because you are supposed to make the changes in a careful order, I found it goes wrong easily. Once all the changes are done, you can get a TextEdit object that can be used to apply those changes to the editor’s Document.

The JDT is a very powerful library but it is far from easy to use. Like many Eclipse APIs, its ancient roots from the late nineties show. Java underwent a huge evolution to make the language a lot easier to use and more powerful that are sorely missed in this complex API. That said, it is still quite impressive what it does and maybe even more impressive that this code is already 25 years old.

3 Likes

There is much of great existing code inside Eclipse (and other brilliant code bases).
But maintainers (and modernizers) are really difficult to find for many of the projects.
Many young upcoming (5-min hello world in a new language/framework) developers are more interested in developing new functionalities, solving the already solved things, again and again, than maintaining or modernizing existing code.

There are a number of problems with the Eclipse eco-system that make me pretty sad.

I feel being part of the history of Eclipse. I flew with Bjorn Freeman-Benson who worked on Visual Age for Java on Vancouver Island, I was at the OOPSLA 2001 in Tampa where IBM made Eclipse open source, and I participated in its arduous selection process for OSGi in 2003-2005. And a lot of people in the Eclipse didn’t like me … Most of all, I’ve been an enthusiastic user of the Eclipse IDE since the beginning.

When I think about Eclipse licensing first comes to mind. Open source was a huge threat to large companies like IBM because of hidden patents. If IBM was caught with its pants down, it could be sued for billions. GPL was a mortal threat. So either develop everything internally or make sure it is good open source. Eclipse was never a technically oriented open source group, it was there mostly to ensure good licenses. That is, good for IBM and other large businesses. The foundation never put the developer first, the lawyers were primary. Although I am a committer on Eclipse Equinox, it never felt the right place to put my work due to this impedance mismatch between business goals and technical goals.

This trade off was bearable as long as IBM and some other big companies made a lot of developers available in fighting the ever present bit rot and adding new features. However, I guess somewhere around 2005 IBM reduced its effort significantly in the technical field. This left a vacuum that was never really filled. Other parties stepped up but often to use Eclipse as a hospice. Providing a committer for the basic care while desperately hoping to pickup some committers that did not have to be on the payroll.

Since the Eclipse IDE was a magnificent software product in 2001 its demise has been slow but undeniable after 2005. The core kept on working, often by IBM doing the minimum to keep it alive for its internal users, like other companies. However, where the refactoring code was absolutely brilliant in 2001, the newer language features like generics and lambdas are just not as solidly supported. Where Eclipse was at the forefront, nowadays it is limping at the back. Intellij took over the role of the primary Java IDE. The Eclipse Foundation never started a Manhattan Project to remain competitive in this space. Today’s Eclipse build support was foolish to start with in 2001 (no good headless model) but today in 2024 it is still largely sequential. The far majority of my 32 Cores are idle when Eclipse builds the workspace. You can quicker build a workspace with parallel Gradle than in Eclipse, even though Gradle has to read all the workspace meta data.

Of course the most telling aspect of the decline is that Eclipse did not initiate an emergency rescue project when the IDE was left out of the supported platforms for Microsoft’s Co-Pilot program. Jeff McAffer: really?

Around 2005 I started building a plugin for bnd. When I found out that Neil Bartlett was also working on this I flew to London and wore him down in a hotel lounge until he agreed to collaborate. Neil did the Eclipse stuff, I did the bnd stuff. Seeing the Eclipse APIs always scared me a bit. They seemed to have created a parallel universe to the Java API. From SWT for the GUI to the IResource that models the file system. It is not that I could not understand the reasons, often just the fact of magnitudes slower computers than today, but it made it really hard to get into making cool applications. Just making a small user interface is really hard if you care about how it looks. It requires tremendous amounts of code and I find it almost impossible to predict the layout from the code.

I don’t think I was alone. A significant contributor to the bad name Eclipse got was that the quality of the non-eclipse plugins was almost universally awful. It should be trivial to make a plugin that edits some file format. A few regular expressions or an AST builder but even Chat GPT will tell you that it is gonna be really hard. Xtend tries to address this but unfortunately my allergy against coupling and dependencies then kicks in. I tried many times but when I need to create a 5 projects for a ietsie tiny editor I am physically repulsed. That should be at most one class. The complexity of the Eclipse API made basically no plugin really good. Users trying to figure out what plugins to use often got disappointed to the point of breaking.

The thing that changed and made me dive into some of the Eclipse APIs was Chat GPT. Creating the first draft of a GUI is now not that hard anymore. Figuring out how to use the AST was simple by asking ChatGPT about it. By learning the API’s I developed a healthy respect for what they can actually do. I hate the verbosity, wide coupling, and antiquated Java of the code but what it achieves is often quite impressive.

I still think a tremendous amount of people use Eclipse. And as a user myself, I generally like it. And with Bndtools it is of course awesome! But I do more and more wonder if it is worth the effort, feedback on bndtools dried more or less up in the last few years. Maybe VSCode would deserve another look? Or the Eclipse foundation could hire me to overhaul the code … nah, unlikely :slight_smile:

2 Likes

Summary: the AST is powerful but clumsy and could benefit from some renovation to take advantage of newer Java features. I wholeheartedly agree. I didn’t quite dive as deeply as you (I was using AST in a read-only way, not to make changes, as the quick fixes I wrote edited the .bnd file not the Java code), so it was interesting to read how the other part worked.

1 Like

Did you take a look at the RefactorAssistant I wrote? We can still change this one significantly before the next release. I think it does a very nice job of simplifying the analysis as well as the updating of the source document. Especially the Cursor<T extends ASTNode> is quite nice in applying rules to the AST to check if some quick fixes should be proposed.

No, I haven’t had a chance to have a close look. Though on the surface it looks like it could be useful for simplifying some of the code in the quick fix processor that I wrote. If anyone would like to give that a whirl, we have the benefit of some pretty extensive regression tests so we can refactor without significant risk of inadvertently breaking something.

1 Like