SonarQube 5.x series: It just keeps getting better and better!

by fabrice bellingard|

    We recently wrapped up the 4.x series of the SonarQube platform by announcing its Long Term Support version: 4.5.1. At the same time, we sat down to map out the themes for the 5.x series, and we think they're pretty exciting.

    In the 5.x series, we want the SonarQube platform to become:

    • Fully operational for developers: with easy management of the daily incoming technical debt, and "real" cross-source navigation features
    • Better tailored for big companies: with great performance and more scalability for large instances, and no more DB access from an analysis

    Easy management of the daily incoming technical debt

    A central Issues page

    If you came home one day to find an ever-growing puddle of water on your floor, what's the first thing you'd do? Grab a mop, or find and fix the source of the water? It's the same with technical debt. The first thing you should care about is stopping the increase in debt (shutting off the leak) before fixing existing debt (grabbing a mop).

    Until now, the platform has been great for finding where technical debt is located, but it hasn't been a good place for developers to efficiently manage the incoming technical debt they add every day. Currently, you can subscribe to notifications of new issues, but that's all. We think that's a failing; developers should be able to rely on SonarQube to help them in this daily task.

    To accomplish this, we'll make the Issues page central. It will be redesigned to let users filter issues very efficiently thanks to "facets". For instance, it will be almost effortless to see "all critical and blocker issues assigned to me on project Foo" with a distribution per rule. Or "all JavaScript critical issues on project Foo".

    With these new capabilities, the central Issues page will inevitably replace the old Issues drilldown page and eliminate its limitations (e.g. few filters, static counts that aren't updated when issues are changed in the UI, ...). In other words, when dealing with issues and technical debt, users will be redirected to the Issues space, and benefit from all those new features.

    Issues will also get a tagging mechanism to help developers better classify pending technical debt. Each issue will inherit the tags on its associated rule, so it will be easy to find "security" issues, for instance. And users will be able to add or remove additional tags at will. This will help give a clearer vision of what the technical debt on a project is about: is it mainly bugs or just simple naming conventions? "legacy framework"-related issues or "new-stack-that-must-be-mastered" issues?

    The developer at the center of technical debt management

    Developing those great new features on the Issues page is almost useless if you, as a developer, always have to head to the Issues page and play with "facets" to find what you're looking for. Instead, SonarQube must know what matters to you as a developer, i.e. it must be able to identify and report on "my" code. This is one reason the SCM Activity plugin will gently die, and come back to life as a core feature in SonarQube - with built-in support for Git and Subversion (other SCM providers will be supported as plugins). This will let SonarQube know which changes belong to which developer, and automatically assign new issues to the correct user. So you'll no longer need to swim through all of the incoming debt each day to find your new issues.

    "Real" cross source navigation features

    For quite some time, SonarQube has been able to link files together in certain circumstances - through duplications (you can navigate to the files that have blocks in common with your code) or test methods (when coverage per test is activated, you can navigate to the test file that covers your code, and vice-versa). But, this navigation capability has been quite limited, and the workspace concept that goes with it is the best proof of that: it is restricted to the context of the component viewer.

    With the great progress made on the language plugin side, SonarQube will be able to know that a given variable or function is defined outside of the current file, and take you to the definition. This new functionality can help developers understand issues more quickly and thoroughly, without the need to open an IDE. You no longer have to wonder where a not-supposed-to-be-null-but-is attribute is defined. You'll be able to jump to the class definition right from the Web UI. And if you navigate far away from the initial location, SonarQube will help you remember your way, and give quick access to the files you browsed recently - wherever they were. In fact, we want SonarQube to become the central place to take a quick look at code without expending a lot of effort to do it (i.e. without the need to go to a workstation, open an IDE, pull the latest code from the SCM repository, probably build-it, ...).

    Focus on scalability and performance

    SonarQube started as a "small" application and gradually progressed to become an enterprise-ready application. Still, its Achilles' heel is the underlying relational database. This is the bottleneck each time we want SonarQube to be more scalable and performant. What's more, supporting 4 different database vendors multiplies the difficulty of writing complex SQL queries efficiently. So even though the database will remain the place where we ensure data integrity, updating that data with analysis results must be done through the server, and searching must use a stack designed for performant searches across large amounts of data. We've implemented this new stack using Elasticsearch (ES) technology.

    Throughout the 5.x series, most domains will slowly get indexed in ES, giving a performance boost when requesting the data. This move will also open new doors to implementing features that were inaccessible with a relational database - like the "facets" used on the Rules or Issues pages. And because ES is designed to scale, SonarQube will benefit from its ability to give amazing performance while adding new features on large instances with millions of issues and lines of code.

    Decoupling the SonarQube analyses from the DB

    The highest-voted open ticket on JIRA is also one of the main issues when setting up SonarQube in large companies: why does project analysis make so many queries to the database? And actually, why does it even need a connection to the database at all? This has big performance issues (when the analysis is run far away from the DB) and security issues (DB credentials must be known by the batch, some specific ports must be opened).

    Along the way, the SonarQube 5.X releases will progressively cut dependencies to the database so that in the end, analysis simply generates a report and sends it to the server for processing. This will not only address the performance and security concerns, it will also greatly improve the design of the whole architecture, clearly carving it into different stacks with their own responsibilities. In the end, analysis will only call the analysers provided by the language plugins, making source code analysis blazing fast. Everything related to data aggregation or history computation (which once required so many database queries during analysis) will be handled by a dedicated "Compute Engine" stack on the server. Integration in the IDE will also benefit from this separation because only the language plugin analysers will be run - instead of the full process - opening up opportunities to have "on-the-fly" analyses.

    Enhanced authentication and authorization system

    A couple of versions ago, we started an effort to break the initial coarse-grained permissions (mainly global ones) into smaller ones. The target is to be able to have more control over the different actions available in SonarQube, and to be able to define and customize the roles available on the platform. This is particularly important on the project side, where there are currently only 4 permissions, and they don't allow a lot of flexibility over what users can or cannot do on a project.

    On the authentication side, the focus will be providing a reference Single Sign-On (SSO) solution based on HTTP headers - which is a convenient and widespread way of implementing SSO in big companies. API token authentication should also come along to remove the need to pass user credentials over the wire for analysis or IDE configuration.

    All this with other features along the way

    These are the main themes we want to push forward for the 5.x series, but obviously lots of other "smaller" features will come along the way. At the time I'm writing this post, we've already started working on most of those big features and we are excited about seeing them come out in upcoming versions. I'm sure you share our enthusiasm!