Static code analysis with NDepend

[Disclaimer: Patrick Smacchia from the NDepend team approached me and offered me a licence for NDepend if I would blog about my experiences with the tool. However, all opinions are my own.]

Static code analysis is a powerful tool to understand and improve your codebase, using a variety of techniques to help find issues or vulnerabilities before you execute your code. NDepend is a popular tool for doing this on .NET code, with deep Visual Studio integration. I was interested to see what I could learn about a fairly substantial commercial codebase that has been developed over a few years.

To start, I downloaded the plugin from https://www.ndepend.com/ (downloading from the Visual Studio Marketplace just points you to the website). For reference, I was using version 2018.1.1. There are a couple of installation gotchas: the download is just a zip file with no README, and when you install the VS extension, it introduces a dependency on the files you unzipped, so make sure you extract to somewhere permanent rather than a temp directory! The developer also suggests that you don’t use the Program Files folder. I’d recommend reviewing https://www.ndepend.com/docs/getting-started-with-ndepend before you get going as it gives a few useful pointers, and would probably benefit from being more visible during the download/installation process.

To start, I created a new NDepend project and ran a full analysis of the codebase, which took roughly 45 seconds for around 32,000 lines. After that I was presented with a dashboard in Visual Studio and an HTML report to explore.

As you can see, there’s immediately plenty of data to check out! I was happy to see that our tech debt rating was looking pretty healthy at a B grade, with not too much estimated time to reach an A. I briefly scanned the HTML report, including the diagrams at the top:

  • a dependency graph shows the links between various assemblies, although ours was too hectic to be useful;
  • similarly, a dependency matrix presented the same information in a different way, although didn’t include information about all the assemblies; both diagrams invited me to use the full interactive version instead
  • a treemap metric view quickly highlighted areas with high cyclomatic compexity, which was a great way to start identifying problem areas
  • an abstractness vs instability chart, which showed which parts of the codebase might be harder to maintain. This one was interactive which was nice to see.

I assume these are presented first to try to highlight some of the visualisations you are able to produce, but it didn’t really guide you to where to find the interactive versions or any more that might be available without digging through some documentation. It was also a bit of a shame that by default they opened a “scaled” version, which then opened in a dialog that went outside the visible screen, making it harder to navigate and close than just opening the full versions, which Chrome scaled automatically for me.

However, the HTML report is only supposed to be a summary, and most of the power is in the Visual Studio extension, so I headed back there to take a look at some of the errors I’d noticed in the report. Two “quality gates” had failed, so I was keen to see what they might be. Clicking on the failed number took me straight to the Queries and Rules Explorer to show me where the failures were (“Critical Rules Violated” and “Debt Rating per Namespace” were the offenders). The way that NDepend works is by applying series a rules against your codebase. It ships with lots of presets, but you have the ability to customise and extend the rules based on your own opinions. They’re written in a LINQ-like query format, which should feel quite natural to C# programmers.

I really appreciated the explanations provided with each rule that had been broken; hovering on a rule brought up a summary with a description of what the rule meant, why it was important/would improve your code if you fixed it, and instructions on how to fix it. Double clicking a rule showed a list of files where it was violated, and other relevant info. Hovering on a file reveals a nice summary of stats about that file; for example, this was one of our worst offenders for complexity:

Calculations such as the breaking point help to drive decisions about how critical it is to fix the issues that NDepend is highlighting, and the links to the documentation are useful to clarify any points you might want more information on.

I thought I would try to pick some low-hanging fruit off and fix one of the critical rules that was broken. NDepend will not automatically fix code for you, but it will take you to where you need to make changes. Unfortunately, when I fixed an issue, it didn’t disappear from the list immediately; you must first rebuild the code and then run the analysis again. A nice touch would be to have the list update in real-time to show progress towards fixing problems. However, NDepend does keep snapshots of each time you run the analysis, so you can compare against the baseline (when you first ran the analysis) to see how your codebase is changing over time.

There are lots of rules to explore and problems to fix, but I also wanted to see what other tools NDepend gives you to dig into your codebase. It provides various dependency graphs and matrices to understand the relationships between classes in your code. I particularly liked the treemap for easily and quickly visualising where certain issues are in your code and how prevalent they are, and then being to drill straight into them as well.

NDepend is clearly a very powerful tool with many facets that will take a while to dig into all of them, but things like contextual help and well-linked documentation in Visual Studio help you to explore your way around. It is a helpful companion to drive code quality up and improve your knowledge around potential code issues and the impact they can have. You can also integrate it with your CI to ensure that metrics are visible and can therefore be worked on to track and improve them over time. There’s so much more that I haven’t had chance to go into for this blog post, but hopefully this has given you a good taste of NDepend’s potential. Unfortunately it does come at a bit of a cost, but you can download it from https://www.ndepend.com/ and start a free trial before committing to explore it for yourself. Here’s to better quality code!

MongoDB doesn’t serialise C# read-only properties

C# 6 introduced getter-only auto-properties, which are a great way of conveying immutability for your classes. However, by default MongoDB ignores these properties from its class maps, as it can’t deserialise them back into a class.

To automatically map read-only properties, you can use a custom convention:

/// <summary>
/// A convention to ensure that read-only properties are automatically mapped (and therefore serialised).
/// </summary>
public class MapReadOnlyPropertiesConvention : ConventionBase, IClassMapConvention
{
    private readonly BindingFlags _bindingFlags;

    public MapReadOnlyPropertiesConvention() : this(BindingFlags.Instance | BindingFlags.Public) {}

    public MapReadOnlyPropertiesConvention(BindingFlags bindingFlags)
    {
        _bindingFlags = bindingFlags | BindingFlags.DeclaredOnly;
    }

    public void Apply(BsonClassMap classMap)
    {
        var readOnlyProperties = classMap
            .ClassType
            .GetTypeInfo()
            .GetProperties(_bindingFlags)
            .Where(p => IsReadOnlyProperty(classMap, p))
            .ToList();

        foreach (var property in readOnlyProperties)
        {
            classMap.MapMember(property);
        }
    }

    private static bool IsReadOnlyProperty(BsonClassMap classMap, PropertyInfo propertyInfo)
    {
        if (!propertyInfo.CanRead) return false;
        if (propertyInfo.CanWrite) return false; // already handled by default convention
        if (propertyInfo.GetIndexParameters().Length != 0) return false; // skip indexers

        var getMethodInfo = propertyInfo.GetMethod;

        // skip overridden properties (they are already included by the base class)
        if (getMethodInfo.IsVirtual && getMethodInfo.GetBaseDefinition().DeclaringType != classMap.ClassType) return false;

        return true;
    }
}

And then register the convention on startup:

var conventionPack = new ConventionPack
{
    new MapReadOnlyPropertiesConvention()
};

ConventionRegistry.Register("Conventions", conventionPack, _ => true);

This will ensure that read-only properties are serialised, which is all we needed. However, to then deserialise properties back into a class, you may need a constructor. You can find an extended example of this at https://stackoverflow.com/questions/39604820/serialize-get-only-properties-on-mongodb (which this code is based on) – we had constructor chains that this didn’t seem to work with, but it might help guide you further.

Dynamically changing build numbers by branch in TeamCity

We use feature branches, and build all of them on our TeamCity CI setup. Every build can be deployed on our test servers, but it’s useful to be able to quickly distinguish which ones came from master and which came from other branches. I took a script from here and edited it to simply append -alpha to the end of the version number for any non-master build:

$branch = "%teamcity.build.branch%"

if ($branch.Contains("/")) 
{
  $branch = $branch.substring($branch.lastIndexOf("/") + 1)
}

Write-Host "Building from $branch branch"

if ($branch -ne "master") 
{
  $buildNumber = "%build.number%-alpha"
  Write-Host "Appending alpha to build number"
  Write-Host "##teamcity[buildNumber '$buildNumber']"
}
else
{
  Write-Host "Leaving build number as-is"
}

This makes it really obvious when a build has come from a feature (or other) branch, to avoid accidentally deploying it etc. You could probably easily extend this to pull a version number from the branch name as well, or similar.

To use this across multiple builds, I just set up a build template with the script stored in the first build step, which you can then apply to builds either individually or across a project.

MongoDB backup script

You can run this on a scheduled task and it will take a backup of all databases, zip it and copy it to a backup location, and automatically remove older backups after 7 days.

REM Create filename for output
set filename=mongodb-backup-%DATE:~6,4%_%DATE:~3,2%_%DATE:~0,2%__%TIME:~0,2%_%TIME:~3,2%_%TIME:~6,2%

REM Export the databases
@echo Dumping databases to "%filename%"
"c:\Program Files\MongoDB\Server\3.4\bin\mongodump.exe" --username <username> --password <password> --out %filename%

REM Create backup zip file
@echo Creating backup zip file from "%filename%"
"c:\Program Files\7-Zip\7z.exe" a -tzip "%filename%.zip" "%filename%"

REM Delete the backup directory (leave the ZIP file). The /q tag makes sure we don't get prompted for questions 
@echo Deleting original backup directory "%filename%"
rmdir "%filename%" /s /q

REM Move zip file to backup location
@echo Moving zip to backup location
move "%filename%.zip" E:\mongodb-backups

REM Delete files older than 7 days
@echo Deleting older backups
forfiles /P "E:\mongodb-backups" /S /M *.zip /D -7 /C "cmd /c del @PATH"

@echo BACKUP COMPLETE!

Diagnosing issues with Elasticsearch

We use logstash in our infrastructure, which ingests logs and outputs transformed data to a store of your choice, which for us is Elasticsearch. We then use Grafana to visualise some of this data. Recently we noticed that Grafana was struggling to perform the queries it needed, which led to investigating and fixing issues with Elasticsearch.

Elasticsearch uses a well-featured API for administration, which you can easily access using curl. The following all assume that you are running Elasticsearch locally on the default port 9200.

First things first: to get basic information about the Elasticsearch instance, run

curl http://localhost:9200/

which will give you the version for example, so that you can be sure you’re looking at the right version of any documentation. To get the status of the cluster, you can run

curl http://localhost:9200/_cluster/health?pretty

which should give you a response like this:

{
  "cluster_name" : "elasticsearch",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 95,
  "active_shards" : 95,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 1,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 98.95833333333334
}

Including the pretty parameter will return a more human-readable response. There are various bits of useful information here. Firstly, the status will give you a general idea of how your cluster is doing: red means there are indices that are not available to query, yellow means that there are some that are not replicated, and green means everything is good. When the cluster first loads, you will notice that the shards go through the initializing_shards phase as they are spinning up, until they become active_shards. For us, around a third of the shards would reach the active phase before the service would restart itself.

You can find out information about the individual shards by running

curl http://localhost:9200/_cat/shards?v

which should give you an output something like this:

index               shard prirep state           docs   store ip        node
logstash-2018.02.27 0     p      STARTED      6219832   7.3gb 127.0.0.1 SL-GvDk
logstash-2018.04.04 0     p      INITIALIZING                 127.0.0.1 SL-GvDk
logstash-2018.04.04 0     r      UNASSIGNED
logstash-2018.02.28 0     p      STARTED      5765860   6.8gb 127.0.0.1 SL-GvDk
logstash-2018.02.13 0     p      STARTED      6810856   7.9gb 127.0.0.1 SL-GvDk

Using the v parameter will give you headers in your output for many queries. There will be a line per shard in your cluster. INITIALIZING shards are currently spinning up on a node, and will then move to STARTED when ready to query. UNASSIGNED shards are ones which are waiting to be assigned to a node. We are only running a single-node cluster, so the replicas will never be assigned, meaning we would never get a green status. You can force all your existing indices to not have any replicas by running

curl -XPUT http://localhost:9200/_settings -d '{ "index.number_of_replicas": 0 }'

However, this would not affect any indexes that are then created in future. To ensure that replicas are not created in the future, you need to edit the index template being used to ensure that new ones will be created with the correct settings.

This cleaned up all our replica shards which would never be assigned, but still didn’t solve the problem of the endless rebooting. Looking at the logs (stored at /var/log/elasticsearch by default) showed that several options were erroring with

java.lang.OutOfMemoryError: Java heap space

So I increased the heap space by updating the JVM options, which can be found at /etc/elasticsearch/jvm.options; there are two options which need to be updated and kept the same as each other:

-Xms2g
-Xmx2g

This sets the heap to 2GB, which for us was sufficient to get everything working again after restarting the service:

/etc/init.d/elasticsearch restart

Now that everything was running again, we could review whether we needed all those indices still; you can list all indices by running

curl http://localhost:9200/_cat/indices

and then close any you don’t want any more by running something like:

curl -X POST http://localhost:9200/logstash-2018.04.*/_close

where wildcards are accepted. However, this won’t delete the data on disk, it will just free up memory. If you want to permanently delete the data from disk as well, you can run:

curl -X DELETE http://localhost:9200/logstash-2018.04.*

Magic Commands – cleaning up ASP.NET temporary files

ASP.NET temporary files can quickly mount up and eat up disk space, especially if you deploy multiple times a day. This website provides a handy command that you can run to delete any unnecessary files:

Get-ChildItem "C:\Windows\Microsoft.NET\Framework*\v*\Temporary ASP.NET Files" -Recurse | Remove-Item -Recurse

You’ll get a few errors when it hits files in use, but apart from that it works a treat. You can run this as a scheduled task if you want to avoid having to do it manually all the time, or get your deployment tool to run it on every deployment to tidy up as you go (Octopus Deploy has its own library task to do something similar, for example).

How to time how long something takes to run in Windows

It can be useful to time how long certain commands take to run. With Powershell, you can do this easily using the Measure-Command cmdlet:

Measure-Command { ping google.com | Out-Default }

Simply replace ping google.com with whatever you want to time, and off you go. You’ll get a readout similar to this when your command finishes executing with details of how long it took:

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 3
Milliseconds      : 116
Ticks             : 31167663
TotalDays         : 3.60736840277778E-05
TotalHours        : 0.000865768416666667
TotalMinutes      : 0.051946105
TotalSeconds      : 3.1167663
TotalMilliseconds : 3116.7663

Disk Cleanup on Microsoft Windows Server

Servers run out of disk space too sometimes, but the Server versions of Windows don’t have the handy Disk Cleanup utility installed by default. If you want to get it back, you can run this Powershell command:

Install-WindowsFeature Desktop-Experience

or you can follow the instructions here if you’re more of a GUI person.

Both of these approaches install the “Desktop Experience”, which includes Disk Cleanup but also some other clutter that you may not want. For the purists out there who just want Disk Cleanup and nothing else, for Windows Server 2012 and earlier you can get the necessary files out of the WinSxS folder (check here for the relevant directories). However, as of Windows Server 2012 R2 this option is no longer available so you’ll have to just accept the added bloat of Desktop Experience.

RavenDB transformers will lower-case your IDs

Here’s a nice little gotcha. When RavenDB (v3.0.30155) returns data from the database, it will generally preserve the casing of the IDs (e.g. an ID of MyDocuments/1 will be returned as such), even though RavenDB itself doesn’t care about casing (e.g. if you run Session.Load<MyDocument>("mydocuments/1"), you’ll still get back the right document). However, if you write a transformer like this:

public class MyDocumentTransformer : AbstractTransformerCreationTask<MyDocument>
{
    public MyDocumentTransformer()
    {
        TransformResults = docs => docs.Select(d => new TransformedDocument
        {
            Id = d.Id
        });
    }
}

You’ll notice that when you use this transformer, any returned TransformedDocuments will have lowercase IDs like mydocuments/1. Although RavenDB doesn’t care about this, we had some code further down the line that (unfortunately) did rely on casing, which then started to fail. Fortunately, since transformers are just running C# code on the server, it’s easy to fix with something like this:

public class MyDocumentTransformer : AbstractTransformerCreationTask<MyDocument>
{
    public MyDocumentTransformer()
    {
        TransformResults = docs => docs.Select(d => new TransformedDocument
        {
            Id = d.Id.replace("mydocuments", "MyDocuments")
        });
    }
}

Which will restore your IDs to their proper-cased glory.

TeamCity build fails with “no net in java.library.path”

One of our builds just failed with this fairly cryptic error:

[File content replacer] no net in java.library.path
java.lang.UnsatisfiedLinkError: no net in java.library.path
    at java.lang.ClassLoader.loadLibrary(Unknown Source)
    at java.lang.Runtime.loadLibrary0(Unknown Source)
    at java.lang.System.loadLibrary(Unknown Source)

Turns out we’d manually updated Java on the machine that the build agent was running on. If you do that, you should also restart the build agent service so that it picks up the new version correctly.