The problem

Today I found I had a problem when trying to associate a Release Management 2013.2 release pipeline with a TFS build. When I tried to select a team project the drop down for the release properties was empty.

image

The strange thing was this installation of Release Management has been working OK last week. What had changed?

I suspected an issue connecting to TFS, so in the Release Management Client’s ‘Managing TFS’ tab I tried to verify the  active TFS server linked to the Release Management. As soon as I tried this I got the following error that the TFS server was not available.

clip_image002

I switched the TFS URL to HTTP from HTTPS and retired the verification and it worked. Going back to my release properties I could now see the build definitions again in the drop down. So I knew I had an SSL issue.

The strange thing was we use SSL as out default connection, and none of our developers were complaining they could not connect via HTTPS.

However, on checking I found on some of our build VMs there was an issue. If on those VMs I tried to connect to TFS in a browser with an HTTPS URL you got a certificate chain error.

But stranger, on my PC, where I was running the Release Management client, I could access TFS over HTTPS from a browser and Visual Studio, but the Release Management verification failed.

The solution

It turns out the issue was we had an intermediate cert issue with our TFS server. An older Digicert intermediate certificate had expired over the weekend, and though the new cert was in place, and had been for a good few months since we renewed our wildcard cert, the active wildcard cert insisted on using the old version of the intermediate cert on some machines.

As an immediate fix we ended up having to delete the old intermediate cert manually on machines showing the error. Once this was done the HTTPS connect worked again.

Turns the real culprit was a group policy used to push out intermediate certs that are required to be trusted for some document automation we use. This old group policy was pushing the wrong version of the cert to some server VMs. Once this policy was update with the correct cert and pushed out it overwrote the problem cert and the problem went away.

One potentially confusing thing here is that the ‘verity the TFS link’ in Release Management verifies that the Release Management server can see the TFS server, not the PC running the Release Management client. It was on the Release Management server I had to delete the dead cert (run a gpupdate /force to get the new policy). Hence why I was confused by my own PC working for Visual Studio and not for Release Management

So I suspect the issue with drop down being empty is always going to really mean the Release Management server cannot see the TFS server for some reason, so check certs, permissions or basic network failure.