The Issue
I recently upgraded a clients Azure DevOps server from 2019 to 2022. This required a new application tier VM due to the change in supported versions of Windows Server between the two versions.
Unfortunately, the client’s developers had always accessed the old Azure DevOps Server using the VMs FQDN, as opposed to a DNS managed alias. Hence, given they wanted to minimise change, the plan was to create a DNS Alias so they could continue to use the same URLs and TFVC workspaces mappings.
The upgrade went well, and once complete I could access the upgraded server using the URLs http://localhost:8080/tfs or http://newserver:8080/tfs.
However, once the DNS Alias (an A record) was added and we tried to access http://oldserver:8080/tfs the developers were being shown a login dialog and when they entered their valid credentials were getting a TF30063 error.
The Solution (well sort of)
I went down a rabbit hole of loopback check settings, but to no avail.
I then tried adding an entry in the local host file for my new server VM on my test client. Strangely this worked.
So my working assumption was either
- The DNS Alias was not working correctly, maybe a CNAME record is required as opposed to an A record?
- As the DNS Alias is replacing what was a ‘real’ server name, had we missed some special settings?
To work through these ideas, I got a new CNAME added to the DNS ‘devops’ that pointed to the new server (arguably a CNAME that should have been in use all along). For some reason this DNS change took much longer to propagate than the A record, but once it did the TF30063 error had completely gone. I could now access the new server using the URLs
- http://localhost:8080/tfs
- http://newserver:8080/tfs
- http://oldserver:8080/tfs (the DNS A record)
- and http://devops:8080/tfs (the DNS CNAME)
So as usual the problem was DNS, I assume due to DNS propagation delays and cache timeouts.
Why is it always DNS?