The Issue
Today I came across a strange issue with a reasonably old multi-stage YAML pipeline, it appeared to be cancelling itself.
The Build stage ran OK, but the Release stage kept being shown as cancelled with a strange error. The strangest thing was it did not happen all the time. I guess this is the reason the problem had not been picked up sooner.
If I looked at the logs for the Release stage, I saw that the main job, and meant to be the only job, had completed successfully. But I had gained an extra unexpected job that was being cancelled in 90+% of my runs.
This extra job was trying to run on an Ubuntu hosted agent and failing to make a connection. All very strange as all the jobs were meant to be using private Windows-based agents.
The Solution
Turns out, as you might expect, the issue was a typo in the YAML.
- stage: Release
dependsOn: Build
condition: succeeded()
jobs:
**- job:**
- template: releasenugetpackage.yml@YAMLTemplates
parameters:
The problem was the stray job: line. This was causing the attempt to connect to a hosted agent and then check out the code. Interesting a hosted Ubuntu agent was requested given there was no Pool defined
As soon as the extra line was removed the problems went away.