windowsworkerconcoursecf-bosh

Concourse external windows worker failed register through atc


I have spun up a bosh cluster on AWS, running a concourse deployment. For this I used a tool called concourse-up. I spun up a windows worker outside the created VPC of the cluster, and I am trying to register the worker through the atc but this step fails with an error. I have opened up all ports and both a web VM and the Worker VM. I have tried several things but there are two specific errors that I get:

  1. When I connect the worker without a --peer-ip, the worker registers so I can see it through the fly cli, but I get this error in the log(snippet below), and jobs will fail with this error: Put /volumes/47c1c26c-274b-4f04-4dea-01d476ed949e/stream-in?path=.: read tcp 10.0.0.7:59478->10.0.0.7:39198: read: connection reset by peer
    {"timestamp":"1513510128.917933226","source":"worker","message":"worker.setup.no-assets","log_level":1,"data":{"session":"1"}}
    {"timestamp":"1513510128.920933962","source":"worker","message":"worker.garden.started","log_level":1,"data":{"session":"2"}}
    {"timestamp":"1513510128.921934128","source":"baggageclaim","message":"baggageclaim.listening","log_level":1,"data":{"addr":"127.0.0.1:7788"}}
    {"timestamp":"1513510130.645173311","source":"tsa","message":"tsa.connection.channel.forward-worker.register.start","log_level":1,"data":{"remote":"34.242.192.32:56803","session":"12.1.1.5","worker-address":"10.0.0.7:38380","worker-platform":"windows","worker-tags":""}}
    {"timestamp":"1513510130.649989367","source":"tsa","message":"tsa.connection.channel.forward-worker.register.reached-worker","log_level":0,"data":{"baggageclaim-took":"2.251829ms","garden-took":"2.492218ms","remote":"34.242.192.32:56803","session":"12.1.1.5"}}
    {"timestamp":"1513510128.960758924","source":"baggageclaim","message":"baggageclaim.repository.get-volume.volume-not-found","log_level":1,"data":{"session":"1.2","volume":"resource-certs"}}
    {"timestamp":"1513510128.960758924","source":"baggageclaim","message":"baggageclaim.api.volume-server.get-volume.volume-not-found","log_level":1,"data":{"session":"2.1.2","volume":"resource-certs"}}
    {"timestamp":"1513510128.963933945","source":"baggageclaim","message":"baggageclaim.repository.create-volume.failed-to-materialize-strategy","log_level":2,"data":{"error":"mkdir C:\\Users\\Administrator\\workspace\\concourse-workspace\\volumes\\init\\resource-certs: Cannot create a file when that file already exists.","handle":"resource-certs","session":"1.3"}}

supposedly this is not the right way to do it if I follow the official docs: https://concourse-ci.org/clusters-with-bosh.html#configuring-bosh-tsa

  1. The other thing I tried is using the --peer-ip which is recommended is the official docs if you are outside a cluster and have no connection to any of the resources other than the atc. But this will not even register the worker, it fails with this error in the log:

{"timestamp":"1513510497.727977514","source":"tsa","message":"tsa.connection.channel.register-worker.register.start","log_level":1,"data":{"remote":"34.242.192.32:56828","session":"13.1.1.8","worker-address":"34.242.40.26:7777","worker-platform":"windows","worker-tags":""}}
{"timestamp":"1513510497.728802919","source":"tsa","message":"tsa.connection.channel.register-worker.register.failed-to-fetch-containers","log_level":2,"data":{"error":"Get http://api/containers: dial tcp 34.242.40.26:7777: getsockopt: connection refused","remote":"34.242.192.32:56828","session":"13.1.1.8"}}
{"timestamp":"1513510497.729249239","source":"tsa","message":"tsa.connection.channel.register-worker.register.failed-to-list-volumes","log_level":2,"data":{"error":"Get http://34.242.40.26:7788/volumes: dial tcp 34.242.40.26:7788: getsockopt: connection refused","remote":"34.242.192.32:56828","session":"13.1.1.8"}}
{"timestamp":"1513510497.729336023","source":"tsa","message":"tsa.connection.channel.register-worker.register.failed-to-reach-worker","log_level":1,"data":{"baggageclaim-took":"469.89µs","garden-took":"666.118µs","remote":"34.242.192.32:56828","session":"13.1.1.8"}}
{"timestamp":"1513510497.729401112","source":"tsa","message":"tsa.connection.channel.register-worker.register.done","log_level":1,"data":{"remote":"34.242.192.32:56828","session":"13.1.1.8","worker-address":"34.242.40.26:7777","worker-platform":"windows","worker-tags":""}}

I have used this guide to configure the worker: http://www.chrisumbel.com/article/windows_worker_to_bosh_deployed_concourse and the official docs


Solution

  • This issue was never resolved, but I believe that it is no longer relevant. Since this answer was posted concourse have added BOSH orchestrated windows workers : https://github.com/pivotal-cf-experimental/concourse-windows-worker-release

    They have also updated their documentation official documentation: https://concourse-ci.org/worker-pools.html