springgitlabgitlab-cigitlab-runner

Gitlab runner cache miss file after stage complete


Summary

My gitlab-ci.yml has 3 stage for deploy an application to okd pod Application running spring boot on tomcat:8 Sometimes, the cache.zip is not update after stage complete so that the next step can't run correctly

Steps to reproduce

My gitlab-ci run the following stage

Stage 1: run test compile ---> OK

Stage 2: package war file as output for deploy ---> Gitlab-ci log show success but the cache.zip has not war file (just sometimes cache.zip not have war file, sometimes it run correctly)

Stage 3: Deploy war file to pod ---> Because of war file not exists in cache.zip, script error -> failed

.gitlab-ci.yml

image: openshift/origin-cli

stages:
  - build
  - test
  - staging

cache:
  paths:
    - .m2/repository
    - target
    - artifact

validate:jdk8:
  stage: build
  script:
    - 'mvn test-compile'
  only:
    - master
  image: maven:3.3.9-jdk-8

verify:jdk8:
  stage: test
  script:
    - 'mvn verify'
    - 'mvn package' # =====> this command generate war file
  only:
    - master
  image: maven:3.3.9-jdk-8

staging:
  script:
    - "mkdir -p artifact"
    - "cp ./target/*.war ./artifact/" # ======> Sometimes error at this line because of previous step not add war file into cache
    - "oc start-build $APP"
    - "rm -rf ./target/* && rm -rf ./artifact/*" # Remove war & class file, only cache m2 lib
  stage: staging
  variables:
    APP: $CI_PROJECT_NAME
  environment:
    name: staging
    url: http://$CI_PROJECT_NAME-staging.$OPENSHIFT_DOMAIN
  only:
    - master

Actual behavior Sometimes cache not have war file after test stage complete (is this depends on war file size?)

Expected behavior War file update into cache after test stage for staging stage deploy

Relevant logs and/or screenshots

ScreenShot

job log

Running with gitlab-runner 13.7.0 (943fc252)
  on gitlab-runner-node1 y6awygsj
Preparing the "docker" executor
00:01
Using Docker executor with image openshift/origin-cli ...
Using locally found image version due to if-not-present pull policy
Using docker image sha256:7ebb6be01117a50344d63f77c385a13302afecd33480b97c36a518d4f5ebc25a for openshift/origin-cli with digest docker.io/openshift/origin-cli@sha256:509e052d0f2d531b666b7da9fa49c5558c76ce5d286456f0859c0a49b16d6bf2 ...
Preparing environment
00:00
Running on runner-y6awygsj-project-489-concurrent-0 via gitlab.runner.node1...
Getting source from Git repository
00:01
Fetching changes...
Reinitialized existing Git repository in /builds/my-project/.git/
Checking out b4c97428 as master...
Removing .m2/
Removing artifact/
Removing target/
Skipping Git submodules setup
Restoring cache
00:05
Checking cache for default-23...
No URL provided, cache will not be downloaded from shared cache server. Instead a local version of cache will be extracted. 
Successfully extracted cache
Executing "step_script" stage of the job script
00:01
$ mkdir -p artifact
$ cp ./target/*.war ./artifact/
cp: cannot stat './target/*.war': No such file or directory
Cleaning up file based variables
00:00
ERROR: Job failed: exit code 1

Environment description

config.toml

concurrent = 1
check_interval = 0

[session_server]
  session_timeout = 1800

[[runners]]
  name = "gitlab-runner-node1"
  url = "https://gitlab.mycompany.vn/"
  token = "y6awygsj9zks18nU6PDt"
  executor = "docker"
  [runners.custom_build_dir]
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]
    [runners.cache.azure]
  [runners.docker]
    dns = ["192.168.100.1"]
    tls_verify = false
    image = "alpine:latest"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/mnt/nfs/nfsshare-gitlab/cache:/cache"]
    shm_size = 0
    pull_policy = "if-not-present"

Used GitLab Runner version

Version: 13.7.0

Git revision: 943fc252

Git branch: 13-7-stable

GO version: go1.13.8

Built: 2020-12-21T13:47:06+0000

OS/Arch: linux/amd64

Possible fixes

Re-run test stage until cache has war file


Solution

  • Let's go step by step.

    First, regarding how to manage the files between stages.

    It's true that you could directly access to the files between jobs and stages if both run on the same environment, but that's not always the case (even if both runners are using the same nfs share directory) and you should use artifacts for that.

    When you define an artifact within a job, you're specifying a list of files that are attached to the job when it succeeds, fails or always, depending on the configuration you have.

    By default, all artifacts from previous stages are passed to each job, but in any case you can use dependencies to also define from which jobs you want to fetch artifacts from.

    So basically you should use the following .gitlab-ci.yml

    image: openshift/origin-cli
    
    stages:
      - build
      - test
      - staging
    
    cache:
      paths:
        - .m2/repository
    
    validate:jdk8:
      stage: build
      script:
        - 'mvn test-compile'
      only:
        - master
      image: maven:3.3.9-jdk-8
    
    verify:jdk8:
      stage: test
      script:
        - 'mvn verify' # =====> verify already includes: validate, compile, test and package
      artifacts:
        paths:
          - target/[YOUR_APP_NAME].war
      only:
        - master
      image: maven:3.3.9-jdk-8
    
    staging:
      dependencies:
        - verify:jdk8
      script:
        - "mkdir -p artifact"
        - "cp ./target/[YOUR_APP_NAME].war ./artifact/"
        - "oc start-build $APP"
      stage: staging
      variables:
        APP: $CI_PROJECT_NAME
      environment:
        name: staging
        url: http://$CI_PROJECT_NAME-staging.$OPENSHIFT_DOMAIN
      only:
        - master
    

    Also, notice that I deleted the mvn package instruction. I would recommend you to take a look into the Build Lifecycle Basics of Maven.