azureazure-devopsazure-cachingazure-devops-self-hosted-agent

Caching node_modules in Azure Pipelines takes longer than installing them


I am running a self-hosted agent (Windows Server) and I am trying to reduce my pipeline build time by caching my node_modules. However, restoring the node_modules cache takes just as long as installing the packages from scratch. Also, looking at the log it gives me the impression that it is downloading/uploading the cache externally, rather than keeping the cache on the VM. If this is true, then my caching of the node_modules would result in transferring ~1GB of data on every build.

What am I doing wrong?

My goal is to simply maintain/keep the node_modules between builds on my self-hosted agent for the following reasons:

  1. To prevent installing the node_modules everytime
  2. To keep the node_modules/.cache folder for computational caching purposes

enter image description here

My Pipeline YML File:

trigger:
  - develop
  - master

variables:
  nodeModulesCache: $(System.DefaultWorkingDirectory)/node_modules

stages:
  - stage: client_qa
    displayName: Client code QA
    dependsOn: []
    pool:
      name: Default
    jobs:
      - job:
        displayName: Lint & test client code
        steps:
          # Use NodeJS.
          - task: UseNode@1
            inputs:
              version: "12.x"

          # Restore cache.
          - task: Cache@2
            inputs:
              key: 'npm | "$(Agent.OS)" | client/package-lock.json'
              restoreKeys: |
                npm | "$(Agent.OS)"
              path: $(nodeModulesCache)
            displayName: Cache Node modules

          # Install dependencies.
          - script: npm install
            workingDirectory: client
            displayName: "Install packages"

          # Lint affected code.
          - script: npm run lint:affected:ci
            workingDirectory: client
            displayName: "Lint affected code"

          # Test affected code.
          - script: npm run test:affected:ci
            workingDirectory: client
            displayName: "Run affected unit tests"


Solution

  • You cache the node_modules to $(System.DefaultWorkingDirectory)/node_modules, the path should be _\agent_work\1\s\node_modules. The self-hosted agent will run execute git clean -ffdx && git reset --hard HEAD before fetching, it will delete the folder node_modules and install the node_modules everytime. check this doc for more details.

    We need add the code - checkout: self clean: false at the steps level.

    YAML definition

    trigger:
      - develop
      - master
    
    variables:
      nodeModulesCache: $(System.DefaultWorkingDirectory)/node_modules
    
    stages:
      - stage: client_qa
        displayName: Client code QA
        dependsOn: []
        pool:
          name: Default
        jobs:
          - job:
            displayName: Lint & test client code
            steps:
              - checkout: self
                clean: false 
              # Use NodeJS.
              - task: UseNode@1
                inputs:
                  version: "12.x"
    
              # Restore cache.
              - task: Cache@2
                inputs:
                  key: 'npm | "$(Agent.OS)" | client/package-lock.json'
                  restoreKeys: |
                    npm | "$(Agent.OS)"
                  path: $(nodeModulesCache)
                displayName: Cache Node modules
    
              # Install dependencies.
              - script: npm install
                workingDirectory: client
                displayName: "Install packages"
    
              # Lint affected code.
              - script: npm run lint:affected:ci
                workingDirectory: client
                displayName: "Lint affected code"
    
              # Test affected code.
              - script: npm run test:affected:ci
                workingDirectory: client
                displayName: "Run affected unit tests"