We have a single repo of our source code which if downloaded is around 2.8GB. We have 4 self hosted agents and over 100 build pipelines. With that, it is not feasible to download the entire source coded for each build/agent.
The approach I gone with is to disable the checkout for these pipeline and then run a command-line script to perform a Git sparse checkout. However this is taking around 15 minutes to get ~100MB worth of source code.
We are using self-hosted Linux agents.
steps:
- checkout: none
- task: CmdLine@2
displayName: "Project Specific Checkout"
inputs:
script: |
cd $(Build.SourcesDirectory)
git init
git config --global user.email ""
git config --global user.name ""
git config --global core.sparsecheckout true
echo STARS/Source/A/ >> .git/info/sparse-checkout
echo STARS/Source/B/ >> .git/info/sparse-checkout
echo STARS/Source/C/ >> .git/info/sparse-checkout
git remote rm origin
git remote add origin https://service:$(Service.Account.Personal.Access.Token)@dev.azure.com/Organization/Project/_git/STARS
git reset --hard
git pull origin $(Build.SourceBranch)
Is there anything I'm doing wrong here which is causing it to take so long to pull this data.
1.Since you use self-hosted agent, you could go to the agent machine, to run the git commands manually, to see whether you would get the same performance.
2.Set variable
system.debugtotrue, to check which command cost more time.3.Instead of Git Sparse checkout, you may specify
pathincheckoutstep:https://learn.microsoft.com/en-us/azure/devops/pipelines/yaml-schema?view=azure-devops&tabs=schema%2Cparameter-schema#checkout
4.Since you run a pipeline on a self-hosted agent, by default, none of the subdirectories are cleaned in between two consecutive runs. As a result, you can do incremental builds and deployments, provided that tasks are implemented to make use of that. So you can set Clean option to false.
https://learn.microsoft.com/en-us/azure/devops/pipelines/process/phases?view=azure-devops&tabs=yaml#workspace