Typical way to use gradle is to download all needed packages in one step, then build in the next. But supposing you have hundreds of gradle projects, and don't want to download all needed packages in one step. I assume there is a way bazel can cache per package right? If so, how does it handle downloading packages in parallel that might have the same parent package? They could potentially try to write the same files at the same time and conflict right? The parents are not listed in the lock file, so without hand parsing the pom file, hard to know what order to download things in.
It's a two step process:
pom.xml
files, so it has to use an external Maven-aware tool to resolve the top level pom.xml
files to produce a description of the full package graph. This can result in diamond dependencies, as you said. It's possible to extend Bazel with bzlmod
to integrate with these tools. For example, rules_jvm_external
is a bzlmod
module that uses Aether to resolve the structure without downloading the artifacts.BUILD
file that mirrors the structure, like this:jvm_import(
name = "org_hamcrest_hamcrest_core",
visibility = ["//visibility:public"],
tags = ["maven_coordinates=org.hamcrest:hamcrest-core:1.3", "maven_repository=https://maven.google.com", "maven_sha256=66fdef91e9739348df7a096aa384a5685f4e875584cce89386a7a47251c4d8e9", "maven_url=https://maven.google.com/org/hamcrest/hamcrest-core/1.3/hamcrest-core-1.3.jar"],
jars = ["@maven//:v1/https/repo1.maven.org/maven2/org/hamcrest/hamcrest-core/1.3/hamcrest-core-1.3.jar"],
deps = [],
)
jvm_import(
name = "org_hamcrest_hamcrest_integration",
visibility = ["//visibility:public"],
tags = ["maven_coordinates=org.hamcrest:hamcrest-integration:1.3", "maven_repository=https://maven.google.com", "maven_sha256=70f418efbb506c5155da5f9a5a33262ea08a9e4d7fea186aa9015c41a7224ac2", "maven_url=https://maven.google.com/org/hamcrest/hamcrest-integration/1.3/hamcrest-integration-1.3.jar"],
jars = ["@maven//:v1/https/repo1.maven.org/maven2/org/hamcrest/hamcrest-integration/1.3/hamcrest-integration-1.3.jar"],
deps = ["@maven//:org_hamcrest_hamcrest_library"],
)
jvm_import(
name = "org_hamcrest_hamcrest_library",
visibility = ["//visibility:public"],
tags = ["maven_coordinates=org.hamcrest:hamcrest-library:1.3", "maven_repository=https://maven.google.com", "maven_sha256=711d64522f9ec410983bd310934296da134be4254a125080a0416ec178dfad1c", "maven_url=https://maven.google.com/org/hamcrest/hamcrest-library/1.3/hamcrest-library-1.3.jar"],
jars = ["@maven//:v1/https/repo1.maven.org/maven2/org/hamcrest/hamcrest-library/1.3/hamcrest-library-1.3.jar"],
deps = ["@maven//:org_hamcrest_hamcrest_core"],
)
...
Bazel then analyzes this BUILD
file to create its own internal graph structure, and depending on which build target is requested, it fetches the JARs as necessary while it traverses and evaluates the graph in a topological manner. With an in-memory graph evaluator and cache, a package that has many dependents will only be fetched and written to disk exactly once, so there's no possibility of files being clobbered.
If you are interested to learn more, see Bazel's evaluation model.