-
-
Notifications
You must be signed in to change notification settings - Fork 287
Refactor scripts/create_repository.py
, improve efficiency and correctness, and bump Scalatest, protobuf-java, kind-projector
#1642
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
liucijus
merged 20 commits into
bazel-contrib:master
from
mbland:bzlmod-update-create_repository
Nov 12, 2024
Merged
Refactor scripts/create_repository.py
, improve efficiency and correctness, and bump Scalatest, protobuf-java, kind-projector
#1642
liucijus
merged 20 commits into
bazel-contrib:master
from
mbland:bzlmod-update-create_repository
Nov 12, 2024
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
No code changes other than the version bumps.
Improves readability by associating several functions directly with the `MavenCoordinates` class. Doesn't produce any changes to the `third_party/repositories/scala_*.bzl` files.
Improves readability and maintainability. No change in `third_party/repositories/scala_*.bzl`.
Uses a list comprehension in `get_json_dependencies` as well, instead of the `list(map(...))` composition. No changes in `third_party/repositories/scala_*.bzl`.
This will hopefully help future contributors discover how to properly update these files via `scripts/create_repository.py`.
Continuation of refactoring to objects to improve maintainability.
I noticed that `org.typelevel:kind-projector_2.{11.12,12.12,13.15}` was always updated after putting `print()` statements in `_map_to_resolved_artifacts`. Arguably if it's a root artifact (which `kind-projector` is), or dependency of one, it shouldn't be 'testonly'.
It seems version 0.13.3 is available for Scala 2.11.12, so no need to restrain its version number.
Also hoisted select_root_artifacts to the top of the file, closer to the actual root artifact version constant declarations.
Also added docstrings to all public methods.
It turns out it's in all the existing `third_party/repositories/scala_*.bzl` files anyway. Also removed `EXCLUDED_ARTIFACTS` from create_repository.py.
Slight efficiency improvement to avoid having to recompute the same labels over and over.
Avoids having `_get_json_dependencies` read the file multiple times, and allows us to clean it up as soon as it's read.
Starting to update repo names pulled on a thread that led to adding several new root artifacts and updating others. However, these are all good changes that pass all tests.
I noticed while running the command directly that we could use the JSON output file to run through all resolved artifacts and their dependencies directly. There was no need for separate a separate `cs resolve` step and `cs fetch` steps for each resolved artifact. Nor was there a need for a separate `_get_json_dependencies` function. Most significantly, there was no need to download each artifact for checksumming. `cs fetch` already downloaded them and listed their paths in the JSON file. The original code was parsing existing file paths to generate URLs to download them again. This change vastly improves performance. Here are the times for creating a fresh output dir and then updating it before this change: ```txt $ /usr/bin/time ./scripts/create_repository.py --output_dir before [ ...snip... ] 66.01 real 13.25 user 9.99 sys $ /usr/bin/time ./scripts/create_repository.py --output_dir before [ ...snip... ] 1.76 real 1.41 user 0.49 sys ``` Here are the times after: ```txt $ /usr/bin/time ./scripts/create_repository.py --output_dir after [ ...snip... ] 1.16 real 0.84 user 0.39 sys $ /usr/bin/time ./scripts/create_repository.py --output_dir after [ ...snip... ] 0.96 real 0.72 user 0.27 sys ``` Comparing the two output directories via `diff -uNr before after` shows the later having reordered some dependencies, otherwise the output is identical. This change also introduces a metadata cache. It might not help much overall, but it feels right not to recompute data if possible. Also decided to emit the Maven coordinates for each artifact actually updated by the script.
It's only the right thing to do for our fellow humans.
Also: - Renames `SCALAPB_RUNTIME_VERSION` to `SCALAPB_VERSION`, since it affects other artifacts in the ScalaPB suite. - Updates `ArtifactLabelMaker.get_label` so that it doesn't call `self._get_label_impl` unless a label isn't already present. The previous `self._cache.setdefault` call took the result of `self._get_label_impl` as an argument, which defeated the purpose.
The new mechanism is more thorough and robust than the version from "Make scala-parser-combinators a root artifact". It will not only exclude artifacts from the newly generated dict, but will remove `deps` labels for newly excluded artifacts from the existing dict. There are no artifacts in the `EXCLUDED_ARTIFACTS` set yet, but we'll add `com.google.guava:listenablefuture` to it once we get to updating gRPC and Guava. I wanted this mechanism to stand on its own for clarity.
Replaces `SPECIAL_CASE_GROUP_LABELS`. Even though keying on the group label worked for now, keying on the artifact seems more sound.
Adds this new private helper function to `ArtifactLabelMaker` to make stripping Scala version suffixes from artifact names more robust.
This was referenced Nov 8, 2024
simuons
approved these changes
Nov 12, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
liucijus
approved these changes
Nov 12, 2024
mbland
added a commit
to mbland/rules_scala
that referenced
this pull request
Nov 15, 2024
Moves the Scalatest, JUnit, and Specs2 toolchains into `@io_bazel_rules_scala_toolchains//testing`. Part of bazel-contrib#1482. Updates all `WORKSPACE` files to set the appropriate `scala_toolchains` parameters and to remove the unnecessary repository import and toolchain registration macros. Adds a `fetch_sources_by_id` parameter to `repositories` from `third_party/repositories/repositories.bzl`. This enables `scala_toolchains` to build the `artifact_ids_to_fetch_sources` mapping from artifact ID lists returned by new macros extracted from `WORKSPACE` macros. The values assigned to each id match the original `fetch_sources` settings in the corresponding original `WORKSPACE` macros. Updates `scala/scala_maven_import_external.bzl` to generate a `load` line for `//scala:scala_import.bzl` based on the repo's canonical name, not `@io_bazel_rules_scala`. As usual, includes several other opportunistic removals of the `@io_bazel_rules_scala` repo name prefix to avoid an internal dependency on that name. This means Bzlmod users won't necessarily have to set the `repo_name` parameter of `bazel_dep` when using `rules_scala`. --- Introduces more macros to return a framework's Maven artifact dependencies, rather than inlining them in a `repositories` call. These inlined lists are replaced by macro invocations, and now the `scala_toolchains` macro can invoke these macros to collect artifact IDs to pass to `repositories`. This also allows for future changes to introduce a `scala_version` parameter if necessary, similar to how `scalafmt_artifact_ids` already works. This is important to avoid collisions when creating repositories for artifacts upon which more than one framework depends under Bzlmod. `WORKSPACE` doesn't seem affected by these collisions, but Bzlmod will produce errors like the following, where both `scala_proto` and `twitter_scrooge` depend upon `io_bazel_rules_scala_guava`: ```txt $ bazel build //src/... ERROR: .../scala/scala_maven_import_external.bzl:299:24: Traceback (most recent call last): File ".../scala/extensions/deps.bzl", line 140, column 21, in _scala_deps_impl scala_toolchains( File ".../scala/private/macros/toolchains.bzl", line 140, column 17, in scala_toolchains _scrooge( File ".../twitter_scrooge/twitter_scrooge.bzl", line 96, column 17, in twitter_scrooge repositories( File ".../third_party/repositories/repositories.bzl", line 113, column 37, in repositories _scala_maven_import_external( File ".../scala/scala_maven_import_external.bzl", line 263, column 30, in scala_maven_import_external jvm_maven_import_external( File ".../scala/scala_maven_import_external.bzl", line 299, column 24, in jvm_maven_import_external jvm_import_external(jar_urls = jar_urls, srcjar_urls = srcjar_urls, coordinates = artifact, **kwargs) Error in repository_rule: A repo named io_bazel_rules_scala_guava_2_13_15 is already generated by this module extension at .../scala/scala_maven_import_external.bzl:299:24 ERROR: Analysis of target '//src/java/io/bazel/rulesscala/worker:worker_test' failed; build aborted: error evaluating module extension scala_deps in //scala/extensions:deps.bzl ``` Recent updates to `scripts/create_repository.py` (bazel-contrib#1639, bazel-contrib#1642) make it easy to emit full direct dependency lists for artifacts included in `third_party/repositories/scala_*.bzl`. This increases the likelihood of collisions, since this expanded metadata forces the macros that instantiate artifact repos to instantiate even more repos. By fetching list of artifact IDs from these macros, `scala_toolchains` can now consolidate them into dictionary keys. Then it passes these unique keys to `repositories` directly, avoiding the problem of instantiating the same repo multiple times in the same module extension. This, in turn, also avoids the need to add parameters to the original `WORKSPACE` macros that instantiate dependencies to avoid collisions under Bzlmod. The `scala_toolchains` macro never needs to call these original macros, under either `WORKSPACE` or Bzlmod. Finally, it also reduces duplication between these artifact ID lists and the `_*_DEPS` symbols originally from `testing/BUILD` (and now in `testing/deps.bzl`). The dependency labels are now generated programatically. (Aside: As I mentioned, we may eventually need to pass a Scala version argument to these macros. It will be possible to cross that bridge without too much trouble if and when that day comes. Or I can try to future proof it in a follow up pull request.)
mbland
added a commit
to mbland/rules_scala
that referenced
this pull request
Nov 27, 2024
Moves the Scalatest, JUnit, and Specs2 toolchains into `@io_bazel_rules_scala_toolchains//testing`. Part of bazel-contrib#1482. Updates all `WORKSPACE` files to set the appropriate `scala_toolchains` parameters and to remove the unnecessary repository import and toolchain registration macros. Adds a `fetch_sources_by_id` parameter to `repositories` from `third_party/repositories/repositories.bzl`. This enables `scala_toolchains` to build the `artifact_ids_to_fetch_sources` mapping from artifact ID lists returned by new macros extracted from `WORKSPACE` macros. The values assigned to each id match the original `fetch_sources` settings in the corresponding original `WORKSPACE` macros. Updates `scala/scala_maven_import_external.bzl` to generate a `load` line for `//scala:scala_import.bzl` based on the repo's canonical name, not `@io_bazel_rules_scala`. As usual, includes several other opportunistic removals of the `@io_bazel_rules_scala` repo name prefix to avoid an internal dependency on that name. This means Bzlmod users won't necessarily have to set the `repo_name` parameter of `bazel_dep` when using `rules_scala`. --- Introduces more macros to return a framework's Maven artifact dependencies, rather than inlining them in a `repositories` call. These inlined lists are replaced by macro invocations, and now the `scala_toolchains` macro can invoke these macros to collect artifact IDs to pass to `repositories`. This also allows for future changes to introduce a `scala_version` parameter if necessary, similar to how `scalafmt_artifact_ids` already works. This is important to avoid collisions when creating repositories for artifacts upon which more than one framework depends under Bzlmod. `WORKSPACE` doesn't seem affected by these collisions, but Bzlmod will produce errors like the following, where both `scala_proto` and `twitter_scrooge` depend upon `io_bazel_rules_scala_guava`: ```txt $ bazel build //src/... ERROR: .../scala/scala_maven_import_external.bzl:299:24: Traceback (most recent call last): File ".../scala/extensions/deps.bzl", line 140, column 21, in _scala_deps_impl scala_toolchains( File ".../scala/private/macros/toolchains.bzl", line 140, column 17, in scala_toolchains _scrooge( File ".../twitter_scrooge/twitter_scrooge.bzl", line 96, column 17, in twitter_scrooge repositories( File ".../third_party/repositories/repositories.bzl", line 113, column 37, in repositories _scala_maven_import_external( File ".../scala/scala_maven_import_external.bzl", line 263, column 30, in scala_maven_import_external jvm_maven_import_external( File ".../scala/scala_maven_import_external.bzl", line 299, column 24, in jvm_maven_import_external jvm_import_external(jar_urls = jar_urls, srcjar_urls = srcjar_urls, coordinates = artifact, **kwargs) Error in repository_rule: A repo named io_bazel_rules_scala_guava_2_13_15 is already generated by this module extension at .../scala/scala_maven_import_external.bzl:299:24 ERROR: Analysis of target '//src/java/io/bazel/rulesscala/worker:worker_test' failed; build aborted: error evaluating module extension scala_deps in //scala/extensions:deps.bzl ``` Recent updates to `scripts/create_repository.py` (bazel-contrib#1639, bazel-contrib#1642) make it easy to emit full direct dependency lists for artifacts included in `third_party/repositories/scala_*.bzl`. This increases the likelihood of collisions, since this expanded metadata forces the macros that instantiate artifact repos to instantiate even more repos. By fetching list of artifact IDs from these macros, `scala_toolchains` can now consolidate them into dictionary keys. Then it passes these unique keys to `repositories` directly, avoiding the problem of instantiating the same repo multiple times in the same module extension. This, in turn, also avoids the need to add parameters to the original `WORKSPACE` macros that instantiate dependencies to avoid collisions under Bzlmod. The `scala_toolchains` macro never needs to call these original macros, under either `WORKSPACE` or Bzlmod. Finally, it also reduces duplication between these artifact ID lists and the `_*_DEPS` symbols originally from `testing/BUILD` (and now in `testing/deps.bzl`). The dependency labels are now generated programatically. (Aside: As I mentioned, we may eventually need to pass a Scala version argument to these macros. It will be possible to cross that bridge without too much trouble if and when that day comes. Or I can try to future proof it in a follow up pull request.)
mbland
added a commit
to mbland/rules_scala
that referenced
this pull request
Nov 27, 2024
Moves the Scalatest, JUnit, and Specs2 toolchains into `@io_bazel_rules_scala_toolchains//testing`. Part of bazel-contrib#1482. Updates all `WORKSPACE` files to set the appropriate `scala_toolchains` parameters and to remove the unnecessary repository import and toolchain registration macros. Adds a `fetch_sources_by_id` parameter to `repositories` from `third_party/repositories/repositories.bzl`. This enables `scala_toolchains` to build the `artifact_ids_to_fetch_sources` mapping from artifact ID lists returned by new macros extracted from `WORKSPACE` macros. The values assigned to each id match the original `fetch_sources` settings in the corresponding original `WORKSPACE` macros. Updates `scala/scala_maven_import_external.bzl` to generate a `load` line for `//scala:scala_import.bzl` based on the repo's canonical name, not `@io_bazel_rules_scala`. As usual, includes several other opportunistic removals of the `@io_bazel_rules_scala` repo name prefix to avoid an internal dependency on that name. This means Bzlmod users won't necessarily have to set the `repo_name` parameter of `bazel_dep` when using `rules_scala`. --- Introduces more macros to return a framework's Maven artifact dependencies, rather than inlining them in a `repositories` call. These inlined lists are replaced by macro invocations, and now the `scala_toolchains` macro can invoke these macros to collect artifact IDs to pass to `repositories`. This also allows for future changes to introduce a `scala_version` parameter if necessary, similar to how `scalafmt_artifact_ids` already works. This is important to avoid collisions when creating repositories for artifacts upon which more than one framework depends under Bzlmod. `WORKSPACE` doesn't seem affected by these collisions, but Bzlmod will produce errors like the following, where both `scala_proto` and `twitter_scrooge` depend upon `io_bazel_rules_scala_guava`: ```txt $ bazel build //src/... ERROR: .../scala/scala_maven_import_external.bzl:299:24: Traceback (most recent call last): File ".../scala/extensions/deps.bzl", line 140, column 21, in _scala_deps_impl scala_toolchains( File ".../scala/private/macros/toolchains.bzl", line 140, column 17, in scala_toolchains _scrooge( File ".../twitter_scrooge/twitter_scrooge.bzl", line 96, column 17, in twitter_scrooge repositories( File ".../third_party/repositories/repositories.bzl", line 113, column 37, in repositories _scala_maven_import_external( File ".../scala/scala_maven_import_external.bzl", line 263, column 30, in scala_maven_import_external jvm_maven_import_external( File ".../scala/scala_maven_import_external.bzl", line 299, column 24, in jvm_maven_import_external jvm_import_external(jar_urls = jar_urls, srcjar_urls = srcjar_urls, coordinates = artifact, **kwargs) Error in repository_rule: A repo named io_bazel_rules_scala_guava_2_13_15 is already generated by this module extension at .../scala/scala_maven_import_external.bzl:299:24 ERROR: Analysis of target '//src/java/io/bazel/rulesscala/worker:worker_test' failed; build aborted: error evaluating module extension scala_deps in //scala/extensions:deps.bzl ``` Recent updates to `scripts/create_repository.py` (bazel-contrib#1639, bazel-contrib#1642) make it easy to emit full direct dependency lists for artifacts included in `third_party/repositories/scala_*.bzl`. This increases the likelihood of collisions, since this expanded metadata forces the macros that instantiate artifact repos to instantiate even more repos. By fetching list of artifact IDs from these macros, `scala_toolchains` can now consolidate them into dictionary keys. Then it passes these unique keys to `repositories` directly, avoiding the problem of instantiating the same repo multiple times in the same module extension. This, in turn, also avoids the need to add parameters to the original `WORKSPACE` macros that instantiate dependencies to avoid collisions under Bzlmod. The `scala_toolchains` macro never needs to call these original macros, under either `WORKSPACE` or Bzlmod. Finally, it also reduces duplication between these artifact ID lists and the `_*_DEPS` symbols originally from `testing/BUILD` (and now in `testing/deps.bzl`). The dependency labels are now generated programatically. (Aside: As I mentioned, we may eventually need to pass a Scala version argument to these macros. It will be possible to cross that bridge without too much trouble if and when that day comes. Or I can try to future proof it in a follow up pull request.)
mbland
added a commit
to mbland/rules_scala
that referenced
this pull request
Dec 8, 2024
Moves the Scalatest, JUnit, and Specs2 toolchains into `@io_bazel_rules_scala_toolchains//testing`. Part of bazel-contrib#1482. Updates all `WORKSPACE` files to set the appropriate `scala_toolchains` parameters and to remove the unnecessary repository import and toolchain registration macros. Adds a `fetch_sources_by_id` parameter to `repositories` from `third_party/repositories/repositories.bzl`. This enables `scala_toolchains` to build the `artifact_ids_to_fetch_sources` mapping from artifact ID lists returned by new macros extracted from `WORKSPACE` macros. The values assigned to each id match the original `fetch_sources` settings in the corresponding original `WORKSPACE` macros. Updates `scala/scala_maven_import_external.bzl` to generate a `load` line for `//scala:scala_import.bzl` based on the repo's canonical name, not `@io_bazel_rules_scala`. As usual, includes several other opportunistic removals of the `@io_bazel_rules_scala` repo name prefix to avoid an internal dependency on that name. This means Bzlmod users won't necessarily have to set the `repo_name` parameter of `bazel_dep` when using `rules_scala`. --- Introduces more macros to return a framework's Maven artifact dependencies, rather than inlining them in a `repositories` call. These inlined lists are replaced by macro invocations, and now the `scala_toolchains` macro can invoke these macros to collect artifact IDs to pass to `repositories`. This also allows for future changes to introduce a `scala_version` parameter if necessary, similar to how `scalafmt_artifact_ids` already works. This is important to avoid collisions when creating repositories for artifacts upon which more than one framework depends under Bzlmod. `WORKSPACE` doesn't seem affected by these collisions, but Bzlmod will produce errors like the following, where both `scala_proto` and `twitter_scrooge` depend upon `io_bazel_rules_scala_guava`: ```txt $ bazel build //src/... ERROR: .../scala/scala_maven_import_external.bzl:299:24: Traceback (most recent call last): File ".../scala/extensions/deps.bzl", line 140, column 21, in _scala_deps_impl scala_toolchains( File ".../scala/private/macros/toolchains.bzl", line 140, column 17, in scala_toolchains _scrooge( File ".../twitter_scrooge/twitter_scrooge.bzl", line 96, column 17, in twitter_scrooge repositories( File ".../third_party/repositories/repositories.bzl", line 113, column 37, in repositories _scala_maven_import_external( File ".../scala/scala_maven_import_external.bzl", line 263, column 30, in scala_maven_import_external jvm_maven_import_external( File ".../scala/scala_maven_import_external.bzl", line 299, column 24, in jvm_maven_import_external jvm_import_external(jar_urls = jar_urls, srcjar_urls = srcjar_urls, coordinates = artifact, **kwargs) Error in repository_rule: A repo named io_bazel_rules_scala_guava_2_13_15 is already generated by this module extension at .../scala/scala_maven_import_external.bzl:299:24 ERROR: Analysis of target '//src/java/io/bazel/rulesscala/worker:worker_test' failed; build aborted: error evaluating module extension scala_deps in //scala/extensions:deps.bzl ``` Recent updates to `scripts/create_repository.py` (bazel-contrib#1639, bazel-contrib#1642) make it easy to emit full direct dependency lists for artifacts included in `third_party/repositories/scala_*.bzl`. This increases the likelihood of collisions, since this expanded metadata forces the macros that instantiate artifact repos to instantiate even more repos. By fetching list of artifact IDs from these macros, `scala_toolchains` can now consolidate them into dictionary keys. Then it passes these unique keys to `repositories` directly, avoiding the problem of instantiating the same repo multiple times in the same module extension. This, in turn, also avoids the need to add parameters to the original `WORKSPACE` macros that instantiate dependencies to avoid collisions under Bzlmod. The `scala_toolchains` macro never needs to call these original macros, under either `WORKSPACE` or Bzlmod. Finally, it also reduces duplication between these artifact ID lists and the `_*_DEPS` symbols originally from `testing/BUILD` (and now in `testing/deps.bzl`). The dependency labels are now generated programatically. (Aside: As I mentioned, we may eventually need to pass a Scala version argument to these macros. It will be possible to cross that bridge without too much trouble if and when that day comes. Or I can try to future proof it in a follow up pull request.)
mbland
added a commit
to mbland/rules_scala
that referenced
this pull request
Dec 10, 2024
Moves the Scalatest, JUnit, and Specs2 toolchains into `@io_bazel_rules_scala_toolchains//testing`. Part of bazel-contrib#1482. Updates all `WORKSPACE` files to set the appropriate `scala_toolchains` parameters and to remove the unnecessary repository import and toolchain registration macros. Adds a `fetch_sources_by_id` parameter to `repositories` from `third_party/repositories/repositories.bzl`. This enables `scala_toolchains` to build the `artifact_ids_to_fetch_sources` mapping from artifact ID lists returned by new macros extracted from `WORKSPACE` macros. The values assigned to each id match the original `fetch_sources` settings in the corresponding original `WORKSPACE` macros. Updates `scala/scala_maven_import_external.bzl` to generate a `load` line for `//scala:scala_import.bzl` based on the repo's canonical name, not `@io_bazel_rules_scala`. As usual, includes several other opportunistic removals of the `@io_bazel_rules_scala` repo name prefix to avoid an internal dependency on that name. This means Bzlmod users won't necessarily have to set the `repo_name` parameter of `bazel_dep` when using `rules_scala`. --- Introduces more macros to return a framework's Maven artifact dependencies, rather than inlining them in a `repositories` call. These inlined lists are replaced by macro invocations, and now the `scala_toolchains` macro can invoke these macros to collect artifact IDs to pass to `repositories`. This also allows for future changes to introduce a `scala_version` parameter if necessary, similar to how `scalafmt_artifact_ids` already works. This is important to avoid collisions when creating repositories for artifacts upon which more than one framework depends under Bzlmod. `WORKSPACE` doesn't seem affected by these collisions, but Bzlmod will produce errors like the following, where both `scala_proto` and `twitter_scrooge` depend upon `io_bazel_rules_scala_guava`: ```txt $ bazel build //src/... ERROR: .../scala/scala_maven_import_external.bzl:299:24: Traceback (most recent call last): File ".../scala/extensions/deps.bzl", line 140, column 21, in _scala_deps_impl scala_toolchains( File ".../scala/private/macros/toolchains.bzl", line 140, column 17, in scala_toolchains _scrooge( File ".../twitter_scrooge/twitter_scrooge.bzl", line 96, column 17, in twitter_scrooge repositories( File ".../third_party/repositories/repositories.bzl", line 113, column 37, in repositories _scala_maven_import_external( File ".../scala/scala_maven_import_external.bzl", line 263, column 30, in scala_maven_import_external jvm_maven_import_external( File ".../scala/scala_maven_import_external.bzl", line 299, column 24, in jvm_maven_import_external jvm_import_external(jar_urls = jar_urls, srcjar_urls = srcjar_urls, coordinates = artifact, **kwargs) Error in repository_rule: A repo named io_bazel_rules_scala_guava_2_13_15 is already generated by this module extension at .../scala/scala_maven_import_external.bzl:299:24 ERROR: Analysis of target '//src/java/io/bazel/rulesscala/worker:worker_test' failed; build aborted: error evaluating module extension scala_deps in //scala/extensions:deps.bzl ``` Recent updates to `scripts/create_repository.py` (bazel-contrib#1639, bazel-contrib#1642) make it easy to emit full direct dependency lists for artifacts included in `third_party/repositories/scala_*.bzl`. This increases the likelihood of collisions, since this expanded metadata forces the macros that instantiate artifact repos to instantiate even more repos. By fetching list of artifact IDs from these macros, `scala_toolchains` can now consolidate them into dictionary keys. Then it passes these unique keys to `repositories` directly, avoiding the problem of instantiating the same repo multiple times in the same module extension. This, in turn, also avoids the need to add parameters to the original `WORKSPACE` macros that instantiate dependencies to avoid collisions under Bzlmod. The `scala_toolchains` macro never needs to call these original macros, under either `WORKSPACE` or Bzlmod. Finally, it also reduces duplication between these artifact ID lists and the `_*_DEPS` symbols originally from `testing/BUILD` (and now in `testing/deps.bzl`). The dependency labels are now generated programatically. (Aside: As I mentioned, we may eventually need to pass a Scala version argument to these macros. It will be possible to cross that bridge without too much trouble if and when that day comes. Or I can try to future proof it in a follow up pull request.)
mbland
added a commit
to mbland/rules_scala
that referenced
this pull request
Dec 16, 2024
Moves the Scalatest, JUnit, and Specs2 toolchains into `@io_bazel_rules_scala_toolchains//testing`. Part of bazel-contrib#1482. Updates all `WORKSPACE` files to set the appropriate `scala_toolchains` parameters and to remove the unnecessary repository import and toolchain registration macros. Adds a `fetch_sources_by_id` parameter to `repositories` from `third_party/repositories/repositories.bzl`. This enables `scala_toolchains` to build the `artifact_ids_to_fetch_sources` mapping from artifact ID lists returned by new macros extracted from `WORKSPACE` macros. The values assigned to each id match the original `fetch_sources` settings in the corresponding original `WORKSPACE` macros. Updates `scala/scala_maven_import_external.bzl` to generate a `load` line for `//scala:scala_import.bzl` based on the repo's canonical name, not `@io_bazel_rules_scala`. As usual, includes several other opportunistic removals of the `@io_bazel_rules_scala` repo name prefix to avoid an internal dependency on that name. This means Bzlmod users won't necessarily have to set the `repo_name` parameter of `bazel_dep` when using `rules_scala`. --- Introduces more macros to return a framework's Maven artifact dependencies, rather than inlining them in a `repositories` call. These inlined lists are replaced by macro invocations, and now the `scala_toolchains` macro can invoke these macros to collect artifact IDs to pass to `repositories`. This also allows for future changes to introduce a `scala_version` parameter if necessary, similar to how `scalafmt_artifact_ids` already works. This is important to avoid collisions when creating repositories for artifacts upon which more than one framework depends under Bzlmod. `WORKSPACE` doesn't seem affected by these collisions, but Bzlmod will produce errors like the following, where both `scala_proto` and `twitter_scrooge` depend upon `io_bazel_rules_scala_guava`: ```txt $ bazel build //src/... ERROR: .../scala/scala_maven_import_external.bzl:299:24: Traceback (most recent call last): File ".../scala/extensions/deps.bzl", line 140, column 21, in _scala_deps_impl scala_toolchains( File ".../scala/private/macros/toolchains.bzl", line 140, column 17, in scala_toolchains _scrooge( File ".../twitter_scrooge/twitter_scrooge.bzl", line 96, column 17, in twitter_scrooge repositories( File ".../third_party/repositories/repositories.bzl", line 113, column 37, in repositories _scala_maven_import_external( File ".../scala/scala_maven_import_external.bzl", line 263, column 30, in scala_maven_import_external jvm_maven_import_external( File ".../scala/scala_maven_import_external.bzl", line 299, column 24, in jvm_maven_import_external jvm_import_external(jar_urls = jar_urls, srcjar_urls = srcjar_urls, coordinates = artifact, **kwargs) Error in repository_rule: A repo named io_bazel_rules_scala_guava_2_13_15 is already generated by this module extension at .../scala/scala_maven_import_external.bzl:299:24 ERROR: Analysis of target '//src/java/io/bazel/rulesscala/worker:worker_test' failed; build aborted: error evaluating module extension scala_deps in //scala/extensions:deps.bzl ``` Recent updates to `scripts/create_repository.py` (bazel-contrib#1639, bazel-contrib#1642) make it easy to emit full direct dependency lists for artifacts included in `third_party/repositories/scala_*.bzl`. This increases the likelihood of collisions, since this expanded metadata forces the macros that instantiate artifact repos to instantiate even more repos. By fetching list of artifact IDs from these macros, `scala_toolchains` can now consolidate them into dictionary keys. Then it passes these unique keys to `repositories` directly, avoiding the problem of instantiating the same repo multiple times in the same module extension. This, in turn, also avoids the need to add parameters to the original `WORKSPACE` macros that instantiate dependencies to avoid collisions under Bzlmod. The `scala_toolchains` macro never needs to call these original macros, under either `WORKSPACE` or Bzlmod. Finally, it also reduces duplication between these artifact ID lists and the `_*_DEPS` symbols originally from `testing/BUILD` (and now in `testing/deps.bzl`). The dependency labels are now generated programatically. (Aside: As I mentioned, we may eventually need to pass a Scala version argument to these macros. It will be possible to cross that bridge without too much trouble if and when that day comes. Or I can try to future proof it in a follow up pull request.)
liucijus
pushed a commit
that referenced
this pull request
Jan 14, 2025
* Toolchainize all testing toolchains Moves the Scalatest, JUnit, and Specs2 toolchains into `@io_bazel_rules_scala_toolchains//testing`. Part of #1482. Updates all `WORKSPACE` files to set the appropriate `scala_toolchains` parameters and to remove the unnecessary repository import and toolchain registration macros. Adds a `fetch_sources_by_id` parameter to `repositories` from `third_party/repositories/repositories.bzl`. This enables `scala_toolchains` to build the `artifact_ids_to_fetch_sources` mapping from artifact ID lists returned by new macros extracted from `WORKSPACE` macros. The values assigned to each id match the original `fetch_sources` settings in the corresponding original `WORKSPACE` macros. Updates `scala/scala_maven_import_external.bzl` to generate a `load` line for `//scala:scala_import.bzl` based on the repo's canonical name, not `@io_bazel_rules_scala`. As usual, includes several other opportunistic removals of the `@io_bazel_rules_scala` repo name prefix to avoid an internal dependency on that name. This means Bzlmod users won't necessarily have to set the `repo_name` parameter of `bazel_dep` when using `rules_scala`. --- Introduces more macros to return a framework's Maven artifact dependencies, rather than inlining them in a `repositories` call. These inlined lists are replaced by macro invocations, and now the `scala_toolchains` macro can invoke these macros to collect artifact IDs to pass to `repositories`. This also allows for future changes to introduce a `scala_version` parameter if necessary, similar to how `scalafmt_artifact_ids` already works. This is important to avoid collisions when creating repositories for artifacts upon which more than one framework depends under Bzlmod. `WORKSPACE` doesn't seem affected by these collisions, but Bzlmod will produce errors like the following, where both `scala_proto` and `twitter_scrooge` depend upon `io_bazel_rules_scala_guava`: ```txt $ bazel build //src/... ERROR: .../scala/scala_maven_import_external.bzl:299:24: Traceback (most recent call last): File ".../scala/extensions/deps.bzl", line 140, column 21, in _scala_deps_impl scala_toolchains( File ".../scala/private/macros/toolchains.bzl", line 140, column 17, in scala_toolchains _scrooge( File ".../twitter_scrooge/twitter_scrooge.bzl", line 96, column 17, in twitter_scrooge repositories( File ".../third_party/repositories/repositories.bzl", line 113, column 37, in repositories _scala_maven_import_external( File ".../scala/scala_maven_import_external.bzl", line 263, column 30, in scala_maven_import_external jvm_maven_import_external( File ".../scala/scala_maven_import_external.bzl", line 299, column 24, in jvm_maven_import_external jvm_import_external(jar_urls = jar_urls, srcjar_urls = srcjar_urls, coordinates = artifact, **kwargs) Error in repository_rule: A repo named io_bazel_rules_scala_guava_2_13_15 is already generated by this module extension at .../scala/scala_maven_import_external.bzl:299:24 ERROR: Analysis of target '//src/java/io/bazel/rulesscala/worker:worker_test' failed; build aborted: error evaluating module extension scala_deps in //scala/extensions:deps.bzl ``` Recent updates to `scripts/create_repository.py` (#1639, #1642) make it easy to emit full direct dependency lists for artifacts included in `third_party/repositories/scala_*.bzl`. This increases the likelihood of collisions, since this expanded metadata forces the macros that instantiate artifact repos to instantiate even more repos. By fetching list of artifact IDs from these macros, `scala_toolchains` can now consolidate them into dictionary keys. Then it passes these unique keys to `repositories` directly, avoiding the problem of instantiating the same repo multiple times in the same module extension. This, in turn, also avoids the need to add parameters to the original `WORKSPACE` macros that instantiate dependencies to avoid collisions under Bzlmod. The `scala_toolchains` macro never needs to call these original macros, under either `WORKSPACE` or Bzlmod. Finally, it also reduces duplication between these artifact ID lists and the `_*_DEPS` symbols originally from `testing/BUILD` (and now in `testing/deps.bzl`). The dependency labels are now generated programatically. (Aside: As I mentioned, we may eventually need to pass a Scala version argument to these macros. It will be possible to cross that bridge without too much trouble if and when that day comes. Or I can try to future proof it in a follow up pull request.) * Move `scala_toolchains` to `scala/toolchains.bzl` Removes this symbol from `scala/scala.bzl` as well as `setup_scala_testing_toolchain`, and deletes `scala/private/macros/toolchains.bzl`. Part of #1482 and #1652. This is required for Bazel 8 and `rules_java` 8 compatibility, but is also compatible with Bazel 6 and 7. In #1652, @hvadehra suggested partitioning the `.bzl` files such that `WORKSPACE` doesn't `load` a file that tries to `load` symbols from `rules_java`. I successfully did so in a separate branch, and along with other minor changes, got `rules_scala` to build with `rules_java` 8.5.1. The other changes will come in separate pull requests, but it makes sense to land this change now before adding any other toolchains to `scala_toolchains`. --- Arguably, we should remove all macros exported from `scala/scala.bzl` that only instantiate toolchain dependencies and define toolchains. That may be a breaking change for some users, but will ultimately be necessary for these macros to remain compatible with Bazel 8. * Extract versioned `_JUNIT_DEPS` in `test/BUILD` Eliminates reliance on the default `@io_bazel_rules_scala_junit_junit` artifact repository. * Update `{junit,specs2_junit}_toolchain()` These macros now mirror the implementation of `scalatest_toolchain()`. However, I realized that the old pattern of calling `scalatest_repositories()` followed by `scalatest_toolchain()` will no longer work without first calling `scala_toolchains(scalatest = True)`. This is because the `alias` targets in `testing/BUILD` that replace the previous implementations all point to `@io_bazel_rules_scala_toolchains`. So if we want to keep these macros, it seems like we should maybe restore the original toolchain targets in `testing/BUILD`. If we don't, we can remove these macros, but we can document these as breaking changes, and update other documentation accordingly. * Move `scala/{private/macros/,}toolchains_repo.bzl` Like "Move `scala_toolchains` to `scala/toolchains.bzl`", removes the `scala_toolchains_repo` symbol from `scala/scala.bzl` and makes it available from `scala/toolchains_repo.bzl`. This avoids a future `test_scala_version 2.12.20` failure during Bazel 8 builds after adding `twitter_scrooge` toolchain support in the new `test_version/version_specific_tests_dir/scrooge_repositories.bzl` file. Otherwise, this new file would load `toolchains_repo` from `scala/scala.bzl`. The `test_version/test_scala_version_.../WORKSPACE` file generated from `test_version/WORKSPACE.template` would then transitively load `.bzl` files with `rules_java` symbols, breaking the test. ```txt $ RULES_SCALA_TEST_ONLY="test_scala_version 2.12.20" ./test_version.sh ERROR: Traceback (most recent call last): File ".../external/rules_scala/scala/private/common_attributes.bzl", line 18, column 28, in <toplevel> "deps": attr.label_list( Error in label_list: Illegal argument: element in 'providers' is of unexpected type. Either all elements should be providers, or all elements should be lists of providers, but got list with an element of type NoneType. ERROR: Error computing the main repository mapping: at test_version/test_scala_version_.../scrooge_repositories.bzl:1:6: at .../scala/scala.bzl:30:5: at .../scala/private/rules/scala_junit_test.bzl:5:5: initialization of module 'scala/private/common_attributes.bzl' failed ```
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Refactors
scripts/create_repository.py
to make more modular, robust, and efficient, and update Scalatest, protobuf-java, and kind-projector jars. Part of #1482.There are a lot of changes from individual commits in this one large PR, since @simuons said it wasn't necessary to break up #1639, another large PR to refactor
scripts/create_repository.py
, and all tests continue to pass. That said, I'm always more than happy to break this one apart if it really is too much.The highlights are:
EXCLUDED_ARTIFACTS
, the all script will update existing dictionary data appropriately.deps
labels referring to it, or remove the artifact and its references altogether.Perhaps most significantly, this cuts the number of Coursier commands and artifact downloads significantly, leading to a very large performance improvement. Here are the times for creating a fresh output dir and then updating it before this change (all with a warm Coursier download cache):
Here are the times after:
Motivation
Overall, this script is making my further work to update dependencies to ease the Bzlmod migration in #1482 so much faster, easier, and safer. But I found ways to improve it further, after #1639, to make the script itself more maintainable, more correct and robust, and much faster.
I'm now using it to bump ScalaPB, protoc-bridge, gRPC, guava, etc. dependencies, and resolving any breakages, in an effort to make rules_scala able to use newer versions of the Protobuf library. Though independent from (and not strictly required by) the Bzlmod migration work, this may be especially important to future Bzlmod users that include later Protobuf versions in their dependencies.
The script is now in a state where I really trust it to do the right thing as I continue this work. This makes this complex upgrade work far more tractable than before.