Skip to content

Add a working introduction to Bazel. #17

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 12, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
235 changes: 235 additions & 0 deletions working/bazel/module-structure.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,235 @@
The following is a reference for [`Bazel`](https://bazel.build/) (internally
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internally -> internally in Google

(This repository is not aimed at Googlers only).
Consider whether it's important to external readers at all, it probably isn't.

Much of the document seems to be about Google-only design choices, so maybe instead make it explicit that this document is mainly about Blaze at Google, but that it also applies to Bazel build systems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

(Bazel is currently Google only for the most part, but that is rapidly changing. For example SpaceX uses it as their own build system, and many other medium/larger companies are considering it)

called `Blaze` at Google) structures modules of Dart code, especially when
compared to external structures (i.e. via `pub`).

This is *not* a proposal, nor a commitment to build out further Bazel support
externally, but rather a reference to help guide discussions around visibility
modifiers and terminology around modular/reusable Dart code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who is this document intended for?
That is, who would care about Bazel modules at all? And why should they? (So, I guess I'm asking for motivation - to make Bazel compatibility visible in language discussions - and expected audience to be spelled out).

I'm guess it is for people like me, who has no working knowledge of Bazel or Blaze, but who work with Dart pub and language packages all day long, and who might affect how well future language changes matches the Bazel model.

## Monolothic Repository

In ["Why Google Stores Billions of Lines of Code in a Single Repository"][1]
the author explain the benefits of a _Monolothic Repository_ ("Mono Repo").
Many of the constraints described in this document are due to this decision,
and be expected to be _unchangeable_ - i.e. it is quite outside of the scope
of any language (including Dart) to change these requirements.

[1]: https://cacm.acm.org/magazines/2016/7/204032-why-google-stores-billions-of-lines-of-code-in-a-single-repository/fulltext

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that Bazel use is a Google-only requirement that nobody else should worry about? (Again, who is the target audience?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Clarified.

## Typical Structure

In the Google mono-repository, there is a single root directory. Let's call it
`google` - and it is commonly referenced by tools and Bazel with a prefix of
`//`, so `//ninjacat` is logically located at `/google/ninjacat`. This root
directory is called a _workspace_.

Bazel [defines a _package_][2] as so:

> The primary unit of code organization in a workspace is the _package_. A
> package is a collection of related files and a specification of the
> dependencies among them.
>
> A package is defined as a directory containing a file named `BUILD`,
> residing beneath the top-level directory in the workspace. A package includes
> all files in its directory, plus all subdirectories beneath it, except those
> which themselves contain a BUILD file.

[2]: https://docs.bazel.build/versions/master/build-ref.html#packages_targets

As one can see, already _package_ is very different from a [`pub` package][3].

[3]: https://www.dartlang.org/guides/libraries/create-library-packages

We map the concept of `pub` packages with the following rules:

* A `package:<a>.<b>/uri.dart` resolves to `//a/b/lib/uri.dart`.
* Any other `package:<name>/uri.dart` resolves to `//third_party/dart/<name>`,
i.e. without a `.` in the name.

### Inside a Package

Inside a package, such as `//ninjacat/app`, you can expect the following:

```
> ninjacat/
> app/
> lib/
> test/
> BUILD
```

(If the package happens to be a Dart web _entrypoint_, you might also see `web/`
and for VM binaries, you might also see `bin/`.)

However, there is another important concept, [`targets`][4]:

> A package is a container. The elements of a package are called _targets_. Most
> targets are one of two principal kinds, files and rules. Additionally, there
> is another kind of target, package groups, but they are far less numerous.

[4]: https://docs.bazel.build/versions/master/build-ref.html#targets

So, imagine the following, in the `BUILD` file:

```
dart_library(
name = "app",
srcs = ["lib/app.dart"],
deps = [
":flags",
],
)

dart_library(
name = "flags",
srcs = ["lib/flags.dart"],
)

dart_test(
name = "flags_test",
srcs = ["test/flags_test.dart"],
deps = [
":flags",
"//third_party/dart/test",
],
)
```

Here we have _3_ targets:
* `app`, which potentially is code that wraps together application-specific
code before being used later in the `main()` function of something in either
`web/` or `bin/`.
* `flags`, which contains some common code for setting/getting flags.
* `flags_test`, which tests that `flags` is working-as-intended.

This concept already is quite different than a `pub` package, where all of the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

than -> from
?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

files in `lib/` are accessible once you have a dependency on that package. In
fact, a common issue externally is that `pubspec.yaml` (sort of similar to
`BUILD`) is not granular enough, leading to the creation of "micro packages"
that have a single file or capability.

### Common Patterns

Based on the above, teams tend to structure their projects _hierarchically_,
with more specific code living deeper in a sub-package. For example, imagine
the `ninjacat` project after a few more weeks:

```
> ninjacat/
> app/
> views/
> checkout/
> lib/
> test/
> BUILD
> home/
> lib/
> test/
> BUILD
> common/
> widgets/
> login/
> lib/
> test/
> testing/
> lib/
> BUILD
> BUILD
```

Already we have the following "packages":

* `package:ninjacat.app.views.checkout`
* `package:ninjacat.app.views.home`
* `package:ninjacat.common.widgets.login`
* `package:ninjacat.common.widgets.login.testing`

It's common to use the [`visibility`][5] property to setup the concept of
"friend" packages, i.e. packages that are only accessible to _specific_ other
packages. In the following code, we ensure that the `login` "package" is only
accessible to packages under `app/`:

```
# //ninjacat/common/widgets/login/BUILD

dart_library(
name = "login",
srcs = glob(["lib/**.dart"]),
visibility = [
"//ninajacat/app:__subpackages__",
],
)
```

[5]: https://docs.bazel.build/versions/master/be/common-definitions.html#common-attributes

Another common pattern is the concept of `testonly`, and `testing` packages:

```
# //ninjacat/common/widgets/login/BUILD

dart_test(
name = "login_test",
srcs = ["test/login_test.dart"],
deps = [
"//ninjacat/common/widgets/login/testing",
"//third_party/dart/test",
],
)
```

```
# //ninjacat/common/widgets/login/testing

dart_library(
name = "testing",
srcs = glob(["lib/**.dart"]),
testonly = 1,
deps = [
"//third_party/dart/pageloader",
],
)
```

In this case, `package:ninjacat.common.widgets.login.testing` exposes a set of
test-only libraries that can be used, only in _test_ targets. This pattern isn't
replicable externally at all, but is _very_ common internally to avoid bringing
in test utilities and code into production applications.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting insight - I could imagine a dev_only flag which could be set which only allows a package to be used as a dev dependency or as a direct dependency of other dev_only packages - this would be non-breaking for pub (but a breaking change per package when they opt in).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack


## Notable Differences

### No cyclical _targets_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth mentioning that this is actually quite common when you take into account dev dependencies. If you have a testing only package (angular_test, build_test), which depends on the main package but is also used within the main packages tests (very common) now you have a cyclic dependency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


Dart, as the language, allows cyclical dependencies between `.dart` files
(libraries). Bazel does _not_ allow cyclical dependencies between packages
(i.e. _targets_). So, the following is illegal in Bazel where it is fine
externally with `pub`:

```
# //ninjacat/common/foo (i.e. package:ninjacat.common.foo/foo.dart)

dart_library(
name = "foo",
srcs = ["lib/foo.dart"],
dependencies = [
"//ninjacat/common/bar",
],
)
```

```
# //ninjacat/common/bar (i.e. package:ninjacat.common.bar/bar.dart)

dart_library(
name = "bar",
srcs = ["lib/bar.dart"],
dependencies = [
"//ninjacat/common/foo",
],
)
```

Unfortunately this is quite common when you take into account the concept of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only unfortunate if you want to treat an entire pub package as a single unit of compilation/modularity, which was never the goal.

A pub package's granularity is the unit of "shippable API product". It's a version manager, not a build tool. Finding a pub package lacking as the unit of compilation is like complaining that you can't recline an entire car. That's not the point. It's a container around things that can be reclined. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you can't say micro packages are an anti-pattern and say pubspec.yaml is only about version solving :-/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what we probably want are three levels of granularity:

  1. Shippable versioned API product. These have names, release schedules, dependencies on other products. Cycles are allowed. Pub packages.

  2. Build target. This is the unit of compilation and modularity. Likely the unit of encapsulation for things like "sealed". Possibly having a privacy control at this level. Acyclic. This doesn't currently exist in a user-facing way.

  3. Source file. A single file of source code with its own top-level scope. Privacy and namespacing works at this level. Dart libraries. Cycles are allowed (and heavily used in practice).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The levels of granularity you describe @munificent are exactly what we have in bazel, with the only difference being that packages are defined as directories with BUILD files, instead of coming from pub. But conceptually they are the same.

What @matanlurey and probably most people who have tried to integrate with a Dart compiler wants/needs, is a formal and consistent way of defining the Build target portion externally. I believe this should be integrated with pub, for a couple reasons:

  • You need to have separate dependencies per build target in order to efficiently create an acyclic graph of build targets.
    • The only alternatives I can come up with are constantly re-analyzing your actual Dart import structure to create custom targets (what package:build does, at significant expense), or forcing all end users to create static configuration for all their transitive dependencies, which is just not realistic.
  • Allows the build target structure to be statically configured, which is a huge win for build systems. Neither bazel nor package:build_runner allow you to read files before declaring your inputs/outputs, so this is a huge pain point today.
  • Module granularity/structure is consistent across tools.
  • No extra dependencies for platforms/features you don't use.
  • Bazel has proven that configuring build targets in the package itself works well (enough). If you need a more granular target you can always send a PR to a package to create it. See my next point for where this potentially falls down externally though...

All this being said there is one potentially huge issue with trying to push the bazel model on the external world, which is figuring out how to actually enforce the acyclic requirement of the build targets. With bazel and a single version restriction for packages, this is not really an issue. But for pub packages any new non-breaking release of a package could introduce a build target cycle for any package that depends on that package accidentally. Maybe the version solver could take that into account (throw out versions if they introduce a build target cycle?), but it doesn't seem trivial.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. (We should probably add a doc about build system constraints somewhere)

pub's `dev_dependencies: ...`. If you have a testing only package (
`angular_test`, `build_test`, `flutter_test`), which depends on the main package
but is also used within the main packages tests you have a cyclic dependency.