Skip to content

protoc-gen-go: determine plan for improving generated APIs #526

@dsnet

Description

@dsnet

Background

The API for generated proto messages (i.e., .pb.go sources) as created by protoc-gen-go is not perfect. It was not perfect when it was created, it is not perfect if we hypothetically fix it today, and it will not be perfect tomorrow. The reality is that we will (with high probability) eventually come up with a new API that is sufficiently better than the one today that it is worth some amount of migration effort. As such, we should come up with a way for protoc-gen-go to generate cleaner, idiomatic, and nicer APIs.

We should come up with a migration strategy that has the following properties:

  1. By default, protoc-gen-go initially generates the API that is considered the "best", by some measure of good according to the standard of the day (e.g., suppose the Go language adds native support for tagged unions, it would be really unfortunate to continue generating the wrapper types for proto oneofs or using pointers to primitive types).
  2. Once I start using protoc-gen-go to generate a specific message, I should have some guarantees regarding the stability of the generated API (i.e., that the generated API will not change or remove declarations). In other words, I should not need to change my .proto file at all to maintain compatibility.
  3. There should be some way to perform a multi-stage migration to the new API. For protos that are part of the public API of some widely used open-source package, then this doesn't matter. However, there are number of closed-source environments, where the generated proto and all the dependencies are within the control of a single organization, where a graceful migration is possible.

Goal 1 is about eagerly adopting the future, goal 2 is about maintaining the stability of the past, while goal 3 is about trying to migrate the past to the future. I understand that goal 3 is not possible for some use-cases, but that doesn't mean it is not worth supporting. I suspect that goal 2 and 3 are irreconcilable; if so, I propose that we favor goal 1 and firmly document that it is the user's responsibility to vendor or version control which exact version of protoc-gen-go they use.

How to accomplish this?

I don't know, but I have some rough ideas.

Some possible ways to achieve goals 1 and 2:

  • Every .proto file is required to specify the generated API version. This could be a single option go_api_version = "protoc-gen-go/v2" (whether to add the generator name and/or allow multiple generators is up for debate). Alternatively, instead of a monotonic number (like v2), it could be a flag of API features. If no such option is specified, you are always living on the edge (which unfortunately violates goal 2).
  • protoc-gen-go could emit an unexported constant or magic comment into the generated .pb.go file to know what API version or features it is using. When protoc-gen-go is invoked, it checks whether the output .pb.go file exists, if so, it determines what previous API version was used and generates according to it. In this approach, the API version is encoded in the output .pb.go file.
  • Your brilliant idea here.

Some possibles ways to achieve goal 3:

  • Add proto options to opt-in to generating duplicate API to support the previous API. For example, in protoc-gen-go: remove type name from generated enum #513 proposes to remove the type prefix from enumeration names. An option would allow the generation of both forms. However, note that supporting both APIs is not possible if there is a identifier conflict. For example, if we're trying to change a function signature. Also, note that using proto options alone can only satisfy goal 1 or goal 2, but not both.
    • If you satisfy goal 1, then you are probably breaking goal 2, since I am required to add the option to be proto file to generate the old API.
    • If you satisfy goal 2, then over time your .proto files are littered with option annotations to opt-in to the newer API. This does not scale as it is easy for users to forget specifying that they want the new API semantics on a newly added proto field, which violates goal 1.
  • Your brilliant idea here.

Other considerations

The issue of migration is tangentially related to the desire to have added support for many Go-specific proto options. All of these need to be considered together. I believe it would be a mistake to start adding options without a cohesive idea of where we're going.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions