Skip to content

[usage] Persist usage reports in database #11343

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 14, 2022
Merged

[usage] Persist usage reports in database #11343

merged 8 commits into from
Jul 14, 2022

Conversation

andrew-farries
Copy link
Contributor

@andrew-farries andrew-farries commented Jul 13, 2022

Description

Store the usage report in the database table created in #11311 whenever usage reconciliation runs.

⚠️ This PR does not yet populate the creditsUsed or generationId fields ⚠️

Related Issue(s)

Part of #10323

How to test

Run the usage component against the database in this PR's preview environment:

  • Port forward to the database in the preview env
  • Edit the local config components/usage/config.json for the usage controller to run every 10s.
  • Run the usage component:
DB_USERNAME=gitpod DB_PASSWORD=<> DB_HOST=localhost DB_PORT=3306 go run . run

(get the password from server environment)

  • Wait a while for reconciliation to run a couple of times.
  • Connect to the port-forwarded database with eg mycli and observe the data in d_b_workspace_instance_usage.

Unit tests.

Release Notes

NONE

Werft options:

  • /werft with-preview

@andrew-farries andrew-farries requested a review from a team July 13, 2022 10:23
@github-actions github-actions bot added the team: webapp Issue belongs to the WebApp team label Jul 13, 2022
return "d_b_workspace_instance_usage"
}

func CreateUsageRecords(ctx context.Context, conn *gorm.DB, report map[AttributionID][]WorkspaceInstanceForUsage) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest the DB layer only takes []WorksapceInstanceForUsage to store, rather than the map, and persists these. The re-mapping can be done in the controller layer. This allows us to keep the db package slightly more general.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or actually, it should take []WorkspaceInstanceUsage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. The conversion between reports and []WorkspaceInstanceUsage is done in the reconciler.

// the same workspace again
ID: instanceID,
UsageAttributionID: teamAttributionID,
WorkspaceClass: "default",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a constant for this default value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no longer required now that the CreateUsageRecords function takes WorkspaceInstanceUsage structs directly.

}

func TestCanHandleMultipleReportsWithDuplicateEntries(t *testing.T) {
conn := dbtest.ConnectForTests(t)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend moving the conn into each test scenario, to ensure that after each test the data is deleted before the next test run

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -85,6 +85,8 @@ func (u *UsageReconciler) Reconcile() (err error) {
}
log.Infof("Wrote usage report into %s", filepath.Join(dir, stat.Name()))

db.CreateUsageRecords(ctx, u.conn, report)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should handle error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

for _, scenario := range scenarios {
t.Run(scenario.Name, func(t *testing.T) {
for _, report := range scenario.Reports {
require.NoError(t, db.CreateUsageRecords(ctx, conn, report))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may need the helpers from dbtest which add hooks into the created records to delete them once the test completes. See dbtest.CreateWorkspace for example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added t.Cleanup calls to each subtest to delete the usage records between each subtest.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conn := dbtest.ConnectForTests(t)
t.Run(scenario.Name, func(t *testing.T) {
t.Cleanup(func() {
require.NoError(t, conn.Where("1=1").Delete(&db.WorkspaceInstanceUsage{}).Error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the cause of the problems with flaky tests. When more tests exist and do this, it deletes the data needed for other runs of the test. That's why the current approach adds a hook for each ID created to only deleted the IDs created in each test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔

None of these tests call t.Parallel, so AIUI the tests in this file will run in series, and within each test each subtest will also run in series. So where is the risk of having a subtest clearing the table after it finishes?

Is it tests running in different packages (which will be run in parallel by go test) that is the concern here?

Copy link
Member

@easyCZ easyCZ Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But tests in different files will be run concurrently. At some point, we'll add a test which also interacts with the WorkspaceInstanceUsage table and we'll be in trouble. The problem comes from the following interleaving:

test1: Create 10 records
test2: create 2 records
test1: Check we've got 10 records - fail, got 12

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exact same issue happened previously, we've added tests for d_b_workspace_instance and then we also added tests for the controller (at higher level) which also created d_b_workspace_instance records

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#10971 has extra context on this, including a link to some literature

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But tests in different files will be run concurrently.

I don't think that's correct. See for example here:

By default, execution of test code using the testing package will be done sequentially. However, note that it is only the tests within a given package that run sequentially.

If tests from multiple packages are specified, the tests will be run in parallel at the package level. For example, imagine there are two packages, package a and package b. The test code in package a will be run sequentially, and the test code in package b will also be run sequentially. However, the tests for package a and package b will be run in parallel. Let’s look more closely into how these will be run in parallel.

So I guess I'm struggling to understand how test cleanup code within a package can interfere with other tests in the same package if there is no t.Parallel anywhere. Maybe there is a race in actually deleting the rows from the database?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issues we observed were across different packages. And the same will happen here once we add proper tests to the controller package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test cleanup changed to only remove records that were created by each test.

@werft-gitpod-dev-com
Copy link

started the job as gitpod-build-af-store-usage-data.7 because the annotations in the pull request description changed
(with .werft/ from main)

Andrew Farries added 4 commits July 13, 2022 14:07
Add a GORM type that represents the "d_b_workspace_instance_usage" table
and methods for working with it.
)

type WorkspaceInstanceUsage struct {
WorkspaceID uuid.UUID `gorm:"primary_key;column:workspaceId;type:char;size:36;" json:"workspaceId"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to merge the other PR with the rename into this one...? I though it had already been merged.

💡 Or has the table defintion, but not the Go struct?

@geropl
Copy link
Member

geropl commented Jul 14, 2022

Doing a quick test...

@geropl geropl self-assigned this Jul 14, 2022
@geropl
Copy link
Member

geropl commented Jul 14, 2022

Works as advertised. I understood that credits and generationId will be set later.

Only thing I noticed is that stoppedAt is not set, while the workspace_instance is definitely stopped (and has a stoppedTime).
image

@andrew-farries Any idea why that might be?

Andrew Farries added 4 commits July 14, 2022 10:23
Rename the misnamed primary key `workspaceId` -> `instanceId`.

Usage based billing is unreleased so just drop the table and recreate it
- no data to migrate.

Also add index names.
In the case where the instance has stopped.
@andrew-farries
Copy link
Contributor Author

Only thing I noticed is that stoppedAt is not set, while the workspace_instance is definitely stopped (and has a stoppedTime).

Thanks. I've fixed this and added tests for the conversion from a usage report to instance usage records. (f25999d and b9016e9).

Copy link
Member

@geropl geropl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the current preview env is in an undefined state.

But I applied the migration manually, ran the test and 🎉 : it worked. 🧘
image

@andrew-farries
Copy link
Contributor Author

andrew-farries commented Jul 14, 2022

/werft run with-clean-slate-deployment=true with-preview=true

👍 started the job as gitpod-build-af-store-usage-data.12
(with .werft/ from main)

@liam-j-bennett
Copy link
Contributor

liam-j-bennett commented Jul 14, 2022

/hold this is blocking the merge queue of a fix that fixes the merge queue 😢 I'll unhold when I've merged

@liam-j-bennett
Copy link
Contributor

liam-j-bennett commented Jul 14, 2022

/unhold You will need to restart the werft job. Apologies for the downtime 🤦

@andrew-farries
Copy link
Contributor Author

andrew-farries commented Jul 14, 2022

/werft run with-clean-slate-deployment=true with-preview=true

👍 started the job as gitpod-build-af-store-usage-data.13
(with .werft/ from main)

@geropl
Copy link
Member

geropl commented Jul 14, 2022

@andrew-farries What about runnign with-preview=false? I already tested twice and am confident it works, so let's get this in! 🚢

@werft-gitpod-dev-com
Copy link

started the job as gitpod-build-af-store-usage-data.14 because the annotations in the pull request description changed
(with .werft/ from main)

@andrew-farries
Copy link
Contributor Author

/unhold

@roboquat roboquat merged commit 4692d0a into main Jul 14, 2022
@roboquat roboquat deleted the af/store-usage-data branch July 14, 2022 13:25
@roboquat roboquat added deployed: webapp Meta team change is running in production deployed Change is completely running in production labels Jul 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: webapp Meta team change is running in production deployed Change is completely running in production release-note-none size/XL team: webapp Issue belongs to the WebApp team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants