-
Notifications
You must be signed in to change notification settings - Fork 471
MOLT Oracle documentation; refactor MOLT tutorials #19918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.
|
✅ Deploy Preview for cockroachdb-api-docs canceled.
|
Files changed:
|
❌ Deploy Preview for cockroachdb-docs failed. Why did it fail? →
|
✅ Netlify Preview
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,22 @@ | |||
## Limitations | |||
|
|||
- Migrations must be performed from a single Oracle schema. You **must** include `--schema-filter` so that MOLT Fetch only loads data from the specified schema. Refer to [Schema and table filtering](#schema-and-table-filtering). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not something for now, but we should also parse through https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.Oracle.html to understand what other limitations we would have too (since we both use LogMiner under the hood. FYI @taroface and @noelcrl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@noelcrl In case this needs additional docs work, do you mind filing a new issue for me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure, I'll file a new issue if necessary here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed the rest just now. Looks good, just nits and clarifications. Agreed with Jeremy's comments. Can review after those are all addressed. I also want to ensure rest of the team can take a look here. @taroface it would be helpful if you could also link the exact docs preview section link so folks can see how it presents on the website
Addressed all comments. PTAL! The docs preview links are in the PR description. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes lgtm!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Will let Noel sign off on the Oracle stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestions for Bulk Load. I will give my suggestions per page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestions for Load and Replicate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work so far, I will take another review at the Oracle-specific portion next and the rest of the changes
@@ -0,0 +1,22 @@ | |||
## Limitations | |||
|
|||
- Migrations must be performed from a single Oracle schema. You **must** include `--schema-filter` so that MOLT Fetch only loads data from the specified schema. Refer to [Schema and table filtering](#schema-and-table-filtering). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure, I'll file a new issue if necessary here
|
||
{% include_cached copy-clipboard.html %} | ||
~~~ sql | ||
GRANT admin TO crdb_user; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should note that if the user doesn't want superuser given to crdb_user
, we can also grant privileges that are less-permissive and more explicit. Here are some privilege examples:
-- Fetch privileges
-- 1. Grant database-level privileges (connect, create schema, create temp tables) for schema creation within the database to migrate to
GRANT ALL ON DATABASE defaultdb TO crdb_user;
-- 2. Grant user privileges to create tables like _molt_fetch_exceptions in the public schema, make sure we are connected to the database to migrate to
GRANT CREATE ON SCHEMA public TO crdb_user;
-- Optional privileges if the schema to migrate to already exists and wasn't created by crdb_user (drop-on-target-and-recreate wasn't used):
-- 3. (Optional) if the schema to migrate to already exists and wasn't created by crdb_user, grant privileges on that schema
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA "migration_schema" TO crdb_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA "migration_schema"
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO crdb_user;
-- 4. (Optional) as well as public for the molt exception tables:
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO crdb_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO crdb_user;
-- Replication mode privileges afterwards
-- 1. Allow staging database creation for replication
ALTER USER crdb_user CREATEDB;
--
Also updated the privileges instructions for the target user to accommodate these more explicit and less-permissive permissions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestions for Load and Replicate Separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed Resume Replication and Failback.
a few questions.
~~~ shell | ||
molt fetch \ | ||
--source $SOURCE \ | ||
--target $TARGET \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this command need:
--schema-filter 'migration_schema' \
?
~~~ shell | ||
molt fetch \ | ||
--source $SOURCE \ | ||
--target $TARGET \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the schema filter need to be included here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd assume so if the previous data load also included --schema-filter
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed the rest including Oracle, please let me know when it is ready for a final review 👍
GRANT SELECT ON V_$TRANSACTION TO C##MIGRATION_USER; | ||
|
||
-- Grant these two for every table to migrate in the migration_schema | ||
GRANT SELECT ON migration_schema.tbl TO C##MIGRATION_USER; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reduce verbosity here, we can also use the shorter:
GRANT SELECT, FLASHBACK ON migration_schema.tbl TO C##MIGRATION_USER;
instead of:
GRANT SELECT ON migration_schema.tbl TO C##MIGRATION_USER;
GRANT FLASHBACK ON migration_schema.tbl TO C##MIGRATION_USER;
Unless we think that separating out the privileges to be more explicit would be useful here.
GRANT SELECT ON DBA_TABLES TO MIGRATION_USER; | ||
|
||
-- Grant these two for every table to migrate in the migration_schema | ||
GRANT SELECT ON migration_schema.tbl TO MIGRATION_USER; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above for potentially shortening the privileges into one statement.
GRANT SELECT ON DBA_TABLES TO C##MIGRATION_USER; | ||
~~~ | ||
|
||
Connect to the Oracle PDB as a DBA and grant the following: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we emphasize the fact that we need to reconnect to the PDB and not the CDB here?
--source 'oracle://{username}:{password}@{host}:{port}/{service_name}' | ||
~~~ | ||
|
||
In [Oracle Multitenant](https://docs.oracle.com/en/database/oracle/oracle-database/21/cncpt/CDBs-and-PDBs.html), `--source` specifies the connection string for the PDB. `--source-cdb` specifies the connection string for the CDB. The username in both `--source` and `--source-cdb` is the common user that owns the tables you will migrate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe something similar to:
In [Oracle Multitenant](https://docs.oracle.com/en/database/oracle/oracle-database/21/cncpt/CDBs-and-PDBs.html), `--source` specifies the connection string for the PDB. `--source-cdb` specifies the connection string for the CDB. The username in both `--source` and `--source-cdb` is the common user that owns the tables you will migrate. | |
In [Oracle Multitenant](https://docs.oracle.com/en/database/oracle/oracle-database/21/cncpt/CDBs-and-PDBs.html), `--source` specifies the connection string for the PDB. `--source-cdb` specifies the connection string for the CDB. The username in both `--source` and `--source-cdb` is the common user that has the appropriate privileges granted to access the tables in the schema to migrate, as well as the appropriate LogMiner privileges. |
Wait for any remaining sessions to show an `INACTIVE` status, then terminate them using: | ||
|
||
~~~ sql | ||
ALTER SYSTEM KILL SESSION 'sid,serial#' IMMEDIATE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we somehow signify here that sid and serial# are placeholders here for results of the query above to find sessions?
~~~ | ||
|
||
{{site.data.alerts.callout_info}} | ||
With Oracle Multitenant deployments, `--source-cdb` is **not** necessary for `verify`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--source-cdb
is also not necessary for failback, wondering if we should include this
~~~ shell | ||
molt fetch \ | ||
--source $SOURCE \ | ||
--target $TARGET \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd assume so if the previous data load also included --schema-filter
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed MOLT Fetch and Migration Overview (new Migration Flows section).
some nits. Great job @taroface, very comprehensive!
~~~ | ||
|
||
In the `molt fetch` command, specify a GTID set using the [`--defaultGTIDSet` replication flag](#mysql-replication-flags) and the format `source_uuid:min(interval_start)-max(interval_end)`. For example: | ||
- For a MySQL source, replication requires specifying a starting GTID set with the `--defaultGTIDSet` replication flag. After the initial data load completes, locate the [`cdc_cursor`](#cdc-cursor) value in the `fetch complete` log output and use it as the GTID set. For example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please clarify: the example has
--defaultGTIDSet b7f9e0fa-2753-1e1f-5d9b-2402ac810003:3-21"
while the cdc_cursor
links to a message that looks like this:
{"level":"info","type":"summary","fetch_id":"735a4fe0-c478-4de7-a342-cfa9738783dc","num_tables":1,"tables":["public.employees"],"cdc_cursor":"0/3F41E40","net_duration_ms":4879.890041,"net_duration":"000h 00m 04s","time":"2024-03-18T12:37:02-04:00","message":"fetch complete"}
so the cdc_cursor
value of "0/3F41E40"
is not the same format as the example. Would it be correct to use the cdc_cursor
value?
Co-authored-by: Florence Morris <[email protected]>
DOC-14130
DOC-14259
DOC-13941
Please review:
The above pages use content from the
_includes
folder. Please review those documents as well.Note that the PG and MySQL flows were updated as well. This is technically outside the scope of this PR, but I'd like to request a review of the SQL user creation steps for each dialect and for CRDB, as this hasn't previously been documented and I was making an educated guess in some places.