Skip to content

MOLT Oracle documentation; refactor MOLT tutorials #19918

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Conversation

taroface
Copy link
Contributor

@taroface taroface commented Jul 11, 2025

DOC-14130
DOC-14259
DOC-13941

  • Document Oracle support for MOLT (in public preview)
  • Refactor MOLT tutorials to center on user journeys
    • Make guidance clearer and more explicit, including PKs requirement
  • Document Azure Blob Storage support

Please review:

Page preview Review link
Bulk Load https://github.com/cockroachdb/docs/pull/19918/files#diff-8b869a1172f94c4b14010f730ab39d51cdeee342497ee1aeb0ed8605c72e5e60
Load and Replicate https://github.com/cockroachdb/docs/pull/19918/files#diff-e33d9d713e7b66e47c4c9e6b1c4d696760ba4a6a56c91388371fcfb07d1b8702
Load and Replicate Separately https://github.com/cockroachdb/docs/pull/19918/files#diff-240a82fe78b31e2105a4a0882479b665d1c36a0553bbfea56dc7dba7cad1aba4
Resume Replication https://github.com/cockroachdb/docs/pull/19918/files#diff-7e788b5f71dd9e3d14bce14858a23ba0cf95efa88299496559d89ba54cab7dfb
Failback https://github.com/cockroachdb/docs/pull/19918/files#diff-15e68aea37ca1e37ede383feed074bf6973b939daadb0357f7997b97f919e9ac
MOLT Fetch https://github.com/cockroachdb/docs/pull/19918/files#diff-d33cc35311a672af48897ed330cc8692163816714829a86f5b667a7ebbda15f3
Migration Overview (new Migration Flows section) https://github.com/cockroachdb/docs/pull/19918/files#diff-39a1ebe415b90ca952a0d16cec1c23b6ec13ecce34d780cb27505e82c2a7e06b

The above pages use content from the _includes folder. Please review those documents as well.

Note that the PG and MySQL flows were updated as well. This is technically outside the scope of this PR, but I'd like to request a review of the SQL user creation steps for each dialect and for CRDB, as this hasn't previously been documented and I was making an educated guess in some places.

Copy link

netlify bot commented Jul 11, 2025

Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Name Link
🔨 Latest commit e52ffda
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-interactivetutorials-docs/deploys/687804d15c587e00089c460b

Copy link

netlify bot commented Jul 11, 2025

Deploy Preview for cockroachdb-api-docs canceled.

Name Link
🔨 Latest commit e52ffda
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-api-docs/deploys/687804d1dfde040008f91b39

Copy link

github-actions bot commented Jul 11, 2025

Files changed:

Copy link

netlify bot commented Jul 11, 2025

Deploy Preview for cockroachdb-docs failed. Why did it fail? →

Name Link
🔨 Latest commit 21fb832
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-docs/deploys/6870a0d7ada63e000859a849

Copy link

netlify bot commented Jul 11, 2025

Netlify Preview

Name Link
🔨 Latest commit e52ffda
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-docs/deploys/687804d1d5571e0008d37688
😎 Deploy Preview https://deploy-preview-19918--cockroachdb-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@Jeremyyang920
Copy link

image I saw this on the preview, but seems like a markdown reference isint resolved properly.

Copy link

@ryanluu12345 ryanluu12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work @taroface and @noelcrl . I can give this a more thorough look Monday, but left some high level comments for now

@@ -0,0 +1,22 @@
## Limitations

- Migrations must be performed from a single Oracle schema. You **must** include `--schema-filter` so that MOLT Fetch only loads data from the specified schema. Refer to [Schema and table filtering](#schema-and-table-filtering).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not something for now, but we should also parse through https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.Oracle.html to understand what other limitations we would have too (since we both use LogMiner under the hood. FYI @taroface and @noelcrl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@noelcrl In case this needs additional docs work, do you mind filing a new issue for me?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure, I'll file a new issue if necessary here

Copy link

@ryanluu12345 ryanluu12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the rest just now. Looks good, just nits and clarifications. Agreed with Jeremy's comments. Can review after those are all addressed. I also want to ensure rest of the team can take a look here. @taroface it would be helpful if you could also link the exact docs preview section link so folks can see how it presents on the website

@taroface
Copy link
Contributor Author

Addressed all comments. PTAL!

The docs preview links are in the PR description.

Copy link

@ryanluu12345 ryanluu12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes lgtm!

Copy link

@Jeremyyang920 Jeremyyang920 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Will let Noel sign off on the Oracle stuff

Copy link
Contributor

@florence-crl florence-crl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestions for Bulk Load. I will give my suggestions per page.

Copy link
Contributor

@florence-crl florence-crl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestions for Load and Replicate.

Copy link

@noelcrl noelcrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work so far, I will take another review at the Oracle-specific portion next and the rest of the changes

@@ -0,0 +1,22 @@
## Limitations

- Migrations must be performed from a single Oracle schema. You **must** include `--schema-filter` so that MOLT Fetch only loads data from the specified schema. Refer to [Schema and table filtering](#schema-and-table-filtering).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure, I'll file a new issue if necessary here


{% include_cached copy-clipboard.html %}
~~~ sql
GRANT admin TO crdb_user;
Copy link

@noelcrl noelcrl Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should note that if the user doesn't want superuser given to crdb_user, we can also grant privileges that are less-permissive and more explicit. Here are some privilege examples:

-- Fetch privileges

-- 1. Grant database-level privileges (connect, create schema, create temp tables) for schema creation within the database to migrate to
GRANT ALL ON DATABASE defaultdb TO crdb_user;

-- 2. Grant user privileges to create tables like _molt_fetch_exceptions in the public schema, make sure we are connected to the database to migrate to
GRANT CREATE ON SCHEMA public TO crdb_user;

-- Optional privileges if the schema to migrate to already exists and wasn't created by crdb_user (drop-on-target-and-recreate wasn't used):

-- 3. (Optional) if the schema to migrate to already exists and wasn't created by crdb_user, grant privileges on that schema

GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA "migration_schema" TO crdb_user;

ALTER DEFAULT PRIVILEGES IN SCHEMA "migration_schema"
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO crdb_user;

-- 4. (Optional) as well as public for the molt exception tables:

GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO crdb_user;

ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO crdb_user;

-- Replication mode privileges afterwards

-- 1. Allow staging database creation for replication
ALTER USER crdb_user CREATEDB;
--

Also updated the privileges instructions for the target user to accommodate these more explicit and less-permissive permissions.

Copy link
Contributor

@florence-crl florence-crl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@florence-crl florence-crl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed Resume Replication and Failback.

a few questions.

~~~ shell
molt fetch \
--source $SOURCE \
--target $TARGET \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this command need:

--schema-filter 'migration_schema' \

?

~~~ shell
molt fetch \
--source $SOURCE \
--target $TARGET \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the schema filter need to be included here?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd assume so if the previous data load also included --schema-filter here

Copy link

@noelcrl noelcrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the rest including Oracle, please let me know when it is ready for a final review 👍

GRANT SELECT ON V_$TRANSACTION TO C##MIGRATION_USER;

-- Grant these two for every table to migrate in the migration_schema
GRANT SELECT ON migration_schema.tbl TO C##MIGRATION_USER;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reduce verbosity here, we can also use the shorter:

GRANT SELECT, FLASHBACK ON migration_schema.tbl TO C##MIGRATION_USER;

instead of:

GRANT SELECT ON migration_schema.tbl TO C##MIGRATION_USER;
GRANT FLASHBACK ON migration_schema.tbl TO C##MIGRATION_USER;

Unless we think that separating out the privileges to be more explicit would be useful here.

GRANT SELECT ON DBA_TABLES TO MIGRATION_USER;

-- Grant these two for every table to migrate in the migration_schema
GRANT SELECT ON migration_schema.tbl TO MIGRATION_USER;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above for potentially shortening the privileges into one statement.

GRANT SELECT ON DBA_TABLES TO C##MIGRATION_USER;
~~~

Connect to the Oracle PDB as a DBA and grant the following:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we emphasize the fact that we need to reconnect to the PDB and not the CDB here?

--source 'oracle://{username}:{password}@{host}:{port}/{service_name}'
~~~

In [Oracle Multitenant](https://docs.oracle.com/en/database/oracle/oracle-database/21/cncpt/CDBs-and-PDBs.html), `--source` specifies the connection string for the PDB. `--source-cdb` specifies the connection string for the CDB. The username in both `--source` and `--source-cdb` is the common user that owns the tables you will migrate.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something similar to:

Suggested change
In [Oracle Multitenant](https://docs.oracle.com/en/database/oracle/oracle-database/21/cncpt/CDBs-and-PDBs.html), `--source` specifies the connection string for the PDB. `--source-cdb` specifies the connection string for the CDB. The username in both `--source` and `--source-cdb` is the common user that owns the tables you will migrate.
In [Oracle Multitenant](https://docs.oracle.com/en/database/oracle/oracle-database/21/cncpt/CDBs-and-PDBs.html), `--source` specifies the connection string for the PDB. `--source-cdb` specifies the connection string for the CDB. The username in both `--source` and `--source-cdb` is the common user that has the appropriate privileges granted to access the tables in the schema to migrate, as well as the appropriate LogMiner privileges.

Wait for any remaining sessions to show an `INACTIVE` status, then terminate them using:

~~~ sql
ALTER SYSTEM KILL SESSION 'sid,serial#' IMMEDIATE;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we somehow signify here that sid and serial# are placeholders here for results of the query above to find sessions?

~~~

{{site.data.alerts.callout_info}}
With Oracle Multitenant deployments, `--source-cdb` is **not** necessary for `verify`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--source-cdb is also not necessary for failback, wondering if we should include this

~~~ shell
molt fetch \
--source $SOURCE \
--target $TARGET \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd assume so if the previous data load also included --schema-filter here

Copy link
Contributor

@florence-crl florence-crl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed MOLT Fetch and Migration Overview (new Migration Flows section).

some nits. Great job @taroface, very comprehensive!

~~~

In the `molt fetch` command, specify a GTID set using the [`--defaultGTIDSet` replication flag](#mysql-replication-flags) and the format `source_uuid:min(interval_start)-max(interval_end)`. For example:
- For a MySQL source, replication requires specifying a starting GTID set with the `--defaultGTIDSet` replication flag. After the initial data load completes, locate the [`cdc_cursor`](#cdc-cursor) value in the `fetch complete` log output and use it as the GTID set. For example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please clarify: the example has

--defaultGTIDSet b7f9e0fa-2753-1e1f-5d9b-2402ac810003:3-21"

while the cdc_cursor links to a message that looks like this:

{"level":"info","type":"summary","fetch_id":"735a4fe0-c478-4de7-a342-cfa9738783dc","num_tables":1,"tables":["public.employees"],"cdc_cursor":"0/3F41E40","net_duration_ms":4879.890041,"net_duration":"000h 00m 04s","time":"2024-03-18T12:37:02-04:00","message":"fetch complete"}

so the cdc_cursor value of "0/3F41E40" is not the same format as the example. Would it be correct to use the cdc_cursor value?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants