-
Couldn't load subscription status.
- Fork 1
Ingest: Add page about ingestion methods #230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Warning Rate limit exceeded@amotl has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 25 minutes and 39 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (4)
WalkthroughA new "Ingestion" section was added to the documentation index, introducing a comprehensive overview page for CrateDB data ingestion methods. The "Monitoring and Metrics" section was renamed to "Metrics and Telemetry," with corresponding updates to headers and reference labels in the telemetry documentation. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant DocsIndex
participant IngestionOverview
participant TelemetryDocs
User->>DocsIndex: Access documentation index
DocsIndex-->>User: Show updated index with "Ingestion" and "Metrics and Telemetry"
User->>IngestionOverview: Navigate to Ingestion section
IngestionOverview-->>User: Display categorized ingestion methods
User->>TelemetryDocs: Navigate to Metrics and Telemetry
TelemetryDocs-->>User: Show updated telemetry documentation
Estimated code review effort🎯 2 (Simple) | ⏱️ ~6 minutes Suggested labels
Suggested reviewers
Poem
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
docs/integrate/telemetry/index.md (1)
2-8: Update description to match widened telemetry scope.Header and label now mention “telemetry”, yet Line 6 still limits the description to metrics.
Consider widening the wording (e.g. “metrics, logs, traces”) so the short intro aligns with the broader purpose.docs/index.md (1)
310-311: Add a visible card for the new Ingestion section.The hidden toctree entry surfaces the page in the side-nav but the front-page grids still lack an “Ingestion” card, unlike the other top-level sections (ETL, Metrics, BI, …).
Adding a grid-item card promotes discoverability and keeps the UX consistent.docs/ingest/index.md (2)
25-29: Tighten wording and avoid filler phrase.“A variety of options to connect and integrate with 3rd-party …” is verbose.
A leaner variant reads better:- Use a variety of options to connect and integrate with 3rd-party - change-data-capture (CDC) applications. + Integrate with 3rd-party change-data-capture (CDC) tools.
41-43: Fix typo “Suported” → “Supported”.-+++ -Suported data source types when importing data into CrateDB. +++ +Supported data source types when importing data into CrateDB.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
docs/_include/links.md(0 hunks)docs/index.md(1 hunks)docs/ingest/file.md(1 hunks)docs/ingest/index.md(1 hunks)docs/integrate/index.md(1 hunks)docs/integrate/telemetry/index.md(1 hunks)
💤 Files with no reviewable changes (1)
- docs/_include/links.md
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:22-33
Timestamp: 2025-05-18T12:50:36.393Z
Learning: In the CrateDB Guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/server`) is an intersphinx reference that points to external content in the CrateDB Toolkit documentation at https://cratedb-toolkit.readthedocs.io/. These references are intentionally not pointing to local files within the repository.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:8-20
Timestamp: 2025-05-18T12:50:38.681Z
Learning: In the CrateDB guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) are intersphinx references that link to resources in the cratedb-toolkit repository (https://github.com/crate/cratedb-toolkit/tree/main/doc), which are rendered at https://cratedb-toolkit.readthedocs.io/. These are valid cross-references between separate Sphinx documentation sets, not local file references.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:6-8
Timestamp: 2025-05-18T13:25:11.786Z
Learning: In the CrateDB Guide documentation, particularly for MCP-related pages, the author prefers a 1-column grid layout for multiple cards to stack them vertically rather than side by side, as this is an intentional design choice.
📚 Learning: in the cratedb guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/serve...
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:22-33
Timestamp: 2025-05-18T12:50:36.393Z
Learning: In the CrateDB Guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/server`) is an intersphinx reference that points to external content in the CrateDB Toolkit documentation at https://cratedb-toolkit.readthedocs.io/. These references are intentionally not pointing to local files within the repository.
Applied to files:
docs/integrate/telemetry/index.mddocs/index.mddocs/ingest/index.md
📚 Learning: in the cratedb guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) ...
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:8-20
Timestamp: 2025-05-18T12:50:38.681Z
Learning: In the CrateDB guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) are intersphinx references that link to resources in the cratedb-toolkit repository (https://github.com/crate/cratedb-toolkit/tree/main/doc), which are rendered at https://cratedb-toolkit.readthedocs.io/. These are valid cross-references between separate Sphinx documentation sets, not local file references.
Applied to files:
docs/index.mddocs/ingest/index.md
🪛 LanguageTool
docs/ingest/index.md
[style] ~27-~27: Consider using a more concise synonym.
Context: ...e ETL solutions. - {ref}cdc Use a variety of options to connect and integrate with 3...
(A_VARIETY_OF)
[grammar] ~41-~41: Ensure spelling is correct
Context: ... tables like regular user tables. +++ Suported data source types when importing data i...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🔇 Additional comments (3)
docs/integrate/index.md (1)
20-21: Confirm link-target consistency after renaming.You switched from
metrics/indextotelemetry/index, which is fine, but the target page still exports both(metrics)=and(telemetry)=anchors.
Please double-check that every existing cross-reference ({ref}and:link:) now points to the new(telemetry)label; otherwise old links may silently keep using the deprecated(metrics)anchor and defeat the rename intention.
If the old alias is meant to stay, add an explicit comment so future maintainers know it is intentional.docs/ingest/file.md (1)
1-4: YAML front-matter renders only with MyST-Parser ≥ 1.0.If the build still uses an older MyST version, the
--- orphan: true ---block will be treated as literal Markdown, not front-matter.
Verify the extension level or switch to the directive form:```{meta} orphan: true</details> <details> <summary>docs/ingest/index.md (1)</summary> `30-40`: **Verify that `fdw` reference label exists.** The bullet links to `{ref}` `fdw`, but no such label is introduced in this PR. If the target page is still pending, add `orphan: true` to suppress build warnings or create a stub page with the corresponding label. </details> </blockquote></details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
7257275 to
86f61d9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (5)
docs/ingest/index.md (5)
7-7: Prefer “ingestion methods” (noun) over “ingest methods”Use the noun form to improve clarity.
-All data ingest methods for CrateDB at a glance. +All data ingestion methods for CrateDB at a glance.
27-29: Tighten wording and avoid “3rd-party”Use “third-party” and simplify phrasing.
- Use a variety of options to connect and integrate with 3rd-party - change-data-capture (CDC) applications. + Integrate with third-party change-data-capture (CDC) tools.
32-34: Improve phrasing of telemetry sentenceRemove awkward hyphenation and clarify subjects.
- Ingest telemetry data, i.e. metrics, logs, and traces originating from - monitoring- or sensor collector systems. + Ingest telemetry data—metrics, logs, and traces—from monitoring systems and sensor collectors.
41-43: Fix typo and placement of footer text“Suported” → “Supported”. If you keep a single-card design, “+++ …” is the card footer. If you adopt multiple cards (recommended), move this sentence below the grid as a normal paragraph.
Option A (single card, minimal fix):
-+++ -Suported data source types when importing data into CrateDB. +++ +Supported data source types when importing data into CrateDB.Option B (after refactor to multiple cards): remove these lines, and add this paragraph after the grid:
Supported data source types when importing data into CrateDB.
14-14: Confirm “material-outlined” icon role supportI’ve verified that:
- ✅
docs/_include/styles.htmlis present- 🔍 No other
{material-outlined}roles are used elsewhere indocs/**/*.mdBefore merging, please ensure your CrateDB theme (crate-docs-theme) registers the
material-outlinedicon role so builds won’t warn or fail. If it isn’t supported, either switch to an icon set known to be enabled (e.g. an Octicon) or remove the icon token:- :{material-outlined}`lightbulb;2em` Loading data from external sources + :octicon:`lightbulb;2em` Loading data from external sources
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
docs/index.md(1 hunks)docs/ingest/file.md(1 hunks)docs/ingest/index.md(1 hunks)docs/integrate/index.md(1 hunks)docs/integrate/telemetry/index.md(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- docs/index.md
🚧 Files skipped from review as they are similar to previous changes (3)
- docs/integrate/telemetry/index.md
- docs/integrate/index.md
- docs/ingest/file.md
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-08T16:50:14.965Z
Learnt from: amotl
PR: crate/cratedb-guide#234
File: docs/home/index.md:47-50
Timestamp: 2025-08-08T16:50:14.965Z
Learning: In the CrateDB Guide docs (MyST), the correct intersphinx target for the CrateDB Cloud documentation homepage is `cloud:index` (not `cloud:docs-index` or `cloud-docs-index`). Use `:link: cloud:index` on cards/links. The `cloud` mapping is inherited via crate-docs-theme.
Applied to files:
docs/ingest/index.md
📚 Learning: 2025-05-18T12:50:36.393Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:22-33
Timestamp: 2025-05-18T12:50:36.393Z
Learning: In the CrateDB Guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/server`) is an intersphinx reference that points to external content in the CrateDB Toolkit documentation at https://cratedb-toolkit.readthedocs.io/. These references are intentionally not pointing to local files within the repository.
Applied to files:
docs/ingest/index.md
📚 Learning: 2025-08-07T23:11:09.657Z
Learnt from: amotl
PR: crate/cratedb-guide#232
File: docs/_include/links.md:1-1
Timestamp: 2025-08-07T23:11:09.657Z
Learning: In the CrateDB Guide repository, intersphinx mappings like `crate-admin-ui` are inherited from the root project `crate-docs-theme` rather than being defined locally in `docs/conf.py`. This allows shared documentation configurations across multiple CrateDB documentation projects.
Applied to files:
docs/ingest/index.md
📚 Learning: 2025-05-18T12:50:38.681Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:8-20
Timestamp: 2025-05-18T12:50:38.681Z
Learning: In the CrateDB guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) are intersphinx references that link to resources in the cratedb-toolkit repository (https://github.com/crate/cratedb-toolkit/tree/main/doc), which are rendered at https://cratedb-toolkit.readthedocs.io/. These are valid cross-references between separate Sphinx documentation sets, not local file references.
Applied to files:
docs/ingest/index.md
🪛 LanguageTool
docs/ingest/index.md
[style] ~27-~27: Consider using a more concise synonym.
Context: ...e ETL solutions. - {ref}cdc Use a variety of options to connect and integrate with 3...
(A_VARIETY_OF)
[grammar] ~41-~41: Ensure spelling is correct
Context: ... tables like regular user tables. +++ Suported data source types when importing data i...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build docs
🔇 Additional comments (2)
docs/ingest/index.md (2)
1-6: Section scaffold looks correctLabel, H1, and shared styles include are fine.
16-37: All referenced labels are defined and will resolve correctlyI verified that each label used in
docs/ingest/index.mdis defined locally:
(ingest-file)=in docs/ingest/file.md(etl)=in docs/integrate/etl/index.md(cdc)=in docs/integrate/cdc/index.md(telemetry)=in docs/integrate/telemetry/index.md(fdw)=in docs/feature/fdw/index.mdNo further changes are needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
docs/ingest/index.md (1)
49-67: Optional UX: Consider one card per link for scannabilitySplitting this card into individual cards (Files, ETL, CDC, Telemetry, FDW) makes the grid responsive and easier to scan. This mirrors the earlier suggestion on this PR, now adapted to your current structure.
If you prefer the grouped approach, keep as-is; just ensure each bullet remains a direct {ref} to canonical pages.
🧹 Nitpick comments (3)
docs/ingest/index.md (3)
22-22: Clarify JSON format terminologyUse explicit wording to avoid confusion between JSON and JSON Lines.
- **Formats:** CSV, JSONL + **Formats:** CSV, JSON (JSON Lines)
57-58: Tighten wording and fix style in CDC sectionConcise phrasing and consistent terminology.
- Use a variety of options to connect and integrate with 3rd-party - change-data-capture (CDC) applications. + Use several options to integrate with third-party + change data capture (CDC) tools.
62-63: Improve grammar and readability in telemetry descriptionReplace “i.e.” and remove awkward hyphenation.
- Ingest telemetry data, i.e. metrics, logs, or traces originating from - monitoring- or sensor-data collector systems. + Ingest telemetry data, such as metrics, logs, and traces from + monitoring systems and sensor collectors.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
docs/index.md(1 hunks)docs/ingest/index.md(1 hunks)docs/integrate/telemetry/index.md(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- docs/index.md
🚧 Files skipped from review as they are similar to previous changes (1)
- docs/integrate/telemetry/index.md
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-guide#234
File: docs/home/index.md:47-50
Timestamp: 2025-08-08T16:50:14.965Z
Learning: In the CrateDB Guide docs (MyST), the correct intersphinx target for the CrateDB Cloud documentation homepage is `cloud:index` (not `cloud:docs-index` or `cloud-docs-index`). Use `:link: cloud:index` on cards/links. The `cloud` mapping is inherited via crate-docs-theme.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:6-8
Timestamp: 2025-05-18T13:25:11.786Z
Learning: In the CrateDB Guide documentation, particularly for MCP-related pages, the author prefers a 1-column grid layout for multiple cards to stack them vertically rather than side by side, as this is an intentional design choice.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:22-33
Timestamp: 2025-05-18T12:50:36.393Z
Learning: In the CrateDB Guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/server`) is an intersphinx reference that points to external content in the CrateDB Toolkit documentation at https://cratedb-toolkit.readthedocs.io/. These references are intentionally not pointing to local files within the repository.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:8-20
Timestamp: 2025-05-18T12:50:38.681Z
Learning: In the CrateDB guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) are intersphinx references that link to resources in the cratedb-toolkit repository (https://github.com/crate/cratedb-toolkit/tree/main/doc), which are rendered at https://cratedb-toolkit.readthedocs.io/. These are valid cross-references between separate Sphinx documentation sets, not local file references.
Learnt from: amotl
PR: crate/cratedb-guide#232
File: docs/_include/links.md:11-11
Timestamp: 2025-08-07T23:11:08.311Z
Learning: In the CrateDB Guide repository, intersphinx mappings like "cloud" are defined within the root project `crate-docs-theme` and inherited by the documentation projects, so they don't need to be explicitly defined in individual `docs/conf.py` files.
Learnt from: amotl
PR: crate/cratedb-guide#232
File: docs/_include/links.md:1-1
Timestamp: 2025-08-07T23:11:09.657Z
Learning: In the CrateDB Guide repository, intersphinx mappings like `crate-admin-ui` are inherited from the root project `crate-docs-theme` rather than being defined locally in `docs/conf.py`. This allows shared documentation configurations across multiple CrateDB documentation projects.
📚 Learning: 2025-08-08T16:50:14.965Z
Learnt from: amotl
PR: crate/cratedb-guide#234
File: docs/home/index.md:47-50
Timestamp: 2025-08-08T16:50:14.965Z
Learning: In the CrateDB Guide docs (MyST), the correct intersphinx target for the CrateDB Cloud documentation homepage is `cloud:index` (not `cloud:docs-index` or `cloud-docs-index`). Use `:link: cloud:index` on cards/links. The `cloud` mapping is inherited via crate-docs-theme.
Applied to files:
docs/ingest/index.md
📚 Learning: 2025-08-07T23:11:09.657Z
Learnt from: amotl
PR: crate/cratedb-guide#232
File: docs/_include/links.md:1-1
Timestamp: 2025-08-07T23:11:09.657Z
Learning: In the CrateDB Guide repository, intersphinx mappings like `crate-admin-ui` are inherited from the root project `crate-docs-theme` rather than being defined locally in `docs/conf.py`. This allows shared documentation configurations across multiple CrateDB documentation projects.
Applied to files:
docs/ingest/index.md
📚 Learning: 2025-05-18T12:50:36.393Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:22-33
Timestamp: 2025-05-18T12:50:36.393Z
Learning: In the CrateDB Guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/server`) is an intersphinx reference that points to external content in the CrateDB Toolkit documentation at https://cratedb-toolkit.readthedocs.io/. These references are intentionally not pointing to local files within the repository.
Applied to files:
docs/ingest/index.md
📚 Learning: 2025-08-07T23:11:08.311Z
Learnt from: amotl
PR: crate/cratedb-guide#232
File: docs/_include/links.md:11-11
Timestamp: 2025-08-07T23:11:08.311Z
Learning: In the CrateDB Guide repository, intersphinx mappings like "cloud" are defined within the root project `crate-docs-theme` and inherited by the documentation projects, so they don't need to be explicitly defined in individual `docs/conf.py` files.
Applied to files:
docs/ingest/index.md
📚 Learning: 2025-05-18T12:50:38.681Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:8-20
Timestamp: 2025-05-18T12:50:38.681Z
Learning: In the CrateDB guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) are intersphinx references that link to resources in the cratedb-toolkit repository (https://github.com/crate/cratedb-toolkit/tree/main/doc), which are rendered at https://cratedb-toolkit.readthedocs.io/. These are valid cross-references between separate Sphinx documentation sets, not local file references.
Applied to files:
docs/ingest/index.md
📚 Learning: 2025-05-18T13:25:11.786Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:6-8
Timestamp: 2025-05-18T13:25:11.786Z
Learning: In the CrateDB Guide documentation, particularly for MCP-related pages, the author prefers a 1-column grid layout for multiple cards to stack them vertically rather than side by side, as this is an intentional design choice.
Applied to files:
docs/ingest/index.md
🪛 LanguageTool
docs/ingest/index.md
[style] ~57-~57: Consider using a more concise synonym.
Context: ...ete ETL solutions. - {ref}cdc Use a variety of options to connect and integrate with 3...
(A_VARIETY_OF)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build docs
🔇 Additional comments (2)
docs/ingest/index.md (2)
1-70: Strong addition; coherent structure and accurate categorizationOverall, this page is a solid overview with clear groupings, correct use of sphinx-design grids, and helpful protocol/format hints. Address the minor link and phrasing tweaks above and it’s ready.
20-20: Confirmed FTP support for nativeCOPY FROM
Per CrateDB’s documentation, FTP is natively supported alongside HTTP(S), Amazon S3, and Azure Blob Storage in the SQLCOPY FROMstatement. The existing line in docs/ingest/index.md (line 20) is accurate—no update required.
docs/ingest/index.md
Outdated
| Data import methods provided by CrateDB Cloud. | ||
| :::: | ||
|
|
||
| ::::{grid-item-card} {material-outlined}`arrow_circle_up;2em` Load data from external sources |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nit: What do you think about this "naming things" detail, to better clarify things what this very group is all about?
| ::::{grid-item-card} {material-outlined}`arrow_circle_up;2em` Load data from external sources | |
| ::::{grid-item-card} {material-outlined}`arrow_circle_up;2em` Load data using external systems |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've used my suggestion "Load data using external systems" now.
About
Making a start on recent requests about having a page that bundles all data ingestion methods on a single spot.
Source: https://cratedb.gitbook.io/cratedb-docs/K3l1K4ZBSqj0AL16dbZi/getting-started/ingesting-data
Source: https://cratedb.gitbook.io/cratedb-docs/K3l1K4ZBSqj0AL16dbZi/drivers-and-integrations/data-sources
Preview: https://cratedb-guide--230.org.readthedocs.build/ingest/
Outlook
Future patches will provide curations to recommend canonical ingest methods, preferably of polyglot nature, possibly tailored/native to CrateDB.
References