Skip to content

Conversation

@puneetsharma21
Copy link
Contributor

@puneetsharma21 puneetsharma21 commented Oct 13, 2025

Changes

  • Added the following CMake flags to the PyArrow build:
    • -DARROW_S3=ON → Enables S3 filesystem support for PyArrow.
    • -DARROW_SUBSTRAIT=ON → Enables Substrait support required by Feast

Summary by CodeRabbit

  • New Features
    • Enabled S3 filesystem support in PyArrow within the datascience CPU image, allowing direct read/write to S3 from Arrow-based workflows.
    • Enabled Substrait integration in PyArrow for improved interoperability with query plan standards.

@openshift-ci openshift-ci bot requested review from atheo89 and dibryant October 13, 2025 14:06
@openshift-ci openshift-ci bot added the size/xs label Oct 13, 2025
@github-actions github-actions bot added the review-requested GitHub Bot creates notification on #pr-review-ai-ide-team slack channel label Oct 13, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 13, 2025

Walkthrough

Adds two CMake options to the pyarrow build stage in the UBI9 Python 3.12 Jupyter datascience Dockerfile to enable Arrow S3 and Substrait support. No other flags changed.

Changes

Cohort / File(s) Change summary
Docker build config
jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu
In pyarrow build stage, added -DARROW_S3=ON and -DARROW_SUBSTRAIT=ON; existing -DARROW_WITH_LZ4=OFF and -DARROW_WITH_ZSTD=OFF unchanged.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The pull request description only provides a brief change list under "### Changes" and omits required template sections such as "## Description," "## How Has This Has Been Tested?", the self checklist, and the merge criteria checklist, making it incomplete against the repository’s PR template. Please revise the pull request description to follow the repository’s template by adding a "## Description" section describing the changes in detail and a "## How Has This Has Been Tested?" section with testing steps and environment information. Also complete the self checklist items and include the merge criteria checklist as specified in the template.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title accurately reflects the addition of Arrow S3 and Substrait support flags to the PyArrow build stage and aligns with the main change in the pull request, using clear and specific phrasing that conveys the core update.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 20c5fda and 809b7cb.

📒 Files selected for processing (1)
  • jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1513
File: runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu:104-108
Timestamp: 2025-09-05T10:07:53.476Z
Learning: jiridanek requested GitHub issue creation for Arrow codec configuration problem during PR #1513 review. Issue #2305 was created addressing disabled core Arrow codecs (LZ4, Zstd, Snappy) in s390x pyarrow build that prevents reading compressed Parquet/Arrow datasets. The issue includes comprehensive problem description covering data compatibility impact, detailed solution enabling codecs with BUNDLED dependencies, clear acceptance criteria for functionality verification, and proper context linking to PR #1513 review comment, assigned to jiridanek.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build (jupyter-datascience-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: build (jupyter-datascience-ubi9-python-3.12, 3.12, linux/ppc64le, false) / build
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-datascience-cpu-py312-ubi9-on-pull-request

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added size/xs and removed size/xs labels Oct 13, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 13, 2025

@puneetsharma21: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/notebooks-py312-ubi9-e2e-tests 809b7cb link true /test notebooks-py312-ubi9-e2e-tests
ci/prow/notebook-jupyter-ds-ubi9-python-3-12-pr-image-mirror 809b7cb link true /test notebook-jupyter-ds-ubi9-python-3-12-pr-image-mirror
ci/prow/images 809b7cb link true /test images

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link
Member

@atheo89 atheo89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 13, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: atheo89

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@atheo89 atheo89 merged commit 7d644ac into opendatahub-io:main Oct 13, 2025
13 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm review-requested GitHub Bot creates notification on #pr-review-ai-ide-team slack channel size/xs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants