Skip to content

Conversation

mateuszkuprowski
Copy link
Contributor

This PR is @teddysupercuts contribution from his fork.
Original PR can be found here.

Re-opening this PR from origin so that our tests work correctly.

Due to Sharepoint inheriting Onedrive, we originally had user_pname
required. But it isn't really required.

It's actually not even used unless the User user username/password as
their form of Auth which is not recommended by Microsoft.

SharepointConnectionConfig inherits from OnedriveConnectionConfig ,
which is where the use and declaration of this arg is made. With the
changes in the PR, we override the arg to be Optional


Also updated the vertexai model which was deprecated.
pawel-kmiecik and others added 7 commits May 14, 2025 15:47
According to Google API documentation, the `webContentLink` and
`exportLink` are intended to be used in browsers, not by scripts.
This leads to a situation when e.g. `webContentLink` redirects to the
Google'a auth login page, which is downloaded and sent to partition.

Instead of that we should use the `googleclient`'s methods, that [call
the Google Drive appropriate APIs to perform download/export
operations](https://developers.google.com/workspace/drive/api/guides/manage-downloads#python):
- `get_media` to download standalone files
- `export` to export Google Workspace native files (Google Docs, Google
Slides, Google Sheets) to corresponding office files (docx, pptx, xlsx,
accordingly)
- `download` to export Google Workspace native files for files that
result with >10MB size
- this operation uses LRO (Long Running Operation) mechanism described
[here](https://developers.google.com/workspace/drive/api/guides/long-running-operations)
Co-authored-by: Filip Knefel <[email protected]>
Co-authored-by: Filip Knefel <[email protected]>
- cap redis client version to avoid breaking uploader plugin
- fixed azure and s3 e2e test script so that they no longer report error
fix error in google drive E2E test
"Error in downloader: 512: [ModuleNotFoundError] No module named
'tenacity'"

---------

Co-authored-by: Paweł Kmiecik <[email protected]>
pawel-kmiecik and others added 6 commits June 2, 2025 11:05
Improve `precheck` method of Confluence's Indexer.

Validate that each space provided in configuration can be accessed, raise exception if at least one of them can't.

---------

Co-authored-by: Rob Roskam <[email protected]>
Co-authored-by: Filip Knefel <[email protected]>
@mateuszkuprowski
Copy link
Contributor Author

Closing this PR and reopening another one, something did not merger right after rebase it was faster this way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants