-
Notifications
You must be signed in to change notification settings - Fork 52
feat/Migration - GitHub Source to Connector V2 Structure #157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
unstructured-theron
wants to merge
62
commits into
main
Choose a base branch
from
DS-90-github-source-v2
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
62 commits
Select commit
Hold shift + click to select a range
823dfc7
Update CHANGELOG with 0.0.23-dev0
unstructured-theron 32d3884
Upgrade version to 0.0.23-dev0
unstructured-theron 20fe615
Update expected outputs to support V2
unstructured-theron 25fc077
Add add_source_entry to GitHub V2
unstructured-theron 974ef3c
Add GitHub Source V2
unstructured-theron 10ca416
Merge branch 'main' into DS-90-github-source-v2
unstructured-theron 9b07474
lint: updating to black pattern
unstructured-theron 49aa703
lint: updating to flake8 pattern
unstructured-theron bd2f6a6
cicd: rename to access token
unstructured-theron 204918e
lint: updating to black pattern
unstructured-theron d5e26f1
Upgrade pygithub version to >= 2.4.0
unstructured-theron e3022e8
Refactoring the precheck method
unstructured-theron 5c282a0
lint: fix imports
unstructured-theron 7681f92
lint: fix imports
unstructured-theron 3855545
Update expected outputs
unstructured-theron 1fd37d7
Reverting github.sh
unstructured-theron d9f0788
Reverting github.sh
unstructured-theron 8799db8
Add exclude metadata to github.sh
unstructured-theron c14eff0
Rename access token
unstructured-theron 48e74bf
Updating the expected outputs (ignored the date fields)
unstructured-theron ce3f8a1
Update the expected outputs with permissions_data
unstructured-theron 2d05ba1
Merge branch 'main' into DS-90-github-source-v2
unstructured-theron 01223fd
GitHub: fixing commented issues
unstructured-theron d2711b3
Merge branch 'main' into DS-90-github-source-v2
unstructured-theron c437139
github.sh: rename to --file-glob
unstructured-theron d38b8db
github.sh: forcing raise exceptions
unstructured-theron 2e81bcf
Merge branch 'main' into DS-90-github-source-v2
unstructured-theron 98ed489
Upgrading version to 0.0.26-dev5
unstructured-theron e78ddc2
fix download file path
unstructured-theron aaf8ab6
Merge branch 'main' into DS-90-github-source-v2
unstructured-theron 1b5cdf4
Merge branch 'main' into DS-90-github-source-v2
unstructured-theron f4c82b4
Improving description and logs
unstructured-theron 10c9aea
Merge branch 'main' into DS-90-github-source-v2
unstructured-theron c91531f
fix lint
unstructured-theron 423c206
fix lint
unstructured-theron a9ba1c5
fix lint
unstructured-theron 0b67c65
Fixing doc methods
unstructured-theron 902f69d
Fix methods run and run_async
unstructured-theron 85ba0d9
Ad method "is_async"
unstructured-theron 618d89e
Modify "additional_metadata" to add metadata just if the fields exist
unstructured-theron f35bc00
Set default value for Secret Field
unstructured-theron 4c527b4
Merge branch 'main' into DS-90-github-source-v2
unstructured-theron f52b7be
fix black
unstructured-theron bce778f
Change to use model_validator
unstructured-theron 50b2db5
Change syntax code of with open file
unstructured-theron 49ee54f
Add recursive flag
unstructured-theron 4b0f9a1
fix lint
unstructured-theron c8e7908
Change syntax code of with open file
unstructured-theron 8c78370
Add metadata to new "recursive" field on Indexer
unstructured-theron 11e119e
lint: run black
unstructured-theron fe0d496
Move recursive to IndexerConfig
unstructured-theron 28186f3
Fix typo connection_config
unstructured-theron 6057310
fix to handle all expections
unstructured-theron 88df7cc
Merge branch 'main' into DS-90-github-source-v2
unstructured-theron 6dbc4cc
fix typo path
unstructured-theron 953bb72
Merge branch 'main' into DS-90-github-source-v2
unstructured-theron 4ca279f
Fix changelog
unstructured-theron d5e3b1b
fix log
unstructured-theron a7dba0a
fix log
unstructured-theron b5e0b74
fix docstring
unstructured-theron 725881c
Add more descriptive error message
bryan-unstructured 44d7560
comment out the clarifai test which prevents from passing CI tests
bryan-unstructured File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,9 @@ | ||
# 0.2.0-dev0 | ||
|
||
### Enhancements | ||
|
||
* **Added migration for GitHub Source V2** | ||
|
||
## 0.2.0 | ||
|
||
### Enhancements | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,4 @@ | ||
-c ../common/constraints.txt | ||
|
||
# NOTE - pygithub==1.58.0 fails due to https://github.com/PyGithub/PyGithub/issues/2436 | ||
pygithub>1.58.0 | ||
pygithub>=2.4.0 | ||
requests |
100 changes: 75 additions & 25 deletions
100
test_e2e/expected-structured-output/github/LICENSE.txt.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,57 +1,107 @@ | ||
[ | ||
{ | ||
"type": "Title", | ||
"element_id": "52585ab256e2832166ca185be6c76cc9", | ||
"text": "Downloadify: Client Side File Creation JavaScript + Flash Library", | ||
"metadata": { | ||
"filetype": "text/plain", | ||
"languages": [ | ||
"eng" | ||
] | ||
}, | ||
"text": "Downloadify: Client Side File Creation JavaScript + Flash Library", | ||
"type": "Title" | ||
], | ||
"filetype": "text/plain", | ||
"data_source": { | ||
"url": "https://api.github.com/repos/dcneiner/Downloadify/git/blobs/2c4f1ab8689a6dfef4ee7d13d4d935cb6663a7e4", | ||
"version": "W/\"bb342a3e84a4ce514665385d7d61fb2922b0705ff23ad599a3e2d355aabe3f21\"", | ||
"record_locator": { | ||
"repo_path": "dcneiner/Downloadify", | ||
"file_path": "LICENSE.txt" | ||
}, | ||
"permissions_data": null, | ||
"filesize_bytes": 1127 | ||
} | ||
} | ||
}, | ||
{ | ||
"type": "Title", | ||
"element_id": "107ab54e7143d022fee38d5dfe235f89", | ||
"text": "Copyright (c) 2009 Douglas C. Neiner", | ||
"metadata": { | ||
"filetype": "text/plain", | ||
"languages": [ | ||
"eng" | ||
] | ||
}, | ||
"text": "Copyright (c) 2009 Douglas C. Neiner", | ||
"type": "Title" | ||
], | ||
"filetype": "text/plain", | ||
"data_source": { | ||
"url": "https://api.github.com/repos/dcneiner/Downloadify/git/blobs/2c4f1ab8689a6dfef4ee7d13d4d935cb6663a7e4", | ||
"version": "W/\"bb342a3e84a4ce514665385d7d61fb2922b0705ff23ad599a3e2d355aabe3f21\"", | ||
"record_locator": { | ||
"repo_path": "dcneiner/Downloadify", | ||
"file_path": "LICENSE.txt" | ||
}, | ||
"permissions_data": null, | ||
"filesize_bytes": 1127 | ||
} | ||
} | ||
}, | ||
{ | ||
"type": "NarrativeText", | ||
"element_id": "1cd03f5c7eea429178fc15c9d6c4cbd4", | ||
"text": "Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:", | ||
"metadata": { | ||
"filetype": "text/plain", | ||
"languages": [ | ||
"eng" | ||
] | ||
}, | ||
"text": "Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:", | ||
"type": "NarrativeText" | ||
], | ||
"filetype": "text/plain", | ||
"data_source": { | ||
"url": "https://api.github.com/repos/dcneiner/Downloadify/git/blobs/2c4f1ab8689a6dfef4ee7d13d4d935cb6663a7e4", | ||
"version": "W/\"bb342a3e84a4ce514665385d7d61fb2922b0705ff23ad599a3e2d355aabe3f21\"", | ||
"record_locator": { | ||
"repo_path": "dcneiner/Downloadify", | ||
"file_path": "LICENSE.txt" | ||
}, | ||
"permissions_data": null, | ||
"filesize_bytes": 1127 | ||
} | ||
} | ||
}, | ||
{ | ||
"type": "NarrativeText", | ||
"element_id": "5da204497a4873a8d0f71ad7865cea7e", | ||
"text": "The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.", | ||
"metadata": { | ||
"filetype": "text/plain", | ||
"languages": [ | ||
"eng" | ||
] | ||
}, | ||
"text": "The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.", | ||
"type": "NarrativeText" | ||
], | ||
"filetype": "text/plain", | ||
"data_source": { | ||
"url": "https://api.github.com/repos/dcneiner/Downloadify/git/blobs/2c4f1ab8689a6dfef4ee7d13d4d935cb6663a7e4", | ||
"version": "W/\"bb342a3e84a4ce514665385d7d61fb2922b0705ff23ad599a3e2d355aabe3f21\"", | ||
"record_locator": { | ||
"repo_path": "dcneiner/Downloadify", | ||
"file_path": "LICENSE.txt" | ||
}, | ||
"permissions_data": null, | ||
"filesize_bytes": 1127 | ||
} | ||
} | ||
}, | ||
{ | ||
"type": "NarrativeText", | ||
"element_id": "1b454f06bfa94b6d367e0e812ae32655", | ||
"text": "THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.", | ||
"metadata": { | ||
"filetype": "text/plain", | ||
"languages": [ | ||
"eng" | ||
] | ||
}, | ||
"text": "THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.", | ||
"type": "NarrativeText" | ||
], | ||
"filetype": "text/plain", | ||
"data_source": { | ||
"url": "https://api.github.com/repos/dcneiner/Downloadify/git/blobs/2c4f1ab8689a6dfef4ee7d13d4d935cb6663a7e4", | ||
"version": "W/\"bb342a3e84a4ce514665385d7d61fb2922b0705ff23ad599a3e2d355aabe3f21\"", | ||
"record_locator": { | ||
"repo_path": "dcneiner/Downloadify", | ||
"file_path": "LICENSE.txt" | ||
}, | ||
"permissions_data": null, | ||
"filesize_bytes": 1127 | ||
} | ||
} | ||
} | ||
] |
86 changes: 63 additions & 23 deletions
86
test_e2e/expected-structured-output/github/test.html.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,52 +1,92 @@ | ||
[ | ||
{ | ||
"type": "Title", | ||
"element_id": "218722ac66e142a570ab2053b430c6c4", | ||
"text": "Downloadify Example", | ||
"metadata": { | ||
"filetype": "text/html", | ||
"languages": [ | ||
"eng" | ||
] | ||
}, | ||
"text": "Downloadify Example", | ||
"type": "Title" | ||
], | ||
"filetype": "text/html", | ||
"data_source": { | ||
"url": "https://api.github.com/repos/dcneiner/Downloadify/git/blobs/c63c8fc21d46d44de85a14a3ed4baec0348ce344", | ||
"version": "W/\"bb342a3e84a4ce514665385d7d61fb2922b0705ff23ad599a3e2d355aabe3f21\"", | ||
"record_locator": { | ||
"repo_path": "dcneiner/Downloadify", | ||
"file_path": "test.html" | ||
}, | ||
"permissions_data": null, | ||
"filesize_bytes": 3001 | ||
} | ||
} | ||
}, | ||
{ | ||
"type": "Title", | ||
"element_id": "bf0fab1925c4b2cbb23a53afce28ebd2", | ||
"text": "More info available at the Github Project Page", | ||
"metadata": { | ||
"filetype": "text/html", | ||
"languages": [ | ||
"eng" | ||
], | ||
"link_texts": [ | ||
"Github Project Page" | ||
], | ||
"link_urls": [ | ||
"http://github.com/dcneiner/Downloadify" | ||
] | ||
}, | ||
"text": "More info available at the Github Project Page", | ||
"type": "Title" | ||
], | ||
"languages": [ | ||
"eng" | ||
], | ||
"filetype": "text/html", | ||
"data_source": { | ||
"url": "https://api.github.com/repos/dcneiner/Downloadify/git/blobs/c63c8fc21d46d44de85a14a3ed4baec0348ce344", | ||
"version": "W/\"bb342a3e84a4ce514665385d7d61fb2922b0705ff23ad599a3e2d355aabe3f21\"", | ||
"record_locator": { | ||
"repo_path": "dcneiner/Downloadify", | ||
"file_path": "test.html" | ||
}, | ||
"permissions_data": null, | ||
"filesize_bytes": 3001 | ||
} | ||
} | ||
}, | ||
{ | ||
"type": "Title", | ||
"element_id": "395aed29cd13842fede90a1a8677aa4b", | ||
"text": "Downloadify Invoke Script For This Page", | ||
"metadata": { | ||
"filetype": "text/html", | ||
"languages": [ | ||
"eng" | ||
] | ||
}, | ||
"text": "Downloadify Invoke Script For This Page", | ||
"type": "Title" | ||
], | ||
"filetype": "text/html", | ||
"data_source": { | ||
"url": "https://api.github.com/repos/dcneiner/Downloadify/git/blobs/c63c8fc21d46d44de85a14a3ed4baec0348ce344", | ||
"version": "W/\"bb342a3e84a4ce514665385d7d61fb2922b0705ff23ad599a3e2d355aabe3f21\"", | ||
"record_locator": { | ||
"repo_path": "dcneiner/Downloadify", | ||
"file_path": "test.html" | ||
}, | ||
"permissions_data": null, | ||
"filesize_bytes": 3001 | ||
} | ||
} | ||
}, | ||
{ | ||
"type": "NarrativeText", | ||
"element_id": "2e22c39e004cb7d566294080c976efc8", | ||
"text": "Downloadify.create('downloadify',{\n filename: function(){\n return document.getElementById('filename').value;\n },\n data: function(){ \n return document.getElementById('data').value;\n },\n onComplete: function(){ \n alert('Your File Has Been Saved!'); \n },\n onCancel: function(){ \n alert('You have cancelled the saving of this file.');\n },\n onError: function(){ \n alert('You must put something in the File Contents or there will be nothing to save!'); \n },\n swf: 'media/downloadify.swf',\n downloadImage: 'images/download.png',\n width: 100,\n height: 30,\n transparent: true,\n append: false\n});", | ||
"metadata": { | ||
"filetype": "text/html", | ||
"languages": [ | ||
"eng" | ||
] | ||
}, | ||
"text": "Downloadify.create('downloadify',{\n filename: function(){\n return document.getElementById('filename').value;\n },\n data: function(){ \n return document.getElementById('data').value;\n },\n onComplete: function(){ \n alert('Your File Has Been Saved!'); \n },\n onCancel: function(){ \n alert('You have cancelled the saving of this file.');\n },\n onError: function(){ \n alert('You must put something in the File Contents or there will be nothing to save!'); \n },\n swf: 'media/downloadify.swf',\n downloadImage: 'images/download.png',\n width: 100,\n height: 30,\n transparent: true,\n append: false\n});", | ||
"type": "NarrativeText" | ||
], | ||
"filetype": "text/html", | ||
"data_source": { | ||
"url": "https://api.github.com/repos/dcneiner/Downloadify/git/blobs/c63c8fc21d46d44de85a14a3ed4baec0348ce344", | ||
"version": "W/\"bb342a3e84a4ce514665385d7d61fb2922b0705ff23ad599a3e2d355aabe3f21\"", | ||
"record_locator": { | ||
"repo_path": "dcneiner/Downloadify", | ||
"file_path": "test.html" | ||
}, | ||
"permissions_data": null, | ||
"filesize_bytes": 3001 | ||
} | ||
} | ||
} | ||
] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
__version__ = "0.2.0" # pragma: no cover | ||
__version__ = "0.2.0-dev0" # pragma: no cover |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.