Skip to content

[wip!] Add slack data importer. #114

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 33 commits into from
Closed

[wip!] Add slack data importer. #114

wants to merge 33 commits into from

Conversation

rht
Copy link
Contributor

@rht rht commented Sep 13, 2017

this importer is more comprehensive than the existing one. pending tiny tiny detail such as the config param of the existing realm as an input and the 20 todo's. will fix when i am able to focus once again :(.

USAGE

See https://my.slack.com/services/export for the step to export a Slack data.
To convert a slack data, let the zip file be slack_data.zip, then
./slack2zulip.py slack_data.zip

which outputs zulip_data.tar.gz.

To deploy the data onto an existing fresh Zulip repo,
./manage.py import --destroy-rebuild-database <path to zulip_data.tar.gz>

e: add Slack URL
e: modify command

@rht
Copy link
Contributor Author

rht commented Sep 13, 2017

cc: if @rishig has requests for how this should be made.

@codecov
Copy link

codecov bot commented Sep 14, 2017

Codecov Report

Merging #114 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #114   +/-   ##
=======================================
  Coverage   53.54%   53.54%           
=======================================
  Files          59       59           
  Lines        2691     2691           
=======================================
  Hits         1441     1441           
  Misses       1250     1250

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5681b6e...c293cb0. Read the comment docs.

@timabbott
Copy link
Member

@rht some thoughts:

  • Am I correct in assuming that the format being generated is the same as the format for the existing Zulip import/export tool (manage.py import)? That seems reasonable for doing a faithful full organization import/export. At some future point, we can create a web flow for triggering that (doing so would require significant work on the Zulip side import backend, but I know how to do that).
  • Do you have thoughts on how to do automated tests for this? One idea might be to have this split into 2 stages: A first stage, which fetches data from Slack and dumps it to a file in the Slack data format. And a second that converts the Slack data format to the Zulip import data format. We could then capture as a text fixture a version of the first phase, and then have automated tests for the conversion piece (which I imagine will have all the complexity), and compare the resulting directory against what is expected.

What do you think?

@rishig
Copy link
Member

rishig commented Sep 16, 2017

This is awesome, thanks @rht! Agree with Tim's comments above.

@rht rht force-pushed the slack++ branch 3 times, most recently from e1caf2b to 265ad84 Compare September 16, 2017 05:28
@rht
Copy link
Contributor Author

rht commented Sep 17, 2017

Am I correct in assuming that the format being generated is the same as the format for the existing Zulip import/export tool (manage.py import)? That seems reasonable for doing a faithful full organization import/export.

Yes, I am following the step based on
zulip/zulip#908 (comment). And some of the
todo's requires further discussion, such as the permission hierarchy, where
Slack pins (in channels) and Zulip stars (in zerver_usermessage) are stored,
user presence timestamp, and so on. More at
https://github.com/rht/slack2zulip-data-converter#term-mapping.

At some future point, we can create a web flow for triggering that (doing so would require significant work on the Zulip side import backend, but I know how to do that).

Yes, I also had thought of this; this requires exposing only the
manage.py import command to the web via admin auth. Slack data export, for
instance, has its own web page instead of having it in the team/realm settings.

Do you have thoughts on how to do automated tests for this? One idea might be to have this split into 2 stages: A first stage, which fetches data from Slack and dumps it to a file in the Slack data format.
And a second that converts the Slack data format to the Zulip import data format. We could then capture as a text fixture a version of the first phase, and then have automated tests for the conversion piece (which I imagine will have all the complexity), and compare the resulting directory against what is expected.

This (the first stage) needs to be done once only, I think. It is not critical
to always check that the Slack data export is always up and running. For the
test, a sample fixture is needed. Also, the test needs to be run in zulip/zulip CI
since it is a manage.py command. Something along the line of
#18 (comment)
should do (yes I know how to make this work via sed).

Note: the schema of the converted data format is based on the sample export data
in https://github.com/zulip/zulip/blob/master/zerver/tests/test_export.py.

@rht rht force-pushed the slack++ branch 19 times, most recently from 0a18267 to 0ae6ed6 Compare September 18, 2017 04:31
@zulipbot
Copy link
Member

Heads up @rht, we just merged some commits that conflict with the changes your made in this pull request! You can review this repository's recent commits to see where the conflicts occur. Please rebase your feature branch against the upstream/master branch and resolve your pull request's merge conflicts accordingly.

@rheaparekh rheaparekh mentioned this pull request Dec 1, 2017
@rht
Copy link
Contributor Author

rht commented Dec 16, 2017

This is superseded by #218. The importer script will live in zulip/zulip instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants