-
Notifications
You must be signed in to change notification settings - Fork 34
Similar series: create a table #969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment has been minimized.
This comment has been minimized.
Codecov Report
@@ Coverage Diff @@
## master #969 +/- ##
============================================
- Coverage 74.58% 74.47% -0.11%
+ Complexity 268 267 -1
============================================
Files 20 20
Lines 913 913
Branches 118 118
============================================
- Hits 681 680 -1
Misses 216 216
- Partials 16 17 +1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! 👍
I only have a few comments:
- please, add this migration to
src/main/resources/liquibase/version/0.4.xml
file, so it will be really executed during application startup - In order to ensure that it's impossible to add duplicated pairs, it would be good to have a constraint for that. Here is an example:
src/main/resources/liquibase/version/0.4/2016-11-29--add-unique-key-transaction_participants-table.xml
http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.0.xsd"> | ||
|
||
<changeSet id="similar-series-create-table" author="azorin" context="scheme"> | ||
<comment>Creates table containing similar series</comment> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is obvious, so we can remove it.
<column name="series_id" type="INTEGER"> | ||
<constraints nullable="false" references="series(id)" foreignKeyName="fk_similar_series_series_id"/> | ||
</column> | ||
<column name="similar_series_id" type="INTEGER)"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
INTEGER)
is a wrong type and it won't work.
<comment>Creates table containing similar series</comment> | ||
|
||
<createTable tableName="similar_series"> | ||
<column name="id" type="INTEGER" autoIncrement="true"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need the id
field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest doing composite primary key because primary key should be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest doing composite primary key because primary key should be
Let's discuss it because I don't see why we need it.
primary key should be
Could you explain this? Why primary key should be? This table is used as a holder of the mappings. It doesn't contain data, it's rather a meta information.
Let's look on our use-case: we need to find series that are similar to specific series (SELECT similar_series_id FROM similar_series WHERE series_id = 10
and SELECT series_id FROM similar_series WHERE similar_series_id = 10
).
As we can see, we will never use a composite key because we never never both parts of it. We are using a value of a one field to find corresponding values of another field.
Thank you for brought up this idea! I read about composite keys and had to think deeply about what we're doing.
Now I have another question: executing 2 queries (see above) to find similar series looks strange. Does it mean that for each pair we should insert it twice (10-20, 20-10)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain this? Why primary key should be?
I think, that all tables should have primary key else we can't determinate row of table.
We can do any field a primary key, but for this task is inncorrect.
I can’t say what the practical value of the primary key is for this case, but I’m used to using primary key everywhere because I use hibernate
As we can see, we will never use a composite key because we never never both parts of it. We are using a value of a one field to find corresponding values of another field.
We can use composite key for uniqueness data, but it is not necessary.
I thinked about this and decided that composite key is not necessary.
Now I have another question: executing 2 queries (see above) to find similar series looks strange. Does it mean that for each pair we should insert it twice (10-20, 20-10)?
We can create trigger which will be create second insert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, that all tables should have primary key else we can't determinate row of table.
My approach is to add something only when we know/understand why we need it. In this case, it's not clear and that's why I suggested to not add the id
field.
We can do any field a primary key, but for this task is inncorrect.
Yes, I agree.
I can’t say what the practical value of the primary key is for this case, but I’m used to using primary key everywhere because I use hibernate
Yes, I remember about this requirement from Hibernate. But now we don't have it and there is no such requirement. I even starting to think that changing database just for some tool isn't a good idea.
I thinked about this and decided that composite key is not necessary.
So, we came to agreement then?
Now I have another question: executing 2 queries (see above) to find similar series looks strange. Does it mean that for each pair we should insert it twice (10-20, 20-10)?
We can create trigger which will be create second insert.
Yes, we can but I think that we shouldn't.
Firstly, I'm not sure whether this works on all databases (H2 and MySQL at least, PostgreSQL later and MongoDB in future).
Secondly, the approach that I'm trying to pursue in this project, is to concentrate everything inside of the service layer. Everything else is very simple and most of the time doesn't contain any logic.
Taking back to the question. Let's insert pair only once and use more complicated SQL query instead. The rationale is to keep only data that is required. Duplicating it just because we don't like to write 2 quesries (or having one but complex) seems like a bad approach.
JFYI: master branch is broken now. I don't have time to fix it now, so it will be broken until tomorrow. |
I think me need testing database for verification this script. |
When you connect new migration, it will be executed on Travis during our integration tests. They are run on both supported databases -- H2 and MySQL. It doesn't mean that you shouldn't test migrations at all but you need to know that we have automated tests for migrations also. |
@@ -54,5 +54,5 @@ | |||
<include file="0.4/2018-06-18--test_paid_user.xml" relativeToChangelogFile="true" /> | |||
<include file="0.4/2018-07-05--series_import_parsed_data_michel_numbers_field.xml" relativeToChangelogFile="true" /> | |||
<include file="0.4/2018-07-15--series_import_parsed_data_group_id_field.xml" relativeToChangelogFile="true" /> | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Next time, please, don't remove unrelated empty lines. They are here for the purpose :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks good! Thank you!
Could I ask you to squash 2 commits into a single one and also rebase your branch on top of the latest master? Master has been fixed a few minutes ago, so your tests after rebase should pass.
UPDATE: I see that tests are passed anyway, so only squash left then. |
Test is failed, but this is class I'm not changed |
Are you talking about codecov check? It's not required. Thank you for the PR! 🥇 I'll merge it today on the evening. There is a new issue for you, that you can work on. |
Merged in b24e6b6 commit. |
Addressed to #968