Skip to content

Conversation

@ang-zeyu
Copy link
Contributor

@ang-zeyu ang-zeyu commented Mar 4, 2020

What is the purpose of this pull request? (put "X" next to an item, remove the rest)

• [x] Other, please explain: Code maintainability / readability

Resolves #1083

What is the rationale for this request?
To remove unnecessary complexity in the baseUrl resolving process and refactor some methods to simplify things.

What changes did you make? (Give an overview)

  • Added a simpler version of calculateNewBaseUrls that directly returns the immediate parent site path instead of { parent, relative }, as is the use case for calculateNewBaseUrls in many cases.
  • Removed some unnecessary complexity ( foundBase, see below ) and refactored _rebaseReference to use guard clauses instead of nested ifs.
  • Removed _rebaseReferenceForStaticIncludes, which is made redundant by _rebaseReference.

Is there anything you'd like reviewers to focus on?
What is the purpose of foundBase in _rebaseReference, if any?
The old _rebaseReference code calculated the new relative url from combinedBases[bases[0]], where combinedBases was the result of { ...childrenBase, ...parentBase }.

The purpose of this seems unclear, and also indeterministic. ( no issues were presented thus far as adding a console.log to combinedBases reveals that combinedBases is always of length 1 at most )

For example, suppose we have

<include src="file from main site" />
<include src="file from a sub (x2) site" />

that are siblings of one another. After we process the first include, foundBase[root] = '' would be present.
After processing the second include from a doubly nested sub site, we would have foundBase[subsite1] = 'subsite1 to 2 relative path'.
Since Object.keys() maintains the insertion order, we would wrongly retrieve the relative path of '' for the second include, causing {{baseUrl}} to be resolved wrongly.

Another irregularity was the need for childrenBase in combinedBases.
When we find the relative base url of the included file, the only concern when resolving said url is what site/subsite does the included file belong in?.
It can also cause the same insertion order problem if we repeat the above example, but shift the second include as the child of the first include.

ie.

<include src="file from main site"> // in this case the relative url for the parent include is resolved wrongly
  <include src="file from a sub (x2) site" /> // that was a result of the parent include.
</include

Testing instructions:
npm run test, and 2103 site diff should have no diffs.

Proposed commit message: (wrap lines at 72 characters)
Simplify baseUrl resolving process

BaseUrls for static includes are rendered once during pre-processing
and another time during resolveBaseUrls.
The rebaseReference subroutine used during resolveBaseUrls also has
some unnecessary complexity.
In addition, in calculateNewBaseUrls, the relative baseUrl for doubly
nested subsites are not resolved correctly.

Let’s remove the rendering during pre-processing, unifying where baseUrl
is resolved for statically included content.
Let’s also refactor the rebaseReference method, simplifying its logic.
At the same time, we can standardise usages of the calculateNewBaseUrls
method to calculateImmediateParentSitePath, and fix the relative baseUrl
returned for doubly nested subsites.

Copy link
Contributor Author

@ang-zeyu ang-zeyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some annotations:

.then(result => markbinder.resolveBaseUrl(result, {
baseUrlMap: this.baseUrlMap,
rootPath: this.rootPath,
isDynamic: true,
Copy link
Contributor Author

@ang-zeyu ang-zeyu Mar 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these variables in resolveDependency are not necessary, as in the resolveBaseUrl process, attribs[ATTRIB_CWF] = context.cwf is the same as dependency.from ( ie. parser.dynamicIncludeSrc.push({ from: context.cwf, ... }) )

if (this.isDynamic) {
// Change CWF for each top level element
if (node.attribs) {
node.attribs[ATTRIB_CWF] = this.dynamicSource;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see the comment for resolveDependency

* the site they belong to, such that the final call to render baseUrl correctly
* resolves baseUrl according to the said site.
*/
_rebaseReference(node) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two changes here essentially:

  • replace nested ifs with guard clauses
  • removal of foundBase ( see the explanation in the OP )

*/


function _rebaseReferenceForStaticIncludes(pageData, element, config) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #1083

/**
* Returns the absolute path of of the immediate parent site of the specified file path.
*/
function calculateImmediateParentSitePath(filePath, root, lookUp) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simpler version of calculateNewBaseUrls that just returns the immediate parent site path, which is what's needed for the majority of use cases

Copy link
Contributor

@crphang crphang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete review, but looks to make things much simpler. Great work so far!

: calculate(parent);
}

return calculate(filePath, undefined);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calculate defined here takes only 1 parameter.

}

// Where the children are included
const newBase = urlUtils.calculateNewBaseUrls(includedSourceFile, this.rootPath, this.baseUrlMap);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loving the review that you did previously, really helped a lot. Will be good if you could help explain this part as well.

For newBase, is it possible to use calculateImmediateParentSitePath?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the comments slightly clearer

this.rootPath, this.baseUrlMap);

// Only re-render if include src and content are from different sites
if (currentBase !== path.resolve(newBase.parent, newBase.relative)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we can have currentBase !== newBase

Copy link
Contributor Author

@ang-zeyu ang-zeyu Mar 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the use cases for calculateNewBaseUrls in particular
relative is needed for the resolving the new relative baseurl
baseUrl: {{hostBaseUrl}}/${newBase.relative}

Edit:
Thanks for this!
I noticed another possible bug ( for doubly nested subsites ) with calculateNewBaseUrls and its usage in _rebaseReference such that calculateNewBaseUrls can be completely removed.

Will investigate and update back, and include a test for doubly nested subsites if its the case

@ang-zeyu ang-zeyu force-pushed the refactor-new-base-urls branch 3 times, most recently from 4edeb21 to 1970f4b Compare March 6, 2020 11:24
@ang-zeyu
Copy link
Contributor Author

ang-zeyu commented Mar 6, 2020

Then we can have currentBase !== newBase

I've removed calculateNewBaseUrls, but moved it to calculateImmediateParentSitePath as an option as its still necessary sometimes to have the relative base url.

Also fixed a bug with doubly nested subsites that existed before with baseurls ( baseurl simply resolves wrongly for it ), and added some functional tests for it.
Page.calculateNewBaseUrl is also standardised to use calculateImmediateParentSitePath now instead.


below commit only updates the commit message

@ang-zeyu ang-zeyu force-pushed the refactor-new-base-urls branch 2 times, most recently from 6498336 to 85e7fec Compare March 7, 2020 05:17
Copy link
Contributor

@marvinchin marvinchin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely a change that will help in de-mystifying the parser! Unfortunately, I don't have sufficient context to answer your questions - but a cursory look at the code seems like your observations are right. Left a couple of questions :)

* 2. The relative path from (1) to the immediate parent site of the specified filePath
* @param needRelative Whether the relative file path from the root site to the immediate parent site
* should be returned as well. Defaults to false if unspecified
* @return Either
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible/inexpensive enough for us to just calculate and return both the absolute and relative path every time? I think having to deal with two possible return types make this function more complicated, and can easily lead to errors especially without a type system in place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with two possible return types to avoid e.g. { absolute: parentSite } = urlUtils.calculate... all over. I merged it into one now though, will revert if preferred!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally would lean towards avoiding hard to spot errors over having slightly shorter code, but I'm open to the input of others.

What do you think about splitting them up into two functions? Would help make this more explicit while reducing unnecessary destructuring at the call site? (i.e. calculateAbsoluteParentSitePath and calculateAbsoluteAndRelativeParentSitePaths)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply! I've changed it back to two functions


const includeSourceFile = node.attribs[ATTRIB_CWF];
const includedSourceFile = node.attribs[ATTRIB_INCLUDE_PATH];
delete node.attribs[ATTRIB_CWF];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering - should we delete these attributes here? It feels a little odd since they were added in _preprocess (at least, I think), and it isn't immediately intuitive that these attributes would be deleted in this function.

If it is necessary for us to delete it here, could we leave a comment explaining why we need to do this? And maybe add a comment in _preprocess to notify that this variable will be deleted here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original function already deletes both

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessary to the best of my knowledge, but not having <div cwf="..."> all over the output is a nice touch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah noted I didn't notice that these outputs would be included in the output!

position: fixed;
width: 100%;
z-index: 9999;
z-index: 1000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm this diff doesn't seem to be related to the changes in this PR.

Seems to be related to #1070. Let's fix this in another PR? Also, it seems like CSS diffs aren't being checked by our tests 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

125ac54, was indeed #1070's test file updates

Seems to be related to #1070. Let's fix this in another PR? Also, it seems like CSS diffs aren't being checked by our tests 😅

I checked the test script briefly, it seems that only siteData.json and .html files are diffed.
The likely original reason was to avoid diffing large library files ( eg bootstrap.min.css ); Perhaps a better solution would be to exclude these files explicitly. Will open as an issue!

foundBase[parent] = relative;
}
// override with parent's base
const combinedBases = { ...childrenBase, ...foundBase };
Copy link
Contributor

@marvinchin marvinchin Mar 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the PR that added the combinedBases change:
#263

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! Unfortunately the PR dosen't really reveal the workings of combinedBases, but

What is the rationale for this request?
When including files belonging to an inner website, the computation of baseUrl is wrong if the depth of the inclusion is only 1.

Going by this, it should be sufficient to resolve the baseurl only when the include's file and included file belongs to different sites. The likely cause of the issue was wrongly resolving against ATTRIB_CWF instead of ATTRIB_INCLUDE_PATH, and perhaps the above mentioned irregularities with foundBase.

I was looking at 64b2ff6 previously, where foundBase was included with the function right from the start.
It dosen't reveal much as well unfortunately, perhaps @Gisonrg could comment!

@ang-zeyu ang-zeyu force-pushed the refactor-new-base-urls branch 4 times, most recently from 340630f to 9dc1269 Compare March 15, 2020 07:09
@ang-zeyu ang-zeyu force-pushed the refactor-new-base-urls branch 2 times, most recently from 1741ad9 to 47a621f Compare March 22, 2020 05:24
BaseUrls for static includes are rendered once during pre-processing
and another time during resolveBaseUrls.
The rebaseReference subroutine used during resolveBaseUrls also has
some unnecessary complexity.
In addition, in calculateNewBaseUrls, the relative baseUrl for doubly
nested subsites are not resolved correctly.

Let’s remove the rendering during pre-processing, unifying where baseUrl
is resolved for statically included content.
Let’s also refactor the rebaseReference method, simplifying its logic.
At the same time, we can standardise usages of the calculateNewBaseUrls
method to calculateImmediateParentSitePath, and fix the relative baseUrl
returned for doubly nested subsites.
@ang-zeyu ang-zeyu force-pushed the refactor-new-base-urls branch from 47a621f to 7199199 Compare March 22, 2020 05:31
@ang-zeyu ang-zeyu requested a review from marvinchin March 22, 2020 16:19
Copy link
Contributor

@marvinchin marvinchin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! Since the original authors with more context surrounding combinedBases aren't available, I think we can go ahead with our current understanding. The test sites included should give us sufficient confidence that the new implementation still gives us the behaviour we are expecting.

@marvinchin marvinchin added this to the v2.12.1 milestone Mar 23, 2020
@marvinchin marvinchin merged commit e88900f into MarkBind:master Mar 23, 2020
Tejas2805 added a commit to Tejas2805/markbind that referenced this pull request Mar 23, 2020
* 'master' of https://github.com/MarkBind/markbind:
  Update test files (MarkBind#1138)
  Remove OK button from modals (MarkBind#1134)
  Add start from line number functionality to code blocks (MarkBind#1115)
  Allow special tags to be self-closing (MarkBind#1101)
  Simplify baseUrl resolving process (MarkBind#1087)
  Remove redundant file writing (MarkBind#1090)
  Bump acorn from 5.7.3 to 5.7.4 (MarkBind#1121)
  Bump acorn from 7.1.0 to 7.1.1 in /src/lib/markbind (MarkBind#1120)
  Unify markdown-it parser variants (MarkBind#1056)
  Remove dynamic include feature (MarkBind#1037)
  Fix flex shrink not applying in content wrapper (MarkBind#1135)
  Escape Nunjucks for special tags (MarkBind#1049)
  Update documentation on icon slot for boxes (MarkBind#1123)
Tejas2805 added a commit to Tejas2805/markbind that referenced this pull request Mar 23, 2020
…x-pageNav

* 'master' of https://github.com/MarkBind/markbind:
  Update test files (MarkBind#1138)
  Remove OK button from modals (MarkBind#1134)
  Add start from line number functionality to code blocks (MarkBind#1115)
  Allow special tags to be self-closing (MarkBind#1101)
  Simplify baseUrl resolving process (MarkBind#1087)
  Remove redundant file writing (MarkBind#1090)
  Bump acorn from 5.7.3 to 5.7.4 (MarkBind#1121)
  Bump acorn from 7.1.0 to 7.1.1 in /src/lib/markbind (MarkBind#1120)
  Unify markdown-it parser variants (MarkBind#1056)
  Remove dynamic include feature (MarkBind#1037)
  Fix flex shrink not applying in content wrapper (MarkBind#1135)
  Escape Nunjucks for special tags (MarkBind#1049)
  Update documentation on icon slot for boxes (MarkBind#1123)
  Convert code in boxes to code block (MarkBind#1086)
marvinchin pushed a commit that referenced this pull request Apr 10, 2020
BaseUrls for static includes are rendered once during pre-processing
and another time during resolveBaseUrls.
The rebaseReference subroutine used during resolveBaseUrls also has
some unnecessary complexity.
In addition, in calculateNewBaseUrls, the relative baseUrl for doubly
nested subsites are not resolved correctly.

Let’s remove the rendering during pre-processing, unifying where baseUrl
is resolved for statically included content.
Let’s also refactor the rebaseReference method, simplifying its logic.
At the same time, we can standardise usages of the calculateNewBaseUrls
method to calculateImmediateParentSitePath, and fix the relative baseUrl
returned for doubly nested subsites.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Investigate the baseUrl resolving process as a whole

3 participants