-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Consistent underline for Readers #2270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Although pandoc doesn't currently support underlines, I do think there should be an underline builder function. The readers don't currently parse underlines consistently, and it makes more sense for readers to call a builder function instead of doing their own thing. A single function would also be very easy to change in the future if we decide to create a new Underline type, or parse all underlines as Emph, etc. Would it be better to put the Also, if it matters, is it possible to restart the build? I don't understand why one of the jobs failed. |
|
+++ Ophir Lifshitz [Jul 16 15 10:14 ]:
Can you rebase against master? There are some travis fixes Particularly |
|
My mistake – doing that now. If we decided to put the builder in pandoc-types, will it have to merge in first? I don't know if it's possible for Travis to know that this PR depends on another repo's PR. |
f78c09f to
2b0d4d3
Compare
|
I'm inclined to think that if we're going to mess with +++ Ophir Lifshitz [Jul 16 15 10:36 ]:
|
|
I agree. It would be a half-measure to add the underline builder to pandoc-types but not a full-fledged Underline element. Is Shared.hs still a good place for it? |
|
Please let me know if you have any more feedback on this change. |
|
Now that 1.16 includes a change to pandoc-types, is it possible to consider before the next release either adding the builder (easy to change later and makes underlines consistent for the time being) or adding a full-fledged Underline type (slightly harder, affects the 9 formats that I listed in #2264)? |
|
I'm still a bit uncomfortable with the mismatch between Also, while I agree that we should treat underline consistently in all readers, I'm not sure it's a good idea to have an Perhaps this would be a good time to (re-)raise these questions on pandoc-discuss. |
|
I’d like to revisit this before 2.0 is out. (The inconsistency is a main reason I can no longer use pandoc in my project. All of our documents use semantic underlines.) Do I still need to bring it up on the pandoc-discuss list? Edit: |
|
Thanks for the reminder. One of these things should be done
(either a new AST element, or a workaround with spans,
or a warning). I'm still not sure which.
+++ Ophir Lifshitz [Feb 24 17 04:24 ]:
… I’d like to revisit this before 2.0 is out. (The inconsistency is a
main reason I can no longer use pandoc in my project. All of our
documents use semantic underlines.) Do I still need to bring it up on
the pandoc-discuss list? If nothing is done, should pandoc eventually
emit a warning as part of the upcoming lossyness report feature
([1]#3392)?
—
You are receiving this because you commented.
Reply to this email directly, [2]view it on GitHub, or [3]mute the
thread.
References
1. #3392
2. #2270 (comment)
3. https://github.com/notifications/unsubscribe-auth/AAAL5NUIP01dggVSbzOYNMG7FvfYOarAks5rfswZgaJpZM4FR28T
|
So you've actually documents that use bold, italics and underline for three semantically different purposes? (Do you even have all possible combinations?!) I tend to take the view that all three are simply a form of emphasis with bold and italics being very common and underlining being used very little. |
|
Yes. See these documents (document A or document B) containing quizbowl questions, for example. Tens of thousands of these documents already exist (many are archived here), and hundreds more are written every year. I will explain how basic formatting elements are used in quizbowl documents.
Bonus 11 of document A demonstrates why only using bold, instead of bold+underline, for meaning 3 is insufficient. If two consecutive words of an answer are both acceptable individually, they should be separately underlined. (Thus, Yes, all combinations are possible. Tossup 20 of document B has almost all combinations; it is easy to find instances of the remaining combinations (bold+italic and underline) elsewhere in the same document. I hope I have made the case for why stripping underlines or converting them into other elements will not work for these documents. Even if Pandoc never supports |
|
Interesting, if maybe a bit unusual, usecase! I can see the problem when the input format is docx/odt or a similarly unsemantic format that doesn't allow for easy preprocessing.
Agreed. To recap the three options and add a fourth:
|
|
Sorry for the additional reminder. Does there need to be further discussion on this? Has there been a decision yet to either make the AST change or the workaround? |
|
@hftf sorry this has been neglected so long. If you want to get these changes working with current dev, I'd be happy to switch to consistently parsing underlines as Span elements with class It might be desirable to adjust some of the writers as well so they render this sort of Span as an actual underline element. Alternatively, you could add the Underline element to pandoc-types and change all the readers and writers. This would require a new version of pandoc-types (and I suppose that it would be simplest for it to be 2.0 to match the upcoming pandoc). |
2b0d4d3 to
579a5d9
Compare
|
Thank you for revisiting this issue. I have rebased my changes. (There is a Travis build error, but I don't think it is my fault.) However, there is one failing test, a strange Docx test that I wasn't comfortable passing or fixing. ... Str "dans",Space,Str "le",Space,Str "film",Space,Link ("",[],[])
-[Emph [ Str "\"Le",Space,Str "nom",Space,Str "des",Space,Str "gens\""]]
+[Span ("",["underline"],[]) [Space,Str "\"Le",Space,Str "nom",Space,Str "des",Space,Str "gens\""]]
("http://www.allocine.fr/film/fichefilm_gen_cfilm=172167.html",""),Str ".",Space ...I spent a few hours trying to investigate why another Returning to the underline issue – I think this the best course of action:
I can work on and submit a PR for step 2 soon, but I will probably need some help if I were to be the one to tackle step 3, since that change will affect a lot of the codebase. |
|
The code responsible for the failure could be However, since it was probably done for a reason, I don't think I'm the right person to fix it. It's hard to know the intentions of this code when there are no comments or unit tests. |
|
Any updates on this would be great. |
|
@jkr do you have any insight into this? |
|
Why don't I just pass the test for now? When @jkr eventually sees this issue, then he can fix the test case and add any comments or unit tests. I'd like to get started soon on implementing Underline via the course of action that I listed above. October would be the best month for me to put time into development; after that, it would be a while until I can contribute much again. Pandoc 2.0 is apparently also supposed to come out around the end of October. However, I don't think it would be worth my time to work on this improvement if it just ends up getting neglected, since I don't have much confidence it will ever be merged if it doesn't land in 2.0. |
|
If @jkr doesn't get back to us I can look into what is needed.
I don't want this to languish much longer.
|
|
Sorry -- somehow I missed this discussion. I'll take a look at it ASAP.
John MacFarlane <[email protected]> writes:
… If @jkr doesn't get back to us I can look into what is needed.
I don't want this to languish much longer.
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#2270 (comment)
|
|
Okay -- sorry about the long delay. Beginning of school email avalanche somehow hid the more recent ones from me. So, @hftf, just to clarify, you'd like the remove the space at the beginning of the |
|
@jkr, Yes, that's right. I didn't know if it was intentional. Thank you for taking a look. |
|
@hftf : cool, I'll take a look this afternoon. Sorry again for the delay. A couple of questions though, for all:
|
|
There are actually two issues going on here:
I'll see if I can improve what's going on there. @hftf: Do you have a rebased version of your changeset? That will help me figure this all out. |
This can be easily updated if needed. The purpose is for Readers to transform underlines consistently.
579a5d9 to
7cbfb46
Compare
|
@jkr, Thanks very much. I've just rebased again. (I haven't touched the adjacent_links test.) That description seems right to me, but I'm still skeptical that people always intend to put a space at the edge of a link just because the link visibly formats a space, while Emph does not. I believe that people are just as careless with either element. In the documents I work with, bold/italic/underline elements often include the trailing space unintentionally since it gets automatically selected when authors drag the cursor in their word processor. I simply expected to see the same behavior after changing the Emph to Span (which has no visual formatting on its own). I didn't want to make the decision to pass that test because I couldn't confirm if the behavior was intended or a bug. Would it be possible to add unit tests, comments, or some kind of documentation to cover the intended behavior of |
|
hftf <[email protected]> writes:
@jkr, Thanks very much. I've just rebased again. (I haven't touched
the adjacent_links test.)
Great -- I'll give it a try and see if I can track down all the issues.
By the way, I'm probably going to change that adjacent_link test. I'd
have to go back to look at the history of that test, but it doesn't seem
like the best example of the behavior (too long, links are oddly
formatted, French might be distracting).
That description seems right to me, but I'm still skeptical that
people always intend to put a space at the edge of a link just because
the link visibly formats a space, while Emph does not. I believe that
people are just as careless with either element.
I think I agree with you here. That's why I asked before whether there
are *any* spans which might want to keep a space. But generally munching
the leading space seems like a good default. If we come up with exceptions
later, we can deal with them.
Would it be possible to add unit tests, comments, or some kind of
documentation to cover the intended behavior of `smushInlines` and
`Combine.hs`? That way, people like me who are not very familiar with
the language might have an easier time debugging.
Yes -- this has been in the back of my head as a TODO for a while. I'll
at least add comments soon, and try to add unit tests in due course.
Best,
Jesse
|
|
Related issue:
Word (and libre office) will add an underline styling to their links,
even when underline hasn't been chosen. My take is that we should *not*
read this as underlining. This does mean that if you add underlining to
a link, it might not show up in conversion, but this seems like a chance
we have to take (plus, doing so would be invisible in most styles that
people use).
In other words, current usage, which often makes links into Emphs
(inside of links), is wrong as well. This is the case in the
`adjacent_links` test.
So I propose that if the entirety of a link is an underline (in other
words, in the case that we have
Link attr [Span attr' ("", ["underline"], []) [ils]] tgt
we internally convert it to
Link attr [ils] tgt
Any objections to this, @hftf, @jgm?
hftf <[email protected]> writes:
… @jkr, Thanks very much. I've just rebased again. (I haven't touched the adjacent_links test.)
That description seems right to me, but I'm still skeptical that people always intend to put a space at the edge of a link just because the link visibly formats a space, while Emph does not. I believe that people are just as careless with either element. In the documents I work with, bold/italic/underline elements often include the trailing space unintentionally since it gets automatically selected when authors drag the cursor in their word processor.
I simply expected to see the same behavior after changing the Emph to Span (which has no visual formatting on its own). I didn't want to make the decision to pass that test because I couldn't confirm if the behavior was intended or a bug. Would it be possible to add unit tests, comments, or some kind of documentation to cover the intended behavior of `smushInlines` and `Combine.hs`? That way, people like me who are not very familiar with the language might have an easier time debugging.
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#2270 (comment)
|
|
That seems logical. Some remarks:
It might also be a good idea to move this underline-related discussion to separate issues. |
|
+++ Jesse Rosenthal [Oct 10 17 14:20 ]:
Related issue:
Word (and libre office) will add an underline styling to their links,
even when underline hasn't been chosen. My take is that we should *not*
read this as underlining
I agree.
|
|
By the way, I tracked down the original source of that test. The reason
that test uses such a weird document is that it was hard to reproduce
the problem in a document from my copy of word. So we kept this as an
in-the-wild test case we should be robust against:
#2689
I'll see if I can create an easier document that can reproduce the issue
(probably by going from pandoc->word).
hftf <[email protected]> writes:
… That seems logical. Some remarks:
1. I'm in favor of changing the test as well for all the reasons you mentioned. I believe the test was added to handle the strange edge case of several consecutive links, which get combined into a single link. Let's just make sure that behavior is still tested.
2. I agree that if an entire link is underlined, then the underline can be dropped. What if only part of the link is underlined/not underlined? For now, I suppose that nothing is dropped and whatever formatting is left as is.
3. I think both leading spaces and trailing spaces should be moved outside the link. Exceptions can be dealt with later, as you say.
It might also be a good idea to move this underline-related discussion to separate issues.
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#2270 (comment)
|
|
@jkr any success creating an alternative test? I'd like to get this merged before 2.0. |
|
Yes -- sorry, part of the difficulty was figuring out how to make a pathological case like that test case that would fail properly. The fixes will be up later this evening. |
|
Sorry -- work thing took over yesterday and today. This is still on my radar, and I plan to get to it tonight. |
|
This issue is now the only thing left on the pandoc 2.0
milestone! If reconstructing this one test case is
proving a big obstacle, should we think about just
removing it for now?
+++ Jesse Rosenthal [Oct 24 17 22:21 ]:
… Sorry -- work thing took over yesterday and today. This is still on my
radar, and I plan to get to it tonight.
|
|
I am in favor of dropping the weird test case for now. This specific pull request is the first part of a workaround that really should be merged at this point. I'm still interested in a new Underline element. I posted a roadmap for that goal 2 months ago, but there seemed to be no interest so I did not implement it. The most opportune time to land an AST change would be 2.0, so I would push for that, but I suppose it can wait. (I also have another completely unrelated breaking change – a proposal for making "plain" output more useful – that I wanted to bring up on the mailing list before 2.0, but I procrastinated doing that.) Both items deserve to land in 2.0; however, I would hate to keep everyone else waiting. |
|
+++ hftf [Oct 27 17 05:45 ]:
Both items deserve to land in 2.0; however, I would hate to keep
everyone else waiting.
Well, we don't have a fixed deadline for a release, so we
could still consider proposals.
However, adding Underline would require quite a lot of work;
just about every module would need to be touched (in both
pandoc and pandoc-types). I'm not sure the benefit is worth
the cost. And I still have a vague sense of unease that
it's too "presentational" an element.
You should go ahead and make your proposal about plain
output on pandoc-discuss. Note that a modification here
could happen in a later point release. (As could the
addition of Underline, really.) The main thrust of 2.0
is moving everything into PandocMonad.
|
|
I think we should just drop the weird test case. It really was a pathological case (note that in the original issue the original reporter couldn't recreate it). Finding an alternative that recreates the behavior can definitely be a TODO for a point release. I apologize to everyone about this. There was a perfect storm of personal and professional. |
This was what I would have thought too. But the others convinced me otherwise. Even in pandoc 1.x, there were breaking changes in the point releases. However, the mentality of trying to change all "big changes" in pandoc 2.0 would make it quite overwhelmed, and it seemed to already be. (I just finished reading the changelog of pandoc 2.0, and it is very lengthy! I'm surprised so much is changed and fixed already. I didn't recall such a lengthy changelog (1788 lines!) in the past. And a quick regex confirmed that.) There's at least one major AST change that everyone is hoping for—column/row span in table, and that is pushed to pandoc 2.1+. So don't worry, there's always a chance to make pandoc even better. |
See #2270 for background -- this test blocked the consistent underline change and was hard to revise, so for now we are removing it.
|
I merged the changes and dropped the test case.
|
|
Is it a good idea to add a CSS rule for |
|
Yes, that sounds quite reasonable. Do you want to suggest
one?
+++ hftf [Dec 03 17 10:28 ]:
… Is it a good idea to add a CSS rule for .underline in the default html
template?
—
You are receiving this because you modified the open/close state.
Reply to this email directly, [1]view it on GitHub, or [2]mute the
thread.
References
1. #2270 (comment)
2. https://github.com/notifications/unsubscribe-auth/AAAL5HHMId4SB9TXiWg7Y-CQg0Epoojrks5s8uhKgaJpZM4FR28T
|
|
I imagine this should be sufficient:
|
|
I added this.
+++ hftf [Dec 03 17 19:50 ]:
… I imagine this should be sufficient:
span.underline { text-decoration: underline; }
—
You are receiving this because you modified the open/close state.
Reply to this email directly, [1]view it on GitHub, or [2]mute the
thread.
References
1. #2270 (comment)
2. https://github.com/notifications/unsubscribe-auth/AAAL5O-EK9B5_c9jc2r9Di59L_4Vcb5_ks5s8vuFgaJpZM4FR28T
|
Fixes #2264.