Skip to content

Stop using deprecated tags #28

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
May 21, 2013
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 14 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,8 +113,7 @@ DocxParser includes abstracts methods that each parser overwrites to satsify its

@abstractmethod
def table(self, text):
return text

return text
@abstractmethod
def table_row(self, text):
return text
Expand Down Expand Up @@ -161,4 +160,16 @@ OR, let's say FOO is your new favorite markup language. Simply customize your ow

def linebreak(self):
return '!!!!!!!!!!!!' # because linebreaks in are denoted by '!!!!!!!!!!!!'
# with the FOO markup langauge :)
# with the FOO markup langauge :)

#Styles

The base parser `Docx2Html` relies on certain css class being set for certain behaviour to occur. Currently these include:

* class `insert` -> Turns the text green.
* class `delete` -> Turns the text red and draws a line through the text.
* class `center` -> Aligns the text to the center.
* class `right` -> Aligns the text to the right.
* class `left` -> Aligns the text to the left.
* class `comment` -> Turns the text blue.
* class `pydocx-underline` -> Underlines the text.
14 changes: 6 additions & 8 deletions pydocx/parsers/Docx2Html.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,6 @@ class Docx2Html(DocxParser):
@property
def parsed(self):
content = self._parsed
content = content.replace('<p></p><p></p>', '<br />')
content = content.replace('</p><br /><p>', '</p><p>')
content = content.replace('</p><br /><ul>', '</p><ul>')
content = "<html>%(head)s<body>%(content)s</body></html>" % {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we want to do this with PolicyStat. Is this already done somewhere else, or should it be added to your transition document?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not really working very well to begin with. Really it should have been called in a loop until the html stopped changing. I believe that the semantinator handles some of this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Added that to your transition document.

'head': self.head(),
'content': content,
Expand All @@ -28,6 +25,7 @@ def style(self):
{{color:red; text-decoration:line-through}}.center
{{text-align:center}}.right{{text-align:right}}
.left{{text-align:left}} .comment{{color:blue}}
.pydocx-underline {text-decoration: underline;}
body{{width:%(width)spx; margin:0px auto;
}}</style>''') % {
'width': (self.page_width * (4 / 3)),
Expand Down Expand Up @@ -109,13 +107,13 @@ def unordered_list(self, text):
}

def bold(self, text):
return '<b>' + text + '</b>'
return '<strong>' + text + '</strong>'

def italics(self, text):
return '<i>' + text + '</i>'
return '<em>' + text + '</em>'

def underline(self, text):
return '<u>' + text + '</u>'
return '<span class="pydocx-underline">' + text + '</span>'

def tab(self):
# Insert before the text right?? So got the text and just do an insert
Expand All @@ -142,7 +140,7 @@ def table_cell(self, text, col='', row=''):
}

def page_break(self):
return '<hr>'
return '<hr />'

def indent(self, text, just='', firstLine='', left='', right=''):
slug = '<div'
Expand All @@ -167,4 +165,4 @@ def indent(self, text, just='', firstLine='', left='', right=''):
}

def break_tag(self):
return '<br/>'
return '<br />'
19 changes: 12 additions & 7 deletions pydocx/tests/test_docx.py
Original file line number Diff line number Diff line change
Expand Up @@ -143,8 +143,13 @@ def test_inline_tags():
'inline_tags.docx',
)
actual_html = convert(file_path)
assert_html_equal(actual_html, '''
<html><body><p>This sentence has some <b>bold</b>, some <i>italics</i> and some <u>underline</u>, as well as a <a href="http://www.google.com/">hyperlink</a>.</p></body></html>''') # noqa
assert_html_equal(actual_html, (
'<html><body><p>This sentence has some <strong>bold</strong>, '
'some <em>italics</em> and some '
'<span class="pydocx-underline">underline</span>, '
'as well as a <a href="http://www.google.com/">hyperlink</a>'
'.</p></body></html>'
))


def test_unicode():
Expand Down Expand Up @@ -639,16 +644,16 @@ def test_shift_enter():
actual_html = convert(file_path)
assert_html_equal(actual_html, '''
<html><body>
<p>AAA<br/>BBB</p>
<p>AAA<br />BBB</p>
<p>CCC</p>
<ol data-list-type="decimal">
<li>DDD<br/>EEE</li>
<li>DDD<br />EEE</li>
<li>FFF</li>
</ol>
<table>
<tr>
<td>GGG<br/>HHH</td>
<td>III<br/>JJJ</td>
<td>GGG<br />HHH</td>
<td>III<br />JJJ</td>
</tr>
<tr>
<td>KKK</td>
Expand Down Expand Up @@ -767,7 +772,7 @@ def test_simple_table():
assert_html_equal(actual_html, '''
<html><body>
<table>
<tr><td>Cell1<br/>Cell3</td><td>Cell2<br/>
<tr><td>Cell1<br />Cell3</td><td>Cell2<br />
And I am writing in the table</td></tr>
<tr><td></td><td>Cell4</td></tr>
</table>
Expand Down
18 changes: 9 additions & 9 deletions pydocx/tests/test_xml.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
class BoldTestCase(_TranslationTestCase):
expected_output = """
<html><body>
<p><b>AAA</b></p>
<p><strong>AAA</strong></p>
<p>BBB</p>
</body></html>
"""
Expand Down Expand Up @@ -121,7 +121,7 @@ class HyperlinkWithBreakTestCase(_TranslationTestCase):

expected_output = '''
<html><body>
<p><a href="www.google.com">link<br/></a></p>
<p><a href="www.google.com">link<br /></a></p>
</body></html>
'''

Expand Down Expand Up @@ -382,7 +382,7 @@ class TableWithListAndParagraph(_TranslationTestCase):
<li>AAA</li>
<li>BBB</li>
</ol>
CCC<br/>
CCC<br />
DDD
</td>
</tr>
Expand Down Expand Up @@ -478,7 +478,7 @@ class ListWithContinuationTestCase(_TranslationTestCase):
expected_output = '''
<html><body>
<ol data-list-type="decimal">
<li>AAA<br/>BBB</li>
<li>AAA<br />BBB</li>
<li>CCC
<table>
<tr>
Expand Down Expand Up @@ -722,7 +722,7 @@ class DeleteTagInList(_TranslationTestCase):
expected_output = '''
<html><body>
<ol data-list-type="decimal">
<li>AAA<br/>
<li>AAA<br />
<span class='delete' author='' date=''>BBB</span>
</li>
<li>CCC</li>
Expand All @@ -746,7 +746,7 @@ class InsertTagInList(_TranslationTestCase):
expected_output = '''
<html><body>
<ol data-list-type="decimal">
<li>AAA<br/>
<li>AAA<br />
<span class='insert' author='' date=''>BBB</span>
</li>
<li>CCC</li>
Expand All @@ -771,7 +771,7 @@ class SmartTagInList(_TranslationTestCase):
expected_output = '''
<html><body>
<ol data-list-type="decimal">
<li>AAA<br/>
<li>AAA<br />
BBB
</li>
<li>CCC</li>
Expand Down Expand Up @@ -850,7 +850,7 @@ class MissingIlvl(_TranslationTestCase):
expected_output = '''
<html><body>
<ol data-list-type="decimal">
<li>AAA<br/>
<li>AAA<br />
BBB
</li>
<li>CCC</li>
Expand Down Expand Up @@ -923,7 +923,7 @@ class SDTTestCase(_TranslationTestCase):
expected_output = '''
<html><body>
<ol data-list-type="decimal">
<li>AAA<br/>
<li>AAA<br />
BBB
</li>
<li>CCC</li>
Expand Down