Skip to content

[Bug Report] Markdown API does not parse emoji #6628

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 of 3 tasks
meteorlxy opened this issue Apr 15, 2019 · 6 comments · Fixed by #11032
Closed
1 of 3 tasks

[Bug Report] Markdown API does not parse emoji #6628

meteorlxy opened this issue Apr 15, 2019 · 6 comments · Fixed by #11032
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented modifies/api This PR adds API routes or modifies them type/enhancement An improvement of existing functionality

Comments

@meteorlxy
Copy link

  • Gitea version (or commit ref): try.gitea.io
  • Can you reproduce the bug at https://try.gitea.io:
    • Yes (provide example URL)
    • No
    • Not relevant

Description

https://try.gitea.io/api/v1/markdown does not parse emoji in markdown text.

image

For reference, here is what GitLab's markdown API returns:

image

@lunny lunny added modifies/api This PR adds API routes or modifies them type/enhancement An improvement of existing functionality labels Apr 15, 2019
@mrsdizzie
Copy link
Member

I believe the two main issues here are

The library we use for Markdown processing has no type of emoji translation support: https://github.com/russross/blackfriday

That alone is probably a blocker on this (unless somebody provided that feature upstream).

We also use a Javascript library for this in the Web interface (https://github.com/joypixels/emojify.js) to render :smile: type text browser side (with static images). Making a change to the API output would alter how the Web interface currently works. I believe the goal was to move to another Javascript solution on the future, which would also maybe leave this feature out of the API.

An alternative to do this in the Markdown API would be to use a different Go Library for this type of emoji processing, and parse the text ourselves like we do for other things (issue links, sha, email address, etc...). I'm not aware of any libraries that will take a text like :smile: and output html however. Gitlab uses this Ruby Gem: https://github.com/bonusly/gemojione so there would need to be something similar for Go.

@silverwind
Copy link
Member

Imho, if unicode emoji characters work, it's fine. Things like :smile: are non-standard and only work on a few specific pieces of software, unicode is supposed to work everywhere.

@stale
Copy link

stale bot commented Jun 15, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.

@stale stale bot added the issue/stale label Jun 15, 2019
@lunny lunny added the issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented label Jun 16, 2019
@stale stale bot removed the issue/stale label Jun 16, 2019
@6543
Copy link
Member

6543 commented Dec 30, 2019

@zeripath does goldmark handle this different? (#9533)

@zeripath
Copy link
Contributor

zeripath commented Dec 30, 2019

I don't think it does out of the box - however, it should be quite easy to make it do it. There's a potential time cost to doing the rendering I guess would have to look up the strings in some sort of map.

Do you know does the :wibble: syntax conflict any other markdown? I don't think it does.

@guillep2k
Copy link
Member

Do you know does the :wibble: syntax conflict any other markdown? I don't think it does.

BTW, I've been meaning to propose a new column (or table) for issues and comments containing the pre-rendered content plus references (mentions) and such. It should hold the rendered version of the content as long as it's shorter than a certain limit (the majority of comments will fall way below that limit). Anything bigger than that limit will be processed online, which is slower, but we're doing that already anyway. Thus we can worry less about the amount of processing in the output.

I'm not sure it's justified, but I thought I would mention it.

silverwind pushed a commit to mrsdizzie/gitea that referenced this issue Apr 13, 2020
This PR replaces all use of emojify.js and adds unicode emoji support to various areas of gitea.

This works in a few ways:

First it adds emoji parsing support into gitea itself. This allows us to

 * Render emojis from valid alias (:smile:)
 * Detect unicode emojis and let us put them in their own class with proper aria-labels and styling
 * Easily allow for custom "emoji"
 * Support all emoji rendering and features without javascript
 * Uses plain unicode and lets the system render in appropriate emoji font
 * Doesn't leave us relying on external sources for updates/fixes/features

That same list of emoji is also used to create a json file which replaces the part of emojify.js that populates the emoji search tribute. This file is about 35KB with GZIP turned on and I've set it to load after the page renders to not hinder page load time (and this removes loading emojify.js also)

For custom "emoji" it uses a pretty simple scheme of just looking for /emojis/img/name.png where name is something a user has put in the "allowed reactions" setting we already have. The gitea reaction that was previously hard coded into a forked copy of emojify.js is included and works as a custom reaction under this method.

The emoji data sourced here is from https://github.com/github/gemoji which is the gem library Github uses for their emoji rendering (and a data source for other sites). So we should be able to easily render any emoji and :alias: that Github can, removing any errors from migrated content. They also update it as well, so we can sync when there are new unicode emoji lists released.

I've included a slimmed down and slightly modified forked copy of https://github.com/knq/emoji to make up our own emoji module. The code is pretty straight forward and again allows us to have a lot of flexibility in what happens.

I had seen a few comments about performance in some of the other threads if we render this ourselves, but there doesn't seem to be any issue here. In a test it can parse, convert, and render 1,000 emojis inside of a large markdown table in about 100ms on my laptop (which is many more emojis than will ever be in any normal issue). This also prevents any flickering and other weirdness from using javascript to render some things while using go for others.

Not included here are image fall back URLS. I don't really think they are necessary for anything new being written in 2020. However, managing the emoji ourselves would allow us to add these as a feature later on if it seems necessary.

Fixes: go-gitea#9182
Fixes: go-gitea#8974
Fixes: go-gitea#8953
Fixes: go-gitea#6628
Fixes: go-gitea#5130
silverwind pushed a commit to mrsdizzie/gitea that referenced this issue Apr 18, 2020
This PR replaces all use of emojify.js and adds unicode emoji support to various areas of gitea.

This works in a few ways:

First it adds emoji parsing support into gitea itself. This allows us to

 * Render emojis from valid alias (:smile:)
 * Detect unicode emojis and let us put them in their own class with proper aria-labels and styling
 * Easily allow for custom "emoji"
 * Support all emoji rendering and features without javascript
 * Uses plain unicode and lets the system render in appropriate emoji font
 * Doesn't leave us relying on external sources for updates/fixes/features

That same list of emoji is also used to create a json file which replaces the part of emojify.js that populates the emoji search tribute. This file is about 35KB with GZIP turned on and I've set it to load after the page renders to not hinder page load time (and this removes loading emojify.js also)

For custom "emoji" it uses a pretty simple scheme of just looking for /emojis/img/name.png where name is something a user has put in the "allowed reactions" setting we already have. The gitea reaction that was previously hard coded into a forked copy of emojify.js is included and works as a custom reaction under this method.

The emoji data sourced here is from https://github.com/github/gemoji which is the gem library Github uses for their emoji rendering (and a data source for other sites). So we should be able to easily render any emoji and :alias: that Github can, removing any errors from migrated content. They also update it as well, so we can sync when there are new unicode emoji lists released.

I've included a slimmed down and slightly modified forked copy of https://github.com/knq/emoji to make up our own emoji module. The code is pretty straight forward and again allows us to have a lot of flexibility in what happens.

I had seen a few comments about performance in some of the other threads if we render this ourselves, but there doesn't seem to be any issue here. In a test it can parse, convert, and render 1,000 emojis inside of a large markdown table in about 100ms on my laptop (which is many more emojis than will ever be in any normal issue). This also prevents any flickering and other weirdness from using javascript to render some things while using go for others.

Not included here are image fall back URLS. I don't really think they are necessary for anything new being written in 2020. However, managing the emoji ourselves would allow us to add these as a feature later on if it seems necessary.

Fixes: go-gitea#9182
Fixes: go-gitea#8974
Fixes: go-gitea#8953
Fixes: go-gitea#6628
Fixes: go-gitea#5130
mrsdizzie added a commit to mrsdizzie/gitea that referenced this issue Apr 21, 2020
This PR replaces all use of emojify.js and adds unicode emoji support to various areas of gitea.

This works in a few ways:

First it adds emoji parsing support into gitea itself. This allows us to

 * Render emojis from valid alias (:smile:)
 * Detect unicode emojis and let us put them in their own class with proper aria-labels and styling
 * Easily allow for custom "emoji"
 * Support all emoji rendering and features without javascript
 * Uses plain unicode and lets the system render in appropriate emoji font
 * Doesn't leave us relying on external sources for updates/fixes/features

That same list of emoji is also used to create a json file which replaces the part of emojify.js that populates the emoji search tribute. This file is about 35KB with GZIP turned on and I've set it to load after the page renders to not hinder page load time (and this removes loading emojify.js also)

For custom "emoji" it uses a pretty simple scheme of just looking for /emojis/img/name.png where name is something a user has put in the "allowed reactions" setting we already have. The gitea reaction that was previously hard coded into a forked copy of emojify.js is included and works as a custom reaction under this method.

The emoji data sourced here is from https://github.com/github/gemoji which is the gem library Github uses for their emoji rendering (and a data source for other sites). So we should be able to easily render any emoji and :alias: that Github can, removing any errors from migrated content. They also update it as well, so we can sync when there are new unicode emoji lists released.

I've included a slimmed down and slightly modified forked copy of https://github.com/knq/emoji to make up our own emoji module. The code is pretty straight forward and again allows us to have a lot of flexibility in what happens.

I had seen a few comments about performance in some of the other threads if we render this ourselves, but there doesn't seem to be any issue here. In a test it can parse, convert, and render 1,000 emojis inside of a large markdown table in about 100ms on my laptop (which is many more emojis than will ever be in any normal issue). This also prevents any flickering and other weirdness from using javascript to render some things while using go for others.

Not included here are image fall back URLS. I don't really think they are necessary for anything new being written in 2020. However, managing the emoji ourselves would allow us to add these as a feature later on if it seems necessary.

Fixes: go-gitea#9182
Fixes: go-gitea#8974
Fixes: go-gitea#8953
Fixes: go-gitea#6628
Fixes: go-gitea#5130
guillep2k added a commit that referenced this issue Apr 28, 2020
* Support unicode emojis and remove emojify.js

This PR replaces all use of emojify.js and adds unicode emoji support to various areas of gitea.

This works in a few ways:

First it adds emoji parsing support into gitea itself. This allows us to

 * Render emojis from valid alias (:smile:)
 * Detect unicode emojis and let us put them in their own class with proper aria-labels and styling
 * Easily allow for custom "emoji"
 * Support all emoji rendering and features without javascript
 * Uses plain unicode and lets the system render in appropriate emoji font
 * Doesn't leave us relying on external sources for updates/fixes/features

That same list of emoji is also used to create a json file which replaces the part of emojify.js that populates the emoji search tribute. This file is about 35KB with GZIP turned on and I've set it to load after the page renders to not hinder page load time (and this removes loading emojify.js also)

For custom "emoji" it uses a pretty simple scheme of just looking for /emojis/img/name.png where name is something a user has put in the "allowed reactions" setting we already have. The gitea reaction that was previously hard coded into a forked copy of emojify.js is included and works as a custom reaction under this method.

The emoji data sourced here is from https://github.com/github/gemoji which is the gem library Github uses for their emoji rendering (and a data source for other sites). So we should be able to easily render any emoji and :alias: that Github can, removing any errors from migrated content. They also update it as well, so we can sync when there are new unicode emoji lists released.

I've included a slimmed down and slightly modified forked copy of https://github.com/knq/emoji to make up our own emoji module. The code is pretty straight forward and again allows us to have a lot of flexibility in what happens.

I had seen a few comments about performance in some of the other threads if we render this ourselves, but there doesn't seem to be any issue here. In a test it can parse, convert, and render 1,000 emojis inside of a large markdown table in about 100ms on my laptop (which is many more emojis than will ever be in any normal issue). This also prevents any flickering and other weirdness from using javascript to render some things while using go for others.

Not included here are image fall back URLS. I don't really think they are necessary for anything new being written in 2020. However, managing the emoji ourselves would allow us to add these as a feature later on if it seems necessary.

Fixes: #9182
Fixes: #8974
Fixes: #8953
Fixes: #6628
Fixes: #5130

* add new shared function emojiHTML

* don't increase emoji size in issue title

* Update templates/repo/issue/view_content/add_reaction.tmpl

Co-Authored-By: 6543 <[email protected]>

* Support for emoji rendering in various templates

* Render code and review comments as they should be

* Better way to handle mail subjects

* insert unicode from tribute selection

* Add template helper for plain text when needed

* Use existing replace function I forgot about

* Don't include emoji greater than Unicode Version 12

Only include emoji and aliases in JSON

* Update build/generate-emoji.go

* Tweak regex slightly to really match everything including random invisible characters. Run tests for every emoji we have

* final updates

* code review

* code review

* hard code gitea custom emoji to match previous behavior

* Update .eslintrc

Co-Authored-By: silverwind <[email protected]>

* disable preempt

Co-authored-by: silverwind <[email protected]>
Co-authored-by: 6543 <[email protected]>
Co-authored-by: Lauris BH <[email protected]>
Co-authored-by: guillep2k <[email protected]>
ydelafollye pushed a commit to ydelafollye/gitea that referenced this issue Jul 31, 2020
* Support unicode emojis and remove emojify.js

This PR replaces all use of emojify.js and adds unicode emoji support to various areas of gitea.

This works in a few ways:

First it adds emoji parsing support into gitea itself. This allows us to

 * Render emojis from valid alias (:smile:)
 * Detect unicode emojis and let us put them in their own class with proper aria-labels and styling
 * Easily allow for custom "emoji"
 * Support all emoji rendering and features without javascript
 * Uses plain unicode and lets the system render in appropriate emoji font
 * Doesn't leave us relying on external sources for updates/fixes/features

That same list of emoji is also used to create a json file which replaces the part of emojify.js that populates the emoji search tribute. This file is about 35KB with GZIP turned on and I've set it to load after the page renders to not hinder page load time (and this removes loading emojify.js also)

For custom "emoji" it uses a pretty simple scheme of just looking for /emojis/img/name.png where name is something a user has put in the "allowed reactions" setting we already have. The gitea reaction that was previously hard coded into a forked copy of emojify.js is included and works as a custom reaction under this method.

The emoji data sourced here is from https://github.com/github/gemoji which is the gem library Github uses for their emoji rendering (and a data source for other sites). So we should be able to easily render any emoji and :alias: that Github can, removing any errors from migrated content. They also update it as well, so we can sync when there are new unicode emoji lists released.

I've included a slimmed down and slightly modified forked copy of https://github.com/knq/emoji to make up our own emoji module. The code is pretty straight forward and again allows us to have a lot of flexibility in what happens.

I had seen a few comments about performance in some of the other threads if we render this ourselves, but there doesn't seem to be any issue here. In a test it can parse, convert, and render 1,000 emojis inside of a large markdown table in about 100ms on my laptop (which is many more emojis than will ever be in any normal issue). This also prevents any flickering and other weirdness from using javascript to render some things while using go for others.

Not included here are image fall back URLS. I don't really think they are necessary for anything new being written in 2020. However, managing the emoji ourselves would allow us to add these as a feature later on if it seems necessary.

Fixes: go-gitea#9182
Fixes: go-gitea#8974
Fixes: go-gitea#8953
Fixes: go-gitea#6628
Fixes: go-gitea#5130

* add new shared function emojiHTML

* don't increase emoji size in issue title

* Update templates/repo/issue/view_content/add_reaction.tmpl

Co-Authored-By: 6543 <[email protected]>

* Support for emoji rendering in various templates

* Render code and review comments as they should be

* Better way to handle mail subjects

* insert unicode from tribute selection

* Add template helper for plain text when needed

* Use existing replace function I forgot about

* Don't include emoji greater than Unicode Version 12

Only include emoji and aliases in JSON

* Update build/generate-emoji.go

* Tweak regex slightly to really match everything including random invisible characters. Run tests for every emoji we have

* final updates

* code review

* code review

* hard code gitea custom emoji to match previous behavior

* Update .eslintrc

Co-Authored-By: silverwind <[email protected]>

* disable preempt

Co-authored-by: silverwind <[email protected]>
Co-authored-by: 6543 <[email protected]>
Co-authored-by: Lauris BH <[email protected]>
Co-authored-by: guillep2k <[email protected]>
@go-gitea go-gitea locked and limited conversation to collaborators Nov 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented modifies/api This PR adds API routes or modifies them type/enhancement An improvement of existing functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants