The footgun of right-to-left decorative characters
Fleurons, and other printer's ornaments are decorative elements used in typography. Many of these have made it into fonts in the digital age, with Unicode supporting classic fleurons like ❦ and ❧, 1 These are slowly getting introduced into the blog styling, formal post Coming Soon™. but also additional symbols-turned-visual indicators ranging from the aptly named ❀ (White Florette) to the lesser known ᪥ (Tai Tham Sign Dokmai 2 Dokmai means 'flower' in Tai Tham scripts like Northern Thai or Lao. ). I was recently exploring using 𐫱 as a delineator between the date and post categories, but found that it was unexpectedly rendering in the middle of the tag:
It should render like this. Why doesn't it?
Manichaeism and Unicode
𐫱 is the Manichaean Punctuation Fleuron, part of a block in Unicode for adapting religious texts from Manichaeism. If you care to know what Manichaeism is, it was a world religion from the 3rd century AD which collapsed through a combination of active persecution, competition from Christianity, Islam, and Buddhism, and the death of the empires that had tied themselves to it. 3 Also they foolishly didn't allow post-deceased converts like some fast-growing religions. If you don't care, you just need to know that they had a pretty flower icon...and wrote from right-to-left.
While most Manichaean scripts are solely written in right-to-left, they're not as constrained on a website: 4 Look at me Mom! עברית and عربي and 𐫖𐫍𐫗i𐫝𐫀𐫍e𐫍𐫗 all in one sentence! whatever rendering algorithm Chrome uses to layout text needs to handle bidirectional text consisting of both LTR and RTL characters. Luckily Unicode is well-described, so there's a Unicode Bidirectional Algorithm that explains how to do this. However because Unicode is complex this requires ~20k words and 51 revisions—let's stick to a TL;DR.
Each character has a bidirectional character type (visible in UnicodeData.txt). This can be strong types, like L (for left) for stuff like the letter "A" or R (for right) for Hebrew text; weak types, like EN for European Numbers such as 0; or neutral, like WS for whitespace like tabs and spaces.
5
Here's the table if you're curious.
Strong types will take precedence, and weak types will defer to their surrounding strong types.
The Manichaean fleuron is R typed, and so it rearranges the weak block of the digits 400 (though they remain internally ordered as 400 instead of 004), rendering 𐫱 <span>400 Divisadero</span> as "𐫱 400 Divisadero".
Fixing with HTML and CSS
Luckily this is fixable if we remember that we have to do it! We can wrap the fleuron with the <bdi> HTML tag to isolate the directionality from the text around it: 𐫱 400 Divisadero. You can also do the same thing with the CSS style unicode-bidi: bidi-override;:
6
'Bidi' for bi-directional.
𐫱 400 Divisadero, which has marginally better support across browsers.
7
Though who really cares about IE.
Just remember to do this the next time you reach for something like 𞢹 or 𐩕.
-
These are slowly getting introduced into the blog styling, formal post Coming Soon™. ↩︎
-
Dokmai means 'flower' in Tai Tham scripts like Northern Thai or Lao. ↩︎
-
Also they foolishly didn't allow post-deceased converts like some fast-growing religions. ↩︎
-
Look at me Mom! עברית and عربي and 𐫖𐫍𐫗i𐫝𐫀𐫍e𐫍𐫗 all in one sentence! ↩︎
-
Here's the table if you're curious. ↩︎
-
'Bidi' for bi-directional. ↩︎
-
Though who really cares about IE. ↩︎


