CSS for internationalisation

By Chen Hui Jing / @hj_chen

🇲🇾 👾 🏀 🚲 🖌️ 👟 💻 🖊️ 🎙 🐈‍⬛ 🧗 🏳️‍🌈
Chen Hui Jing


internationalisation localisation

Internationalisation is the design and development of a product, application or document that enables easy localisation for target audiences that vary in culture, region, or language.

What is CSS?

A vocabulary for communicating with the browser.

To inform the browser how we want elements to be styled and laid out.

CSS is a domain-specific, declarative programming language.
Lara Schenck

A browser pondering languages

Why lang matters?

  • Language-specific behaviour in browsers
  • Default font selection
  • Search engine optimisation
  • Spelling and grammar checks
  • Translation
  • Non-text readers
  • Parsers and scripts

Source: Why use the language attribute?

Basics of using lang

Set the default language of the document:

<html lang="zh">

If you have mixed language content on your page:

<p>The fourth animal in the Chinese Zodiac is Rabbit (<span lang="zh">兔子</span>).</p>

The :lang() pseudo-class

A selector that represents an element based on its language

We use italics to emphasise words in English, 但是中文则是用着重号.

We use italics to emphasise words in English, 但是中文则是用着重号.

Attribute selectors

A selector that matches elements based on the presence or value of a given selector

/* will match only zh */
/* will match zh, zh-HK, zh-Hans, zhong, zh123…
 * basically anything with zh as the first 2 characters */
/* will match zh, zh-HK, zh-Hans, zh-amazing, zh-123 */

Some things to note…

The :lang() pseudo-class does take care of partial tag matching

In terms of selector matching, the |= operator performs a comparison against a given attribute on the element, while the :lang() pseudo-class uses the UAs knowledge of the document's semantics for the comparison

You can always use normal classes or ids, but you would no longer be making use of the convenience of what already exists

The faux fonts problem

Faux bold effect
Spotting faux bold
Faux italic effect
Spotting faux italic
Image source: Cai


Controls if the browser should synthesize a missing typeface or not

We use italics to emphasise words in English, 但是中文则是用着重号.

Glyph differences




Allows control of glyph substitution and sizing in East Asian text



To control the use of language-specific glyph substitutions and positioning

<!-- Macedonian lang code -->
<body lang="mk">
  <h4>Члeн 9</h4>
  <p>Никoj чoвeк нeмa дa бидe пoдлoжeн нa прoизвoлнo aпсeњe, притвoр или прoгoнувaњe.</p>
body {
  /* Serbian OpenType language tag */
  font-language-override: "SRB";

Example lifted from CSS Fonts Module Level 4


Transforms text for styling purposes

If I want [flowers], I’m going to send them to myself.

Süße Soßen-Klöße genießen maßgeblich gefräßige preußische Nutznießer.

Ουδέν κακόν αμιγές καλού.

ァィゥェ ォヵㇰヶ

Line breaks for inline boxes

If an element generates zero boxes, was it really there at all?

If an element generates zero boxes, was it really there at all?

CSS for controlling line breaks

line-break allows choosing various levels of “strictness” for line breaking restrictions
word-break controls what types of letters are glommed together to form unbreakable “words”, causing CJK characters to behave like non-CJK text or vice versa
hyphens controls whether automatic hyphenation is allowed to break words in scripts that hyphenate
overflow-wrap allows the UA to take a break anywhere in otherwise-unbreakable strings that would otherwise overflow

Line breaking by Florian Rivoal @ dotCSS


If you don't give a lang attribute, you don't get automatic hyphenation.

—Florian Rivoal

Browsers use language-specific dictionaries to figure out where the hyphenation points should be.

The hyphens property

a cute kitten

konservatorio konservatorio

a cute kitten


a cute kitten


a cute kitten


a cute kitten


a cute kitten


a cute kitten


a cute kitten


a cute kitten



ᠬᠦᠮᠦᠨ ᠪᠦᠷ ᠲᠥᠷᠥᠵᠦ ᠮᠡᠨᠳᠡᠯᠡᠬᠦ ᠡᠷᠬᠡ ᠴᠢᠯᠥᠭᠡ ᠲᠡᠢ᠂ ᠠᠳᠠᠯᠢᠬᠠᠨ ᠨᠡᠷ᠎ᠡ ᠲᠥᠷᠥ ᠲᠡᠢ᠂ ᠢᠵᠢᠯ ᠡᠷᠬᠡ ᠲᠡᠢ ᠪᠠᠢᠠᠭ᠃ ᠣᠶᠤᠨ ᠤᠬᠠᠭᠠᠨ᠂ ᠨᠠᠨᠳᠢᠨ ᠴᠢᠨᠠᠷ ᠵᠠᠶᠠᠭᠠᠰᠠᠨ ᠬᠦᠮᠦᠨ ᠬᠡᠭᠴᠢ ᠥᠭᠡᠷ᠎ᠡ ᠬᠣᠭᠣᠷᠣᠨᠳᠣ᠎ᠨ ᠠᠬᠠᠨ ᠳᠡᠭᠦᠦ ᠢᠨ ᠦᠵᠢᠯ ᠰᠠᠨᠠᠭᠠ ᠥᠠᠷ ᠬᠠᠷᠢᠴᠠᠬᠥ ᠤᠴᠢᠷ ᠲᠠᠢ᠃

writing-mode: vertical-lr

すべての人間は、生まれながらにして自由であり、かつ、尊厳と権利と について平等である。人間は、理性と良心とを授けられており、互いに同 胞の精神をもって行動しなければならない。

writing-mode: vertical-rl

text-orientation & text-combine-upright

國家籃球協會(英語:National Basketball Association,縮寫:NBA)是北美的男子職業籃球聯盟。

text-orientation: upright

國家籃球協會(英語:National Basketball Association,縮寫:NBA)是北美的男子職業籃球聯盟。

text-combine-upright: all

Right-to-left languages

CSS should not be used for bi-directional styling

Because directionality is an integral part of the document structure, markup should be used to set the directionality for a document or chunk of information, or to identify places in the text where the Unicode bidirectional algorithm alone is insufficient to achieve desired directionality.

Not just for East Asian text

Gosizdat poster from 1927 Recreation of Gosizdat poster in HTML and CSS

Confusing physical directions

Physical directions for writing modes other than horizontal top-to-bottom may be confusing


Logical properties

writing-mode / direction
horizontal-tb vertical-rl vertical-lr
ltr rtl ltr rtl ltr rtl
inset-block-start inset-block-start inset-inline-start inset-inline-end inset-inline-start inset-inline-end
inset-inline-end inset-inline-start inset-block-start inset-block-start inset-block-end inset-block-end
inset-block-end inset-block-end inset-inline-end inset-inline-start inset-inline-end inset-inline-start
inset-inline-start inset-inline-end inset-block-end inset-block-end inset-block-start inset-block-start

See the Pen CSS Logical properties demo for borders by Chen Hui Jing (@huijing) on CodePen.

Lists and counters

Allows us to display ordered lists with international numbering systems

Full list of list-style-type options at MDN

CSS Fizzbuzz

Further reading and references






Font is IBM Plex Serif by Mike Abbink.