[css-inline] Arabic drop-caps · Issue #2399 · w3c/csswg-drafts · GitHub
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[css-inline] Arabic drop-caps #2399

Closed
fantasai opened this issue Mar 5, 2018 · 14 comments
Closed

[css-inline] Arabic drop-caps #2399

fantasai opened this issue Mar 5, 2018 · 14 comments
Labels
Closed Accepted by Editor Discretion css-inline-3 Current Work i18n-alreq Arabic language enablement i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response.

Comments

@fantasai
Copy link
Collaborator

fantasai commented Mar 5, 2018

@behnam provided the necessary info in #698 (comment) to define shaping behavior, so we should spec it.

@fantasai fantasai added the css-inline-3 Current Work label Mar 5, 2018
@r12a r12a added the i18n-alreq Arabic language enablement label Apr 30, 2018
@r12a r12a added the i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. label Apr 30, 2018
@fantasai
Copy link
Collaborator Author

fantasai commented Dec 6, 2018

In #698 (comment) @jfkthame wrote:

Sure, i understand that, but the example you give is not a particularly useful thing to do. I suspect that much of the time one will apply first-letter in order to change the formatting (see the image at #698 (comment)).

We have https://drafts.csswg.org/css-inline-3/#initial-letter-shaping saying that

When initial-letters is not normal, shaping should still occur across an inline initial letter box’s boundaries. ... For example, if the first letter of the Farsi word “پس” were styled with initial-letters: 2 1, both letters would be styled in their joined forms, with initial-form “ﭘ” as the initial letter, followed by the normally-styled final-form “ﺲ”. Note that the two letters might not always graphically connect, even when shaped in their joining forms. (my emphasis)

But we have https://drafts.csswg.org/css-text/#boundary-shaping saying that

Text shaping must be broken at inline box boundaries when any of the following are true for any box whose boundary separates the two typographic character units: Any of margin/border/padding separating the two typographic character units in the inline axis is non-zero. ...

The two things seem to be contradictory, or at least to warrant some additional explanation.

If the first letter is styled with a different font-size, font-family, etc it's entirely possible it won't actually connect to the next letter even when the appropriate contextual forms are used; I think that's what the emphasized sentence above is pointing out.

Adding margin or border around the letter, OTOH, would be a cause for not shaping between the initial letter and following text.

Raised caps are an interesting edge case, where I guess it's reasonable to maintain joining as the baseline isn't changed and no horizontal separation is being introduced (though it seems an unlikely thing for someone to really want to do in such a script).

The few examples I've seen of traditional practice (see #698 (comment) and the linked discussion) don't appear to support shaping across more general drop-cap-like formatting: the large dropped, boxed initial in https://upload.wikimedia.org/wikipedia/commons/6/6a/Hafezeshamlu02.jpg, for example, is not shaped in an initial form.

@fantasai fantasai reopened this Dec 6, 2018
@r12a
Copy link
Contributor

r12a commented Dec 12, 2018

The few examples I've seen of traditional practice (see #698 (comment) and the linked discussion) don't appear to support shaping across more general drop-cap-like formatting: the large dropped, boxed initial in https://upload.wikimedia.org/wikipedia/commons/6/6a/Hafezeshamlu02.jpg, for example, is not shaped in an initial form.

Do we have a source which we believe is reliable that does show enlarged initial letters using joining forms? @behnam, @shervinafshar, @khaledhosny, @sahafshar, @ntounsi, @mostafa, any ideas?

@shervinafshar
Copy link

@r12a, this topic has been bouncing around for a bit now. My personal take is that since this is a novelty of a typographical practice in Arabic script text, there is no reliable source. However, the research shows that (a) most cases of drop-caps in Arabic script texts are not using joining forms; (b) through my research on the matter, I believe that there is also some precedence for the drop-caps in Arabic script text to use of joining forms. This requires further investigation, but I reproduce the case that I'm referring to as a rare precedence.

What follows is from the volume Compendium of Latin Translations of Persian Astronomical Tables, selected portions of Zīj-i Sultānī translated into Latin and published by Oxford University Press in 1655.

image

image

Further pointers:
my thread on Persian Computing
Liam Quin's blogpost has a section with his findings, points to the thread above

@sahafshar
Copy link

sahafshar commented Dec 12, 2018 via email

@r12a
Copy link
Contributor

r12a commented Aug 23, 2019

I wonder whether the answer here is to assume that by default arabic versals are not joining forms, but allow authors to turn them into joining forms using ZWJ if desired??

In that case, would we need a special rule to say that ZWJ is kept with the preceding letter when using ::first-letter selection?

(Btw, @sahafshar your image didn't seem to make it into the thread.)

@r12a
Copy link
Contributor

r12a commented Jan 23, 2020

Fwiw (perhaps not much) if you open this test in Safari (which is the only major browser i know of that supports initial-letter behaviour) you'll see that when selecting the first letter of a word where i had inserted ZWJ after the initial letter, ::first-letter automatically picked up the ZWJ and rendered the enlarged initial as a joining form.

If there is no ZWJ (see this test) then the initial letter is unjoined, but the 2nd letter in the word is joined.

@svgeesus
Copy link
Contributor

@faceless2 a nice drop cap testcase for you

@fantasai
Copy link
Collaborator Author

I'm inclined to spec the behavior @r12a describes in #2399 (comment) Happy to do something different if there's some clear consensus as to what's correct, though. :)

@jfkthame
Copy link
Contributor

If there is no ZWJ (see this test) then the initial letter is unjoined, but the 2nd letter in the word is joined.

@r12a When you say "but the 2nd letter in the word is joined", do you mean it takes a right-joining (medial or final) form, as if it were joined to the initial letter? Or just that it is joined to what follows it?

(As far as I can see when testing here, it takes an initial form; i.e. it does not shape as though joined to the initial letter.)

@fantasai
Copy link
Collaborator Author

Got an iOS screenshot I got from my friend... so it does look like Safari is breaking the connection on both sides.

That said, using the correct connecting form between the second part of the word and the initial letter seems to me would be more readable than actually breaking this connection. Similar concept to how Latin drop-caps kern in the rest of the first word into the drop-cap, to help maintain a clearer connection between the first letter and the rest of its word.

I also understand that using isolated form for a drop-cap just looks a lot better; that's probably why it's more common per @shervinafshar and @sahafshar’s comments.

So basically we have three options here:

  • Break the connection on both sides: isolated form drop-cap, initial form rest of the word.
  • Connect both sides: Initial form drop-cap, connected (medial) form rest of the word.
  • Split model: Isolate drop-cap, connected (medial) form rest of the word.

I think the third option is actually the best one. Curious to hear what native users of the writing system think.

Here's a mockup of all three options, for reference:
arabic-drop-cap

@sahafshar
Copy link

I agree @fantasai, the third option makes most sense to me. Sorry @r12a you're right, I forgot the reference picture in my original response on the thread. I've attached it here. As you can see, it also follows the same logic as the split model.
Dropcaps

@fantasai
Copy link
Collaborator Author

Tentatively updated the editor's draft with text for this behavior given @sahafshar’s confirmation. I'll wait on more feedback before closing the issue, though.

@faceless2
Copy link

This change isn't script-specific? Seems reasonable. Given how hard it was finding examples for arabic etc. I'm pretty sure we'd be basing a decision on a sample size of zero for N'Ko.

@fantasai
Copy link
Collaborator Author

fantasai commented Jun 5, 2020

@faceless2 I think in the absence of conclusive information to the contrary, it should apply to all scripts. :) The principles that make it the most sensible option (see #2399 (comment)) are generally-applicable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closed Accepted by Editor Discretion css-inline-3 Current Work i18n-alreq Arabic language enablement i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response.
Projects
None yet
Development

No branches or pull requests

9 participants