@@ -1312,6 +1312,121 @@ Value: Brahmi_Joining_Number
13121312File: UnicodeData
13131313Property: SPECIAL
13141314
1315+ File: ArabicShaping
1316+ #
1317+ # This file is a normative contributory data file in the
1318+ # Unicode Character Database.
1319+ #
1320+ # This file defines the Joining_Type and Joining_Group property
1321+ # values for Arabic, Syriac, N'Ko, Mandaic, and Manichaean positional
1322+ # shaping, repeating in machine readable form the information
1323+ # exemplified in various tables of The Unicode Standard core specification.
1324+ #
1325+ # This file also defines Joining_Type values for Mongolian, Phags-pa,
1326+ # Psalter Pahlavi, Sogdian, Old Uyghur, Chorasmian, and Adlam positional
1327+ # shaping, and Joining_Type and Joining_Group values for Hanifi Rohingya
1328+ # positional shaping, which are not listed in tables in the core
1329+ # specification.
1330+ #
1331+ # Script Section Table(s)
1332+ #
1333+ # Arabic 9.2 9-3, 9-4, 9-5, 9-7, 9-8, 9-9, 9-10, 9-11, 9-13
1334+ # Syriac 9.3 9-15, 9-16, 9-17, 9-18, 9-19
1335+ # Mandaic 9.5 9-22, 9-23
1336+ # Manichaean 10.5 10-4, 10-5, 10-6, 10-7
1337+ # Psalter Pahlavi 10.6 --
1338+ # Chorasmian 10.8 --
1339+ # Mongolian 13.5 --
1340+ # Phags-pa 14.4 14-7
1341+ # Sogdian 14.10 --
1342+ # Old Uyghur 14.11 --
1343+ # Hanifi Rohingya 16.14 --
1344+ # N'Ko 19.4 19-5
1345+ # Adlam 19.9 --
1346+ #
1347+ # Each line contains four fields, separated by a semicolon.
1348+ #
1349+ # Field 0: the code point of a character, in hexadecimal form.
1350+ #
1351+ # Field 1: gives a short schematic name for that character.
1352+ # The schematic name is descriptive of the shape, based as
1353+ # consistently as possible on a name for the skeleton and
1354+ # then the diacritic marks applied to the skeleton, if any.
1355+ # Note that this schematic name is considered a comment,
1356+ # and does not constitute a formal property value.
1357+ #
1358+ # Field 2: defines the joining type (property name: Joining_Type)
1359+ # R Right_Joining
1360+ # L Left_Joining
1361+ # D Dual_Joining
1362+ # C Join_Causing
1363+ # U Non_Joining
1364+ # T Transparent
1365+ #
1366+ # See Section 9.2, Arabic for more information on these joining types.
1367+ # Note that for cursive joining scripts which are typically rendered
1368+ # top-to-bottom, rather than right-to-left, Joining_Type=L conventionally
1369+ # refers to bottom joining, and Joining_Type=R conventionally refers
1370+ # to top joining. See Section 14.4, Phags-pa for more information on the
1371+ # interpretation of joining types in vertical layout.
1372+ #
1373+ # Field 3: defines the joining group (property name: Joining_Group)
1374+ #
1375+ # The values of the joining group are based schematically on character
1376+ # names. Where a schematic character name consists of two or more parts
1377+ # separated by spaces, the formal Joining_Group property value, as specified in
1378+ # PropertyValueAliases.txt, consists of the same name parts joined by
1379+ # underscores. Hence, the entry:
1380+ #
1381+ # 0629; TEH MARBUTA; R; TEH MARBUTA
1382+ #
1383+ # corresponds to [Joining_Group = Teh_Marbuta].
1384+ #
1385+ # Note: The property value now designated [Joining_Group = Teh_Marbuta_Goal]
1386+ # used to apply to both of the following characters
1387+ # in earlier versions of the standard:
1388+ #
1389+ # U+06C2 ARABIC LETTER HEH GOAL WITH HAMZA ABOVE
1390+ # U+06C3 ARABIC LETTER TEH MARBUTA GOAL
1391+ #
1392+ # However, it currently applies only to U+06C3, and *not* to U+06C2.
1393+ # To avoid destabilizing existing Joining_Group property aliases, the
1394+ # prior Joining_Group value for U+06C3 (Hamza_On_Heh_Goal) has been
1395+ # retained as a property value alias, despite the fact that it
1396+ # no longer applies to its namesake character, U+06C2.
1397+ # See PropertyValueAliases.txt.
1398+ #
1399+ # When other cursive scripts are added to the Unicode Standard in the
1400+ # future, the joining group value of all its letters will default to
1401+ # jg=No_Joining_Group in this data file. Other, more specific
1402+ # joining group values will be defined only if an explicit proposal
1403+ # to define those values exactly has been approved by the UTC. This
1404+ # is the convention exemplified by the N'Ko, Mandaic, Mongolian,
1405+ # Phags-pa, Psalter Pahlavi, Sogdian, Old Uyghur, Chorasmian, and Adlam scripts.
1406+ # Only the Arabic, Manichaean, and Syriac scripts currently have
1407+ # explicit joining group values defined for all characters, including
1408+ # those which have only a single character in a particular Joining_Group
1409+ # class. Hanifi Rohingya has explicit Joining_Group values assigned only for
1410+ # the few characters which share a particular Joining_Group class, but
1411+ # assigns jg=No_Joining_Group to all the singletons.
1412+ #
1413+ # Note: Code points that are not explicitly listed in this file are
1414+ # either of Joining_Type T or U:
1415+ #
1416+ # - Those that are not explicitly listed and that are of General_Category Mn, Me, or Cf
1417+ # are Joining_Type=T.
1418+ # - All others not explicitly listed are Joining_Type=U.
1419+ #
1420+ # For an explicit listing of all characters of Joining_Type=T, see
1421+ # the derived property file DerivedJoiningType.txt.
1422+ # For an implementation that needs to parse for the values of
1423+ # Joining_Type, it is recommended to use DerivedJoiningType.txt
1424+ # instead of ArabicShaping.txt, to avoid the separate required step of
1425+ # calculating the set for Joining_Type=T based on General_Category values.
1426+ #
1427+ # #############################################################
1428+ Property: SPECIAL
1429+
13151430File: SpecialCasing
13161431Property: SPECIAL
13171432
@@ -1361,7 +1476,6 @@ HackName:
13611476FinalComments
13621477Note that PropertyAliases sorts by the long name, while PropertyValueAliases
13631478sorts by the short name
1364- ArabicShaping
13651479BidiMirroring
13661480CompositionExclusions
13671481StandardizedVariants
0 commit comments