Alphabetical Collation Sequence of Arabic Words With Special Characters in Microsoft Office Software



Author Information

Manar Almanea, Imam Mohammad Ibn Saud Islamic University, Saudi Arabia

Abstract

Arabic is a language characterized by a large number of special characters, such as accents and symbols, in its script beyond alphabetic letters. While Arabic adheres to a fixed alphabetical order, the arrangement of words containing these special characters remains controversial. This study investigates the degree of sophistication of the Arabic alphabetical sorting systems operating in Microsoft Office Word and Excel documents, as well as in Python, which employs UTF-8 encoding. A list of 38 Arabic words was used for evaluation purposes. Each group of words in the list shared almost the same consonantal root but with varying characters and diacritics. Extraction and comparison of the sorted outputs from the three programs revealed marked sorting differences in the three sorted lists, with discrepancies as significant as 58% observed across the tested conditions. This is not just a simple technical error—it’s a linguistic and cultural oversight with consequences for data integrity and accessibility. Similarities and differences in the orders of the generated lists are then discussed. To solve this problem, this study proposes a linguistically-informed secondary alphabetical order for special characters beyond the primary order of Arabic letters. The order is based on some linguistic features of the special characters, such as the word’s root and the character’s phonological salience. Software developers working with Arabic script in digital applications are advised to incorporate the recommendations of this study into their work and to make adjustments to the alphabetical collation algorithms implemented within their programs.


Paper Information

Conference: ACAH2025
Stream: Language

This paper is part of the ACAH2025 Conference Proceedings (View)
Full Paper
View / Download the full paper in a new tab/window


To cite this article:
Almanea M. (2025) Alphabetical Collation Sequence of Arabic Words With Special Characters in Microsoft Office Software ISSN: 2186-229X – The Asian Conference on Arts & Humanities 2025 Official Conference Proceedings (pp. 325-340) https://doi.org/10.22492/issn.2186-229X.2025.26
To link to this article: https://doi.org/10.22492/issn.2186-229X.2025.26


Comments & Feedback

Place a comment using your LinkedIn profile

Comments

Share on activity feed

Powered by WP LinkPress

Share this Research

Posted by James Alexander Gordon