LispAndUnicode

Overview

This topic is a focus for issues around using and integrating the Unicode character encoding into Common Lisp. As we implement more, we will be faced with choices about how to handle for example the numeral encodings of Devinagri. Subtopics will address the issues. There is a References section in each subtopic. Use the Example subtopic as a template.

Interpreting Digit Characters

Affected Components

This issue affects TheReader and the ThePrinter? .

Issue

CommonLisp is defined only on the 7-bit ASCII character set. Some allowance is provided for extending the character set via the implementation-defined ExtendedChar? data type. But even under the current spec, numbers are defined as the ASCII digits of '0' - '9' and the 'A' - 'Z' depending on radix. However, under Unicode the digit characters representing the values 0 - 9 are not only '0' - '9' but other scripts with characters corresponding with values 0 - 9. To further muddy the waters, since the Roman numeric characters are also provided in a script, CLforJava could plausibly support a number extension in TheReader to read Roman numerals.

  • Unaddressed
    • Unicode attribute mapping
    • #-sign naming of characters

Resolution

  • The Java Character class provides several methods for determining if a char is a digit character. The isDigit method will correctly identify a standard, base-10 number in any of the several Unicode scripts including Roman. However, it's limitation to base-10 makes it not suited to TheReader. The digit method however does have a radix parameter and will return the actual value represented by the character - including the Roman numerals - which may prove problematic.

  • Unaddressed
    • An extension to the PotentialNumber? algorithm to define a Roman numeral encoded in the Roman script of Unicode.

Implementation

Details of design and implementation

References

HyperSpec CLtL Java Unicode
link 1 Link 2 Link 3 Link 4


Example

Affected Components

Issue

Provide a description of the issue or conflict here

Resolution

Provide a detailed description of the resolution

Implementation

Details of design and implementation

References

HyperSpec CLtL Java Unicode
link 1 Link 2 Link 3 Link 4

Topic revision: r2 - 2009-02-11 - 03:30:45 - MeganLusher
 
Home
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback