Keyboard Support

Contact and Search

Keyman.com Homepage

Header bottom

Keyman.com

Index

On this page

< Previous article   Next article >


HOWTO: Map an ANSI keyboard to the same Unicode code-points

NOTE: This archived documentation has not been updated recently and may contain information that is no longer relevant

In some situations, a keyboard developer may wish to retain a legacy encoding for a keyboard. However, for compatibility reasons, they may wish to update the keyboard to use Unicode characters internally. The following table shows how to map the ANSI (Windows Western European, cp1252) characters to Unicode characters, without changing the legacy font mapping. Important Note: This does not translate an ANSI keyboard into a true Unicode keyboard. However, it does provide a pathway for transition where an application does not support ANSI input correctly. One application that has had reported issues with ANSI input in this way is Paratext 6.

This table is based on data from (http://unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT')


ANSI Value Unicode Value Unicode Character Name
x80 d128 U+20AC EURO_SIGN
x81 d129 *
x82 d130 U+201A SINGLE_LOW-9_QUOTATION_MARK
x83 d131 U+0192 LATIN_SMALL_LETTER_F_WITH_HOOK
x84 d132 U+201E DOUBLE_LOW-9_QUOTATION_MARK
x85 d133 U+2026 HORIZONTAL_ELLIPSIS
x86 d134 U+2020 DAGGER
x87 d135 U+2021 DOUBLE_DAGGER
x88 d136 U+02C6 MODIFIER_LETTER_CIRCUMFLEX_ACCENT
x89 d137 U+2030 PER_MILLE_SIGN
x8A d138 U+0160 LATIN_CAPITAL_LETTER_S_WITH_CARON
x8B d139 U+2039 SINGLE_LEFT-POINTING_ANGLE_QUOTATION_MARK
x8C d140 U+0152 LATIN_CAPITAL_LIGATURE_OE
x8D d141 *
x8E d142 U+017D LATIN_CAPITAL_LETTER_Z_WITH_CARON
x8F d143 *
x90 d144 *
x91 d145 U+2018 LEFT_SINGLE_QUOTATION_MARK
x92 d146 U+2019 RIGHT_SINGLE_QUOTATION_MARK
x93 d147 U+201C LEFT_DOUBLE_QUOTATION_MARK
x94 d148 U+201D RIGHT_DOUBLE_QUOTATION_MARK
x95 d149 U+2022 BULLET
x96 d150 U+2013 EN_DASH
x97 d151 U+2014 EM_DASH
x98 d152 U+02DC SMALL_TILDE
x99 d153 U+2122 TRADE_MARK_SIGN
x9A d154 U+0161 LATIN_SMALL_LETTER_S_WITH_CARON
x9B d155 U+203A SINGLE_RIGHT-POINTING_ANGLE_QUOTATION_MARK
x9C d156 U+0153 LATIN_SMALL_LIGATURE_OE
x9D d157 *
x9E d158 U+017E LATIN_SMALL_LETTER_Z_WITH_CARON
x9F d159 U+0178 LATIN_CAPITAL_LETTER_Y_WITH_DIAERESIS

  • The 5 characters x81, x8D, x8F, x90 and x9D do not have a mapping to Unicode. For compatibility reasons, you should avoid using these characters even with pure ANSI keyboards, as some applications may strip them from your data.

Example

  • The following rule is in an ANSI keyboard with a legacy font.
x9A + 'm' &gt; x9B    c 'aa' + 'm' -&gt; 'am'
  • That would be converted, according to the table above, as follows:
U+0161 + 'm' &gt; U+203A    c 'aa' + 'm' -&gt; 'am'

Applies to:

  • Keyman Developer 5.0
  • Keyman Developer Professional 6.0
  • Keyman Developer Professional 6.2
  • Keyman Developer Standard 6.0
  • Keyman Developer Standard 6.2
  • Keyman Developer Professional 7.0