standard logo    Personal Education

Using Unicode and Kicking ASCII


If you use a keyboard with your computer, you are familiar with the letters printed on the keys. While you are using a word processing program like LibreOffice, KWord, Caligra Words or even Microsoft Word, if you tap a key on your keyboard, the letter appears at your cursor. You already knew that. But, computers are much more capable than that. Around the world, different languages have letters which aren't found on a traditional computer keyboard. Computers are not stuck with just the 26 letters used in English.

image: keyboard
(Photo by ahhyeah on flickr)

Early computers used a keyboard code called American Standard Code for Information Interchange (ASCII). The original 256 codes represented the basic letters, numbers and punctuation symbols on the keyboard used in the U.S., plus useful extras like the carriage return. 256 codes are enough to handle the English alphabet, 26 letters in both capital and lower case along with common punctuation and the various math symbols along with our ten digits. There was even room for some characters called "dingbats" such as the symbols for the four card suits heart ♥, diamond ♦, spade ♠ and club ♣. The world's many other languages don't all fit into basic ASCII, though. There are letters in many different shapes needed, way beyond 256 codes.

Let's look at the common case of Spanish. In addition to the standard letter "e", Spanish needs two other versions: "é" and "è" for accents representing sounds  which are different from the basic letter. If you are going to send a letter to your friend José, you want him to feel he is truly recognized, so you wouldn't want to type his name as Jose. It would seem you mean to say "Hose" instead of the "Hose-ay" sound he uses.

So, what's the solution?

Use Unicode instead of ASCII. This document will show you three ways to do it.

Unicode is an international standard which goes beyond the 256 codes of ASCII.

Method 1: Special Characters Tool

Many programs have built-in access to some special keys. LibreOffice is typical. The Insert menu has a choice for special characters.

image: choosing copyright

This image shows the beginning of the special characters grid. If you look carefully, you will see that the scroll control has a long way to move down. There are many characters available. In the illustration, the copyright symbol © is highlighted. You can just click on the OK button to get that symbol popped into your document at the cursor location.

To get the letter e with an accute accent, scroll down a bit. (Following up the José example)

image: accute e

Now José is going to be happy to hear from you.

Method 2: The meta key

If you plan to write to your friend, José, frequently, you may prefer a quicker shortcut that lets you directly insert some letters like the acute accent é straight from the keyboard without needing to mess with the mouse and menus. The trick is to use the Meta key (The Meta key looks like a window flag, and it is called the Windows key by Microsoft, no surprise, they use that image on their start menu, too.) Remember that you should only expect to memorize these tricks if you use them over and over. Until you memorize the steps, make up an index card cheat sheet to tape by your keyboard.

image: Meta key
Photo by yum9me on flickr, license: CC-BY-NC-ND

The keyboard trick is: Hold down the Meta/Windows key, and while holding it down, tap a key for the accent, let go the Meta key and tap the letter you want to accent.  for José use Meta + apostrophe then e. For the Caps versions Meta + apostrophe then Shift + E.

The apostrophe key is also used to make a single quote mark so you might prefer to think of the accent key that way.

Other basic accents work the same way. Just change the keys tapped. In some cases, you need to hold down the shift while holding the Meta key down. That shows that you want the top character on a key for your accent.  Here are some common ones for Spanish, French and German.

Special
Key press sequence
é
  + then  which might also be written: Meta + apostrophe then e
è
+ then  which might also be written: Meta + accent grave then e
ñ
+ + then which might also be written: Meta + shift + nyay then n
ç
+ then which might also be written: Meta + comma then c
ö
+   + then which might also be written: Meta + shift + colon then o

Method 3: UTF-8 and Expanded Unicode

There are many, many more unusual characters and symbols which you might want to use. These need yet another keyboard trick.

This method is called the U-hex technique.

Hold down the Control + Shift key and tap the letter U, then type the UTF-8 hex (hexidecimal) codes.
Link for ones I frequently use.
Well organized and illustrated list
Link for a more complete list.
Link for an even more comprehensive list. This link shows many different sets of codes. The list displays different codes depending on your choice from the selector tool illustrated by the next screen capture.

image: selector tool

Once you get hooked, you may want to use these tricks everywhere. Some programs allow one or more of the tricks, even if they don't offer a menu item for selecting  and inserting the special characters. For example, the Unicode U-hex method works in Twitter when you tweet on the web page. If you use a separate client for Twitter, you'll need to check if it works.

Have fun.

Web Pages are funny about unicode. The original version of this page was written using "charset=ISO-8859-1" which allowed me to use embedded unicode letters where I needed them to describe the tricks above. However, I've been updating pages to HTML5 and "charset=utf-8" which made the embedded unicode characters to break. Instead, I had to go through the page, replacing all the accented letters with html "entities" like "é" for é. Odd, to say the least.

UPDATE 2018-04-05 I came across a neat site today, one which lets you draw a character and returns a list of unicode characters which closely match it. Cool, if not useful. Shapecatcher.com