Here's why 'A' starts at position 65 in ASCII table

soniel: (via Hacker News)

If you look at each byte as being 2 bits of ‘group’ and 5 bits of ‘character’;

00 11011 is Escape

10 11011 is [

So when we do ctrl+[ for escape (eg, in old ansi ‘escape sequences’, or in more recent discussions about the vim escape key on the ‘touchbar’ macbooks) - you’re asking for the character 11011 ([) out of the control (00) set.

Any time you see \n represented as ^M, it’s the same thing - 01101 (M) in the control (00) set is Carriage Return.

Likewise, when you realise that the relationship between upper-case and lower-case is just the same character from sets 10 & 11, it becomes obvious that you can, eg, translate upper case to lower case by just doing a bitwise or against 64 (0100000).

And 40h & 60h .. having a nice round number for the offset mostly just means you can ‘read’ ascii from binary by only paying attention to the last 5 bits. A is 1 (00001), Z is 26 (11010), leaving us something we can more comfortably manipulate in our heads.

I won’t claim any of this is useful. But in the context of understanding why the ascii table looks the way it does, I do find four sets of 32 makes it much simpler in my head. I find it much easier to remember that A=65 (41h) and a=97 (61h) when I’m simply visualizing that A is the 1st character of the uppercase(40h) or lowercase(60h) set.

You will have to look at the paste bin attachment to understand the comment though but its awesome when you get it. Then go to Erid Raymond’s post and read the ASCII section. You will understand it better.

I have reproduced the ascii table from the pastebin below.

  00 01 10 11
00000 NUL Spc @ `
00001 SOH ! A a
00010 STX B b
00011 ETX # C c
00100 EOT $ D d
00101 ENQ % E e
00110 ACK & F f
00111 BEL G g
01000 BS ( H h
01001 TAB ) I i
01010 LF * J j
01011 VT + K k
01100 FF , L l
01101 CR - M m
01110 SO . N n
01111 SI / O o
10000 DLE 0 P p
10001 DC1 1 Q q
10010 DC2 2 R r
10011 DC3 3 S s
10100 DC4 4 T t
10101 NAK 5 U u
10110 SYN 6 V v
10111 ETB 7 W w
11000 CAN 8 X x
11001 EM 9 Y y
11010 SUB : Z z
11011 ESC ; [ {
11100 FS < |  
11101 GS = ] }
11110 RS > ^ ~
11111 US ? _ DEL

Pressing Ctrl + [ will cause Escape key to be triggered. You can try this in the terminal or in vi editor. Similarly pressing Ctrl + J will cause new line to be inserted.

Similarly the difference between an uppercase and lowercase character is always a single bit. The ascii value of lowercase v can be obtained by applying the bitwise OR operation on the uppercase V(1010110) and 64(0100000).

What is the advantage of this layout, I do not know. Did we come to this layout design because it saves some CPU cycles or preserves memory? I do not know. But in this context, my long standing question of why English alphabets are weirdly positioned in the ascii table makes sense.