I find a code that sorting the names and at this program shouldnt be difference Between "a" and "A"

and my Question is what does line 20 said ..!?

#include <iostream>
#include <string.h>
#define MAXNAMES 100
using namespace std;

int compare(string s1, string s2){
  int i;
    for ( i = 0; s1[i] && s2[i]; ++i)
        /* If characters are same or inverting the 6th bit makes them same */
        if (s1[i] == s2[i] || (s1[i] ^ 32) == s2[i])

    /* Compare the last (or first mismatching in case of not same) characters */
    if (s1[i] == s2[i])
        return 0;
    if ((s1[i]|32) < (s2[i]|32)) //Set the 6th bit in both, then compare
        return -1;
    return 1;
2 Years
Discussion Span
Last Post by rayhaneh

Given two characters (a-z, A-Z) in the ASCII character set, setting the sixth bit high will make it lower case if it was upper case, and do nothing if it was already lower case.

Here's an example with "A".

32   = 0b0100000
A    = 0b1000001

32|A = 0b1100001 = "a"

Now try the same thing with "a"

32   = 0b0100000
a    = 0b1100001

32|a = 0b1100001 = "a"

So line 20 takes two characters, and then makes sure they're both lower case, and then does the comparison; this way, "b" will come out as lower than "C", which is what we want for a case-insenstive comparison.

Edited by Moschops


if ((s1[i]|32) < (s2[i]|32)) //Set the 6th bit in both, then compare

It's a case insensitive comparison.

Have a look at this ASCII table. The difference between 'A' and 'a' is 32 = 2^6, and uppercase letters all have bit 6 clear. So if you set bit 6, you get the lowercase version either way.

Note that this only works for the ASCII (or equivalent encoding) characters [A-Za-z]... if the strings have other things in them, this comparison will not behave as expected.


That code is weak in that it makes an assumption about the bit representation of characters. In particular ASCII upper and lower case letters differ only in whether the 6th bit is set, so that code ensures that the bit is the same before doing a comparison.

A much better approach would be to do the conversion with toupper or tolower instead as those don't assume ASCII.

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.