Is this in the documentation somewhere?
>>> a = b'\x01\x02'
>>> type(a[0])
<class 'int'>
>>> type(a[0:1])
<class 'bytes'>
>>> a = '\x01\x02'
>>> type(a[0])
<class 'str'>
>>> type(a[0:1])
<class 'str'>
Note that in the byte string version, a[0] and a[0:1] return different types, while in the regular string version, both return the same type. Why does it make sense to treat the two cases differently?
I would have expected python to be more consistent.
Here's a similar example:
>>> s = "hello"
>>> type(s)
<class 'str'>
>>> b = s.encode()
>>> type(b)
<class 'bytes'>
>>> s[0]
'h'
>>> b[0]
104
>>> s[0:1]
'h'
>>> b[0:1]
b'h'
I happen to think this behavior is consistent (with indexing and slicing rules).
Here's why it makes sense:
bytes and str are
not the same. They aren't even conceptually the same.
A bytes (byte string) is a sequence of bytes (I assume you know what a byte is, and that it isn't a "character"). It is data,
not text. A string on the other hand is a sequence of
characters and is text.
As I mentioned earlier, indexing a byte string gives you a byte (it's a
sequence of bytes, so this makes perfect sense). Since a byte is a numeric type (and
not a character) what you get is a number (long). Slicing a sequence gives you the portion of the sequence from a to (but not including) b as a new sequence. Note that slicing a sequence always gives a sequence. Why? Because you are asking for
a portion of and not just a single element of the sequence. This is why the result is a byte string and not a long. You may notice that while the two results actually have the same data (in essence, at least), their types (and representation) are completely different, and reasonably so. This is why b[0] != b[0:1] where b is a byte string.
str on the other hand is a
sequence of characters (text). Indexing an str (which I'll call string from now on) gives you a character (makes perfect sense again). However, python characters are
str instances with just one element, so this is what indexing a string gives you, another string. Slicing as string gives you a portion of the string, as a new string. If you slice for just one element, what you get is
a new string with just one element. This is why s[0] == s[0:1] where s is a string.