As far as I know UTF-8 is a variable-length encoding, i.e. a character can be represented as 1 byte, 2 bytes, 3 bytes or 4 bytes.
For example the Unicode character U+00A9 = 10101001 is encoded in UTF-8 as
11000010 10101001, i.e. 0xC2 0xA9
The prefix 110 in the first byte indicates that the character is stored with two bytes (because I count two ones until zero in the prefix 110).
The prefix in the following bytes starts with 10
A 4-byte UTF-8 encoding would look like
11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
prefix 11110 (four ones and zero) indicates four bytes and so on.
Now my question:
Why is the prefix 10 used in the following bytes? What is the advantage of such a prefix? Without 10 prefix in the following bytes I could use 3*2=6 bits more if I write:
11110000 xxxxxxxx xxxxxxxx xxxxxxxx