Let’s try to insert a 6’Th character, it should work fine, right?
Well, not really 🙂
By default, SQL Server is using UCS-2 encoding for nvarchar columns.
UCS-2 represent each character on 16 bits (2 bytes) – 65,536 chars should be enough for everybody, right? 🙂
Well, not exactly 🙂 Since 2001, many more characters were added to the Unicode standard, reaching a total of 120,737 chars today (2015, Unicode 8.0). These clearly can’t be represented on only 2 bytes, so 3 or 4 are needed.
In our case, A, B, C, D… are not the letters from the latin alphabet, but… ‘MATHEMATICAL BOLD CAPITAL A, B, C…’: http://unicode.org/cldr/utility/character.jsp?a=1D400
In UTF-16, this is represented on 4 bytes as: 0xD835 0xDC00 (hexa).
MS SQL Server will happily accept it, but by default will consider it as 2 chars. The same happens in .NET Framework, that will return the length 12 for the above string: