Determine if a string has all the same characters
The last string demonstrates how Raku can recognize that glyphs made up of different combinations of combining characters can compare the same. It is built up from explicit codepoints to show that each of the glyphs is made up of different combinations.
-> $str {
my $i = 0;
print "\n{$str.raku} (length: {$str.chars}), has ";
my %m;
%m{$_}.push: ++$i for $str.comb;
if %m > 1 {
say "different characters:";
say "'{.key}' ({.key.uninames}; hex ordinal: {(.key.ords).fmt: "0x%X"})" ~
" in positions: {.value.join: ', '}" for %m.sort( *.value[0] );
} else {
say "the same character in all positions."
}
} for
'',
' ',
'2',
'333',
'.55',
'tttTTT',
'4444 444k',
'๐ฌ๐ง๐ฌ๐ง๐ฌ๐ง๐ฌ๐ง',
"\c[LATIN CAPITAL LETTER A]\c[COMBINING DIAERESIS]\c[COMBINING MACRON]" ~
"\c[LATIN CAPITAL LETTER A WITH DIAERESIS]\c[COMBINING MACRON]" ~
"\c[LATIN CAPITAL LETTER A WITH DIAERESIS AND MACRON]",
'Aะฮ๊ฎ๐๐ แช'
Output:
"" (length: 0), has the same character in all positions.
" " (length: 3), has the same character in all positions.
"2" (length: 1), has the same character in all positions.
"333" (length: 3), has the same character in all positions.
".55" (length: 3), has different characters:
'.' (FULL STOP; hex ordinal: 0x2E) in positions: 1
'5' (DIGIT FIVE; hex ordinal: 0x35) in positions: 2, 3
"tttTTT" (length: 6), has different characters:
't' (LATIN SMALL LETTER T; hex ordinal: 0x74) in positions: 1, 2, 3
'T' (LATIN CAPITAL LETTER T; hex ordinal: 0x54) in positions: 4, 5, 6
"4444 444k" (length: 9), has different characters:
'4' (DIGIT FOUR; hex ordinal: 0x34) in positions: 1, 2, 3, 4, 6, 7, 8
' ' (SPACE; hex ordinal: 0x20) in positions: 5
'k' (LATIN SMALL LETTER K; hex ordinal: 0x6B) in positions: 9
"๐ฌ๐ง๐ฌ๐ง๐ฌ๐ง๐ฌ๐ง" (length: 4), has the same character in all positions.
"ววว" (length: 3), has the same character in all positions.
"Aะฮ๊ฎ๐๐ แช" (length: 7), has different characters:
'A' (LATIN CAPITAL LETTER A; hex ordinal: 0x41) in positions: 1
'ะ' (CYRILLIC CAPITAL LETTER A; hex ordinal: 0x410) in positions: 2
'ฮ' (GREEK CAPITAL LETTER ALPHA; hex ordinal: 0x391) in positions: 3
'๊ฎ' (LISU LETTER A; hex ordinal: 0xA4EE) in positions: 4
'๐' (OLD ITALIC LETTER A; hex ordinal: 0x10300) in positions: 5
'๐ ' (CARIAN LETTER A; hex ordinal: 0x102A0) in positions: 6
'แช' (CHEROKEE LETTER GO; hex ordinal: 0x13AA) in positions: 7
Last updated
Was this helpful?