It’s sometimes useful, or even necessary, to represent strings containing accented or other letters, which are outside of the US-ASCII set, as pure ASCII. That is, for instance:
This transliteration might be desirable for various reasons, mainly to use the string somewhere where only ASCII is supported (or desirable). Some folks call this process deaccent, as it’s commonly used to remove accents from words in order to make comparisons possible. In practice, accents are not necessarily the only problem, and you’ll want to handle things like:
There’s a CPAN module which can help here: Text::Unidecode by Sean M. Burke.
This will print, as expected:
As you can see in the module documentation, it’s not meticulous, so it doesn’t always do a good job. However, Text::Unidecode
works nicely with Western European languages along with some others.