A really useful tool:
https://github.com/rspeer/python-ftfy
"Here are some examples (found in the real world) of what ftfy can do:
ftfy can fix mojibake (encoding mix-ups), by detecting patterns of characters
that were clearly meant to be UTF-8 but were decoded as something else:
import ftfy
ftfy.fix_text('✔ No problems')
'✔ No problems'
Does this sound impossible? It's really not. UTF-8 is a well-designed encoding
that makes it obvious when it's being misused, and a string of mojibake usually
contains all the information we need to recover the original string."
Share and enjoy,
*** Xanni ***
--
mailto:xanni@xanadu.net Andrew Pam
http://xanadu.com.au/ Chief Scientist, Xanadu
https://glasswings.com.au/ Partner, Glass Wings
https://sericyb.com.au/ Manager, Serious Cybernetics