Encode unicode strings in a silly way, based on glounicode
Basic structure for machine learning projects