Duplicate and near-duplicate documents in the web: detection by means of fuzzy-hash techniques el 31 de Diciembre del 2011