I think this does not mean it has to scan the file directly, but maybe keep all the hashes to files that were taken down and stop them from getting uploaded again. This would also be fairly unintrusive, but could add a few false positives.
That is not acceptable.
This would also be fairly unintrusive, but could add a few false positives.
If this was the case, we'd have a whole bigger problem on our hands.
Even considering the birthday problem, the chance for such collisions is astronomically small. Especially if you combine it with the file size that you always have anyways.
In fact I'd guess that sites like these already do exactly that in order to avoid hosting duplicates (if not handled at the file system level).
I think this does not mean it has to scan the file directly, but maybe keep all the hashes to files that were taken down and stop them from getting uploaded again. This would also be fairly unintrusive, but could add a few false positives.
That is not acceptable.
If this was the case, we'd have a whole bigger problem on our hands.
Even considering the birthday problem, the chance for such collisions is astronomically small. Especially if you combine it with the file size that you always have anyways.
In fact I'd guess that sites like these already do exactly that in order to avoid hosting duplicates (if not handled at the file system level).