Finding similarities between unstructured documents