A framework for obtaining structurally complex condensed representations of document sets in the biomedical domain