Frequency Balanced Datasets Lead to Better Language Models