Corpus representativeness for syntactic information acquisition