Word-based self-indexes for natural language text