The Getty Thesaurus of Geographic Names is a ” a hierarchical vocabulary of around 1.1 million names, and coordinates and other information for around 892,000 geographic places.” (From Getty Vocabularies Download Center)
In other words it is an controlled vocabulary of place names that can be searched online or, with permission, downloaded in XML form (or relational database or MARC.) I wonder if this could be used to create text engines that search by place and use the TGN records (which contain hierarchical information) to provide context? To put it another way, is TGN an ontology?
A quote on their hierarchical structure,
More about scope and structure: The TGN is a hierarchical database; its trees branch from a root called Top of the TGN hierarchies (Subject_ID: 1000000); it currently has one published facet, World. Under the World, the places are generally arranged in hierarchies representing the current political and physical world, although some historical nations and empires are also included. There may be multiple broader contexts, making the TGN polyhierarchical. In addition to the hierarchical relationships, the TGN has equivalent and associative relationships; thus it is a thesaurus, in compliance with ISO and NISO standards. (From About the TGN)
Another source of geographic names is NGA: GNS; GEOnet Names Server. There you can download, without permission, the names in a format suitable for information systems.