Using Concept Maps as a Cross-Language Resource Discovery Tool


Ryan Richardson

Virginia Tech


Concept maps, introduced by Novak, are an aid to learners’ understanding. I hypothesize that concept maps also can function as a summary of large documents (e.g., electronic theses and dissertations, or ETDs). At the Digital Libraries Research Laboratory at Virginia Tech, we have developed a system that automatically generates concept maps from English-language ETDs in the computing field. The system also will provide Spanish translations of these concept maps for native Spanish speakers. Because of the results of our enhanced machine translation techniques, we believe concept maps could allow researchers to discover pertinent dissertations in languages they cannot read, helping them to decide if they want a potentially relevant dissertation translated. Our system uses a state-of-the-art natural language processing system, called Relex, first to extract noun phrases and noun-verb-noun relations from electronic dissertations, and then to produce concept maps automatically. We also have incorporated information from the table of contents of ETDs to create novel styles of concept maps. Currently we are producing concept maps for the Virginia Tech CS collection (175 ETDs), which covers a broad range of computer science topics. We have amassed a collection of about 580 Spanish-language ETDs, from Scirus and two Mexican universities, and we are using this corpus to mine phrase translations that would not be found in online dictionaries or phrase lists. We also have tested the usefulness of the automatically-generated and translated concept maps in a user experiment conducted at Universidad de las Americas (UDLA) in Puebla, Mexico. This experiment provides insights regarding if concept maps can augment abstracts (translated using a standard machine translation package) in helping Spanish speaking users find ETDs of interest.

Ryan Richardson is finishing a Ph.D. in the Department of Computer Science at Virginia Tech. He has been a member of the Digital Library Research Laboratory at VT for seven years. He holds a M.A. in computer science from Bellarmine University in Louisville, Kentucky, and a M.S. in computer science from Virginia Tech. His research is in automatic generation of concept maps, automatic translation of concept maps, and automatic mining of phrase translations from large technical corpora. He has presented papers and posters at the Joint Conference on Digital Libraries (JCDL), the European Conference on Digital Libraries (ECDL), and at CMC 2006, in addition to having papers accepted at several other conferences. He is a member of the Association of Computing Machinery (ACM).