Text analysis
Computational thinking and analysis of texts
Image credit: Gerd Altmann at Pixabay
Text analysis project
The Computational Thinking and Learning Initiative co-sponsored a textual analysis working group during the fall 2019 semester. The members of the working group, including four undergraduate Library Buchanan Fellows, learned how to analyze textual data at different scales, looking for hidden patterns in their data and finding ways to represent query results succinctly. During the first half of the semester, the students learned the rudiments of XQuery, a query language that excels at finding results in semi-structured data of any sort, ranging from literary texts encoded in TEI to bibliographic data encoded in JSON. In the second half of the semester, participants queried large sets of textual data using Apache Spark, a framework for querying distributed data sets, and RumbleDB, an emerging framework for Spark based on XQuery. After completing the sessions this semester, members of this working group were able to extract information from big data sets in the humanities, social sciences, or other disciplines with greater ease and confidence.