Portfolio Activity 4: Text Analysis

I find that computational text analysis has great research value to me, and that it would be of definite use to me as an historian. To begin, computational text analysis can help stimulate inquiry questions. Computational text analysis is “distant reading” and therefore can be used to analyze a large group of documents at once. To use the example of “Mining the Dispatch”, the input for the text analysis was a collection of over 100,000 documents. Since computers can do millions of calculations easy, you can use a extremely large sampling size instead of being limited to analyzing as much as you could physically do yourself. In “Mining the Dispatch”, computational text analysis was used to do topic modeling. For the input of the text analysis, it used over 100,000 documents to categorize them each by topics. One could apply computation text analysis like it was used in “Mining the Dispatch” to generate a list of topics to create some inquiries about a group of documents. However, a potential problem with that and computational text analysis in general is that it will not be 100% accurate. Using the example “Mining the Dispatch” once again, graphs are given that display the frequency of a certain topic that were generated by computational text analysis, and then are compared to the true value. While they are often very accurate, there are some abnormalities in fugitive slave ads graph that show that there are inaccuracies that should be taken count of. The reason for this is because computational text analysis is “distant reading” and not “close reading”. Close reading is when a person combs through the text to truly discern the texts purpose, tone, settings, etc. This is different from distant reading, as a computer is unable to read like a human as it cannot understand what it is reading. Instead, a computer has to view the words individually as tokens. It can then group of individual words as tokens or take them individually to determine how much of a word shows up in a text, or what word is most likely to show up next to another word. This is how the topic modeling in “Mining the Dispatch” was done. While distant reading has it’s uses, true understanding of text will come from close reading, so that is where I would draw the line on the use of computation text analysis. Still, using computational text analysis as the first step in an approach to historical research is a great way to start. For example, I could use computational text analysis in a large electronic library to find key words and phrases that would be relevant to my research question. This would point me in the right direction to a group of documents that might be the evidence I could use. Then, I could do a close reading of the documents to have an deeper understanding, contextualize, and then corroborate between the documents. Using computational text analysis in this way speeds up the initial research process so I can spend more time analyzing documents.

This entry was posted in Uncategorized. Bookmark the permalink.