All posts by admin

What Can Digital Humanities Do With Crowdsourcing?

Crowdsourcing is how historians in the Digital Humanities field can create projects that create new ways of analyzing material. Collaborators to The Collective Wisdom Handbook state, “Crowdsourcing can open up new possibilities for research that could not have existed otherwise, fundamentally changing authoritative practices, definitions, and classifications.” (The Collective Wisdom 2) Along with changing the way one analyzes data, there is a component of collaboration. The collaborators on The Collective Wisdom Handbook explain, “At its best, crowdsourcing increases public access to cultural heritage collections while inviting the public to engage deeply with these materials.” (The Collective Wisdom 3) The publics’ collaboration with these projects is one of the main components crowdsourcing. These projects come about in many different ways and what tasks the public undertakes. One of the best examples of these different tasks is transcription.

Transcription in crowdsourcing projects puts the task directly into the public’s hands. The project creators upload documents and ask volunteers to help transcribe them for an online archive. One of the first projects to bring this idea forward was Transcribe Bentham. The Transcribe Bentham project uses documents created by philosopher Jeremy Bentham “with the intention of engaging the public with Bentham’s thought and works, creating a searchable digital repository of the collection, and quickening the pace of transcription and publication by recruiting unpaid online volunteers to assist in transcribing the remaining manuscripts.” (Causer, Tonra, and Wallace, Transcription maximized, 120) The project aims to have volunteers transcribe the documents that creators upload in a collaborative effort to build an archive of Bentham’s documents.

Another project that uses transcription in a crowdsourcing project is By the People. By the People is a project created by the Library of Congress, designed similarly to Transcribe Bentham, that encourages volunteers to help transcribe documents uploaded to create an archive. The documents in this project vary widely “into ‘Campaigns’ and presented to volunteers along with transcription conventions, a discussion platform, and explanatory material to help folks learn a bit about the subjects of the documents.” (Hyning and Jones, Data’s Destinations, 9) This learning of the material is how these projects keep participants engaged in transcribing the documents. With Transcribe Bentham, the creators discovered that volunteers “were motivated by a sense of contributing to the greater good by contributing to the production of the Collected Works and making available Bentham’s writings to others, whereas some even found transcribing fun.” (Causer, Tonra, and Wallace, Transcription maximized, 127) To keep volunteers excited and engaged to transcribing the material, they need to feel like they play a significant factor in the process and it has to be fun.

How to Read Crowdsourced Knowledge

Wikipedia has gained a reputation since its inception in 2001. To this day most educational spaces treat the website as taboo, a website filled with errors that users should avoid at all costs. Alexandria Lockett, in her article “Why Do I Have Authority to Edit the Page? The Politics of User Agency and Participation on Wikipedia” (2020), states, “Wikipedia was clearly shaking up the education system back then, and it continues to be taught as a forbidden space. Throughout my undergraduate studies, my peers and I noticed and discussed that our professors were increasingly issuing threats and warnings about using and citing Wikipedia.” (208) Despite those threats, students, and others, continue to use the website for information on various topics. Roy Rosenzweig, in his article, “Can History Be Open Source? Wikipedia and the Future of the Past” (2006), believes this can be a good thing for educational and historical purposes. He states, “One reason professional historians need to pay attention to Wikipedia is because our students do . . . We should not view this prospect with undue alarm. Wikipedia for the most part gets its facts right . . . And the general panic about students’ use of Internet sources is overblown.” (136) As Rosenzweig implies, Wikipedia can be great for beginning research for students and historians—one reason is their Talk and History pages.

The Talk page on a Wikipedia article is a section where users can bring up concerns, potential edits, or any other topics they feel are essential for the article. This section can give users a sense of how the article has changed over time and what different contributors see as the critical issues and concerns of that article. When I looked at the Talk section for the “Digital Humanities,” I noticed that there were different concerns with the organization, particularly with the references and sources.

This attention to the sources tells me there is a concern with the contributors about where the information comes from and that the information is most likely accurate. This conclusion lines up when I looked at the History page for the article. The History page for a Wikipedia article gives every single time a revision took place for a particular article. It gives the user who revised the article, the edit size, and when the edit took place

When I looked at the page, I noticed three contributors significantly changed to the article: Elijahmeeks, Gabrielbodard, and SimonMahony. I decided to click on their names to learn more about them, and I found that they were all associated with the Digital Humanities in some capacity. This association lines up with the Talk page and the concerns about sources. People associated with the Digital Humanities field will be concerned about where the information is coming from and if it is accurate which is why it is brought up repeatedly on the Talk page.

Recently, there have been concerns about using A.I. for crowdsourcing information, especially as a threat to Wikipedia. In his article “Wikipedia’s Moment of Truth” (2023), Jon Gertner states, “On a conference call in March that focused on A.I.’s threats to Wikipedia, as well as the potential benefits, the editors’ hopes contended with anxiety. While some participants seemed confident that generative A.I. tools would soon help expand Wikipedia’s articles and global reach, others worried about whether users would increasingly choose ChatGPT . . . .” (36) There are many reasons why A.I. worries experts when crowdsourcing information. When I was using ChatGPT, I noticed a lack of citations for the information the program answers with. This lack of citations shows how well Wikipedia does with crowdsourcing information since there is an emphasis on citing information. However, there are still possibilities when using A.I. When I was looking at main article for Digital Humanities on Wikipedia I noticed that the Tools section was a little light, only mentioning a few tools. When I asked ChatGPT to give me some examples of different tools, it gave me several compared to the Wikipedia article. I can see a use for ChatGPT and other A.I. programs for crowdsourcing information. Nonetheless, the problems with citations still gives Wikipedia a point in proving accuracy.

Compare Tools

The digital tools Voyant, Kepler.gl, and Palladio are handy tools for historians and researchers to use when analyzing digital material. Researchers use these tools to look at relationships within, find patterns and trends, and convey meaning and understanding from the research material. Researchers can do this because of the visualizations these three tools create making finding specific patterns easier. While the goals for why researchers use these tools are similar, the tools themselves are very different in their executions.

Voyant is a text-mining tool researchers use to convey meaning from a data set. It includes five tools researchers can use together to derive meaning from selected texts. The Cirrus tool is a word cloud that shows words that appear most frequently. Like Cirrus, the Trends tool is a graph visualization that shows the frequency of a specific word across the document. The Reader tool shows the document itself where users can select a specific word to find in the document. The Summary tool shows an overview of the specific word in terms of the document. Finally, the Contexts tool shows each time the document uses the word. These tools, used together, can find meaning in how words can tell a lot about a period, place, people, and culture.

Kepler.gl is a digital mapping tool researchers can use to discover patterns and trends across space. Even a simple point map can derive meaning from the data they are researching by showing the different relationships between those points. However, that is only part of what Kepler.gl can do with a map and data. Researchers can change the map to look at the data in different ways. Instead of points, one can change it to a cluster map to see the quantity of the data and compare them. A heat map can do the same thing, showing the density of the data. Other items can also be added, including a timeline showing how that data changed over time. All of this can give researchers more information than just traditional research.

Similar to Kepler.gl, Palladio can also make use of maps in a digital medium. However, the tool’s main component is using network graphs to show the relationships between different data. These graphs, overlaid on a map, can show the connections across space. Nonetheless, some of the most helpful information comes from looking at the network graphs themselves and limiting which data is present. Limiting the data can show different connections that otherwise go unseen in traditional research or are harder to see. For example, when I worked with Palladio, I limited the information to only look at topics discussed between male and female interviewees. While both groups shared similar topics, some were limited to a particular group. These topics tell me a lot about the subjects themselves and the culture at the time of their interviews.

While these three tools have different modes of executions, researchers should look at them in a variety of ways. Researchers can use these tools together to find meaning, patterns, and trends with the material and data they choose to research. While Palladio can use maps with the network graphs, it only shows a certain amount of information. A researcher should also include using Kepler.gl to get the information one can get from using a map that otherwise Palladio lacks. At the same time, Palladio also relies heavily on using words especially in the network graphs. Voyant makes the perfect complement to that program to delve deeper into using those words in the data.

Network Analysis with Palladio

Network graphs are helpful tools that researchers can use to convey meaning between information connections. According to authors Ruth Ahnert, Sebastian E. Ahnert, Catherine Nicole Coleman, and Scott B. Weingart, in their book The Network Turn: Changing Perspectives in the Humanities (2020), “The conventional network graph of node and edge (points connected by lines) makes it possible to convey a tremendous amount of information all at once, in one view. Networks express an internal logic of relationships between entities that is inherently intuitive.” (The Network Turn 57). The connections shown through these graphs can give researchers information about the material they are studying.

Many projects have used network graphs to help better understand, gain meaning, and answer questions about the material they are graphing. One such project includes Viral Texts. In this project, researchers, including Ryan Cordell, examined nineteenth-century American newspapers and their connections. What they found with these newspapers recirculated many articles, making the researchers wonder why. They found the answers to many of the different questions posed by the different types of recirculated texts related to the culture during the Antebellum period in the United States. Mapping the Republic of Letters is another project that used network graphs similar to Viral Texts. Like in the Viral Texts project, the Mapping the Republic of Letters project used network graphs overlaid on a map to show connections between letters from historical figures. From these connections and the use of a map, researchers discovered the importance of travel and that the letters were a way “of communicating ideas and shaping opinion, and also as a process of intellectual self-definition.” (“Historical Research in a Digital Age” 407-9). Another project that uses network graphs is Linked Jazz. Linked Jazz, like the other two projects, uses these graphs to show connections with the researchers’ material, in this case different jazz musicians. What the researchers discovered was “data about concert performances and recording dates gives . . . rich information about not just collaborations between musicians, but also about time and place, musical works, songs, and songwriters, and record labels and releases . . . .” (Hwang, Levay, and Provo “Contributing to Linked Jazz” 2015). Network graphs can show the connections between material and convey meaning about a time, place, people, or even a genre of music.

I discovered much information and meaning from using these tools when starting with Palladio and network graphs. A simple network graph overlaid on a map shows the connections across the space. It gave me a scope of how and where these connections happened.

I then started playing around with the network graphs and limiting certain information to see what I could find. One such graph shows the topics that male and female interviewees discussed. While they shared many of the topics, some were limited to either only male or only female interviewees. Using that information, I gathered that those topics were important to that gender specifically for a reason. For example, only female interviewees discussed elections which made me think about how they did not get the right to vote like their male counterparts which is probably why that topic is important to them.

Another graph shows the topics categorized based on their job. Again the topic of elections stood out to me since it was limited to only people who worked in the house. It led me to think that perhaps it was because they heard more gossip and politics from their masters while working in the house.

Mapping with Kepler.gl

Mapping in the Digital Humanities is a valuable tool for researchers to look for patterns and trends across space. Todd Presner and David Shepard, in their article Mapping the Geospatial Turn (2016), state, “On its most basic level, a map is a kind of visualization that uses levels of abstraction, scale, coordinate systems, perspective, symbology, and other forms of representation to convey a set of relations.” (Presner and Shepard 247). Researchers can use those patterns and trends to gather meaning and understanding about the material they are researching. There are different ways that researchers use these maps to find patterns and form meanings, including “historical mapping of ‘time-layers’ to memory maps, linguistic and cultural mapping, conceptual mapping, community-based mapping, and forms of counter-mapping that attempt to de-ontologize cartography and imagine new worlds.” (Presner and Shepard 247).

There have been many projects that have used maps to find trends and show meaning in the material that they present. One such project is the Photogrammar project. The Photogrammar project, created and updated from 2012 to 2016, is a website showcasing photographs from the FSA-OWI archive. The program “began as a response to the challenges of navigating the digital and physical archive at the LoC [Library of Congress]” (Arnold 2). The project uses what they call “generous interfaces” to help gather meaning using different modes of visualization. (Arnold 3). Another project that uses maps is the Histories of the National Mall. This project is a website that uses Google Maps to place markers around the National Mall in Washington, D.C. These markers are important buildings, statues, monuments, and areas that hold historical significance in the city and country. Sheila Brennan states, “Our key strategy for making the history of the National Mall engaging for tourists was to populate the website with surprising and compelling stories and primary sources that together build a textured historical context for the space and how it has changed over time.” (Brennan). Mapping the Gay Guides is another project that uses maps to find trends and patterns in their material. The project uses a popular, life-saving book called Bob Damron’s Address Book to place markers on a map of all the bars mentioned in that book to find those trends and patterns. Some of the different topics explored in the project tackle race, gender, and sexuality with each topic containing trends and meanings from the material and map. (Regan and Gonzaba).

When using maps with the Kepler model I was surprised with what I could discover from the material. I have only used maps once before in my projects, and they were simple Google Maps, like with the Histories of the National Mall, and I only did a little analysis with them. Working with Kepler changed my perspective on using maps for research. Even a simple point map can show information about relationships between the different points. However, there is so much more a map can do. This includes cluster and heat maps which can show quantity and density which a point map lacks. A timeline can be added to a map to show how that area and material changed over time. There is much more that a map can show that can give a researcher more information about the material they have presented.

Text Analysis with Voyant

Text-Mining are different tools historians and researchers use to gather meaning from a set of information or data. Lincoln A Mullen from America’s Public Bible project states, “The project uses the techniques of machine learning and text analysis to find the quotations, and it uses data analysis and visualization to make sense of them.” (2) Text mining allows analysis of a set of data to gather meaning from it. The types of information a researcher can use can vary. In the case of America’s Public Bible and the Signs@40 projects, researchers use written sources in the form of newspapers and articles, respectively, to gather their information. However, it does not always have to be just written sources. In the case of the Robots Reading Vogue project at Yale University, researchers used visual sources in the form of magazine covers along with written sources to accumulate data. From the data collected, researchers then ask questions and gather meaning from that data. Researchers usually try to find trends and patterns in the data and then gather meaning from those. Historians might want to understand the period or events associated with the information presented, so they would look for any trends or patterns in their collected data to find answers.

Voyant is an excellent text mining tool for historians and researchers to help researchers find those trends and patterns. Researchers can put information into the free online tool and investigate the data using the five tools provided. The five tools work together as a corpus, making it easy for researchers to find the trends and patterns in the information they provide.

When using the tool, I used data from the WPA Slave Narratives collection. When looking at the data on Voyant, I wanted to compare different states and look at the similarities and differences between them. I decided to look at Georgia and Virginia and their most used words with their respective word clouds. Looking at Georgia’s word cloud, I noticed that there were a variety of stereotypical southern words, including plantation, marster, and slaves. The most used word, however, was old.

In comparison, Virginia’s word cloud contained more dialect words, including ah, yo, tuh, and yer. However, the most used word was slaves.

When looking at the articles concerning the word clouds, Virginia focused more on dialect with enslaved people, while Georgia’s was more about life in the South.

Why Metadata Matters

Metadata is considered “data about data” (Carbajal and Caswell). According to the Jisc article, metadata “is usually structured textual information that describes something about the creation, content, or context of a digital resource—be it a single file, part of a single file, or a collection of many files.” (Jisc) Metadata describes every part of an image. It is essential in the field of digital humanities. If there is a picture of an object but no information attached to that object, people would not be able to use that digital item. Not only that, but the item would not be discoverable (DPLA). For example, with my kitchen item, if I did not add any metadata to those images, other people would not be able to search and use them.

Several different metadata categories are essential in the realm of digital humanities. The first category is the description of an item. The Carbajal and Caswell article states, “For archivists, preservation and description are key ingredients in making a collection of records ‘archival.’” (Carbajal and Caswell). A digital item should have a thorough description so that a person will know what that item is. If I did not put a description on my kitchen items, people would not know what the image is. Two other metadata categories are equally essential and coincide: creator and rights. People can find all kinds of digital images. However, one must know how to find the rights to those digital images. For example, I cannot just find an image on the internet and decide to use it; however, I want to use it. I need to make sure there is no copyright claim on the image. This goes right into the creator element. With rights, there is the correlated element of the creator. Some rights claims require people to ask the original creator for permission to use the image, so it is essential to include the creator when creating metadata.

Tropy is an excellent way to help digital humanities practitioners work with metadata. Tropy is a program that people can use to help input and describe sources using metadata. I found it helpful and easy to use in the kitchen items assignment. Tropy made it extremely easy to input the pictures and describe the images using metadata. Omeka is another program that can help digital humanities practitioners work with metadata, especially combined with Tropy. Omeka is a web platform where people can create digital exhibits. While using Tropy, researchers can export and import their data into Omeka. After that, they can then create exhibits using that data. I could do that with the items I exported from Tropy and then imported into Omeka. I could add information, change the layout, and add pictures of the kitchen items.

Database Review: EBSCOhost

Overview:

EBSCOhost is a database that provides a wide range of journal articles, e-books, and other sources for research. According to the U.S. Department of the Interior, “EBSCO’s online databases provide access to thousands of journals and reference sources in a wide variety of subjects. EBSCO’s leading online full-text databases offer access to articles from peer-reviewed journals published by many other the world’s most prestigious publishers.” The database covers a vast majority of topics such as Art and Architecture and Race Relations Abstracts to provide researchers narrow topics and sources. EBSCO strives to provide “libraries, health care and medical institutions, corporations and government agencies with access to content and resources to serve information and workflow needs of their users and organizations.” (EBSCO 2023)

Facts:

Date Range: Date range varies depending on the topic (for example, I put in “Food Processor” in the search which mean the date range is limited between 1976-2023, however if you search another topic like the “Industrial Revolution” the date range changes to between 1896-2023)

Publisher: EBSCO Industries, Inc.

Publisher About Page: https://www.ebsco.com/about#:~:text=An%20industry%20leader,been%20in%20business%20since%201944.

Object Type: Academic Journals, Trade Publications, Periodicals, Newspapers, Biographies, Blog Entries, Books, Conferences Papers, Country Reports Databases, Educational Reports, Encyclopedias, Government Documents, Grey Literature, Law, Primary Source Documents, Reports, Reviews Speeches, Working Papers

Exportable Image: Yes

Facsimile image: No

Full text searchable: Yes

Titles list links: https://www.ebsco.com/title-lists

History/Provenance: EBSCO started out as a small “family-owned” company in the United States in 1944. It has since grown to have several offices around the world to become the one of the “Largest privately held” companies in the country. EBSCO has for more than 70 years, been “the leading provider of research databases, e-journal and e-package subscription management, book collection development and acquisition management, and a major provider of library technology, e-books and clinical decision solutions” for institutions around the world (EBSCO).

Reviews

The collections offered by EBSCO eBooks is substantial, with various purchase options and models that suit any library wanting to maintain a digital collection. EBSCO eBooks come with great tools like ECM and GOBI, which are great for building an e-book collection from scratch or having in the back pocket to purchase those obscure titles requested by patrons. The online interface has good navigation features but downloading e-books can be frustrating. It is also time-consuming to download more software to view the e-book offline. An improvement would be having all the e-books, not just the DRM-free titles, work with other e-book readers already installed on mobile devices and computers. Overall, this is an ideal resource for all types of libraries, not just health sciences and hospital libraries. — Pamela Herring Journal Of Electronic Resources in Medical Libraries

eBooks on EBSCOhost offers a wealth of full text content from which libraries may choose and offers a flexible interface that may be tailored to meet the subscribing institution’s preferences. With its diverse content, no platform fees, and a variety of access models, this e-book platform is a boon for a wide variety of institutions and budgets. — Kimberly Mitchell Journal Of Electronic Resources in Medical Libraries

* The only reviews that could be found were about the eBooks on the EBSCOhost site

Access: The database requires a college or institutional login to access the site

Info from Publisher: https://www.ebsco.com/publishers-partnerships

Other Info: EBSCO has an Open Access policy which can be found here. The company provides researchers and students “trustworthy,” open access, peer-reviewed journals for their use.

Citing: EBSCOhost provides instructions for all the different styles of citation. They note that students should “consult their institution’s reference librarian for more clarification” and to ask their professors which style they would prefer they would use. https://support-ebsco-com.mutex.gmu.edu/help/?int=ehost&lang=en&feature_id=Sty&TOC_ID=Always&SI=0&BU=0&GU=1&PS=0&ver=&dbs=a9h

A Guide to Digitization

Creating digital images can capture specific aspects of a certain item. It can capture the color and some of the size and texture. However, there are certain aspects ignored in digital images. Digital images need help capturing the sound and all the different sides of an item, making it hard for a researcher to get the true nature of it. Rather than digital images, video would be a better alternative for a historical researcher. Not only can videos capture the same aspects that digital images can and do it better, but they can also depict the sound and all the sides of the item. Videos can give researchers more material to work with.

Missing information in digital images can lead to misinterpretations from the viewer. The Conway article states, “Representation is an intentional relationship between the maker and the viewer, fraught with the potential for communication problems ranging from misinterpretation and misunderstanding to falsehood and forgery.” (3) These misinterpretations can lead to bad practices as historians, impacting how they understand and use these items. As in the paragraph before, some might choose a particular digital medium over another, use multiple mediums, or even use different angles of photos; as Conway states, “Building collections of photographs through digitization is fundamentally a process of representation, far more interesting and complex than merely copying them to another medium.” (3) These different ways that historians can avoid misinterpretations can lead to the different uses in the field as Melissa Terras states, “The opportunities to provide and enhance resources ‘for learning, teaching, research, scholarship, documentation, and public accountability’ are immense.” (2)

Conway, Paul. “Building Meaning in Digitized Photographs.” Journal of the Chicago Colloquium of Digital Humanities and Computer Science 1, no. 1 (2009): 1-18.

Terras, Melissa. “Digitisation and Digital Resources in the Humanities.” In Digital Humanities in Practice, edited by Claire Warwick, Melissa Terras, and Julianne Nyhan, 1-22. Facet Publishing, 2012.

The Public Domain Review

The Public Domain Review is a website containing different digital material in the public domain and in Creative Commons. On the website, users can find images, books, and films from around the world that have since entered the public domain. Each image, book, and film briefly describes the material and, sometimes, the artist. The website, on their rights page (found here), details the labeling system of their public domain material. There are different labels for the public domain depending on where it is, life plus how many years, and if the government is involved. They also go into the attribution and Share-Alike rules according to Creative Commons.