Why Digitizing Sources is Important
Written by
Categories: Modern

Why Digitizing Sources is Important

As human beings and as scientists in the early 21st century, we have a crisis of epistemology and misinformation. Science is a system for distributed, verified trust and as the rate of publications increases, and new discoveries lead to conclusions which threaten more and more wealthy actors, that system has been breaking down. There is lots of talk about blame, but I don’t find that is helpful. Often, what seem to be two opposed factions lean on each other like tired wrestlers, and use the commotion of their fighting to keep their supporters too busy to ask awkward questions about the gap between the policies that their representatives say they support and the policies they enact. Instead of laying blame, I would like to talk about one of the things we are doing to solve this.

In the philological side of ancient world studies, we got started digitizing the sources in the 1990s. More or less every published text in Greek or Latin up to 300 CE, and many in ancient Semitic languages, was available to anyone with a browser and an Internet connection by the 2000s on projects like LacusCurtius, PHI Online, and the Epigraphik-Datenbank Clauss / Slaby. These free-to-read texts are not always the best, and we have been slower to publish art and artifacts the same way. The main database of ancient Greek painted pottery is free-to-search but claims copyright to all photos. Medievalists don’t seem to have been as active in this area (the Internet Medieval Sourcebook just has translations), but on Armour in Texts, I have been doing the same for body armour from the first Sumerian inventories to the 19th century.

Making the sources available, so people can use the classical style of argument, is crucial. In my experience, the stupidest kinds of debates are ones where neither side has access to the sources except through partisan summaries. When people only have access to a cherrypicked version of the evidence, its easy for them to start special pleading and otherwise rationalizing a position they have already decided upon. There is no way to decide whether one army’s test of captured armoured vehicles during WW II was biased except by reading it. It is very important to know whether a belief that some weapon was effective or ineffective against a particular target goes back to authoritative wartime sources, or some book published in 1982.

You can see what happens when neither side in an argument has access to the evidence by reading online discussions. But you can also see it if you watch less number-minded historians deal with topics like casualty figures. If they can’t get access to the sources for these estimates, or take the time to read how they were estimated, they flip through the figures in different authorities and pick one or declare themselves agnostic. They don’t learn the rules of evidence for handling these kinds of figures, although they might make some ad-hoc arguments and subjective judgments about the reliability of a few favourite numbers. There is also a problem known as the empty citation, where the cited source just cites another source and at some point in the chain the cited work does not exist, or it does not say what its claimed to say, or it asserts without evidence. (Harzing 2001: 130-132) Losing track of the data which lies behind a claim is not a mark of wicked intruders defiling the sacred groves of academe, it is a human thing like murder.

Once sources are available, not everyone will choose to look at them. Some people some of the time don’t want sources, they just want to emit argument-shaped noise that harmonizes with the noises their ingroup makes. When natural scientists bash Aristotle, or engineers talk about a “thousand year dark age,” they are not speaking to describe the world, they are speaking to show that they have learned their community’s prejudices. But if one side in an argument can keep citing sources (not authorities), and the other has to flail and posture and assert without evidence, then over years and decades the first side usually wins over the uncommitted. While bad money drives out good, good calm evidence-based arguments skewer blather. And once the argument has been focused on sources, it can move to the next stage of asking how should those sources be used. But you don’t need to be a great philosopher to have a straightforward factual discussion “source A says … source B says … source C seems to disagree.”

Further Reading: Harzing, Anne-Wil (2002) “Are Our Referencing Errors Undermining Our Scholarship and Credibility? The Case of Expatriate Failure Rates.” Journal of Organizational Behavior, Vol. 23, No. 1 (February 2002), pp. 127-148 [available from author’s website]

The sources on my site are free to read, but they aren’t costless to collect. Help support them with a donation on Patreon or paypal.me or even liberapay

Edit 2023-08-29: someone in the UK has a similar idea (warning: Substack)

paypal logo
patreon logo

Write a comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.