Digital Humanities Conference – Abstract

Introduction

Friedrich Nietzsche, one of the most prominent figures in the history of Western philosophy, left an enduring legacy behind. His challenging philosophical work had a profound impact on the thinking of twentieth century political leaders and intellectuals. For decades arduous scholars and interpreters (among them Michel Foucalt, George Bataille, Jacques Derrida and Leo Strauss) have been pondering over questions regarding Nietzsche’s extraordinary personality and grandiose controversial ideas.

The aim of our project is to convey a holistic view on Nietzsche’s life, which will help us understand better his elusive character and assist us in tracing the origin and influence of his ideas. To attain this ambitious goal we compiled reliable biographical information on the philosopher from a wide variety of sources and organized it in a structured/relational database. Moreover, we reconciled and extended the data with semantic repositories. After following analysis and assessment, we built a publicly-accessible website, incorporating rich visual and interactive presentations of people, events, places and objects, related to the life of Friedrich Nietzsche.

The Data Model

A web application’s flexibility and its data reliability are closely dependent on its underlying data model. To ensure the quality of the data, we had to develop a structurally-sound model, which gives a real-world context, leverages generic structures and follows naming standards. In addition, it had to capture the considerable diversity of biographical information in its entirety, while avoiding inconsistencies and redundancy.

One of the most valuable instruments that we could use to fulfill the criteria was abstraction: “the ability to increase the types of information a design can accommodate using generic concepts” (Hoberman 2007). Fortunately, an abstract semantic model for describing biographical information about people already existed – the BIO RDF Schema (Davis and Galbraith 2010). Thus, we decided to adopt it and build on it.

According to the BIO vocabulary, a person’s life may be well depicted as a series of interconnected major events, to which additional details and relevant information can be attached. The model can be considered as person-centric rather than neutral. It defines and describes several core classes and properties that can be used to create a relatively complete story of a person’s life and his or her interactions with other individuals, organizations (institutions) and the surrounding environment.

We incorporated the underlying concepts of the BIO classes in the design of our database. For example, events limit intervals of time (timespans) that can be associated with particular long-term or short-term relationships between individuals and groups of people (or organizations). Distinct types of life events are available as the obvious Birth, Education, Marriage and Death. In addition, a number of obscure and subtle events such as Baptism, Naturalization, Imprisonment and Inauguration have been added. The included event types did not cover the whole spectrum of events associated with biographical material in its completeness, and they were further expanded during the data extraction and population process.

An “importance” attribute was added to Events. It is a positive integer in the interval between 1 and 10, which indicates the significance of individual events (10 – being the highest and 1 – the lowest), determined by the event type (“writing” is considered as more important than “reading”), and allows us to filter events more precisely.

As the semantic model did not provide a complete representation of the relationship segment of a biography, we developed a Relationship classification hierarchy, which unified several types of relationships such as family and marital relationships, and professional collaborations.

Moreover, we introduced Media Item, Citation, Participant and Location entities.

Information about text documents (books, articles, lectures) and music compositions is stored in a single table – Media Item. A media type property is assigned to each item. Collected works and music albums are treated as compounded media items (defined by “part of” relationships). Media Items are linked to Events (as objects).

The Citation entity holds excerpts from documents, as well as standalone quotes and references to multimedia objects. Citations can be linked to both Intervals and Events.

The Participant and Location entity hierarchies participate in single many-to-many relationships with Event. This help us define different thematic roles for all nouns, present in a sentence describing an event, and thus we can preserve more relevant and structured information. Typical thematic roles include: agent, patient, theme, recipient, beneficiary, location, origin, direction, instrument, and experiencer (Santorini and Kroch 2007).

The Data

One of the main objectives of our project was to present relatively rich and precise information about Friedrich Nietzsche, his interactions with the world and the literary and philosophical legacy he had left. Achieving high level of completeness and accuracy requires a combination of heterogeneous sources of data, thus introducing additional conceptual and analytical complexities. It was a challenge that we often faced.

Many of the resources brought valuable information that was incompatible with the database schema in use and changes to the data model immediately followed. In order to avoid those recurring modifications, we created a temporary database for storing and manipulating heterogeneous data. Thus, in the process of record refinement we were also adapting our data model. In addition, conflicts between facts from diverse sources occasionally arose and further research was necessary for their resolutions.

To facilitate the clean-up and transformation of data, we used the temporal database, spreadsheets, scripts, macros and Google Refine (a power tool for working with messy data, molding it from one format into another and extending it with web services).

Aside from the biographical information that we were extracting from books, articles and websites on Nietzsche, we also needed spatial information (about countries, cities and their coordinates). The primary geographic data that we had obtained was derived from Geobytes’s GeoWorldMap product.

Additional information about individuals, and coordinates of populated places and addresses, were downloaded from open semantic databases like DBPedia and Freebase, after manual reconciliation of the records.

Currently, the database contains information about 2617 events (in 1010 of them, Nietzsche is a participant), 1354 citations, 990 media items (books, compositions, web sites), 695 individuals, 148 organizations and 293 relationships.

The Application

We chose Microsoft ASP.NET MVC as the back-end technology, which is based on Microsoft .NET Framework, an object oriented programming platform. Consequently, it was convenient for us to use ORM software as a data access layer in order to bring the relational and object-oriented worlds together. For this purpose, we chose Entity Framework v5.0, Microsoft’s official ORM technology, boasting of excellent tooling support.

The front-end is built with the Bootstrap (supporting HTML5 and CSS3) and jQuery (JavaScript) frameworks. The website is responsive to different screen resolutions and can be accessed via desktop computers, laptops, tablets and smartphones.

Main features: a dynamic timeline of Nietzsche’s life, which can be sorted by date and filtered by various criteria; an interactive map, plotting the places, which Nietzsche visited; the philosophical and musical work, he has authored; works, written on Nietzsche; graphs and lists on his family, friends, correspondents, influencers and influences. They all reveal parts of the puzzling Friedrich Nietzsche.

References

Books
  • Nietzsche’s Library, Rainer J. Hanshe (2007)
  • Friedrich Nietzsche. A Philosophical Biography, Julian Young, Cambridge University Press (2010)
  • Great Thinkers of the Western World, Annual 1999, HarperCollins Publishers (1999)
  • Nietzsche: Life as Literature, Alexander Nehamas, Harvard University Press (1985)
  • Nietzsche: A Critical Life, Ronald Hayman, Oxford University Press (1980)
  • Introductions to Nietzsche, Robert Pippin, Cambridge University Press (2012)
  • The Syntax of Natural Language: An Online Introduction Using the Trees Program, Beatrice Santorini and Anthony Kroch (2007)
Websites

Friedrich Nietzsche (Stanford Encyclopedia of Philosophy)
http://plato.stanford.edu/entries/nietzsche
Friedrich Nietzsche (Britannica Online Encyclopedia)
http://www.britannica.com/EBchecked/topic/414670
Friedrich Nietzsche (Wikipedia)
http://en.wikipedia.org/wiki/Friedrich_Nietzsche
Friedrich Nietzsche Bibliography (Wikipedia)
http://en.wikipedia.org/wiki/Friedrich_Nietzsche_bibliography
List of works about Friedrich Nietzsche (Wikipedia)
http://en.wikipedia.org/wiki/List_of_works_about_Friedrich_Nietzsche
The Nietzsche Channel
http://www.thenietzschechannel.com
Nietzsche Circle
http://www.nietzschecircle.com
Nietzsche Chronicle
http://www.dartmouth.edu/~fnchron
Nietzsche.ru
http://nietzsche.ru
A Definition of Database Design Standards for Human Right Agencies
http://shr.aaas.org/DBStandards/contents.html
BIO: A Vocabulary for Biographical Information
http://vocab.org/bio/0.1/.html
Knowledge Representation, John F. Sowa
http://www.jfsowa.com/krbook
Freebase
http://www.freebase.com
DBpedia
http://dbpedia.org

Progress Report 2

Overview

Key Accomplishments of the Last Period

  1. The conceptual data model was updated
  2. Information from multiple sources was gathered
  3. A temporary database for data aggregation was created and populated
  4. Part of the data was cleaned up and organized
  5. Sketches of the website’s user interface were drafted
  6. The development of the website was initiated

Upcoming Tasks for the Next Period

  1. Import the formatted data in the production database
  2. Complete the data analysis
  3. Prepare a working demo

Issues

One of the main objectives of our project is to present relatively rich and precise information about Friedrich Nietzsche, his interactions with the world and the literary and philosophical legacy he has left. Achieving high level of completeness and accuracy requires a combination of heterogeneous sources of data, thus introducing additional conceptual complexity. It is the challenge that we have been facing for the past reporting period.

According to the project milestones that we had set, all relevant and necessary data should have been formatted, imported in the database and analyzed. However, the data modelling and organization processes appeared to be not so straightforward as we expected. Many of the resources bring valuable information that is incompatible with the database schema in use and changes to the data model immediately follow. In order to avoid these recurring modifications, we created a temporary database for storing and manipulating heterogeneous data. Thus, in the process of record refinement we also adapt our data model. In addition, conflicts between facts from diverse sources occasionally arise and further research is necessary for their resolutions. Nevertheless, the aggregation of information is almost complete and soon we will be back on track (in accordance with the project milestones).

Major Task Completion

Planned Complete Date Actual Complete Date Milestone
24.03.2013 t.b.a. Complete Data Extraction and Database Population
 07.04.2013 t.b.a. Complete Data Analysis

Details on Accomplishments

Conceptual Model Changes

The updated conceptual data model (version 1.2) for the project is available for download at https://nietzschebiography.files.wordpress.com/2013/04/conceptual_model_v1-2.pdf

As currently the model is often being changed, explanations on it will be given in the next report, until then a final version is expected.

Data Aggregation and Manipulation

So far we have compiled chronologies of Nietzsche’s life and literary works on Nietzsche from several books and websites. To facilitate the clean up and transformation of data, we have used the temporal database, spreadsheets, scripts, macros and Google Refine (a power tool for working with messy data, molding it from one format into another and extending it with web services).

Aside from the biographical information that we are extracting from the sources mentioned in previous posts, we also need geographical information (about countries, populated places and their coordinates). The data that we have obtained is derived from two primary products:

  • MaxMind’s “World Cities” database, which includes 3 174 000 populated places, their population, latitude and longitude. The database is updated about once per year, since the city data does not change very frequently.
  • Geobytes’s GeoWorldMap, which is distributed as a set of three text files. A “Countries” table is used to store a comprehensive list of all the countries in the world; a “Regions” table stores a list of sub-country geographical entities such as states, provinces and territories; and a “Cities” table is a detailed list of cities, related to countries and sub-country regions. We currently do not use the “Regions” data.

Common Website Design Features

  1. The website must be fully compliant with HTML5 and CSS3.
  2. The website should use metadata tags and attributes extensively.
  3. The website design should be simple, use typography and vector graphics rather than bitmap images.
  4. The website should provide a mobile version using media queries technology.
  5. The website should be designed by mobile-first approach.
  6. The website should be touch-friendly.

Basic Website Structure

Relative URL Name
/ Homepage
/bio Life
/bio/{Event.Slug} Life Event Details
/work Work
/work/{MediaItem.Slug} Work Details
/connections Connections
/connection/{Agent.Slug} Connection Details
/map Map
/map/{Location.Slug} Location Details
/sources Sources
/source/{Source.Slug} Source Details