Progress Report 3

Overview

Key Accomplishments of the Last Period

  1. The conceptual data model was finalized
  2. The intermediary structured database was finalized
  3. The project website basic UI was created
  4. The project MySQL database was created (only structure — no data)
  5. The project website ORM software was configured to use the project database — an object model was created and mapped to the database

Upcoming Tasks for the Next Period

  1. Complete the website development
  2. Summarize project results
  3. Prepare a poster

Issues

All significant events in Nietzsche’s life have been extracted and imported in the database. Interpersonal relationships are left to be inferred from the collected biographical information for the completion of the data analysis.

Major Task Completion

Planned Complete Date Actual Complete Date Milestone
20.02.2013 20.02.2013 Project Start
24.02.2013 01.03.2013 Data Collection — Complete Source Aggregation
10.03.2013 10.03.2013 Data Collection — Complete Initial Draft of Database Model Design
24.03.2013 t.b.a. Data Collection — Complete Data Extraction and Database Population
07.04.2013 t.b.a. Complete Data Analysis
14.04.2013 22.04.2013 Complete Website Design
15.04.2013 29.04.2013 Website Development Start

Details on Accomplishments

Finalized Conceptual Model

Several changes have been applied to the conceptual model:

  • The Album entity and the inheritance hierarchy of Media Item have been deprecated. All media items will be stored in a single table. A media type property will be assigned to each item. Albums will be treated as compounded media items (defined by “part of” relationships).
  • The Note, Citation and Source entities have been merged in a single Citation entity, which will hold excerpts from documents, as well as standalone quotes and references to multimedia objects. Citations are linked to Intervals and Events
  • The Keyword entity has been dropped. Information about individuals and organizations will be stored in their corresponding tables.
  • Interpersonal relationships have been linked to Intervals.
  • An Importance attribute has been added to Events. It is a positive integer in the interval between 1 and 10, which will indicate the significance of individual events (10 – being the highest and 1 – the lowest). It will further allow us to filter events more precisely.
  • Media Items can be linked to Events.

In addition, the Agent entity has been renamed to Participant, and it now participates in a single many-to-many relationship with Event. This help us define different thematic roles for all nouns, present in a sentence describing an event, and thus preserve more relevant and structured information (see Thematic Roles, from ref. 1). Typical thematic roles include: agent, patient, theme, recipient, beneficiary, location, origin, direction, instrument, experiencer.

The final conceptual data model (version 1.3) for the project is available for download at https://nietzschebiography.files.wordpress.com/2013/05/conceptual_model_v1-3.pdf

Finalized Intermediary Structured Database

The physical data model for the project  is available for download at https://nietzschebiography.files.wordpress.com/2013/05/physical_model_v1-0.pdf

Basic Project Web Site UI

The project web site user interface was based on Bootstrap, an open-source template which simplifies meeting current market requirements on website UIs. Thanks to this decision, making the web site adaptable to various screen sizes and compliant with the newest WWW standards was made much easier.

Below you may see a few screenshots of the user interface design:

Nietzsche : A Digital Biography project website design for PC

Nietzsche : A Digital Biography project website design for PC

Nietzsche : A Digital Biography project website design for mobile phones

Nietzsche : A Digital Biography project website design for mobile phones

Project MySQL database

The project relational database model was continuously updated and has now reached a pre-release version that may be already used as a basis for further programming work. The model was therefore exported from the designer into SQL DDL and this script was launched on MySQL database that is part of our Windows Azure project web site.

On the screenshot below, you may see the tables stored in the project database.

Nietzsche Digital Biography MySQL Database

Nietzsche : A Digital Biography MySQL Database

Web Site ORM Configured

Since we chose Microsoft ASP.NET MVC as the backend technology and it is based on Microsoft .NET Framework, an object oriented programming platform, it was convenient for us to use ORM software as a data access layer in order to bring the relational and object oriented worlds together. For this purpose, we chose Entity Framework v5.0, a Microsoft’s official ORM technology, especially due to a very good tooling support.

To make the ORM work, however, it was necessary to create an object equivalent of the relational database model. The screenshots below show the resulting object model.

Nietzsche : A Digital Biography project object model in a conceptual view

Nietzsche : A Digital Biography project object model in a conceptual view

Nietzsche : A Digital Biography project object model in a detailed view

Nietzsche : A Digital Biography project object model in a detailed view

References

1. Santorini, Beatrice, and Anthony Kroch. 2007-.
The syntax of natural language: An online introduction using the Trees program.
http://www.ling.upenn.edu/~beatrice/syntax-textbook

Progress Report 1

Overview

Key Accomplishments of the Last Period

  1. Technologies to be used were chosen
  2. Project management methodology and tools were chosen
  3. Data sources were collected
  4. Database schema’s first draft was created
  5. Digital biography’s hosting was established

Upcoming Tasks for the Next Period

  1. Finalize the database schema
  2. Create and populate the database
  3. Perform data analysis
  4. Design biographical website’s user interface
  5. Configure the website to use the database
  6. Start development of the website

Issues

None at the moment.

Major Task Completion

Planned Complete Date Actual Complete Date Milestone
20.02.2013 20.02.2013 Project Start
24.02.2013 01.03.2013 Data Collection — Complete Source Aggregation
10.03.2013 10.03.2013 Data Collection — Complete Database Model Design
24.03.2013 t.b.a. Data Collection — Complete Data Extraction and Database Population

Details on Accomplishments

Chosen technologies

Because the digital biography is going to be a website, we needed to choose web application development technologies. At first we had some problems to find intersections in our technological knowledge, but in the end we agreed on the following:

DBMS: MySQL
Website Platform: ASP.NET MVC
DAL: Entity Framework (ORM)

Project Management

It is a good practice to choose a methodology for any software development in teams. We chose to use Scrum, an agile development methodology, together with Team Foundation Service, a Microsoft’s free tool for assisting software development project management and a version control system.

Collected Data Sources

The primary sources that we will use are:

  • The Nietzsche Channel
  • Nietzsche Circle
  • Friedrich Nietzsche – Stanford Encyclopedia of Philosophy
  • Friedrich Nietzsche. A Philosophical Biography, Julian Young, Wake Forest University, North Carolina
  • Nietzsche: Life as Literature, Alexander Nehamas, Cambridge, Harvard University
    Press, 1985
  • Nietzsche: A Critical Life, Ronald Hayman, New York: Oxford University Press, 1980
  • Introductions to Nietzsche, Robert Pippin, Cambridge University Press, 2012
  • Nietzsche’s Library, Rainer J. Hanshe
  • The Neurological Illness of Friedrich Nietzsche, D. Hemelsoet, K. Hemelsoet and D. Devreese, 2008, N° 1 (Vol. 108/1) p.9-16, Acta Neurologica Belgica

Project Database Schema

The core of the conceptual database model for the “Nietzsche: A Digital Biography” project is based on the BIO vocabulary – a semantic model for describing biographical information about people.

According to it a person’s life may be well depicted as a series of interconnected major events, to which additional details and relevant information can be attached. The model can be considered as person-centric rather than neutral. For instance, the Employment event puts the individual being employed as the principal agent in the event rather than the employer.

Biography Vocabulary Core Classes

The BIO vocabulary defines and describes several core classes and properties that can be used to create a relatively complete story of a person’s life and his or her interactions with other individuals, organizations (institutions) and the surrounding environment. We incorporated their underlying concepts for the design of our database. For example, events limit intervals of time (timespans) that can be associated with particular long-term or short-term relationships between individuals and groups of people (or organisations). Various types of life events are available as the obvious Birth, Education, Marriage and Death. In addition, a number of obscure and subtle events such as Baptism, Naturalisation, Imprisonment and Inauguration have been added. The included events so far do not cover the whole spectrum of events associated with biographical material in its entirety, and they will be further expanded during the data extraction and population process.

As the semantic model does not currently provide a complete representation of the relationship segment of a biography, we developed a Relationship classification hierarchy, which incorporates several types of relationship such as family and marital relationships, and professional collaborations.

The conceptual data model for the project is available for download at https://nietzschebiography.files.wordpress.com/2013/03/conceptual_model.pdf

Project Hosting

Since we chose to use ASP.NET as the website platform, we needed a suitable hosting provider for it. A natural choice was Windows Azure, a Microsoft’s cloud-computing platform which offers free hosting for small websites. An advantage of this hosting is that it integrates with the Team Foundation Service’s version control system and downloads all changes automatically.

The address of our digital biography website is: http://nietzsche.azurewebsites.net

Project Milestones

Deadline Milestone
 18.02.2013 Project Start
 24.02.2013 Data Collection — Complete Source Aggregation
 10.03.2013 Data Collection — Complete Database Model Design
 24.03.2013 Data Collection — Complete Data Extraction and Database Population
 07.04.2013 Complete Data Analysis
 14.04.2013 Complete Website Design
 15.04.2013 Website Development Start
 12.05.2013 Website Development End
 26.05.2013 Complete Result Summarization
 27.05.2013 Website Launch
 31.05.2013 Project End

Our Methodology

The project will be carried out in a traditional, sequential methodology on its highest level. The project plan will be split into 4 major stages succeeding one after another without any reiteration, which will allow us to better evaluate the degree of meeting the project goals and deadlines at any time. The following text describes methodologies of each of the major process stages.

Phase 1 – Data Collection

A biography incorporates a huge bundle of facts and a complex series of events comprising the life of a person. Biographical facts may be classified on two levels according to a hierarchy of relevance, described in details in Sergio Soares’s dissertation on “Extraction of Biographical Information from Wikipedia Texts”. Soares distinguishes immutable personal characteristics (e.g. date and place of birth/death, family information), mutable personal characteristics (education, occupation, residence, affiliation), relational personal characteristics (family and marital relationships, professional collaborations), individual events (professional activities, personal events) and others among biographical data, while excluding irrelevant (non-biographical) details on a zero level.

taxonomy_of_biographical_classes

During the Data Collection phase we will collect Nietzsche’s biographical data and store them in a structured database (a relational database management system). The phase may be divided to three distinct subphases: data source aggregation, data model design (incorporating the taxonomy of biographical classes, proposed by Soares) for storing and accessing them logically and efficiently and database population.

The data source aggregation subphase will consist of identifying various reliable data sources, selecting the relevant data to be stored, verifying and merging them with the data present in the database. The types of sources we are going to use are the following (sorted in descending order of priority):

  1. Web content created by acclaimed universities
  2. Web content created by reliable encyclopedias (e.g. Britannica)
  3. Printed biographies written by established authors
  4. Web content created by other reputable institutions
  5. Other sources

To store, analyze, process and update the data in the RDBMS quickly and correctly, an appropriate data model has to be designed. A data model defines the types and relationships of data to be stored and is the basis for any further work. Our goal will be to decompose the collected data into as small pieces of information as possible, in order to leverage the advantages of a structured database the most. During the data collection we will find various types of data and it will happen incrementally, hence the data model must also change in time accordingly.

Furthermore, additional entities and relations, which will represent the taxonomy of the biographical classses, will be created. They will facilitate the classification of biographical facts and provide information about their origin.

The process of data extraction and database population is described as follows:

  1. Assemble a list of online resources identified by unique URLs
  2. Create a local repository of the digitized texts to be mined
  3. Download a recent version of the specified web pages and save the HTML content in the repository by using a website crawler
  4. Add biographical data from digitized books in plain text format to the repository manually
  5. Parse the repository data, scrape names of people and places, and store them in an entity table
  6. Build Wikipedia resource locators from the named entities, fetch and extract infobox information and save it to the database
  7. Strip HTML tags and irrelevant data
  8. Delimit individual tokens over the text, segment the documents into sentences and classify them into the biographical classes, discussed above, manually or with the assistance of appropriate data mining software
  9. Import images and multimedia, related to Nietzsche

Phase 2 – Data Analysis

After we create a large enough structured database with Nietzsche’s biographical data, we will try to analyse it by using text mining tools, compile descriptive statistics and draw conclusions on it. The following domains of Nietzsche’s biography will be explored:

  • Circles of friends and acquaintances
  • Public events and social interactions
  • Journeys and places of residence
  • Literary work
  • Aspirations and external drivers

All statistics will be based on the data stored in the biographical database and the applications that will produce them (and eventually visualize them in graphs, diagrams, maps, etc.) will be reusable on the website that we will create in the following phase. The compiled statistical data will be preserved in database tables or made accessible via database views.

Phase 3 – Website Development

In the third phase of our project we will design and develop a dedicated website where a subset of the collected data about Nietzsche and all the derived statistics will be published. The content will be presented very cleanly, but attractively and, if possible, graphically.

The website will be highly interactive (via the use of AJAX, partial rendering and dynamic filters). Events plotted on a timeline, spatial distribution of people and places, a network of Nietzsche’s friends and acquaintances are exemplary applications suited for the interative presentation of the data.

Moreover, the website will be very well interlinked and contextual so that the reader will never have hard time finding needed or related information.

Phase 4 – Results Summarization

In the last phase of the project we are going to summarize the results in a final document, which will also be published in a section of the project website.

The State of Art

There are currently many sources that may be used as data providers – paper, electronic and even audiovisual. Hundreds of printed books, encyclopedias and articles that deal exclusively with Nietzsche’s biography (or incorporate it in some of their chapters/sections [1—7]) were published. The Internet is also full of webpages and websites dedicated to the philosopher and his work. They may be divided into several categories: university articles [8—10], encyclopedic articles [11—14], articles of servers specialized in biographies [15—16] and complementary fan websites [17—18]. Numerous documentaries on Nietzsche’s life have been presented by broadcasting television stations and online video channels, which may also provide us with important and useful information [19—20].

There are no websites, however, which offer complete biographical information about the philosopher, presented in a creative and visually appealing way, as we would like to deliver with our project. The website that briefly meets our ideas in relation to uniformity and completeness of information is “The Nietzsche Channel” [18], a website that contains one of the most extensive biographies (and comprehensive bibliographies) of Nietzsche. The information provided, however, is not well-structured, interlinked or interactive and does not offer any visualization. The desired presentation complexity may be closer to the website “Biography.com” [15], which offers better data structuring and interconnection, but the information is quite basic and elementary and no grasping visualizations or true interactivity have been implemented.

Likewise, we have not found any other websites that display biographical information following the structure and presentation we would like to embody in this project. Nonetheless we found an effort in Digital Humanities by Stanford University that works with and presents comparable data in a similar way as we would like to — in a project called “Mapping the Republic of Letters” [11]. The authors of this project worked with large data sets of letters sent or received by selected historical figures. They processed these letters and visualized the extracted information in different ways — e.g. interactive maps displaying the people’s communication over time or charts and graphs that clearly summarized key statistics about their communication. Our project’s web presentation should use similar visualization techniques, telling the story of Nietzsche in a new interactive way.

In summary, no other (or only a few similar) projects revealing Nietzsche’s life in a modern way currently exist. Much information traditionally stored in biographic books and encyclopedias is now available online, but it does not fully leverage the possibilities offered by the technology of digitalization. Web technologies enable interactivity, interconnectivity and visualizations that are impossible to be applied on content in static form. Yet, they are still used only partially when it comes to websites dedicated on philosophers or other influential historic figures, and our goal is to demonstrate the possibilities through this project.

References

[1]
C. Cate, Friedrich Nietzsche. Woodstock, NY: Overlook Press, 2005.
[2]
J. Young, Friedrich Nietzsche : a philosophical biography. Cambridge [England]; New York: Cambridge University Press, 2010.
[3]
R. Safranski a S. L. Frisch, Nietzsche : a philosophical biography. New York: W.W. Norton, 2003.
[4]
R. J. Hollingdale, Nietzsche : the man and his philosophy. Cambridge, U.K.; New York: Cambridge University Press, 2001.
[5]
P. Strathern, Nietzsche in 90 minutes. Chicago: I.R. Dee, 1996.
[6]
L. Chamberlain, Nietzsche in Turin : an intimate biography. New York: Picador USA, 1998.
[7]
W. A. Kaufmann, Nietzsche, philosopher, psychologist, antichrist. Princeton, N.J.: Princeton University Press, 1974.
[8]
„Friedrich Nietzsche – German Philosopher – Biography“. [Online]. Available: http://www.egs.edu/library/friedrich-nietzsche/biography/. [Accessed: 26-11-2012].
[9]
„Friedrich Nietzsche (Stanford Encyclopedia of Philosophy)“. [Online]. Available: http://plato.stanford.edu/entries/nietzsche/. [Accessed: 26-11-2012].
[10]
„Nietzsche Biography – OpenLearn – Open University“. [Online]. Available: http://www.open.edu/openlearn/history-the-arts/culture/philosophy/thinkers/nietzsche-biography. [Accessed: 26-11-2012].
[11]
„Friedrich Nietzsche – New World Encyclopedia“. [Online]. Available: http://www.newworldencyclopedia.org/entry/Friedrich_Nietzsche. [Accessed: 26-11-2012].
[12]
„Friedrich Nietzsche – Wikipedia, the free encyclopedia“. [Online]. Available: http://en.wikipedia.org/wiki/Friedrich_Nietzsche. [Accessed: 26-11-2012].
[13]
„Friedrich Nietzsche (German philosopher) — Britannica Online Encyclopedia“. [Online]. Available: http://www.britannica.com/EBchecked/topic/414670/Friedrich-Nietzsche. [Accessed: 26-11-2012].
[14]
„Nietzsche, Friedrich [Internet Encyclopedia of Philosophy]“. [Online]. Available: http://www.iep.utm.edu/nietzsch/. [Accessed: 26-11-2012].
[15]
„Friedrich Nietzsche Biography – Facts, Birthday, Life Story – Biography.com“. [Online]. Available: http://www.biography.com/people/friedrich-nietzsche-9423452. [Accessed: 26-11-2012].
[16]
„Friedrich Nietzsche Biography – Friedrich Nietzsche Childhood, Life & Timeline“. [Online]. Available: http://www.thefamouspeople.com/profiles/friedrich-nietzsche-128.php. [Accessed: 26-11-2012].
[17]
„[Nietzsche Circle][The Life of Nietzsche]“. [Online]. Available: http://www.nietzschecircle.com/nietzsche_work.html. [Accessed: 26-11-2012].
[18]
„The Nietzsche Channel: Biography.“ [Online]. Available: http://www.thenietzschechannel.com/bio/bio.htm. [Accessed: 26-11-2012].
[19]
„Films for the Humanities and Sciences – Friedrich Nietzsche: Beyond Good and Evil“. [Online]. Available: http://ffh.films.com/id/86/Friedrich_Nietzsche_Beyond_Good_and_Evil.htm. [Accessed: 26-11-2012].
[20]
„Films for the Humanities and Sciences – Nietzsche“. [Online]. Available: http://ffh.films.com/id/9332/Nietzsche.htm. [Accessed: 26-11-2012].
[21]
„Mapping the Republic of Letters“. [Online]. Available: http://republicofletters.stanford.edu/. [Accessed: 26-11-2012].

About the Project

The philosophical work of Friedrich Nietzsche had a significant impact on the thinking of twentieth century leaders and intellectuals. For decades philosophy scholars and curious individuals have been pondering over questions regarding Nietzsche’s extraordinary personality and grandiose controversial ideas. What were the reasons behind his thoughts? How did Nietzsche’s personal life influence his work? How did his believes evolve and why? Which contemporaries inspired Nietzsche or who conversely fell under his influence?

The aim of our upcoming project will be to answer these questions by collecting Nietzsche’s biographical data; organizing them in a structured database using modern digital technologies, analyzing and putting this information into context; providing rich visual and interactive presentations of relations between people, events and ideas. The fruits of our endeavour will be publicly available on a dedicated project website.

The main difference between our project and the other websites containing philosopher biographies and bibliographies is that our goal is to create a full-length digital biography by utilizing all possibilities of cutting-edge technologies that can assist us in copying with the challenging task of displaying less cluttered, better organized, readable and more capturing relevant content. To do this, the database model in which the data is stored should be deeply interlinked and the presentation layer should make use of these relationships as much as possible. The content should be clear, easily searchable, navigable and highly semantic, leveraging technologies like XML, HTML5, CSS3 or microformats. It should also be visually engaging — diagrams, maps and graphs would make it more appealing and demonstrative. Most of these visualizations should be “live” — based on the data sets stored in the database. Finally, the website should contain interactive elements, allowing the user to explore the content quickly and effectively. An example of that could be an interactive map (or a timeline) and dynamically loaded content, which can be accomplished with the use of JavaScript.

Authors

  • Orlin Topalov
  • Vojtěch Vít