Building bridges with the Big Data of the Past

This year´s World Digital Preservation Day’s theme “Breaking Down Barriers” points to how digitisation is the road into the future and holds numerous opportunities in various areas. Moreover, “the aim of World Digital Preservation Day is to create greater awareness of digital preservation that will translate into a wider understanding which permeates all aspects of society – business, policymaking, personal good practice” (cited from – a conviction that is shared by the Time Machine network and that we would like to outline for you on this occasion.

The Time Machine Concept and Agenda

The Time Machine network is a large-scale research initiative aiming to create the Big Data of the Past – a distributed digital information system mapping the European social, cultural and geographical evolution across times. In the proposed approach, digitisation is only the first step of a long series of extraction processes, including document segmentation and understanding, alignment of named entities and simulation of hypothetical spatiotemporal 4D reconstructions.

Such computational models with an extended temporal horizon are key resources for developing new critical reflections on the future of our institutions, and insights for historians, social scientists, creative arts professionals, policymakers, and for the general public, with a significant common denominator: contributing to informed decision-making from everyday life to academic, professional and political matters. The vision is, therefore, to enable Europe to turn its long history, as well as its multilingualism and multiculturalism, into a living social and economic resource.

The Time Machine Organisation

To implement this agenda, the Time Machine Organisation (TMO) was founded in 2019 as organisational and institutional governing framework behind the Time Machine network.

The Time Machine Organisation is the leading international organisation for cooperation in technology, science and cultural heritage and the institutional governing framework that ensures the sustainability and economic independence of the Time Machine project. It is an internationally oriented association under Austrian law and headquartered in Vienna. As such, the association is open to any type of legal entity that deals with science, technology and cultural heritage. 

The Basic Time Machine Scheme

The Time Machine Processing and Digitisation Infrastructure

The Time Machine Processing Infrastructure will be composed of a digital content processor and three simulation engines (Figure 2-1):

  • A 4D Simulator that manages a continuous spatiotemporal simulation of all possible pasts and futures that are compatible with the data
  • A Universal Representation Engine that manages the multidimensional representation space resulting from the integration of extremely diverse types of digital cultural artefacts (text, images, videos, 3D)
  • A Large-Scale Inference Engine that will shape and assess the coherence of 4D simulations based on human-understandable concepts and constraints
Figure 2-1: TM Digital Content Processor and the three simulation engines

The Time Machine Digitisation Infrastructure will be composed of a network of digitisation hubs and will be organised on a European scale. A peer-to-peer platform will be in charge of managing and optimising digitisation strategies at European level and will also be tasked with the development of generic solutions for archiving, directly documenting the digitisation processes, and swiftly putting the digitised documents online.

The core software components

The Data Graph

The Data Graph (incrementally implemented by research in Pillar 1) is the central component of the Time Machine, containing all the information modeled in the Time Machine. The graph is constructed both manually using editing Apps and automatically through the processing of the Digital Content Processor. Apps permit visualise and edit the Data Graph, thus performing Internal (e.g. inclusion of Nodes and Links) and External Operations (e.g. Visualisation).

The 4D Map

The 4D Map is a second central component of Time Machine. It plots both ongoing projects and the dataset of these projects. This means that the 4D Map is both the map where activities can be followed and the map aggregating results. The density of the 4D Map is not simplified. In particular, some zones may be modeled only in 3D, 2D and even 1D, as a list of included elements. The 4D Map includes a layer of Municipalities on which Local Time Machines can be anchored. The 4D Map can be navigated using several 4D interfaces.

The Code Library

The Code Library is a library accessible in several programming languages regrouping key Operators function for processing Data in the Time Machine Environment

The Projects Repository

The Projects Repository monitors all the active projects of the Time Machine. Projects are usually conducted by institutions but can also be launched by individuals. Projects may be new or documentation of ancient projects. Projects can mine Sources and ingest their extracted data into the Data Graph. These Projects are associated with a Zone of Coverage that associated them with Local Time Machines, producing content for GeoEntitites. Projects may also produce intermediary datasets that can be downloaded even if they are not yet integrated into the Data Graph. Projects can also develop Apps that interact with the 4D Map and the Data Graph. Projects can contribute to the Code Library by working on the GitHub repository of the Time Machine to produce new Operators. These different objectives are non-exclusive from one another.

The Requests For Comments

Reaching consensus on the technology options to follow in a programme as large as Time Machine is a complex issue. To ensure the open development and evaluation of work, a process inspired by the Request for Comments (RFC) that was used for the development of the Internet protocol3 will be adapted to the needs of Time Machine. Time Machine Requests for Comments will be freely accessible publications, identified with a unique ID, constituting the main process for establishing rules, recommendations, core architectural choices for the Time Machine components.

Basic Principles of the Requests for Comments

  • Accessibility: RFCs are freely accessible, free of charge.
  • Openness: Anybody can write an RFC.
  • Identification: Each RFC, once published, has a unique ID and version number.
  • Incrementalism: Each RFC should be useful for its own right and act as a building block to others. Each RFC must be aimed as a contribution, extension or revision of the Time Machine Infrastructure.
  • Standardisation: RFCs should aim to make use of standardised terms to improve the clarity level of its recommendation.
  • Scope: RFCs are designed contribution and implementation solutions solving practical problems. RFC are not research papers and may not necessarily contain experimental evidence. The RFCs cover not only the technical infrastructure but the data standards, legal frameworks, values and principles.
  • Self-defining process: Like for the development of the Internet, RFCs are the main process for establishing the Time Machine Infrastructure and Processes but also the processes and roles for managing RFCs themselves.

The Local Time Machines

The Time Machine Network is organised as a virtually unlimited amount of Local Time Machines (LTMs). Each LTM is anchored in the space of a city or a region and has the ambition to build a dense database of spatiotemporal information laying the foundation of a 4D model of its physical environment.

The Time Machine Organisation is responsible for the development of the core infrastructure which includes structures like the Data Graph or the 4D Map. The Apps are pieces of software that allow users to experience and edit the information in the Data Graph and the 4D Map. They can be grouped into families of Apps like the Navigators or the Annotators. The way these components fit together is shown in Figure 2-2.

Figure 2-2: General structure of Local Time Machines and their relation to projects, Apps and core components.

Local Time Machines are areas characterised by density of operations. There is no single coordination that aims to manage all the projects that have arisen in a specific area. All those involved in the various projects and activities, through their organisation charts, can in any case converge towards a community which will therefore have an autonomous logic of structuring and functioning that emerges locally. The projects involved can include projects with national and international grants, institutional projects having internal funding, projects financed by local administrative institutions, projects held by companies on cultural heritage benefiting of services and tools implemented by the Time Machine Organisation through the Local Time Machines Infrastructure, but also small-scale projects led by individuals.

The project-based horizontal structure has key advantages:

  • Standard processes facilitate easy on-boarding of new projects and members. They ensure openness by-design
  • Standard operations and libraries of standard operators guaranty by design the desired level of compatibility between processes and datasets.
  • Centralised repositories for projects, operations and data sets enable a constantly up-to-date map of activities in progress.

Another development principle of the Local Time Machines is scalability by design. To maximise growth of the Time Machine environment, the right balance must be found between part of the infrastructure under the control of the Time Machine Organisation and pieces of software independently developed. Time Machine should be as distributed as possible, but as centralised as necessary.

The Time Machine Organisation (TMO) helps the regional/local actors in this process by providing technology, methodology and supporting infrastructure facilitating the digitisation pipelines, the standardisation of the information gathered and the development of related services. In the course of time, Local Time Machines pass through different maturity phases. Each maturity phase permits to envision specific exploitation strategies. A series of events taking the form of Local Time Machine Academy events will be organised to present, compare and evaluate ongoing work.

If you are interested to explore more details on the Time Machine concept and agenda, please feel free to have a closer look at our website and Manifesto and discover our Organisational Plan 2020-2021 for specifics.