20110505 linked openeuropeanalswt2011

Embed Size (px)

Citation preview

EuropeanaConnect

Linked (Open) Europeana: Vernetzte Daten in der Europischen Digitalen Bibliothek

Prof. Dr. Stefan Gradmann Unter Rckgriff auf M. Doerr, S. Hennicke, A. Isaac, C. Meghini, G. Schreiber, H. Van de SompelUnd Arbeiten von Europeana V1.0 und EuropeanaConnect

bersicht

NICHT: was ist Europeana

NICHT: was ist Linked Data

Nach dem Katalog: semantische Kontextualisierung in Europeana

Das Europeana Data Model (EDM) Grundlagen

Mona Lisa und mehr ...

Status Quo

ber 'Offenheit'

EDM re-uses three ontologies all of which are defined as a RDFS model.

SKOSSKOS is an ontology to model KOS (vocabularies) in the Semantic Data Layer of Europeana. It specifically enables cross-vocabulary matching between concepts.

Dublin CoreDublin Core is used to describe the core features of culture objects.ESE uses old Dublin Core Element Set. EDM uses new Dublin Core Metadata Terms which are specializations of the 15 old Dublin Core Elements.The use of DC Terms ensures backward compatibility to ESE.

OAI OREThe typical record about an object provided to Europeana will included several information pieces: e.g. with descriptive metadata, views (thumbnails, video files, audio files, text documents etc.), links to landing pages etc.OAI ORE allows us to group and organize these information pieces: the abstract provided object (Object), the descriptive metadata (Proxy), any view of the provided object (Digital Representation).

Nach dem Katalog

Alte und neue Begriffe
(thanks, Karen Coyle!)

KatalogBestand'Record'DokumentSucheBibliothekInformation

AggregationExplorationNavigationVerbindungKontextWissen

The current data model of Europeana are the Europeana Semantic Elements (ESE).

ESE addresses the issue of interoperability between the data from the different domains represented in Europeana by reducing the data to a flat, Dublin-Core like representation. This is a simple and robust approach but it has some drawbacks: The original metadata and information perspective are not visible anymore. And at the same time we can not specialize to finer-grained models or connect to external resources like LOD community.

The EDM addresses exactly these shortcomings . It tries to transcend the different information perspectives which are represented in Europeana. It acts as a top-level ontology in order to make objects from different domains interoperable while still preserving the original data.

The EDM is destined to replace ESE after the 2011 release of Europeana. The ESE will then be an application profile of EDM. That means that all ESE data in Europeana will be still compatible with the new system.

Nach dem Katalog: Objekte und Semantischer Kontext in Europeana

Kontextualisierte Objektreprsentationen ( Wissensgenerierung): Europeana verbindet Objekt- und Wissensorganisationssysteme!

First a few words about the envisioned information architecture of Europeana:This is how the information space of Europeana will be restructured: At the bottom we have the objects which are provided to Europeana. Above we have the Semantic Data Layer which is new. It contains various kinds of KOSs with knowledge about people, places, concepts, and so on. These concepts are linked to the objects below and thereby contextualize and enrich them.

Die Semantische Datenschicht

BibliothekArchivMuseum

berbrckt Informationsinseln und verbindet Objektreprsentationen aus verschiedenen Gedchtnisdomnen.

The data provided to Europeana will come from many different kinds of domains like libraries, archives, or museums. They all will provide their specific collections and KOSs. That will naturally result in isles of information. In order to make the data interoperable the concepts of the various KOSs in the Semantic Data Layer will be aligned, that means they will be connected via cross-vocabulary links.This technically enables applications to navigate through a semantic layer of concepts from different sources and to use it to access objects which are originally described by different but semantically related concepts.

EDM und Linked Open Data

KontextoptionenDBpediaVIAFGNDGeonamesLCSH

Europeana-ObjekteEuropeana intends to connect to the Linked Open Data community. In the Linked Open Data cloud we find many more knowledge sources like Dbpedia, Geonames, or Library of Congress Subject Headings. Europeana wants to use them to further contextualize and enrich the objects in its information space. At the same time Europeana wants to make its own data available to other communities.

The EDM is crucial for realizing this vision.

[ LOD cloud July 2009 ]

EDM

ESE

Europeana Semantic Elements (ESE)

ad hoc fr den ersten Prototyp (2008) kreiert

Interoperabilitt: Bezug auf das minimalistische und 'flache' Dublin-Core Modell

einfach und robust aber:

Semantik der Original-Metadaten geht verloren

Keine Spezialisierung im Sinne granularer Modelle mglich

Keine Verlinkungen und damit auch keine Verbindungen zu externen Ressourcen

Wir htten es wohl besser nicht semantic genannt :)

The current data model of Europeana are the Europeana Semantic Elements (ESE).

ESE addresses the issue of interoperability between the data from the different domains represented in Europeana by reducing the data to a flat, Dublin-Core like representation. This is a simple and robust approach but it has some drawbacks: The original metadata and information perspective are not visible anymore. And at the same time we can not specialize to finer-grained models or connect to external resources like LOD community.

The EDM addresses exactly these shortcomings . It tries to transcend the different information perspectives which are represented in Europeana. It acts as a top-level ontology in order to make objects from different domains interoperable while still preserving the original data.

The EDM is destined to replace ESE after the 2011 release of Europeana. The ESE will then be an application profile of EDM. That means that all ESE data in Europeana will be still compatible with the new system.

EDM

Europeana Data Model (EDM)wird ESE mit dem Danube-Release von Europeana (Mai 2011) ersetzen

ESE wird dann ein application profile von EDM (Rckwrtskompatibilitt!)

erhlt die Semantik der Originaldaten ohne Verlust von Interoperabilitt

ermglicht eine Nutzung von Europeana als Linked Open Data

ermglicht eine Nutzung von Linked Open Data in Europeana

ermglicht 'semantische' Nutzungsszenarien

The current data model of Europeana are the Europeana Semantic Elements (ESE).

ESE addresses the issue of interoperability between the data from the different domains represented in Europeana by reducing the data to a flat, Dublin-Core like representation. This is a simple and robust approach but it has some drawbacks: The original metadata and information perspective are not visible anymore. And at the same time we can not specialize to finer-grained models or connect to external resources like LOD community.

The EDM addresses exactly these shortcomings . It tries to transcend the different information perspectives which are represented in Europeana. It acts as a top-level ontology in order to make objects from different domains interoperable while still preserving the original data.

The EDM is destined to replace ESE after the 2011 release of Europeana. The ESE will then be an application profile of EDM. That means that all ESE data in Europeana will be still compatible with the new system.

EDM: Anforderungen und Designprinzipien

Unterscheidung zwischen realem Objekt (Buch, Bild, Akte, mediale Aufzeichnung) und digitaler Reprsentation

Unterscheidung zwischen Objekt und beschreibenden Metadaten

Es mssen mehrere Sichten eines Objekts mglich sein, mit potentiell einander widersprechenden Aussagen

Untersttzung komplexer Kompositobjekte

Standard-Metadatenformat mit Spezialisierungsoption

Standard-Vokabularformat mit Spezialisierungsoption

Maximale Nachnutzung existierender Standards

EDM und andere Standards

Simple Knowledge Organization System (SKOS)Modelliert die Wissensorgabnisationssysteme (KOS) in der semantischen Datenschicht von Europeana.

Ermglicht Verbindungen zwischen KOSs.

DCMI Metadata Terms (DCTerms)Basis fr semantisch interoperable deskriptive Objekt-Metadaten

Stellen Rckwrtskompatibilitt zu ESE her

Open Archives Initiative Object Reuse & Exchange (OAI ORE)Organisiert und modelliert Aggregationen von Web-Resourcen fr die Objektreprsentation

Provided Object: reprsentiert des gegebene (reale) Objekt

Digital Representation: eine digitale Sicht des Objektes

Proxy: Objektbeschreibung aus einer bestimmten Perspektive

Aggregation: gruppiert alle Teilinformationen

EDM re-uses three ontologies all of which are defined as a RDFS model.

SKOSSKOS is an ontology to model KOS (vocabularies) in the Semantic Data Layer of Europeana. It specifically enables cross-vocabulary matching between concepts.

Dublin CoreDublin Core is used to describe the core features of culture objects.ESE uses old Dublin Core Element Set. EDM uses new Dublin Core Metadata Terms which are specializations of the 15 old Dublin Core Elements.The use of DC Terms ensures backward compatibility to ESE.

OAI OREThe typical record about an object provided to Europeana will included several information pieces: e.g. with descriptive metadata, views (thumbnails, video files, audio files, text documents etc.), links to landing pages etc.OAI ORE allows us to group and organize these information pieces: the abstract provided object (Object), the descriptive metadata (Proxy), any view of the provided object (Digital Representation).

EDM: Klassen

In the physical world we create, use, and refer to aggregations of things all the time. We collect pictures in a photo album, read journals that are collections of articles, and burn CDs of our favorite songs. In this physical world these aggregations are frequently tangible - we can hold the photo album, journal, and CD. But, we also aggregate abstract entities for instance on the WEb. OAI-ORE makes it possible to identify an aggregation.

Mona Lisa: Beschreibung der
Direction des Muses de France ...

Mona Lisa as described and depicted by the French ministry of culture (Directions des musees de France)

und als Metadaten-Aggregation
in EDM

Proxy

Digitale Reprsentationen

Reales Objekt

Aggregation

This is the metadata record of the French ministry of culture modeled in EDM.

Each bubble represent a resource. In the bubble you have the class of the resource (its type) in italics and beneath the URI of the resource which identifies it.The arrows are the semantic links (the properties) between the resources. If there are two properties then the one below is the sub-property of the other one with a more specific meaning.

First we have the Aggregation node which groups together all information pieces delivered by the Ministry. It aggregates the node representing the physical object Mona Lisa, the digital representations of the Mona Lisa, and the proxy node which is specific to a given provider, and is used to represent the description of the provided object, as seen from the perspective of that specific provider.

This is how every metadata record provided to Europeana will look like in its basic form.

Why manage central nodes for provided objects?The ORE model says so: an ORE proxy has to be proxy for some "view- independent" resource.Users are looking for (real world) objects (the painting Mona Lisa) and not for the specific view on it of Louvre, or Jaconde (of which they normally do not know anyway). So the approach is: Find the object first (PhysicalThing) and then proceed to the specific views on it. This is also the LOD approach.

Semantische Anreicherung

Zeitrume, Daten

Rumliche Entitten

Personen und
Organisationen

Konzepte

Europeana wants to contextualize and enrich its objects by linking them to resources which contain additional knowledge. This enables richer functions, such as query expansion (e.g., using alternatives for a creator's name), recommendation of objects using semantic relations between them (objects created by connected artists), etc.

This is the same Proxy from the slide before but now all the string values are converted to resources and typed. For example the subject of the painting Mona Lisa femme is now a resource typed as a concept and with the english and french spelling of the concept attached taken from a KOS in the Semantic Data Layer. And in the same KOS we could also properly find the broader term for this concept. Furthermore we could semantically align the concept femme with the concept femme in the Wikipedia (LOD cloud) and take all the information available there for this specific subject, including the many translations of the term itself.

To increase the data value of its objects.

Ereignisbasierte Modellierung

Erhalt und Nutzung der Originaldaten impliziert Kompatibilitt der Beschreibungen jenseits der einfachen Objektsicht!

What we looked at so far can be understood as object-centric modeling. The second general modeling approach is event-centric which tries to tell a story about the objects history.

For this purpose EDM provides a simple event-centric core of one class and three properties:

ens:Event: hub for event descriptionsens:wasPresentAt, holding between any resource and an event it is involved in;ens:happenedAt, holding between an event and a place;ens:occurredAt, holding between events and the time spans during which they occurred.

This is to give you an impression of what is possible without going into details.

Komplexe Objekte und Beziehungen

Teil-Ganzes fr komplexe (hierarchische) Objekte

Objektanordnungen

Derivation und Versionierung

This is a (more or less fictional) example of three records about a translation of Edgar Allan Poes The Narrative of Arthur Gordon Pym of Nantucket to french:Record from BNF about an edition from 1868Record from Gallica about an edition from 1868 (which offers a digital version of the book online: this the WebResource)Record from BNF with an edition from 2007

A few things I want to point out:Two records about the same thing and both point to the same object of interest, the 1868 edition. The user will look for this edition and not for the specific view of Gallica or BNF on this edition. So this node is the point of entry from which a user will proceed to a specific view on the object.It is also apparent now why Proxies for the descriptive metadata are helpful: Because this way we can keep the two views on the 1868 edition distinct.Finally the link isDerivativeOf is an example of an inter-object link. So, for example, if a user found the 2007 edition he will be also hinted to the digital version of the 1868 edition in Gallica.

With respect to FRBR one could start discussing now what and where is the work, expression, manifestation, and item here. Although the development of the EDM has been inspired by FRBR it is not implemented yet. That will happen after 2011.

ESE in EDM