Documentation

The mdw operates the mdwRepository for the collection, archiving and publication of media objects at the mdw. Media objects can exist in all machine-readable formats and include, in particular, digitized material, text, image, video and audio files, as well as software and data collections from teaching, research, development and development of the arts. 

Several organizational units of the mdw develop specialized services in mutual coordination to promote the use of this infrastructure.

mdwRepository as a central tool for managing media objects

The mdwRepository manages media objects in the form of data and their metadata, which enable the cataloging of media objects, define their possible uses, place them in context, and guarantee their (re)use. The mdwRepository follows the RDA Metadata IG metadata principles:

  1. "The only difference between metadata and data is mode of use

  2. Metadata is not just for data, it is also for users, software services, computing resources

  3. Metadata is not just for description and discovery; it is also for contextualisation (relevance, quality, restrictions (rights, costs)) and for coupling users, software and computing resources to data (to provide a Virtual Research Environment)

  4. Metadata must be machine-understandable as well as human understandable for autonomicity (formalism)

  5. Management (meta)data is also relevant (research proposal, funding, project information, research outputs, outcomes, impact…)" (RDA Metadata Principles and their Use 20141114 - Word Document)

The following categories of media objects are currently managed in the mdwRepository or are under development or in planning stage:

  • Publications
    • Open Access publications (pub.mdw)

    • Theses (under development)

  • Research Data

  • Collections & Archives

  • Performance and Intellectual Capital documentation (in planning)

  • Research Information System (under development)

  • Open Educational Resources (in planning).

To ensure long-term accessibility, unique and stable identifiers (permalink, optional DOI) are assigned to all data and metadata stored in the mdwRepository. Thus, they become digital resources in the sense of the Digital Resource Management (defined in the mdw Development Plan 2019-2024).

The mdwRepository uses metadata standards such as Dublin Core (which is also an ISO standard ), OpenAIRE standards, General International Standard Archival Description (ISAD(G)) or user-defined metadata schemas. The latter should only be used if no metadata standards exist.

In addition to master objects, the mdwRepository also generates and/or manages variants of the data (e.g. preview files or format conversions, e.g. data transformation for long-term archiving). Media objects and their metadata are technically checked, e.g. to ensure data integrity. Ongoing backups ensure data security.

All members of the mdw as well as external partners with written agreements (e.g. through funding contracts, cooperation agreements, etc.) can store and manage media objects in the mdwRepository after agreeing to the terms of use.

The public area of the mdwRepository is used for publishing metadata and data. Publishing is done on freely accessible web pages and simultaneously on technical services (e.g. OAI-PMH endpoint, REST APIs or SPARQL endpoint). This ensures access to the media objects by the respective designated communities (providing human-readable and technical services) and enables integration into external systems (e.g. WorldCat or BASE - Bielefeld Academic Search Engine).

The mdwRepository is a central component of the Digitization Strategy, the Digital Resource Management Strategy in the 2019-2024 Development Plan or the 2022-2024 Development Plan, and the mdw Research Strategy.
The mdwRepository is listed in the following directories of repositories:

Main functions of the mdwRepository

  • Provide a central collection of media objects at mdw,
  • Provide an administration environment for the media objects (incl. metadata collection, permissions and rights assignment),
  • Secure the underlying data and long-term archiving (incl. data transformation and migration),
  • Provide access to the metadata and/or data for various purposes (e.g., teaching, research, development, and indexing of the arts), following Open Access and FAIR principles where possible,
  • Enable detailed searches and analyses (in the metadata and/or data domain) and thereby
  • Promote the use of data and metadata.

mdwRepository as part of the Digital Resource Management at mdw

Digital Resource Management has been highlighted as a strategically important area in mdw's development plans EP 2019-2024 (Chapter 11, Digital Resource Management) and EP 2022-2024 (Chapter 10, Digitization). Digital resources, as defined by the International Organization for Standardization (ISO), are all types of resources that are transmitted and/or retrievable or accessible by means of an information technology system and are referenced by a unique and stable identifier in a recognized identification system (e.g., ISBN, permalink, URN, DOI). Thus, the notion of digital resource in the ISO definition largely corresponds to the definition of media objects in the terms of use of the mdwRepository. 

One of the core areas of digital resource management at mdw is digital asset management, a term first established in media or image and video management. In 2014, mdw started with a corresponding solution. And built its digital asset management system on a central repository for the storage and reuse of digital objects. In this context, a digital repository is understood as the service in the sense of ISO that provides reliable access to the managed digital resources.

By recording metadata that enables the media objects to be indexed and defines their use, the foundation is laid for media objects to become digital assets as defined by ISO that have a specific value and are used. 

Since 2016, the mdwRepository has evolved as a re3data-registered data repository and was enhanced by the open access publication server pub.mdw, a literature repository component. Currently, the mdwRepository also includes Digital Assets, Datasets and Collections&Archives. There are plans to expand the mdwRepository with new components in the Literature Repository (thesis server and mdwPress) as well as with the creation of a Current Research Information System (CRIS). Thus, the mdwRepository will is built on three pillars, which can be described as

  • Data Repository,
  • Literature Repository and
  • CRIS Systems

in accordance with internationally used terms.

 

mdwRepository

Data Repository

mdwData

Datasets

Archive/Sammlungen

Open Educational Resources

Literature Repository

pub.mdw

ethes.mdw

mdwPress

Open Access Journals

CRIS

mdwFIS

mdwLIS

Entitäten- / Klassifikationsserver
Technische Infrastruktur

Fig. 1: 3 pillars of the mdwRepository: Literature Repository pub.mdw (registered in Directory of Open Access Repositories (OpenDOAR)), Data Repository mdwData (registered in Registry of Research Data Repositories (re3data)) and planned mdwFIS CRIS System (registered in Directory of Research Information Systems (DRIS))

 

From a user perspective, the three-pillar structure means that the core of data storage is the Data Repository and that the Literature Repositories and the FIS/LIS interact with it (e.g., by storing research outputs in the Data Repository, whose metadata is accessed by the FIS). Collections and archives use the Data Repository as a long-term archiving tool and to manage collection management information. The Data Repository enables data archiving (format identification, data validation, metadata extraction, hash check, virus check, etc.) and documents the processing chain and change history, the digital provenance of the data according to PREMIS (PREservation Metadata: Implementation Strategies). The users themselves do not primarily interact with a literature repository or a data repository, they interact with the mdw repository and its services.

Technically, the mdwRepository follows the OpenAIRE guidelines for the respective repository pillar:

In order for the 3 repository components to work together seamlessly they must use the same classifications (vocabulary, taxonomy, thesaurus, ontology) in order to ensure semantic interopability and share a common set of entities (e.g. agents, projects, works).

While the entity registry server (mdwERS) will enable a clear assignment of persons, organizational units, projects, etc. within the mdwRepository and enable the exchange of entity data with other systems, e.g. funding agencies, the classification registry server (mdwCRS) will form the basis for uniform semantics (vocabularies, taxonomy, thesaurus, ontology) within the mdwRepository. 

mdwRepository Services

Media Objects are stored in a shared technical infrastructure (virtual machines, database clusters, filesystem, applications and services).

On the Media Object management side, the mdwRepository generates unique and stable identifiers (Permalink, optionally DOI) for all stored data and metadata, turning them into digital resources. DOIs are provided according to the DOI Policy of the mdw.

The mdwRepository uses metadata standards like Dublin Core (ISO 15836-1:2017), OpenAIRE standards, General International Standard Archival Description (ISAD(G)) or user-defined metadata schemas. The latter should only be used if no metadata standards exist. 

On the data level, the mdwRepository generates and/or manages not only the originals (master) but also variants of the data (e.g. preview files or format conversions, e.g. data transformation for long-term archiving). In each case, technical checks of the media objects and their metadata are performed (e.g. to ensure data integrity). Ongoing backups ensure data security.

All mdw members including researchers, students, staff and doctoral candidates as well as external partners with a corresponding written agreement (e.g. through funding agreements, cooperation agreements, etc.) can store and manage media objects as depositors in the mdwRepository after agreeing to the Terms of Use.

The public area of the mdwRepository is used to publish metadata and data. Publication takes place on freely accessible websites and simultaneously on technical services (e.g. OAI-PMH endpoint, REST-APIs or SPARQL endpoint). This ensures access to the media objects by the respective user group - people and technical services - and enables integration with external systems (e.g. WorldCat or BASE - Bielefeld Academic Search Engine).

The mdwRepository is a key component of the Digitalisation Strategy being currently established (Roadmap), Digital Resource Management strategy listed in the Development Plan 2019-2024 (in German), and Research Strategy (in English) of the mdw.

The mdwRepository is registered in:

 

mdwRepository Services

The mdwRepository team provides a set of services for using the technical infrastructure. Users are supported and advised by the respective team members depending on the usage scenario, e.g. for digitisation projects, open access publications and data management plans.

 

 

Mission Statement

The mission of the mdwRepository is to support capturing and preserving the university’s intellectual outputs by ensuring and promoting sustainable services of ingest, storage and access to Media Objects.

The main functions of the institutional repository are:

  • acquiring Media Objects of the mdw,

  • providing a management environment for the Media Objects (incl. metadata management, permission / access control),

  • securing the data for long term archiving / digital preservation (incl. data integrity validation, data transformation and migration),

  • providing access to Media Objects (data and/or metadata) for various purposes (incl. support of the research of the university, students, and other scholars, teaching, and advancement, appreciation and teaching of the arts) in accordance with Open Access and FAIR principles,

  • enabling detailed search, discovery and dissemination of Media Objects (on data and/or metadata level), and

  • actively promoting the use of the Media Objects.

 

mdwRepository as part of the Digital Resource Management at the mdw

Digital Resource Management has been highlighted as a strategically important core area the Development Plan of the mdw 2019 - 2024. In the sense of the ISO/IEC 24751-3:2008 definition, digital resources are all types of resources which are transferred and/or retrievable or accessible by means of an information technology system and which are referenced via a unique and stable identifier in a recognized identification system (e.g. ISBN, Permalink, URN, DOI). Thus, the ISO definition is close to the media objects definition in the mdwRepository Services Terms of Use.

A digital repository in the sense of ISO 24622-1:2015(en) is the service that provides reliable access to the digital resources managed. 

One of the core areas of Digital Resource Management at mdw is the field of Digital Asset Management, a term which first established itself in the field of media and/or image and video management. The mdwRepository manages media objects in the as a combination of data and metadata, turning media objects into digital assets according to ISO/IEC 19770-1:2017: media objects only become digital assets that have a certain value for depositors and are used by capturing metadata that enable the indexing of media objects and defining permissions and usage terms.

The digital asset management system is based on a central repository for the storage and reuse of digital objects. The mdwRepository has its roots in the Digital Asset Management domain (see History).

 

mdwRepository Services for Research Data

Research data (datasets) can be managed during their entire life cycle, which typically consists of four phases: Creation, management, publication and storage/archiving. Among other things, this makes it possible to exercise the rights and obligations of researchers arising from contracts with third-party funding bodies and other legal sources and, for example, to implement data management plans (DMPs). In the meantime, DMPs have become a fixed component of the external research funding. For example, the Austrian Science Fund FWF requires a DMP for all projects (further information).

Contact: forschungsfoerderung@mdw.ac.at

 

mdwRepository Services for performance documentation and research information system

In the future, the performance documentation (in German: Leistungsinformationssystem - LIS) will be used to record the performance of the scientific/artistic staff. The LIS is intended to enable researchers and artists to adequately present their work at mdw. The data entered should serve, among other things, to prepare the legally required bibliographic record and the Intellectual Capital Record of the mdw. An integrated research information system (in German: Forschungsinformationssystem - FIS) aims to document research outputs, in particular with regard to research projects.


Contact person LIS: Bernhard Kurz (Quality Management), kurz@mdw.ac.at, +43 1 711 55 6012;
Contact person FIS: Vitali Bodnar (Research Support), bodnar@mdw.ac.at, +43 1 711 55 6114

 

mdwRepository Services for Open Access Publications

pub.mdw offers mdw members the opportunity to publish their scientific work electronically, to make it permanently accessible and, if possible, to make it usable again by means of free licensing. The service thus contributes to the implementation of the open access policy.

Persistent Identifiers (DOI) guarantee that publications can be clearly referenced and form the basis for their sustained dissemination via the Internet.

Contact: Michael Staudinger (University Library), staudinger@mdw.ac.at, +43 1 711 55 8100)

 

mdwRepository Services for Theses

In accordance with § 38 of the study law section of the mdw statutes, academic master's or diploma theses and dissertations must be submitted in electronic form in addition to the printed copy. In doing so, the submitters can give their consent to the publication of the electronic version. 

If consent is given, the electronically submitted papers will be stored in the mdwRepository and made available via the ub.mdw search portal

 

mdwRepository Services for Collections & Archives

The rich cultural heritage of the mdw will be digitally accessible via an archive portal. The archives and collections set up at the mdw will thus have the opportunity to present their holdings in a database system which complies with international standards such as International Standard Archival Description (ISAD (G)), Records in Contexts (RiC) and Dublin Core (DC) and cataloguing guidelines such as resource cataloging with standard data in archives and libraries (RNAB) and is linked to standard databases such as the GND or VIAF. The mdwRepository can use the Encoded Archival Description (EAD) maintained by the Technical Subcommittee on Encoded Archival Standards as a transfer format.

As a transdisciplinary service facility, the portal, designed as an open knowledge platform, guarantees access to historical material and, combined with the digitization of the originals, serves to safeguard the cultural heritage of the mdw.

Contact: Severin Matiasovits (Archives of the mdw), matiasovits@mdw.ac.at, +43 (1) 71155 - 6500

 

Access

Employees and students at the mdw have access to nuxeo via LDAP authentication. External users must be registered by the mdwRepository administrators.

Users of the backend systems need to accept the Terms of Use applicable to the repository services. Otherwise, no access to the management interface is granted.

Access levels can be defined in the management backend depending on the digital asset requirements. The information on rights and conditions of access are written in the administrative metadata (rights metadata and ACL). We differentiate between the access to metadata and content object and/or file. When access to metadata is granted it does not automatically imply access to the content object and/or file.

Once persistently published (with URN or DOI), the digital assets are available to the public.

 

Licenses

The users assign the licenses required for their Digital Assets on their own based on a pre-defined license list covering License Statements incl. "All Rights Reserved" or open licenses, e.g. Creative Commons. Together with the Access Rules, Licenses control the publishing and distribution option available for the digital assets.

 

Organizational Infrastructure

The mdwRepository is operated in-house at the mdw - University of Music and Performing Arts Vienna. We mainly use Virtual Machines to host the different services provided by the mdwRepository:

  • nuxeo backend system (with ElasticSearch Cluster for search functionality, Kibana for analysis, Oracle database, and network storage that can be extended dynamically as the data amount grows),
  • Triple Store and Graph Database (Apache Jena Fuseki instances and Blazegraph),
  • OAI-PMH endpoint provided via an Apache Webserver,
  • mdwCMS for managed content on the mdwRepository website,
  • Python Flask server for public touchpoints (restricted public APIs, custom frontends), and
  • Custom frontends (accessing the nuxeo backend system via its REST API).

All services undergo daily backups / snapshots.

The staff of the IT department ensures the technical availability and functionality of the mdwRepository infrastructure.

The Digital Asset Management team provides knowledge on digital preservation in general, expertise in data management planning, metadata management, and implementing custom application functionalities.

 

Data Integrity and Authenticity

The data in the backend system nuxeo in a Visible Content Store (VCS) that uses a SQL database as a backend and stores data in a Content-addressable storage based on a md5 digest. 

The data can only be accessed via the nuxeo frontend by the registered users that have accepted the mdwRepository services Terms of Use. There is no direct access to the files on the file system aside from members of the IT department, protecting the data against unauthorized alteration and, thus, ensuring data integrity. Checksums are created for all files ingested and validated in regular intervals. In case of mismatches files will be restored from backups.

Authenticity is preserved by creating an audit record for each media object ingested and managed within the mdwRepository (incl. creation user and datetime information, modification or status or publication information). The audit trail is stored in the database backend.

Both metadata and binary files are versioned at the application layer by the repository software (using versions and revisions).

 

Data Discovery and Identification

The backend system nuxeo creates unique IDs for every asset. Thus, each record can be uniquely identified within the system. The files in the Java Content Repository are stored in checksum based directory structure.

For external publication the mdwRepository provides persistent identifiers: 

  • URNs for all externally available resources and
  • DOIs on request.

Data can be published externally via the OAI-PMH endpoint for third parties to harvest and re-use the data and through an Apache Jena Fuseki Triple Store and a Blazegraph server via our SPARQL endpoints. Additionally, data can be presented on websites that interact with the backend system via a RESTful API.

All Metadata is indexed in an ElasticSearch index together with the full text extracted from documents for easy retrieval. All search strategies available by ElasticSearch can be used within the mdwRepository. Searches can be bookmarked for easy access later on.

mdwRepository is listed in the Registry of Research Data Repositories, re3data.org.

The OAI-PMH endpoint is registered in the list of OAI-PMH Registered Data Providers, which is used as a source for OAI-PMH endpoints available for harvesting data. Additionally, OAI-PMH based data is available via BASE (Bielefeld Academic Search Engine).

 

Security

For greater security of access all mdwRespository components are accessed via a firewall both for internal and external users (no direct access to servers aside from IT department).

Access to search servers (ElasticSearch) is restricted by IP-based access restrictions. Thus, users do not have direct access to these servers.

Only a limited number of members of the IT department, external service providers with a valid contract and non-disclosure agreement, have access to the servers, databases and storages.


Contact

IT / Digital Asset Management

Stefan Szepe
+43 1 711 55 7329
szepe@mdw.ac.at

 

Research Support

forschungsfoerderung@mdw.ac.at

 

University Library

Michael Staudinger
+43 1 711 55 8100
staudinger@mdw.ac.at

 

Further contact persons

 

Data protection:

Anne-Kristin Fischer
+43 1 71155 – 6046
datenschutz@mdw.ac.at


Intellectual property:

Paul Hofmann
+43 1 71155 – 6220
hofmann-paul@mdw.ac.at


Academic integrity:

Karl-Gerhard Straßl
+43 1 711 55 – 6010
strassl@mdw.ac.at