[ English | Deutsch ]
The entry point for importing collections and data into the DARIAH-DE Repository is the DARIAH-DE Publikator, which allows you to prepare, manage, and finally import your collections into the DARIAH-DE Repository using your favorite internet browser.
What’s new in version 5?¶
With version 5 of the Publikator, the new DARIAH-DE Repository Search is introduced on the domain <https://repository.de.dariah.eu>. The Publikator will be available directly as always at <https://repository.de.dariah.eu/publikator> or via direct menu link from the Repository Search. From now on your data will not only be published, it also will be added to the Repository Search automatically instead of creating a collection description draft in the DARIAH-DE Collection Registry only. The status REGISTERED has been removed from the Publikator, the final status of any collection will always be PUBLISHED.
New repository-only instances of the Generic Search and the Collection Registry have been set up, to only describe and find collections and data in the DARIAH-DE Repository. You can describe your collection and edit your collection description there.
Your data will also still be available in the DARIAH-DE Generic Search.
The term collection requires an explanation in connection within the DARIAH-DE Repository or the DARIAH-DE Research Data Federation Architecture: A collection here means certain quantities of research data, which is practically a quantity of files that belong together in some way.
If your files are already publicly accessible as a collection and are already provided with Digital Object Identifier and if someone (eg a data center) takes care of their safe storage, you can register and describe them as a collection in the DARIAH-DE Collection Registry (as before). If you have a technical interface to your collection, you can also specify it in your collection description. So the contents of your collection are indexed in the Generic Search of DARIAH-DE and can be found there.
However, your research data can also be stored locally on a hard disk, on a CD or in a non-publicly accessible location, either as a collection or as a single file. Then it is not accessible to other researchers, your research data can not be searched for and found by other interested parties and may be lost to science if not maintained. If you want to make your data available to other scientists and keep your research results safe and citable, you can import them into the DARIAH-DE Repository via the DARIAH-DE Publikator.
After that, your research data
- will be stored safely in the repository,
- will include a Digital Object Identifier (your collection itself and all the files).
Your data then
- can be permanently referred to and be cited,
- is publicly accessible,
- is described as a collection in the Collection Registry, and
- is searchable in the Repository Search as well as in the DARIAH-DE Generic Search.
Your research data are then included in the research data life cycle and are thus available for subsequent use.
Log in with the DARIAH-DE Account or with the Federation Account¶
You can reach the DARIAH-DE Publikator in the DARIAH-DE Portal from the side of the
or also directly via this link to the
To use the DARIAH-DE Publikator, please click on the Start with the DARIAH-DE Publikator button (see figure 1) and log in with your DARIAH or Federation account. If you do not have a DARIAH account, you can apply for it HERE.
First Time Login Confirm the Access to the Storage¶
If you log in the first into the DARIAH-DE Publikator, you will be asked if you want to allow that you account may access the DARIAH storage. This dialog is displayed because the DARIAH Federation Architecture uses OAuth for the services. You must allow the access, otherwise you can’t use the DARIAH-DE Publikator. This dialog is only displayed once.
Publishing with the DARIAH-DE Publikator¶
A collection created in the DARIAH-DE Publikator is initially only used to aggregate research data. In this way, you have a superordinate unit that summarizes your data into a topic and allows you to describe your data as a collection of related objects.The associated data can be assigned to this collection and uploaded for publication. Your files are also described with metadata. As a metadata standard, Dublin Core Simple has been used to track a generic approach so that you have a small and refined stock of metadata to describe your data. Only a few details are obligatory.
After the publication the data of this collection is stored securely in the DARIAH-DE Repository and is publicly accessible. You can use the persistent identifiers (DOIs) to reference your collection and data. Furthermore your collection will be indexed in the Repository Search.
The DARIAH-DE Publikator will publish a collection description in the Repository Collection Registry that is based on the metadata you entered, which can be extended there. Only references to the data are stored in the Collection Registry (or an access method specified on the data) but not the data itself. In the collection registry, you describe your collection – including technical interfaces, and you can access a much more detailed description scheme (DARIAH Collection Description Data Model – DCDDM) than it is possible with the publication with Dublin Core Simple.
First, the files are saved by the DARIAH-DE Publikator in the DARIAH-DE OwnStorage – an implementation of the DARIAH Storage API. During the publication process, the DARIAH-DE Publikator delivers the objects of a collection including metadata to the DARIAH-publish service, which in turn passes the data to the DARIAH-crud service, that is for basal operations such as CREATE and RETRIEVE on the DARIAH-DE OwnStorage, and now also gets DOIs and performs some metadata conversions, and finally safely stores each individual file, along with descriptive, administrative, and technical metadata, in the repository.
Two Views of the DARIAH-DE Publikator¶
The user interface of the DARIAH-DE Publikator is divided into two views. The first includes an overview of your collections. Here you can create collections and you can see a list of all collections you have created so far. For each collection in this list, the title and the status of the publication process are displayed:
DRAFT: The collection has been just created or is currently being edited within the Publikator. Collections in draft status are only visible to you as a logged in user or registered user. The content of draft collections can be changed. RUNNING: A publishing operation has been started and is currently in progress. ERROR: An error occurred during a publication process. PUBLISHED: The collection and its data are published in the DARIAH-DE Repository, registered in the Repository’s Collection Registry and indexed by the Repository Search and additionally DARIAH-DE Generic Search.
The overview of your collection is also responsible for publishing you collections. If you click on create new collection or on one of the collections and click edit collection, you will be taken to the second view: Edit Collection. Here you can edit contents of the collection and edit the metadata.
Creating a New Collection¶
If you have not created a collection yet, you can create a new one by clicking on the create new collection button. A newly created collection is initially in the status DRAFT. You will now be taken directly to the Edit Collection View.
Tagging your Collection with Metadata¶
Any changes that you make in this view are saved automatically. If you click on the to main view button, all your data and metadata already is stored securely to the Publikator storage, so you can continue working anytime.
First, you should fill out the displayed mandatory metadata fields to describe your collection directly. At the moment, three items are mandatory:
- Title (dc:title)
- Creator (dc:creator)
- Rights management (dc:rights)
The required metadata fields are marked with an asterisk (*) and appear in red as long as they are not filled out. If you are not familiar with the Dublin Core metadata schema, you can click on the (i) to display a description of the metadata field including examples. Dublin Core Simple has 15 metadata fields, the other twelve you can add by clicking the button add optional metadata. All fields are repeatable and you can add them by clicking on (+) as often as you want, and of course delete them by means of (-). Each obligatory field must contain at least one at the latest when the collection is published.
Integrating Files (and More Metadata)¶
You can now add your research data as files by clicking on the field Drop files... or drag and drop your files directly into this field. The uploaded files will appear in the left collection tree. Two metadata fields are automatically assigned: The filename is used as the title, and the format comes from the mimetype of the file, which is determined automatically. You are welcome to change or delete this data. The three metadata fields mentioned above are also mandatory for each file.
However, if you add many files to your collection, you do not have to enter all the metadata for each file individually. For all fields, such as creator, author, or licensing, you can select the title of your collection in the tree, and then click on arrow-down. Then the content of the selected field (eg rights management) is copied to the current level with all directly associated files to the collection. If there is existing content, this information will not be deleted, but a further field will be added. Be careful to not inadvertently pick up the title of the collection for all files. There is not yet a back or undo function!
In the following screenshot you see the edit collection view with optional metadata fields of the sample collection:
In the second screenshot below you can see the view of the metadata of the attached file. Each file and the collection has its own set of metadata. You can edit them independently. If you have selected a file on the left tree, you can view the file, remove the file, and update the file. If the file is deleted, it is removed from the OwnStorage, including the metadata. File and metadata are no longer available in the DARIAH-DE Publikator. Of course the file will remain on your hard disk. If you want to update the file, for example because you have made local changes to it, you can exchange the file by updating it.
Ordering of Files and Collections¶
By default the files are sorted by the order of the upload. You can change this order by using drag and drop on the left tree. You can also move files into subcollections and change the order of subcollections.
Back to the Overview Page¶
You can edit your collection as often as you want, the data and metadata are stored in the Publikator safely until you publish them. Once you have finished editing your collection, you can go back to the overview by clicking on the button to main view and work on your collection at any time. You will see a list of your collections in the overview, and if you come directly from the edit mode, the last edited collection will already be opened.
You can now create additional collections or continue working with the already existing ones. Since you are logged in to the DARIAH-DE Portal, the collections of this view are only visible to you as long as they are not published. These collections are in status DRAFT. The field below explains the possibilities for you to proceed with the collection:
Your collection is in the draft stage and is only visible to you. Click edit collection to add files to your collection, and enrich the collection and its content with metadata. Please note that some metadata fields must be completed before you can publish the collection. You also have optional metadata fields that will increase the visibility of your collection after publication.
If you have finished editing your collection and you are happy with all your metadata, you can publish the collection: Your collection and all the files contained in it will get Digital Object Identifier (DOIs) during the publication process and can thus be permanently and unambiguously referenced.
You can also delete the collection and all contained files including metadata from the Publikator, leaving your source files on your hard drive. Published collections can be deleted from the Publikator, but not from the DARIAH-DE Repository.
Publish Your Collection¶
If you are now satisfied with your collection, which means that you have added all files and metadata information (at least the mandatory), then you can click the publish collection button.
Please be aware that all data and metadata are publicly accessible after the publication process and can no longer be edited or deleted by you!
During the publication process many modules are processed, which are described in the info boxes of this documentation below (for the work with the Publikator you can skip it). Data and metadata of your collection will be passed on by the DARIAH-DE Publikator to the DARIAH-Publish Service and from there to the DARIAH-crud service. Information about the status of the publication process are displayed in the blue box. This information comes directly from the Publish Service. Partly, they are quite technical.
The DARIAH-Publish Service...
...is a workflow service that performs various steps within the publication.
Among other things, the metadata is validated, references to objects within the imported collection are converted to Digital Object Identifier (DOIs) and technical metadata is generated. Finally, after the creation of the collection file, all referenced data, including metadata are passed from the OwnStorage (by reference) on to the DARIAH-crud.
If the publication service is successfully terminated, your collection has been successfully published. This means initially that
- all files were written to the PublicStorage, where they are publicly accessible,
- all files have a DOI and can be found at the Datacite Search,
- the collection and its contents can be queried via the DARIAH OAI-PMH service,
- a collection description has been created for your published collection in the Repository’s Collection Registry, and
- your collection is indexed in the Repository Search as well as in the DARIAH-DE Generic Search.
The DARIAH-crud Service...
...is the storage service of the DARIAH-DE Repository and provides basic storage operations.
Two instances of the DH-crud service are in operation. One can only be reached internally (eg from the DARIAH-publish service). This is primarily responsible for the generation and administration of data. Here the metadata and data of all objects
- are stored in DARIAH-DE PublicStorage,
- are entered into the index database for later retrieval by OAI-PMH service, and
- get a DOI which uniquely identifies and references each object.
The second instance, which allows read-only access to the data, can be accessed externally. It returns data and metadata of the stored objects, as well as a small and fine index page for an overview of the collection and its contents.
If the publication process has been successful, the status of your collection changes from RUNNING to PUBLISHED. The overview looks like this (the published collection is expanded in this screenshot):
The generated Digital Object Identifier of your collection is displayed in the table as the DOI of the collection (doi:10.20375/0000-000B-C8EF-7 ). The displayed link refers (using a DOI resolver) to the landing page of the object. You have access to all data and metadata.
Congratulations! Your collection is now published and thus publicly accessible and referenced via the displayed persistent identifier (DOI)!
You can display your collection through a landing page (please click go to landing page) of the repository. From there, you have direct access to the data and metadata of your collection, and you can view descriptive, technical, and administrative metadata. Furthermore, you have access to all the related objects in your collection. You also can access your collection in the Repository Search, just click on show in repository. You can add metadata to the automatically created collection description in the DHREP Collection Registry, please click on edit in Collection Registry. If you want to delete your collection, the data and metadata are only deleted in the DARIAH-DE Publikator, but not fron the the DARIAH-DE Repository!
Additionally your collection will be available in the DARIAH-DE Generic Search, and then also be publicly searchable there.
Via the landing page you can get a quick overview of your collection and its data. You can see some core metadata of the respective collection or content file and you can download all data and metadata.
You can retrieve various generated and saved metadata for each file as well as for the collection itself. All metadata and files can also be found in the bagit bag, which stores each of your files together with their metadata in a ZIP file. The collection itself is also stored in the repository as a single file, which refers to its content files via DOI. For each file, the bagit bag includes the file itself plus descriptive metadata (Dublin Core Simple Metadata you already provided, see above), administrative metadata (provided by the DARIAH-crud service) and automatically extracted technical metadata. These bagit bags are stored in the DARIAH-DE PublicStorage.
More about the DARIAH-crud API you can find in the menu: Help > Repository API Documentation or directly HERE .
Further, more technical, links and references such as Handle metadata or links directly to DARIAH-DE OAI-PMH service of your collection, you can find on the index page of your objects, see Extended Downloads > Index page of this object.
If errors occur during the publication process, the status of your collection changes to ERROR. You will first see a general error description concerning your collection.
You will get a detailed description of the problem by clicking on show error details. In some cases, you can also jump directly to problematic places from your collection, which you need to correct to restart the publication process of your collection.
In our example, some mandatory metadata are missing. You can click on the edit button (the one width the pencil) to correct this error. After you have made the corrections you can start the publish process again.
Repository Search and Collection Registry¶
Your collection is now securely and permanently stored in the DARIAH-DE Repository and can be persistently referenced via DOI. With the help of the DOI or a URL including handle resolver and the DOI, everyone can access your collection and its associated data. Your collection is published in the Repository’s Collection Registry with a collection description and indexed in the Repository Search.
If you want to edit your collection description, just click on edit in Collection Registry and you will be taken directly to the collection description of your collection.
Depending on your browser configuration you will probably be asked again for your federation account. Please log in with the same credentials you logged on to the Publikator. Normally you will be logged in to the Collection Registry with your current Publikator account.
If you wish to add more information to your collection, please switch on Show hints (Editor options on the left). If you want to keep your data indexed in the Repository Search, you must not delete or modify the access data (OAI-PMH) under Collection Access.
The DARIAH-DE Generic Search <https://search.de.dariah.eu> also indexes your collection.
Digital Object Identifier (DOIs)¶
The verification of your collection and the data contained is mainly provided by the Digital Object Identifier. The collection itself as well as each individual content file gets such a DOI and it looks as follows:
You can now use this DOI as a reference to your collection and your research data. As a DOI and identifier you should use it as following:
If you want to use or share a URL at the same time, you can simply use every DOI resolver:
Landing and Index Pages, Data, and Metadata¶
You can also reference all metadata and data files directly using the EPIC2 Handle (this is not possible using DOIs):
Landing page: hdl.handle.net/21.11113/0000-000B-C8EF-7@landing Index Page: hdl.handle.net/21.11113/0000-000B-C8EF-7@index Data: hdl.handle.net/21.11113/0000-000B-C8EF-7@data Descriptive metadata: hdl.handle.net/21.11113/0000-000B-C8EF-7@metadata Administrative metadata: hdl.handle.net/21.11113/0000-000B-C8EF-7@adm Technical Metadata: hdl.handle.net/21.11113/0000-000B-C8EF-7@tech ZIP file containing data and metadata (BagIt): hdl.handle.net/21.11113/0000-000B-C8EF-7@bag