Build a new SIDRE application

Create your own SIDRE application with your own metadata schema.

Categories:

Tutorial

The SIDRE software system can be used to build your own search index with your own metadata schema. The necessary source code is publicly available and can be adapted via configurations. This tutorial shows how to build your own SIDRE application.

Setup repository

Create a new git repository for your application setup configuration - for example on gitlab.com, github.com or your own gitlab-instance. Please use resodate-setup as a template.

This so-called “setup-repo” should contain all specific and generally valid configurations for your own SIDRE instance as well as a Vagrant configuration that allows anyone to start the application locally in a VM.

Your repo should contain:

requirements.yml - a file that contains the dependency to the common sidre-setup ansible collection
default-config.yml - a file that contains all the default ansible variable configurations and adjustments for your SIDRE instance
default-config - a directory that contains all the default config files used in default-config.yml
Vagrantfile - a file that defines how to set up the VM
- adjust the vm_host variable in the first line to an IP address you want to use locally for your VM, for example 192.168.98.123
- a playbook.yml file that is used by Vagrant as entry point for ansible

Metadata Schema

The metadata schema used in the index can be fully configured, and the only thing required is a unique resource identifier. Ideally, a unique URL to the landing page of the resource is used for this.

Json-Schema

Create a new git repository for your json-schema-definition - ideally on gitlab.com. The json schema is used to validate the metadata. An example schema can be found in https://gitlab.com/oersi/oersi-schema

You need to provide your schema in a zip-file to be able to use it with the Ansible scripts. The GitLab-CI in the example repo shows how to do this on GitLab.

In the default-config.yml of your setup-repo, you need to configure…

… the URL to your schema zip-file via the variable search_index_metadata_schema_artifact_url.
… optionally the file- and directory-structure via search_index_backend_metadata_schema_location and search_index_backend_metadata_schema_resolution_scope (if it varies from the default).

Example

search_index_metadata_schema_artifact_url: https://gitlab.com/my-namespace/my-repo/-/jobs/artifacts/main/download?job=deploy
search_index_backend_metadata_schema_location: "http://localhost:{{ search_index_schemas_port }}/schemas/my-path/schema-xxx.json"
search_index_backend_metadata_schema_resolution_scope: "http://localhost:{{ search_index_schemas_port }}/schemas/my-path/"

Backend Schema

There are some field-configurations that must be defined via Ansible variables in the default-config.yml of your setup-repo:

search_index_backend_metadata_field_configuration.baseFields.resourceIdentifier - the field name of the field that contains the unique resource identifier
search_index_backend_metadata_field_configuration.baseFields.metadataSource - (optional) subfields define the usage of the source metadata information (see details in Supported Metadata Schema)

Example

search_index_backend_metadata_field_configuration:
  baseFields:
    resourceIdentifier: id
    metadataSource:
      field: mainEntityOfPage
      isObject: "true"
      useMultipleItems: "true"
      objectIdentifier: id
      queries:
        - name: providerName
          field: provider.name

Elasticsearch Mapping

Create a file elasticsearch-mapping.json.j2 in a backend subdirectory of the default-config directory of your setup-repo. This file should contain the elasticsearch mapping for the metadata index. Use a jinja2-template like in the example setup-repo to configure the mapping and keep the index_patterns and template definition.

In the default-config.yml of your setup-repo, you need to configure…

… the path to your mapping file via the variable elasticsearch_metadata_mapping_file
… the public index name via elasticsearch_metadata_index_alias_name - your metadata will be available under this name in the API
… the version of the mapping via elasticsearch_metadata_index_version - this is used to check if the mapping needs to be updated; please increase the version when changing the mapping

Example

elasticsearch_metadata_mapping_file: "{{ default_config_files_dir }}/backend/elasticsearch-mapping.json.j2"
elasticsearch_metadata_index_alias_name: my_metadata_index
elasticsearch_metadata_index_version: 1

Harvesting Metadata

To connect your application specific metadata sources to the index, you need to implement the jobs that fetch the data, transform it to your metadata schema and load it into the SIDRE backend.

For this, you can implement python scripts that use sidre-import-scripts-commons as a dependency. This module contains a set of python helper methods that can be used to load metadata into the SIDRE backend or to fetch data from common APIs like OAI-PMH.

Create a new git repository for your import scripts. Follow the instructions in sidre-import-scripts-commons to include this module and create your own processes for every metadata source that should be connected to your index.

In the default-config.yml of your setup-repo, you need to configure…

… the URL to your import scripts artifact via the variable search_index_import_scripts_artifact_url
… the active sources that should be imported periodically per default via the variable search_index_import_scripts_enabled_sources_py - this can be adjusted in the instance configuration for specific instances of your app.

Note: it is also possible to create an import that is based on metafacture/metafix instead of python - see https://gitlab.com/oersi/oersi-etl as an example.

Frontend

The schema / the fields that are used in the frontend are fully configurable.

Field Configuration

In the default-config.yml of your setup-repo, you need to configure…

… basic field options via search_index_frontend_field_configuration - see the Frontend Readme for details (General field configuration)
… search and filters via search_index_frontend_search_configuration - see the Frontend Readme for details (Search Configuration)
… the fields that are displayed in the “content” section of the detail page via search_index_frontend_detail_page_configuration - see the Frontend Readme for details (Detail page)
… the fields that are displayed on the result cards via search_index_frontend_result_card_configuration - see the Frontend Readme for details (Result Card)

Example

search_index_frontend_field_configuration:
  baseFields:
    title: name
    resourceLink: id
  options:
    - dataField: inLanguage
      translationNamespace: language

search_index_frontend_search_configuration:
  searchField:
    dataField:
      - name
      - description
  filters:
    - componentId: language
      dataField: inLanguage

search_index_frontend_result_card_configuration:
  content:
    - field: description
    - field: type

search_index_frontend_detail_page_configuration:
  content:
    - field: creator.name
    - field: description
    - field: type
    - field: publisher.name
    - field: inLanguage
    - field: keywords
      type: chips
    - field: license.id
      type: license

CSS

Create a file style-override.css in a frontend subdirectory of the default-config directory of your setup-repo. Implement your custom css in this file and configure the path to it in the default-config.yml via the variable search_index_frontend_custom_style_css.

Example

search_index_frontend_custom_style_css: '{{ default_config_files_dir }}/frontend/style-override.css'

Available Languages

Configure the languages that should be available in the frontend via the variable search_index_frontend_available_languages. Create a subdirectory for each language in the frontend directory of the default-config directory of your setup-repo. In each of these directories, create a translation.json file that overrides the default translations for the frontend labels. Also, in each of these directories, create a data.json file that provides the language specific labels for your used metadata fields. Include these files in the default-config.yml via the variable search_index_frontend_custom_translations.

Example

search_index_frontend_available_languages: ["de", "en"]
search_index_frontend_custom_translations:
  - {path: '{{ default_config_files_dir }}/frontend/en/translation.json', language: 'en'}
  - {path: '{{ default_config_files_dir }}/frontend/de/translation.json', language: 'de'}
  - {path: '{{ default_config_files_dir }}/frontend/en/data.json', language: 'en'}
  - {path: '{{ default_config_files_dir }}/frontend/de/data.json', language: 'de'}

Start the application

Now your application should be ready for installation and a first start. Use the Vagrant configuration to start the application locally in a VM. For this, you need to have Vagrant and VirtualBox installed on your machine. In your setup-repo, run the following command to install your application in a Vagrant VM: vagrant up.

After the installation, you can access your application in the browser via 192.168.98.123 (or the IP address you used in the Vagrantfile for the vm_host).