Tutorial

The objective of this tutorial is to create a simple Submission Information Package (SIP) that can be transferred to Digital Preservation Service. The SIP consists of a Metadata Encoding and Transmission Standard (METS) document, a digital signature and file(s).

The following must be provided to construct a valid minimal SIP:

  • base METS information (METS profile, contract identifier and creator)

  • the files

  • descriptive metadata

Rest of the required information is automatically generated by siptools-ng.

1. Create base METS

The METS object contains various metadata about the SIP and its contents. It is mostly populated by siptools-ng, though you need to provide some basic information first:

from mets_builder import METS, MetsProfile

mets = METS(
    mets_profile=MetsProfile.RESEARCH_DATA,
    contract_id="urn:uuid:abcd1234-abcd-1234-5678-abcd1234abcd",
    creator_name="Sigmund Sipenthusiast",
    creator_type="INDIVIDUAL"
)

Once the SIP object is created, the METS object will be available via the SIP.mets property.

Note

The METS specification is explained in more detail in the DPRES metadata requirements.

In many cases, siptools-ng uses dpres-mets-builder to automatically generate metadata without requiring user action.

2.1 Import files from directory

The easiest way to create a SIP object is to import a whole directory using siptools_ng.sip.SIP.from_directory() method:

from siptools_ng.sip import SIP

sip = SIP.from_directory(mets=mets, path="/path/to/local/directory")

Each file in the directory and its subdirectories is added to SIP, and technical metadata is automatically created.

2.2 Import files individually

If files must be treated individually, for example when file specific metadata is added, or when technical metadata is generated manually, File class can be used:

from siptools_ng.file import File

files = [
    File(
        path="/home/sigmund/Videos/source-files/video.mkv",
        digital_object_path="media/video.mkv"
    ),
    File(
        path="/home/sigmund/Documents/pdf-a/document.pdf",
        digital_object_path="documents/document.pdf"
    )
]

Once the necessary metadata has been added to files, the File objects are added to SIP object using siptools_ng.sip.SIP.from_files():

sip = SIP.from_files(mets=mets, files=files)

Technical metadata is automatically generated for files that do not already have technical metadata.

Warning

Do not add or modify File instances after you have created the SIP instance.

3. Add descriptive metadata

At least one piece of descriptive XML metadata needs to be added into the SIP. This metadata can concern a file or the package as a whole; the only requirement is that at least one piece of descriptive metadata is provided somewhere.

Note

The National Digital Preservation schema catalog contains a variety of different metadata document schemas that are accepted by the Digital Preservation Service.

You can look them up in the DPRES national specifications.

The metadata XML document can be automatically imported using ImportedMetadata, which will automatically detect the XML schema.

from mets_builder.metadata import (
    ImportedMetadata, MetadataType, MetadataFormat
)

# Import metadata automatically from an external file...
dmd_md = ImportedMetadata.from_path("/path/to/descriptive_metadata.xml")

# ...or enter metadata schema information manually
dmd_md = ImportedMetadata(
    metadata_type=MetadataType.DESCRIPTIVE,
    metadata_format=MetadataFormat.DC,
    format_version="2008",
    data_path="/path/to/descriptive_metadata.xml"
)

You can add the descriptive metadata to either a file or the SIP:

# Add metadata to SIP
sip.add_metadata([dmd_md])

# Add metadata to File, and the File to SIP
file.add_metadata([dmd_md])
sip = SIP.from_files(mets=mets, files=[file])

4. Export SIP

Once you have created a SIP using either method, you can export it using the siptools_ng.sip.SIP.finalize() method.

This will generate a tar archive with a digital signature, a METS document and copies of all the files.

sip.finalize(
     output_filepath="sip.tar",
     sign_key_filepath="rsa-keys.crt"
 )

Note

rsa-keys.crt is the signing key used to create a digital signature for the SIP.

See the instructions on the National Digital Preservation Service site (in Finnish) for generating this signing key.

The generated sip.tar file can then be uploaded into the Digital Preservation Service.