Working_BATCH_FILES__dlg_ARCHIVE_AIP_creation_workflowx DLG ARCHive AIP Creation Instructions

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 6

DownloadWorking_BATCH_FILES__dlg_ARCHIVE_AIP_creation_workflowx DLG ARCHive AIP Creation Instructions
Open PDF In BrowserView PDF
DLG AIP creation workflow for AIPs to be ingested by ARCHive system
using AIP Creation Helper v 2.0
Revised 10/24/2018
NOTE: These instructions assume you are working on a Windows PC (tested with Windows 7). You must
have the following software applications installed and available on your system path so that they can be
executed from the command line:
Software
JAVA
Perl
Python
FITS (File
Information Tool
Set)
7zip
Bzip2 for Windows
Grep for Windows

Version
8 (1.8)
5.16
3.x
1.3x

URL
https://java.com/en/download/
https://www.perl.org/
https://www.python.org/
https://projects.iq.harvard.edu/fits/home

17
1.0.2
2.5.4

https://www.7‐zip.org/
http://gnuwin32.sourceforge.net/packages/bzip2.htm
http://gnuwin32.sourceforge.net/packages/grep.htm

1) Identify and stage digital objects to be included in AIP. As a general rule, DLG AIPS will be
created at the digital object level, so usually there will be one item level record in dlgadmin per
AIP. Generally AIPs will be named using record DLGadmin slugs or record IDs, however
newspaper AIPs will be named using the batch name. Newspaper AIPs will be created at the
batch level and will include the entire batch, including tif files.
Files should be unzipped. If files have been bagged previously, validate the bag before moving
them into the prescribed AIP subfolder structure. Delete thumbs.db files.
Make sure you are at a location with enough space to accommodate duplication of the files
necessary to create the AIP.
2) Rename files comprising digital objects and associated metadata to include DLG repository
and collection codes ([dlgrepo]_[dlgcoll]_item).
3) Create the stub filesystem that will be transformed into the AIP. The following processes
rely on a file system that strictly follows this pattern:
[aip id] parent folder
‐‐objects subfolder

‐‐[digital_object_id] Subfolder containing files. In most cases this is the same as
the AIP ID.
In most situations, this can be done using one of the following Windows batch (.bat) files. Make
sure that files and folders are named correctly before proceeding with any of the AIP creation
steps:
noadmin_directories_AIP_stub_filesystem.bat

Use for a series of subfolders containing files
for individual digital objects
noadmin_ind_AIP_stub_filesystem.bat
Use for a folder of files that correspond to a
separate object.
news_AIP_creation_make_stub_filesystem.bat Use for a series of newspaper batches
(folders) that will each be archived as a
separate AIP. This will result in a set of files in
a folder named “rearranged”. Use this as your
working directory for all subsequent
processes.

All subsequent steps will be executed in the directory containing these [aip id] subfolders. In
these instructions, this will be called the working directory.
4) Create an XML file containing AIP level metadata. You can create this file using a
spreadsheet and XML Blueprint to extract the XML from the spreadsheet. Save this file in your
working directory. You can begin the spreadsheet with a TSV export of item level records from
dlgadmin and modify it so that it follows this example:

Then use XMLBlueprint or Excel to extract the xml from the spreadsheet.
The extracted XML file should have the following form:


fake_aip01
1
http://archive.libs.uga.edu/dlg

This is a test AIP
http://rightsstatements.org/vocab/CNE/1.0/
testrepo
testcoll


fake_aip02
1
http://archive.libs.uga.edu/dlg
This is another test AIP
http://rightsstatements.org/vocab/CNE/1.0/
testrepo
testcoll


5) Move the AIP_creation_helper.zip file to your working directory and unpack it. Copy the
current version of the AIP_creation_helper.zip file to your working directory. Right click this file
and use 7zip to “unzip here”. If this is not an available option on your context menu, you can
use any other unzip program for this step, but make sure you have 7zip correctly installed on
your machine and also make sure that the unzipped files are directly in your working folder, not
in a subfolder.
Unzipping the AIP_creation_helper_v2.0 will result in the following files being added to your
working directory:
Name
make_stub_filesystem
move_to_FITS_directory
1.0_move_fits_xml
2.3‐big_combine_xml.bat
3.3‐makemaster.bat
4.2‐prep_bags.bat
7zprepare_bag.pl
bag_and_validate_all_folders_python.cmd

Type
folder containing batch files to create stub filesystems
folder containing batch file to be copied to FITS directory
Windows Batch File
Windows Batch File
Windows Batch File
Windows Batch File
PL File
Windows Command Script

bagit.py
close.xml
dlg‐fits‐to‐master‐stylesheet.xsl
fart.exe
open.xml
README_SIP_Creation.txt
saxon9he.jar
split_TSV_source.xsl

PY File
XML File
XSL Stylesheet
Application
XML File
Text Document
Executable Jar File
XSL Stylesheet

6) Copy the runfits_external.bat file to the home directory of your FITS installation (where fits.bat is
located).
You only have to do this one time and then you can re‐use it whenever you create AIPs using this
system. Double click to run the batch file. Enter the path (including drive letter) to your working
directory when prompted.
7) Run the four .bat files to create the AIP.
Double click on each of the .bat files in numerical order to execute them. The second .bat file, 2.3‐
big_combine_xml.bat, will ask you to input the name of the xml file you extracted from the spreadsheet.
You must enter this exactly as the file is named (case sensitive), including the .xml extension. Please be
sure to follow any instructions given through the command prompt window. If you notice any errors you
must investigate them and correct them before moving on with the next step or the process will not
complete successfully.
7a) Run the first .bat file, 1.x_move_fits_xml.bat
This step will create a metadata‐working directory in the stub file system and move the fits.xml files to
it.
7b) Run the second .bat file, 2.x‐combine_xml.bat.
You will be prompted to enter the name of your AIP level xml file that you created in step 5. Be sure to
include the .xml extension. This step will execute the following processes:







Split the AIP level xml (extracted from the spreadsheet in step 5) into separate files for each AIP
and move them to the appropriate metadata‐working folder.
Combine the .fits.xml files for each AIP into one xml file.
Strip the xml declarations from the combined FITS xml.
Regularize the file paths in the FITS xml so that it is relative to the “objects” folder of the
filesystem.
Combine the AIP level xml fragment with the combined FITS xml.
Add an XML declaration, opening tag, and closing tag so that the resulting xml is well‐formed.

Result of 2.x_combine‐xml.bat

7c) Run the third .bat file, 3.x_makemaster.bat to transform each *_wellformed.xml file to
*_master.xml
This step will create a “metadata” folder containing the [aip_id]_master.xml file and copy the contents
of the metadata‐working folders to a subfolder of the working directory called “trash”.

Result of 3x‐makemaster.bat

7d) Run the fourth .bat file, 4.x‐prep_bags.bat.
After completing this step, your original subfolders will have the suffix _bag appended to them. This
indicates that the bagging step was successfully completed. There will also be a set of files ending in a
string of numbers and the extension .tar.bz2. These are the completed AIPs that are ready for upload to
ARCHive.

Result of 4x‐prep_bags.bat

8) Prepare MD5 manifest using report from ExactFile or other program capable of producing an
MD5 checksum. This manifest file should be a tab delimited text file containing the checksum
and the name of the final zipped and tarred bag. If you are going to upload multiple AIPs in a
batch, there should be one manifest for all of the files to be uploaded.



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.5
Linearized                      : Yes
Author                          : mwilloug
Create Date                     : 2018:10:24 17:24:47-04:00
Modify Date                     : 2018:10:24 17:24:47-04:00
XMP Toolkit                     : Adobe XMP Core 5.2-c001 63.139439, 2010/09/27-13:37:26
Format                          : application/pdf
Title                           : Microsoft Word - Working_BATCH_FILES_DRAFT_dlg_ARCHIVE_AIP_creation_workflow.docx
Creator                         : mwilloug
Creator Tool                    : PScript5.dll Version 5.2.2
Producer                        : Acrobat Distiller 10.1.16 (Windows)
Document ID                     : uuid:421347c6-e520-4a9d-9633-08eab0313ecf
Instance ID                     : uuid:26e4a35d-bcc9-4a44-8121-dd1b02abb2d2
Page Count                      : 6
EXIF Metadata provided by EXIF.tools

Navigation menu