Working_BATCH_FILES__dlg_ARCHIVE_AIP_creation_workflowx DLG ARCHive AIP Creation Instructions
User Manual:
Open the PDF directly: View PDF .
Page Count: 6
Download | |
Open PDF In Browser | View PDF |
DLG AIP creation workflow for AIPs to be ingested by ARCHive system using AIP Creation Helper v 2.0 Revised 10/24/2018 NOTE: These instructions assume you are working on a Windows PC (tested with Windows 7). You must have the following software applications installed and available on your system path so that they can be executed from the command line: Software JAVA Perl Python FITS (File Information Tool Set) 7zip Bzip2 for Windows Grep for Windows Version 8 (1.8) 5.16 3.x 1.3x URL https://java.com/en/download/ https://www.perl.org/ https://www.python.org/ https://projects.iq.harvard.edu/fits/home 17 1.0.2 2.5.4 https://www.7‐zip.org/ http://gnuwin32.sourceforge.net/packages/bzip2.htm http://gnuwin32.sourceforge.net/packages/grep.htm 1) Identify and stage digital objects to be included in AIP. As a general rule, DLG AIPS will be created at the digital object level, so usually there will be one item level record in dlgadmin per AIP. Generally AIPs will be named using record DLGadmin slugs or record IDs, however newspaper AIPs will be named using the batch name. Newspaper AIPs will be created at the batch level and will include the entire batch, including tif files. Files should be unzipped. If files have been bagged previously, validate the bag before moving them into the prescribed AIP subfolder structure. Delete thumbs.db files. Make sure you are at a location with enough space to accommodate duplication of the files necessary to create the AIP. 2) Rename files comprising digital objects and associated metadata to include DLG repository and collection codes ([dlgrepo]_[dlgcoll]_item). 3) Create the stub filesystem that will be transformed into the AIP. The following processes rely on a file system that strictly follows this pattern: [aip id] parent folder ‐‐objects subfolder ‐‐[digital_object_id] Subfolder containing files. In most cases this is the same as the AIP ID. In most situations, this can be done using one of the following Windows batch (.bat) files. Make sure that files and folders are named correctly before proceeding with any of the AIP creation steps: noadmin_directories_AIP_stub_filesystem.bat Use for a series of subfolders containing files for individual digital objects noadmin_ind_AIP_stub_filesystem.bat Use for a folder of files that correspond to a separate object. news_AIP_creation_make_stub_filesystem.bat Use for a series of newspaper batches (folders) that will each be archived as a separate AIP. This will result in a set of files in a folder named “rearranged”. Use this as your working directory for all subsequent processes. All subsequent steps will be executed in the directory containing these [aip id] subfolders. In these instructions, this will be called the working directory. 4) Create an XML file containing AIP level metadata. You can create this file using a spreadsheet and XML Blueprint to extract the XML from the spreadsheet. Save this file in your working directory. You can begin the spreadsheet with a TSV export of item level records from dlgadmin and modify it so that it follows this example: Then use XMLBlueprint or Excel to extract the xml from the spreadsheet. The extracted XML file should have the following form:5) Move the AIP_creation_helper.zip file to your working directory and unpack it. Copy the current version of the AIP_creation_helper.zip file to your working directory. Right click this file and use 7zip to “unzip here”. If this is not an available option on your context menu, you can use any other unzip program for this step, but make sure you have 7zip correctly installed on your machine and also make sure that the unzipped files are directly in your working folder, not in a subfolder. Unzipping the AIP_creation_helper_v2.0 will result in the following files being added to your working directory: Name make_stub_filesystem move_to_FITS_directory 1.0_move_fits_xml 2.3‐big_combine_xml.bat 3.3‐makemaster.bat 4.2‐prep_bags.bat 7zprepare_bag.pl bag_and_validate_all_folders_python.cmd Type folder containing batch files to create stub filesystems folder containing batch file to be copied to FITS directory Windows Batch File Windows Batch File Windows Batch File Windows Batch File PL File Windows Command Script bagit.py close.xml dlg‐fits‐to‐master‐stylesheet.xsl fart.exe open.xml README_SIP_Creation.txt saxon9he.jar split_TSV_source.xsl PY File XML File XSL Stylesheet Application XML File Text Document Executable Jar File XSL Stylesheet 6) Copy the runfits_external.bat file to the home directory of your FITS installation (where fits.bat is located). You only have to do this one time and then you can re‐use it whenever you create AIPs using this system. Double click to run the batch file. Enter the path (including drive letter) to your working directory when prompted. 7) Run the four .bat files to create the AIP. Double click on each of the .bat files in numerical order to execute them. The second .bat file, 2.3‐ big_combine_xml.bat, will ask you to input the name of the xml file you extracted from the spreadsheet. You must enter this exactly as the file is named (case sensitive), including the .xml extension. Please be sure to follow any instructions given through the command prompt window. If you notice any errors you must investigate them and correct them before moving on with the next step or the process will not complete successfully. 7a) Run the first .bat file, 1.x_move_fits_xml.bat This step will create a metadata‐working directory in the stub file system and move the fits.xml files to it. 7b) Run the second .bat file, 2.x‐combine_xml.bat. You will be prompted to enter the name of your AIP level xml file that you created in step 5. Be sure to include the .xml extension. This step will execute the following processes: Split the AIP level xml (extracted from the spreadsheet in step 5) into separate files for each AIP and move them to the appropriate metadata‐working folder. Combine the .fits.xml files for each AIP into one xml file. Strip the xml declarations from the combined FITS xml. Regularize the file paths in the FITS xml so that it is relative to the “objects” folder of the filesystem. Combine the AIP level xml fragment with the combined FITS xml. Add an XML declaration, opening tag, and closing tag so that the resulting xml is well‐formed. Result of 2.x_combine‐xml.bat 7c) Run the third .bat file, 3.x_makemaster.bat to transform each *_wellformed.xml file to *_master.xml This step will create a “metadata” folder containing the [aip_id]_master.xml file and copy the contents of the metadata‐working folders to a subfolder of the working directory called “trash”. Result of 3x‐makemaster.bat 7d) Run the fourth .bat file, 4.x‐prep_bags.bat. After completing this step, your original subfolders will have the suffix _bag appended to them. This indicates that the bagging step was successfully completed. There will also be a set of files ending in a string of numbers and the extension .tar.bz2. These are the completed AIPs that are ready for upload to ARCHive. Result of 4x‐prep_bags.bat 8) Prepare MD5 manifest using report from ExactFile or other program capable of producing an MD5 checksum. This manifest file should be a tab delimited text file containing the checksum and the name of the final zipped and tarred bag. If you are going to upload multiple AIPs in a batch, there should be one manifest for all of the files to be uploaded. fake_aip01 1 http://archive.libs.uga.edu/dlg This is a test AIP http://rightsstatements.org/vocab/CNE/1.0/ testrepo testcoll fake_aip02 1 http://archive.libs.uga.edu/dlg This is another test AIP http://rightsstatements.org/vocab/CNE/1.0/ testrepo testcoll
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.5 Linearized : Yes Author : mwilloug Create Date : 2018:10:24 17:24:47-04:00 Modify Date : 2018:10:24 17:24:47-04:00 XMP Toolkit : Adobe XMP Core 5.2-c001 63.139439, 2010/09/27-13:37:26 Format : application/pdf Title : Microsoft Word - Working_BATCH_FILES_DRAFT_dlg_ARCHIVE_AIP_creation_workflow.docx Creator : mwilloug Creator Tool : PScript5.dll Version 5.2.2 Producer : Acrobat Distiller 10.1.16 (Windows) Document ID : uuid:421347c6-e520-4a9d-9633-08eab0313ecf Instance ID : uuid:26e4a35d-bcc9-4a44-8121-dd1b02abb2d2 Page Count : 6EXIF Metadata provided by EXIF.tools