Collapse to view only § 1236.44 - Documenting digitization projects.
- § 1236.40 - Scope of this subpart.
- § 1236.41 - Definitions for this subpart.
- § 1236.42 - Records management requirements.
- § 1236.44 - Documenting digitization projects.
- § 1236.46 - Quality management requirements.
- § 1236.48 - File format requirements.
- § 1236.50 - Requirements for digitizing permanent paper and photographic print records.
- § 1236.52 - Requirements for digitizing permanent mixed-media records.
- § 1236.54 - Metadata requirements.
- § 1236.56 - Validating digitized records and disposition authorities.
§ 1236.40 - Scope of this subpart.
(a) This subpart establishes processes and requirements to ensure that agencies:
(1) Identify the records the agency will digitize in each project;
(2) Account for all records covered by the project, regardless of media type;
(3) Implement quality management techniques to verify equipment performance and monitor processes to detect and correct errors;
(4) Produce complete and accurate digitized records that the agency can use for all the same purposes as the source records; and
(5) Validate that the resulting digitized records meet the standards in this subpart.
(b) This subpart covers the standards and procedures agencies must apply when digitizing permanent paper records using reflective digitization techniques. Such records include most paper-based documents, regardless of size, such as modern textual documents, maps, posters, manuscripts, graphic arts prints (for example, lithographs or intaglio), drawings, bound volumes, and photographic prints. This subpart also covers any records that may be incorporated into mixed-media records.
(c) This subpart does not cover standards and procedures agencies must apply when digitizing permanent records using transmissive digitization techniques. Such records include photographic negatives, transparencies, aerial film, roll film, and micrographic and radiographic materials. In addition, this subpart does not cover digitizing records on dynamic media. Such records include motion picture film, video, and audio tapes.
(d) For guidance on digitizing out-of-scope media types or non-paper-based portions of mixed-media records, such as dynamic media, radiographic, negative or positive film, or other special media types, please contact the Records Management Policy and Standards Team by email at [email protected] or by phone at 301-837-1948.
(e) This subpart does not require that optical character recognition (OCR) be performed during digitization. However, these regulations do not prevent agencies from performing OCR to meet their business needs.
(f) This subpart does not address other applicable laws and regulations governing documents and digital files, including, but not limited to, proper handling of classified or controlled unclassified information (CUI) and compliance with 36 CFR part 1194 (section 508). Agencies should work with their legal counsel and other officials to ensure compliance with these and other applicable requirements.
(g) This subpart also does not address other business needs or legal constraints that may make it necessary for an agency to retain source records for a period of time after digitizing. Agencies should work with their legal counsel and other officials to determine whether such retention might be necessary because it relates to rights and interests, appeal rights, benefits, national security, litigation holds, or other similar reasons.
§ 1236.41 - Definitions for this subpart.
In addition to the definitions contained in § 1220.18 of this subchapter and § 1236.2, the following definitions apply to this subpart:
Accuracy is the degree to which the information correctly describes the object or process being measured. It can be thought of in terms of how close a reading or average of readings is to a true or target value. Accuracy is a different measure than precision.
Adobe RGB is a red, green, blue color space developed to display on computer monitors most of the colors that CMYK color printers produce. The Adobe RGB color space is significantly larger than the sRGB color space, particularly in the cyan and green regions.
Aimpoint is a specific value assigned to a given metric to assess performance achievement.
Artifact (defect) is a general term to describe a broad range of undesirable flaws or distortions in digital reproductions produced during image capture or data processing. Some common forms of image artifacts include noise, chromatic aberration, blooming, interpolation, and imperfections created by compression.
Batch is a group of files that are created under the same conditions or are related intellectually or physically. During digitization, batches represent groups of records that are digitized and undergo QC inspection processes together.
Bit depth is the number of bits used to represent each pixel in an image. The term is sometimes used to represent bits per pixel and at other times, the total number of bits used multiplied by the number of total channels. For example, a typical color image using 8 bits per channel is often referred to as a 24-bit color image (8 bits x 3 channels). Color scanners and digital cameras typically produce 24-bit (8 bits x 3 channels) images or 36-bit (12 bits x 3 channels) capture, and high-end devices can produce 48-bit (16-bit x 3 channels) images. Bit depth is also referred to as “color depth.”
Clipping is the abrupt truncation of a signal when the signal exceeds a system's ability to differentiate signal values above or below a particular level. In the case of images, the result is that there is no differentiation of light tones when the clipping is at the high end of signal amplitude, and no differentiation of dark tones when clipping occurs at the low end of signal amplitude.
CMYK is a subtractive color model used in printing that is based on cyan (C), magenta (M), yellow (Y), and black (K). These are typically referred to as “process colors.” Cyan absorbs the red component of white light, magenta absorbs green, and yellow absorbs blue. In theory, the mix of the three colors will produce black, but black ink is also used to increase the density of black in a print.
Color accuracy is measured by computing the color difference (ΔE2000) between the digital imaging results of the standard target patches and their premeasured color values. By imaging an appropriate target and evaluating through the software, variances from known values can be determined, which is a good indicator of how accurately the system is recording color. Analytical software measures the average deviation of all color patches measured (the mean).
Color channel misregistration is the measurement of color-to-color spatial dislocation of otherwise spatially coincident color features of a digitized object.
Color management is using software, hardware, and procedures to measure and control color in an imaging system, including capture and display devices.
Color space is a specific organization of colors that supports reproducible representations of color in combination with color profiling supported by various devices. A color space can be a helpful conceptual tool for describing or understanding the color capabilities of a particular device or digital file. Examples of color spaces include Adobe RGB 1998, sRGB, ECIRGB_v2, and ProPhoto RGB.
Compression, lossless is a technique for data compression that will allow the decompressed data to be exactly the same as the original data before compression, bit-for-bit. The compression of data is achieved by coding redundant data in a more efficient manner than in the uncompressed format.
Compression, visually lossless is a form or manner of lossy compression where the data that is lost after the file is compressed and decompressed is not detectable to the human eye; the compressed data appearing identical to the uncompressed data.
Digital Image Conformance Evaluation (DICE) is the measurement and monitoring component of the Federal Agencies Digital Guidelines Initiative (FADGI) Conformance Program. The program consists of measuring ISO-compliant reference targets and using analysis software such as OpenDICE for testing and monitoring digitization programs to ensure they meet FADGI technical parameters. Agencies can access FADGI-compliant tools and resources online at http://www.digitizationguidelines.gov/guidelines/digitize-OpenDice.html.
Digitization project is any action an agency (including an agent acting on the agency's behalf, such as a contractor) takes to digitize permanent records. For example, a digitization project can range from a one-time digitization effort to a multiyear digitization process; can involve digitizing a single document into a digital records management system or digitizing boxes of records from storage facilities; or can include digitizing active records as part of an ongoing business process or digitizing inactive records for better access.
Digitized record is a digital record created by converting paper or other media formats to a digital form that is of sufficient authenticity, reliability, usability, and integrity to serve in place of the source record.
Dynamic range is the ratio between the smallest and largest possible values of a changeable quantity, frequently encountered in imaging or recorded sound. Dynamic range is another way of stating the maximum signal-to-noise ratio.
Federal Agencies Digital Guidelines Initiative (FADGI) is a collaborative effort by Federal agencies to articulate Technical Guidelines that form the basis for many of the technical parameters in this part, which equate to the FADGI three-star level. Agencies can access FADGI online at http://www.digitizationguidelines.gov/guidelines/digitize-technical.html.
Grayscale is an image type lacking any chromatic data, consisting of shades of gray ranging from white to black. Most commonly seen as having 8 bits per pixel, allowing for 256 shades or levels of intensity.
Image quality is the degree of perceived or objective measurement of a digital image's overall accuracy in faithfully reproducing an original. A digital image created to a high degree of accuracy meets or exceeds objective performance attributes (such as level of detail, tonal and color fidelity, and correct exposure), and has minimal defects (such as noise, compression artifacts, or distortion).
Lightness uniformity measures how evenly a lens records the lighting of neutral reference targets from center to edge and between points within the image.
Modulation transfer function (MTF)/spatial frequency response (SFR) is the modulation ratio between the output image and the ideal image. SFR measures the imaging system's ability to maintain contrast between progressively smaller image details. Using these two functions, a system can make an accurate determination of resolution related to the sampling frequency.
Newton's Rings are interference patterns that appear as a series of concentric, alternating light and dark rings of colored light (when imaged in a color mode). This type of interference is caused when smooth transparent surfaces come into contact with small gaps of air between the surfaces. The light waves reflect from the top and bottom surfaces of the air film formed between the surfaces, causing light rays to constructively or destructively interfere with each other. The areas where there is constructive interference will appear as light bands and the areas where there is destructive interference will appear as dark bands.
Noise is one or more undesirable image artifact(s) in a digitized record that is not part of the source material.
Pixels per inch (ppi), describes the resolution capabilities of an imaging device, such as a scanner, or the resolution of a digital image. PPI is different from dots per inch (dpi).
Posterization is an effect produced by reducing the number of tones (colors) in an image so that there is a noticeable distinction between one tone and another instead of a gradual shift between them.
Precision is the characteristic of measurement that relates to the consistency between multiple measurements, under uniform conditions, of the same item or process. As opposed to accuracy, precision does not indicate how close a measurement is to a true value.
Quantization is a lossy compression technique that involves compressing a range of values to a single quantum value, usually to reduce file size. This may result in flaws in an image, such as posterization, caused by reducing the data available in an image file to represent aspects like colors.
Raster image is a digitally encoded representation of a subject's tonal and brightness information into a bitmap. Data from digital cameras and scanning devices record light characteristics as numerical values into a grid, or raster, of picture elements (pixels).
Reference target is a chart of test patterns and patches with known standard values used to evaluate the performance of an imaging system.
Reflective digitization is a process in which an imaging system captures reflected light off of scanned objects such as bound volumes, loose pages, cartographic materials, illustrations, posters, photographic prints, or newsprint.
Reproduction scale accuracy measures the relationship between the physical size of the original object and the size in pixels per inch (PPI) of that object in the digital image.
Resolution is the level of spatial detail rendered by an imaging system as measured by MTF/SFR.
Sampling frequency measures the imaging spatial resolution and is computed as the physical pixel count or pixels per unit of measurement, such as pixels per inch (PPI). This parameter provides information about the size of the original and the data needed to determine the level of detail recorded in the file. (See also modulation transfer function (MTF)/spatial frequency response (SFR).)
Sharpening artificially enhances details to create the illusion of greater definition. Image quality testing using the SFR quantifies the level of sharpening introduced by imaging systems or applied by users in post-processing actions.
Source record is the record from which a digitized version or digitized record is created. The source record should be the record copy that was used in the course of agency business.
Spatial resolution determines the amount (for example, quantity, PPI, megapixels) of data in a raster image file in terms of the number of picture elements or pixels per unit of measurement, but it does not define or guarantee the quality of the information. Spatial resolution defines how finely or widely spaced the individual pixels are from each other. The actual rendition of fine detail is more dependent on the SFR of the scanner or digital camera.
sRGB is a standard RGB color space created by HP and Microsoft for use on monitors, printers, and the internet. sRGB uses the ITU-R BT.709-5 primaries that are also used in studio monitors and HDTV, and a transfer function (gamma correction) typical of CRTs (cathode ray tube TVs and computer monitors), all of which permits sRGB to be directly displayed on typical monitors. The sRGB gamma is not represented by a single numerical value. The overall gamma is approximately 2.2, consisting of a linear (gamma 1.0) section near black, and a non-linear section elsewhere involving a 2.4 exponent and a gamma changing from 1.0 through about 2.3.
Tolerance is the allowable deviation from a specified value.
Tone response or optoelectronic conversion function (OECF) is a measure of how accurately the digital imaging system converts light levels into digital pixels.
Transmissive digitization is a process in which the system transmits light through a photographic slide or negative.
White balance error is a measurement of the digital file's color neutrality. The definition of “neutral” is not universal: RGB workflows that use digital count values encode neutral as defined by the International Color Consortium (ICC) color space chosen. L*a*b* workflows define neutral as 0 on the a* axis and b* axis, with the lightness recorded from 0-100 on the L* axis.
§ 1236.42 - Records management requirements.
(a) Before starting a digitization project, agencies must establish intellectual control of the records that will be digitized. Intellectual control means having the information necessary to identify and understand the content and context of the records. One traditional records management technique to establish intellectual control is the creation of an inventory. The inventory must identify whether the records are complete, if there are any gaps in coverage or missing records, the presence of any mixed-media records, the disposition schedule under which the records fall, the date range when the records were created, any access or use restrictions that apply to the records, and the records' storage location.
(b) Agencies must identify any relationships between the source records in order to retain these relationships between the digitized versions. For example, are there case files that are associated by case number? Does a folder contain multiple documents that are stapled together? Are there digital components of a mixed-media file stored on removable media (DVD or USB drives)? What is the relationship of the folder to other folders in a box? Any relationships must be captured as part of the digitization process:
(1) Through metadata (See § 1236.54 for metadata requirements);
(2) By organizing the folder structure of a file system;
(3) By using file formats that allow for multi-page files, such as PDF or TIFF; or
(4) Through a combination of these approaches.
(c) In addition, the inventory can be used to identify all the elements of physical control needed for the records to be digitized. Physical control includes understanding the physical characteristics of source records. Physical characteristics determine a project's scope, and the image capture techniques and equipment to be used. For example, the type of paper, the type of printing, or the size of the records can impact what methods and equipment are used to digitize records.
(d) There are additional considerations for managing the source records during the digitization process:
(1) Ensure there are appropriate safeguards for the source records to prevent their loss or damage.
(2) Restrict access to source records while they are being digitized to minimize the risk of unauthorized additions, deletions, or alterations.
(3) Ensure there is a process to identify and document gaps in coverage or missing records.
(e) Agencies must ensure that records are free from unauthorized alteration, destruction, or deletion by complying with the mechanisms and controls specified in §§ 1236.10 and 1236.20:
(1) The agency may generate checksums using the SHA-256 hash algorithm and record them as technical metadata in a recordkeeping system for each image file when digitization is complete and the agency determines that the records are no longer in active use and the metadata are no longer subject to any changes that may result from ongoing business use. Use the checksums to monitor the digitized records for corruption or alteration and capture them as metadata as required in § 1236.54; or
(2) The agency may perform file integrity monitoring or file comparison audits.
(f) If there are born-digital records that are part of the record series within the project, follow the instructions for managing mixed-media records in § 1236.52.
§ 1236.44 - Documenting digitization projects.
Agencies must create digital documentation when digitizing permanent source records. The agency must retain this documentation alongside the digitized records until the digitized records have been transferred to NARA and NARA has notified the agency that the accessioning process is complete. The agency must dispose of the documentation in accordance with an appropriate General Records Schedule (GRS) or agency records schedule. The required documentation will help the agency populate the Transfer Request instrument (TR) in NARA's Electronic Records Archives (ERA). The following documents are required:
(a) A defined project plan that identifies:
(1) Record series or file units to be digitized;
(2) Method that will be used to name digitized records;
(3) Estimated date range of the source records;
(4) Missing pages;
(5) Gaps or missing records in the series. Depending on the type of gap or missing records, indicate if there will be charge-out cards for skipped or missing records that will be inter-filed if they are transferred at a later date;
(6) Estimated volume, media types, dimensions, physical characteristics, and condition of the source records;
(7) Equipment and software used to digitize records;
(8) Estimated file storage requirements for the digitized records. The file storage needs may affect project decisions, such as compression and file format;
(9) Any access or use restrictions that apply to the records;
(10) Method used to capture the relationships that exist between source records once they are digitized; and
(11) Any metadata element labels that differ from those specified in § 1236.54.
(b) Any information needed to associate the digitized records to the source records' agency records schedule(s) including the item numbers;
(c) Any related finding aids, indexes, inventories, logs, registers, or metadata schemas the agency uses to manage the records that can serve as sources for the metadata required in § 1236.54.
(d) A quality management (QM) plan that ensures the project meets the quality assurance (QA) objectives and quality control (QC) inspection procedures.
(1) The quality management plan must include the policies, functions, roles, responsibilities, requirements, and objectives for the project.
(2) The quality assurance component of the QM plan must include documentation of:
(i) Image quality performance parameters selected to capture the information present in the source records;
(ii) Equipment and device acceptance testing methods and results;
(iii) Design reviews to evaluate if digitization workflows meet the requirements; and
(iv) Training conducted.
(3) The quality control component of the QM plan must document:
(i) The procedures used to inspect image quality;
(ii) The procedures used to inspect metadata quality;
(iii) The corrective actions taken to mitigate deviations throughout all phases of the project; and
(iv) The procedures used to verify that digitized records conform to the requirements.
§ 1236.46 - Quality management requirements.
(a) Quality assurance (QA) requirements. The agency must meet the image quality performance parameters specified in § 1236.50 by verifying how well the equipment meets the aim points and tolerances of the parameters. The agency cannot rely solely on equipment specifications, such as scanner ppi settings or camera sensor megapixels, to ensure digital image quality.
(1) The agency must use QA processes to:
(i) Quantify scanner or camera performance before selecting the equipment by scanning a reference target and measuring the results with analytical software to determine if the equipment meets the technical parameters.
(ii) Evaluate internal or external vendor imaging systems against image quality performance parameters;
(iii) Monitor equipment performance by quantifying scanner or camera performance during digitization; and
(iv) Verify that resulting digital files meet project specifications.
(b) Quality Control (QC) requirements. The agency must implement QC inspection and monitoring processes to ensure that images meet the digitization image quality parameters in § 1236.50.
(1) The Federal Agencies Digital Guidelines Initiative (FADGI) Digital Image Conformance Evaluation program (DICE) is a QC inspection and monitoring process that uses image targets and analysis software to verify compliance. Applied properly, this methodology will ensure agencies meet the requirements in § 1236.50.
(2) If the agency does not adopt the FADGI Conformance Evaluation program, it must document both the procedures used and how it verified conformance to the quality parameters.
(c) Quality Control (QC) testing and analysis. During the digitization process, the agency must perform QC testing and analysis to identify malfunctioning or improperly configured digitization equipment, improper software application settings, incorrect metadata capture, or human error, and take corrective actions. It must:
(1) Implement an image quality analysis process and use reference targets to verify that digitization devices conform to imaging parameters in this subpart;
(2) Replace reference targets as they fade or accumulate dirt, scratches, and other surface marks that reduce their usability;
(3) Regularly test equipment to ensure scanners and digital cameras/copy systems are performing optimally. It must:
(i) Scan a reference target containing a grayscale, color chart, and accurate dimensional scale at the beginning of each workday;
(ii) Use image quality analysis software to verify that the performance evaluation specifications are being met; and
(iii) Perform additional tests when problems are detected.
(4) Test equipment with the specific software/device driver combination(s), and re-test after any changes to the workflow; and
(5) Ensure that equipment operation, settings, and image processing actions are the same as those used to evaluate the test target. Turn off auto correction settings in the capture equipment such as “auto exposure” that may cause non-conformance of the target evaluation or the resulting image files.
(d) Quality control inspection. (1) The agency must perform QC inspections of the digital records for compliance with the technical parameters and criteria specified in this subpart. The inspection must ensure 100% of the image files:
(i) Can open and be displayed;
(ii) Are encoded with a compression type and in a format specified in § 1236.48; and
(iii) Have the resolution, color mode, bit depth, and color profile specified in § 1236.50.
(2) The agency must perform a visual inspection using a statistically valid technique:
(i) The agency may visually inspect a random sample of a minimum of ten digital records or 10% of each batch of digital records, whichever is larger; or
(ii) The agency may employ a statistically valid sampling plan to verify that the image quality, file quality, metadata quality, and completeness requirements have been met. Agencies that employ their own sampling technique must include documentation of the method used, as specified in § 1236.44(d)(3)(i).
(3) Visual inspection must be conducted using a calibrated graphics workstation and using a monitor set to 100% magnification to check the following image quality characteristics:
(i) Image tone, brightness, contrast, and color accuracy match the specifications in § 1236.50;
(ii) Images are free from clipping (missing detail lost in highlights or shadows);
(iii) Images are free from color channel misregistration, or quantization errors;
(iv) Images are free of any image artifacts that compromise the informational content of the record, such as dust, Newton's rings, missing pixels, scan lines, drop-outs, flare, or over-sharpening; and
(v) Images are not improperly cropped, have the expected dimensions and orientation (landscape/horizontal or portrait/vertical), and images are not flipped, inverted, or skewed.
(e) Corrective measures. If the inspection reveals errors, perform the following steps until there is a 100% success rate for the sample set:
(1) If 1% or more of examined records fail to meet any of the criteria in § 1236.50, determine the source and scope of any errors, correct or re-digitize affected records, and reinspect the images by following the requirements in paragraph (d) of this section;
(2) If less than 1% of examined records fail to meet any of the criteria in § 1236.50, determine the source and scope of any errors and correct or re-digitize the affected records.
(f) Inspection for other quality aspects. The agency must inspect the resulting files to verify that they meet the metadata and records completeness requirements:
(1) Metadata quality. The agency must evaluate the accuracy of metadata. This may be done using automated techniques if appropriate. Otherwise, the QC inspections must be done manually. These inspections must ensure that:
(i) Files are named according to project specifications; and
(ii) Correct administrative, descriptive, and technical metadata are captured in a recordkeeping system and in image files.
(2) Records completeness. The agency must employ automated and visual inspection processes to verify the completeness and accuracy of digitization:
(i) Verify that all records have been accounted for by referring to box lists, folder title lists, or other inventories;
(ii) Compare source records with their digitized versions to verify that 100% of the informational content has been captured;
(iii) Compare source records with their digitized versions to verify the digitized records are in the same order;
(iv) Examine records for related envelopes, notes, or other forms of media to verify that all sources of record information have been digitized;
(v) Verify that any mixed-media records that cannot be digitized are associated with the digitized records using the “Relation” metadata elements in § 1236.54(c); and
(vi) Confirm that missing pages or images have been noted in the project documentation.
§ 1236.48 - File format requirements.
(a) The agency must encode, retain, and transfer digitized records in one of the following file formats, either uncompressed or using one of the specified compression codecs in tables 1 and 2 to this section.
(1) Agencies that combine multiple uncompressed TIFF images into PDF/A files using JPEG2000 compression must perform the quality inspection step specified in 1236.46(d) against the resulting PDF/A files.
(2) When using JPEG 2000 visually lossless compression, agencies must determine the amount of compression to apply, not to exceed 20:1, by performing tests and visually evaluating for compression artifacts that obscure or alter the information content.
(b) The agency must encode, retain, and transfer digitized permanent paper records in one of the following file formats, either uncompressed or with one of the compression codecs specified in table 1 to this paragraph (b).
Table 1 to Paragraph (
Format name and version | Acceptable compression codecs | TIFF 6.0 | Uncompressed, Deflate (ZIP). | JPEG2000 part 1 (ISO/IEC 15444-1:2019) | JPEG 2000 part 1 core coding system lossless compression. Agencies may use up to 20:1 visually lossless compression. | Portable network graphics 1.2 (PNG) | Deflate (ZIP). | PDF/A (Select any version of PDF/A that meets project requirements. However, do not use the attachments feature in PDF/A-3 or PDF/A-4 | Deflate (ZIP), JPEG 2000 part 1 core coding system lossless compression. Agencies may use up to 20:1 visually lossless compression. |
---|
(c) The agency must encode, retain, and transfer digitized photographic print records in one of the following file formats, either uncompressed or with one of the compression codecs specified in the table 2 to this paragraph (c).
(1) For a series of predominantly textual records with interspersed photographic prints, use the formats in table 1 to paragraph (b) of this section for paper records. All photographic prints must be digitized according to the standards in § 1236.50.
(2) For a series of predominantly printed photographs, including those with paper records interspersed, use the file formats in table 2 to this paragraph (c) for photographic print records.
(3) However, the agency must not transcode, or interpolate (upsample) files anywhere in the workflow.
Table 2 to Paragraph (
Format name and version | Acceptable compression codecs | TIFF 6.0 | Uncompressed, Deflate (ZIP). | JPEG2000 part 1 (ISO/IEC 15444-1:2019) | JPEG 2000 part 1 core coding system lossless compression. Agencies may use up to 20:1 visually lossless compression. | Portable network graphics 1.2 (PNG) | Deflate (ZIP). |
---|
§ 1236.50 - Requirements for digitizing permanent paper and photographic print records.
(a) Overview. This section describes the minimum requirements appropriate for digitizing paper records. Depending on the physical characteristics of the source records, the agency must select the applicable specifications described in either table 1 to paragraph (d) of this section for modern textual paper records or the table 2 to paragraph (e) of this section for photographic prints and paper records with fine details. Agencies must implement appropriate equipment, lighting, special handling, or imaging methods to ensure the capture of all information. Agencies may exceed these requirements, if necessary, to capture fine detail or to meet their own business needs.
(b) Image quality parameters. The performance parameters are based on FADGI three-star aim points and tolerance ranges.
(c) Equipment requirements. The equipment used to digitize Federal records must be appropriate for the media type, and capable of achieving documented project objectives without damaging the source records.
(d) Requirements for digitizing modern textual paper records. For these records, produce image files at a minimum of 300 ppi sized to the source document.
(1) Records suitable for the specifications in table 1 to this paragraph (d) for modern textual paper records are modern textual documents with a well-defined printed type (such as typeset, typed, laser-printed), and with moderate to high contrast between the ink of the text and the paper background. Performance metric values in table 1 for modern textual paper records conform to the FADGI “Documents (Unbound): Modern Textual Records” category, and are appropriate when source records do not have visible content with L* values darker than 20. Neutral reference patches on the evaluation test target with L* less than 20 are not used for analysis.
(2) For other paper records such as manuscripts, illustrations, graphics, and documents with poor legibility or diffuse characters (such as carbon copies or Thermofax) that have visible content with L* values darker than 20, agencies must evaluate neutral reference patches on the evaluation test target with L* greater than 20. (These values equate to FADGI three-star for “Documents (Unbound): General Collections”).
(3) The agency must digitize in an acceptable RGB color mode if records contain color or other characteristics that are necessary to interpret the information of the source record, or that would be lost when digitizing using grayscale gamma 2.2.
(4) At a minimum, the agency must digitize the paper records covered by this paragraph to the following parameters:
Table 1 to Paragraph (
Digital file specifications | Attributes | Color mode | color or grayscale. | Bit depth | 8 or 16. | Color space | gray gamma 2.2, AdobeRGB1998, sRGB, ProPhoto RGB, ECIRGBv2. | Resolution (Sampling Frequency) (Units are Pixels Per Inch/ppi minus Reproduction Scale Accuracy) | ≥294 ppi (300 ppi—2%). | Measurement parameters | Performance metric values
Difference from aim (applies to 20 ≤ L* ≤ 100). | Tone Response (OECF) L* (Units Colorimetric ΔL*) gray patches that meet the measurement parameters | ± 5. | White Balance (Units Colorimetric ΔE(a*b*)) gray patches that meet the measurement parameters | ≤4%. | Lightness Uniformity (Units Colorimetric—Standard Deviation Divided by Mean L*) | ≤3%. | Average Color Accuracy (Units Colorimetric—Mean ΔE 2000—for patches meeting the measurement parameters) | ≤ 3.5. | Color Accuracy 90th Percentile (Units Colorimetric—2.5 times average deviation for patches meeting the measurement parameters) | ≤ 8.75. | Color Channel Misregistration (Units Pixels) | SFR 10 (Sampling Efficiency) (Measurement is a Ratio %) | >80%. | MTF50 (50% SFR) (Percentage of Half Sampling Frequency) [Lower, Upper] | Percentage of half sampling frequency: [>40%, <75%]. | Reproduction Scale Accuracy (Units % Difference from Header PPI) | <±2%. | Sharpening (Units Max Modulation) | < 1.1. | Noise (Upper Limit) (Units Std Dev of L*) | Noise (Warning Limit) (Units Std Dev of L*) | ≥.25. |
(e) Requirements for digitizing photographic prints and paper records that have fine details. Records that have fine detail, require a high degree of color accuracy, or have other unique characteristics, must be captured using the specifications in table 2 to this paragraph (e) for photographic prints and paper records with fine details. For these records, produce image files (as described table 2) at a minimum of 400 ppi sized to the source document (these performance values equate to FADGI three-star category “Prints and Photographs”). It may be necessary to apply a higher resolution than the minimum for some records that have fine detail.
(1) These specifications apply to records such as photographic prints, graphic-arts prints (for example, lithographs or intaglio), drawings, embossed seals, and records that have information that cannot be captured by the parameters in table 1 to paragraph (d) of this section for modern textual paper records.
(i) For records in which the smallest significant detail is 1.0 mm or smaller, such as aerial photographs and topographic maps (which require a high degree of enlargement and precision to ensure the dimensional accuracy of the scans), the agency must increase the resolution to capture all the information in the source record.
(ii) For many imaging devices, increasing the ppi settings may not increase the actual resolution level or capture the desired detail. The equipment for digitizing records with fine detail must be capable of meeting the higher quality parameters. It may be necessary to exceed the parameters in table 2 to this paragraph (e) to capture all the information inherent in the records.
(2) The agency must digitize photographic prints, including monochrome and black and white, using a color mode.
(3) The agency must digitize in an acceptable color mode if records contain color or other characteristics that are necessary to interpret the information of the source record, or that would be lost when digitizing using grayscale gamma 2.2.
(4) At a minimum, agencies must digitize all records covered by this paragraph to the following parameters:
Table 2 to Paragraph (
Digital file specifications | Attributes | Color mode | color or grayscale. | Bit depth | 8 or 16. | Color space | Gray gamma 2.2, AdobeRGB1998, ProPhoto RGB, ECIRGBv2. | Resolution (Sampling Frequency) (Units are Pixels Per Inch/ppi minus Reproduction Scale Accuracy) | ≥392 ppi (400 ppi—2%). | Measurement parameters | Performance metric values | Tone Response (OECF) L* (Units Colorimetric ΔL2000*) for any given gray patch | ± 4. | White Balance (Units Colorimetric ΔE(a*b*)) for any given gray patch | ≤4. | Lightness Uniformity (Units Colorimetric − Standard Deviation Divided by Mean) | <3%. | Average Color Accuracy(Units Colorimetric—Mean ΔE 2000—average deviation of all patches) | <3.5. | Color Accuracy 90th Percentile (Units Colorimetric—2.5 times average deviation of all patches) | <8.75. | Color Channel Misregistration (Units Pixels) | <0.5 pixel. | SFR10 (Sampling Efficiency) (Measurement is a Ratio %) | 80%. | SFR50 (50% SFR) (Units Percentage of Half Sampling Frequency) [Lower, Upper] | Percentage of half sampling frequency: [>40%, <75%]. | Reproduction Scale Accuracy (Units % Difference from Header PPI) | <± 2%. | Sharpening (Units Max Modulation) | <1.1. | Noise (Upper Limit) (Units Std Dev of L*) | <2. | Noise (Lower Limit) (Units Std Dev of L*)—A warning should be raised if the image doesn't meet this criteria | ≥.25. |
§ 1236.52 - Requirements for digitizing permanent mixed-media records.
Mixed-media files are records that belong together or relate to a common topic and are stored on more than one media type. Mixed-media files result from the processes agencies use to create, maintain, and use records. For example, a case file may include paper records, online digital records, and digital records on storage media.
(a) For any non-paper media, agencies must analyze the contents to determine whether any files are records.
(1) If the media contains records that are temporary, manage them according to their appropriate GRS or agency-specific records authority.
(2) If the media contains records that are permanent, but not part of the digitized record series, locate their disposition schedule and capture them in a digital information system that complies with the requirements in § 1222.26 of this subchapter and §§ 1236.10 through 1236.14.
(3) If the media contains born-digital components of mixed-media files that are related to the digitized records series, capture the born-digital records in a recordkeeping system in accordance with § 1222.26 of this subchapter and associate the born-digital records with any related records once they are digitized using the “Relation” metadata elements in § 1236.54.
(4) If they are permanent records stored on a media type that is out of scope for this subpart, document this information according to the instructions in § 1236.44. Agencies must maintain the association between records using the “Relation” metadata elements specified in § 1236.54.
(b) Contact the Records Management Policy and Standards Team at [email protected] for guidance on what to do with types of media in a mixed-media file that are outside the scope of this subpart, such as dynamic media, x-rays, negative or positive film, or other special media types.
§ 1236.54 - Metadata requirements.
(a) General. To ensure that intellectual and physical control of the digital records can be maintained, this regulation specifies metadata elements that must be captured in a recordkeeping system, or embedded in each file, or both captured in a recordkeeping system and embedded in each file. Ensure that the metadata remains accurate and consistent regardless of where it is stored.
(1) If using metadata to capture relationships between source records as required in § 1236.42(b), agencies must use the “Relation” metadata elements in table 1 to paragraph (c)(1) of this section for basic administrative metadata.
(2) If using different metadata labels from the ones required in this section, agencies must document the labels that the agency uses and note this in the Details section of the ERA (Electronic Records Archive) Transfer Request (span).
(3) Determine the appropriate level to be used as the source of descriptive metadata. Depending on the agency's existing recordkeeping practices and level of intellectual control, use information from the project, record series, file unit, or item level as the source for administrative, technical, and descriptive metadata fields. If the components of a record have not been individually indexed with unique descriptions, apply the series or file unit-level descriptions to all of the image files within that grouping. If the components of the record do not have individual titles, the agency must apply the item Record IDs instead.
(4) Include additional metadata if it is captured. If other metadata elements are provided in addition to the metadata requirements in this subpart, NARA will accept that metadata as part of the transfer process.
(b) Metadata capture requirements. Agencies must:
(1) Capture the metadata specified by paragraphs (c) through (e) of this section at the file or item level as part of the digitization project;
(2) Create file names and record IDs that are unique to each image file;
(3) Embed the metadata specified by paragraph (c) of this section in each image file, capture and maintain it in a recordkeeping system, associate it with the records it describes, and keep it consistent and accurate in both places;
(4) Ensure that scanning equipment embeds the system-generated technical metadata specified by table 4 to paragraph (e)(1) of this section for format technical metadata and table 5 to paragraph (e)(2) of this section for processing technical metadata in each image file, and ensure that image processing does not alter or delete it; and
(5) Transfer metadata to NARA in CSV format.
(c) Administrative metadata. (1) Capture in a recordkeeping system and embed in each image file the following administrative metadata:
Table 1 to Paragraph (
Metadata label | Description | Requirement level | Identifier: File Name | The complete name of the computer file, including its extension | Mandatory (file names are an inherent attribute of each file so there is no need to embed them as an element of metadata). | Identifier: Record ID | The unique identifier assigned by an agency or a records management system. 36 CFR 1236.20(b)(1) requires that agencies assign unique identifiers to each record | Mandatory. | Identifier: Records Schedule Item # | The number assigned to the agency records schedule or GRS item to which the record belongs | Mandatory. | Relation: Has Part | A related record that is either physically or logically required in order to form a complete record. Mixed-media files that contain records on multiple media types must use this element to identify all components | Mandatory if a record includes multiple parts, such as the component parts of a case file or mixed-media file. | Relation: Is Part Of | A related record or file in which the described record is physically or logically included. Use this element to indicate that a record is a component of a mixed-media file | Mandatory if file is a component of a multi-part record. |
---|
(2) Capture in a recordkeeping system and embed in each file any of the following access and use restrictions the metadata inherited from the source records:
Table 2 to Paragraph (
Metadata label | Required fields | Description | Requirement level | Access Restrictions | Access Restriction Status | Indicate whether or not there are access restrictions on the record | Mandatory. | Specific Access Restriction | Specific access restrictions on the record, based on national security considerations, donor restrictions, court orders, and other statutory or regulatory provisions, including Privacy Act and Freedom of Information Act (FOIA) exemptions | Mandatory if access restriction exists | Use Restrictions | Use Restriction Status | Indicate whether or not there are use restrictions on the record | Mandatory. | Specific Use Restriction | The type of use restrictions on the record, based on copyright, trademark, service mark, donor, or statutory provisions | Mandatory if use restriction exists. | Rights: Rights Holder | A person or organization owning or managing intellectual property rights relating to the record | Mandatory if there is a rights holder. |
---|
(d) Descriptive metadata. Capture the following descriptive metadata from source records at the lowest level needed to support access and preservation and to maintain contextual information. Depending on the agency's existing recordkeeping practices and level of intellectual control, it may use information from the project level, record series, file unit, or item, as the source for descriptive metadata. If the components of a record have not been individually indexed with unique descriptions, apply the series or file unit-level descriptions to all of the image files within that grouping. If source records share a common material type or dimensions, auto-populate the source type and source dimension metadata. If the components of the record do not have individual titles, the agency must apply the item Record IDs instead. Capture the metadata in a recordkeeping system for each image file:
Table 3 to Paragraph (
Metadata label | Description | Requirement level | Title | A name given to the source record. If a name does not exist, the mandatory metadata element Identifier: Record ID serves as the title for the record | Mandatory. | Description | A narrative description of the content of the record, including abstracts of documents | Mandatory. | Creator | The agent (person, agency, other organization, etc.) primarily responsible for creating the source record | Mandatory. | Date: Creation Date | The date or date range indicating when the source record met the definition of a Federal record | Mandatory. | Source Type | The medium of the source record that was scanned to create a digital still image | Mandatory. | Source Dimensions | The dimensions of the source record (including unit of measure) | Mandatory. |
---|
(e) Technical metadata. (1) Ensure that the following values are embedded in each image file and that image processing does not delete or alter them:
Table 4 to Paragraph (
Metadata label | Definition | Requirement level | Date Time Created | The date or date-and-time the digital image was created | Mandatory. | Image Width | The width of the digital image, | Mandatory. | Image Height | The height of the digital image, | Mandatory. | Color Space | The name of the International Color Consortium (ICC) profile used | Mandatory. | Bits Per Sample | Number of bits per component | Mandatory. | Samples Per Pixel | The number of components per pixel. Usually, 1 for grayscale images and 3 for RGB images | Mandatory. |
---|
(2) Ensure that the following process metadata elements are recorded for each image file:
Table 5 to Paragraph (
Metadata label | Definition | Requirement level | Scanner Make and Model | The manufacturer and model of the scanner used to create the image | Mandatory if using a scanner. | Digital Camera Make and Model | The manufacturer and model of the digital camera used to create the image | Mandatory if using a digital camera. | Software Name and Version | The name and version of the software used to capture the image | Mandatory if using scanning software. |
---|
(3) Capture the following technical metadata in a recordkeeping system for each image file, and use them to monitor digital records for corruption or alteration:
Table 6 to Paragraph (
Fixity metadata label | Description | Requirement level | Message Digest Algorithm | The specific algorithm used to construct the message digest for the digital object or bitstream | Mandatory if using checksums as described in § 1236.42(e)(1). | Message Digest (checksum) | The output of Message Digest Algorithm | Mandatory if using checksums as described in § 1236.42(e)(1). |
---|
§ 1236.56 - Validating digitized records and disposition authorities.
(a) When a digitization project is complete, the agency must validate that the digitized versions meet the standards in this subpart.
(b) Separate staff must conduct the validation, independent from the staff that performed the digitization QC inspections described in § 1236.46.
(c) Agencies must verify that:
(1) All records identified in the project's scope have either been digitized or have been identified in project documentation as missing or incomplete records (and the agency must note this information in the Details section of the ERA TR when transferring the records);
(2) All required metadata are accurate, complete, and correctly labeled;
(3) All image technical attributes specified in § 1236.50 have been met;
(4) All image files are legible and all physical characteristics necessary to understand and use the records have been captured;
(5) Mixed-media files are digitized appropriately for the material type, or if mixed-media components are retained in their original format, they are associated with digitized components through metadata, per the requirements specified in § 1236.54(c); and
(6) Project documentation has been created according to § 1236.44.
(d) Once validated, the digitized records are permanent records.
(e) After validating, the agency must determine whether the agency has any reasons for retaining the source records for a period of time once digitized, in keeping with § 1236.40(g).
(f) Unless source records will be retained for reasons identified in § 1236.40(g), the agency must dispose of the source records in accordance with an agency records schedule or GRS that addresses disposition after digitization.
(g) Agencies cannot use the GRS to dispose of source records if the digitized records do not meet the requirements in this subpart. In such cases, agencies should contact the Records Management Policy and Standards Team at [email protected] to determine what steps they must take.
(h) Agencies must transfer the digitized records to NARA according to the approved disposition authority and include the transfer metadata as described in § 1236.58.
(i) Agencies must retain the project documentation described in § 1236.44 until the National Archives confirms receipt of the records and legal custody of the records has been transferred.
(j) Agencies must transfer the administrative, technical, and descriptive metadata captured during the digitization project as CSV files, as described in § 1236.54(b)(6), with the resulting digitized records.