OHDF Converters

Charles Hu6/10/24About 6 min

A Look Ahead

In this section, we will cover:

What OHDF Converters is
How OHDF Converters can be used to create:

OHDF Converters is a custom data normalization library that hosts and provides OHDF mapper normalization services to other applications in the SAF tool suite. All OHDF mappers depend on and utilize the functionality that OHDF converters provides in order to implement the three core components of an OHDF mapper. As such, it is important to have a general understanding of what some key OHDF Converters utilities are and how they operate in order to create the necessary components for an OHDF mapper and successfully integrate your OHDF mapper into OHDF Converters.

Directory Structure

The following is a simplified depiction of the directory tree for the HDF Converters. Only noteworthy and potentially useful files and directories are included. It is not imperative to memorize the structure, but it is useful to familiarize yourself with it to better understand what exists where in the library for future reference.

hdf-converters
+-- data
|   +-- converters
|   |   +-- csv2json.ts
|   |   +-- xml2json.ts
+-- sample_jsons                            // Sample exports for mapper testing are located here
+-- src                                     // *-to-OHDF mappers are located here
|   +-- converters-from-hdf                 // OHDF-to-* mappers are located here
|   |   +-- reverse-any-base-converter.ts
|   |   +-- reverse-base-converter.ts
|   +-- mappings                            // Non-OHDF mappings are located here (e.g., CVE, CCI, NIST)
|   +-- utils                               // Utility functions are located here
|   |   +-- fingerprinting.ts
|   |   +-- global.ts
|   +-- base-converter.ts
+-- test                                    // Mapper tests are located here
|   +-- mappers
|   |   +-- forward                         // *-to-OHDF tests
|   |   +-- reverse                         // OHDF-to-* tests
|   |   +-- utils.ts
+-- types                                   // Explicit data typing for known export schemas
+-- index.ts
+-- package.json

Fingerprinting

Fingerprinting in OHDF Converters is performed by the fingerprinting feature implemented in the fingerprinting.ts file under /src/utils/. The actual fingerprinting utility itself is provided through the following function:

export function fingerprint(guessOptions: {
  data: string;
  filename: string;
}): INPUT_TYPES;

This function takes an object which consists of the key-value pairs data, which contains a stringified representation of an input file of unknown data format, and filename, which contains the file name of the input file. Once the function receives the file to analyze, it leverages two key items: INPUT_TYPES and fileTypeFingerprints.

INPUT_TYPES is an enumerated type that defines a common set of agreed upon names to identify each data format with. Members of INPUT_TYPES typically appear as so:

export enum INPUT_TYPES {
  ASFF = "asff",
  BURP = "burp",
  CHECKLIST = "checklist",
  CONVEYOR = "conveyor",
  // Truncated for pedagogical purposes
}

where the enum member is the uppercase snake case form of the common name while the initialized value is the stringified lowercase form of the name.

fileTypeFingerprints is a variable defining the specific set of data elements that we are to uniquely associate with a certain data format. Members of fileTypeFingerprints typically appear as so:

const fileTypeFingerprints: Record<INPUT_TYPES, string[]> = {
  [INPUT_TYPES.ASFF]: ["Findings", "AwsAccountId", "ProductArn"],
  [INPUT_TYPES.CONVEYOR]: ["api_error_message", "api_response"],
  [INPUT_TYPES.FORTIFY]: ["FVDL", "FVDL.EngineData.EngineVersion", "FVDL.UUID"],
  // Truncated for pedagogical purposes
};

where the key is the standard name for a data format as defined by INPUT_TYPES, while the value is an array of stringified field names that are cumulatively unqiue to that data format.

Using the fields defined in fileTypeFingerprints, the function fingerprint() will search through the input file and attempt to assign it a data format according to whichever fingerprint array in fileTypeFingerprints has the most matches to keys found in the input file. It will then return the common name of that data format as defined by INPUT_TYPES. If no match is found, it will return an empty string instead (as defined by INPUT_TYPES.NOT_FOUND).

Most OHDF mapper developers will not touch the fingerprint() function itself but will rather add the necessary members to INPUT_TYPES and fileTypeFingerprints in order to allow fingerprint() to correctly identify their data format.

Mapper

Mappers are often self-contained in a single specialized file under /src/, but leverage a number of utilities spread throughout OHDF Converters to actually implement the ability to normalize input files from developer-provided mappings.

Mapper Base Class

The core of OHDF Converters revolves around the base-converter class found in the base-converter.ts file under /src/. base-converter is a key class that all *-to-OHDF mappers extend. It enables the streamlined development of *-to-OHDF mappers by abstracting the actual implementation of the underlying service that performs the data transformations (i.e., the mapper), resulting in the developer only having to write a technical implementation of their developed *-to-OHDF mapping for the base-converter class to consume and use.

Other services provided by base-converter include:

File Format Parsing

base-converter provides a number of functions which can parse a variety of file formats and convert them into a usable Javascript object equivalent.

Currently supported file formats are as follows:

Format	Function
CSV	`parseCsv()`
HTML	`parseHtml()`
XML	`parseXml()`

Generic Types

base-converter accepts type arguments to define the typing of the data that the mapper is expected to ingest. This is defined here:

export class BaseConverter<D = Record<string, unknown>> {
  data: D;
  mappings?: MappedTransform<ExecJSON.Execution, ILookupPath>;
  collapseResults: boolean;

  constructor(data: D, collapseResults = false) {
    this.data = data;
    this.collapseResults = collapseResults;
  }
  // Truncated for pedagogical purposes
}

and can be used as so:

export class CycloneDXSBOMMapper extends BaseConverter<DataStorage>

This is particularly useful for when we know the type of incoming data and want to avoid continuous manual casting of types within the mapping.

Keywords

base-converter provides a series of keywords that can be inserted into the technical implementation of the mapping in order to allow necessary code functions or data manipulations to occur within the mapping definition. These keywords are inherited upon extending base-converter and are as follows:

path: Define JSON object path to go to. Paths are found recursively (e.g., if a top-level field is set to the path result, then all fields below it will only see the result field of the source file). To escape this recursion, you can set the $ symbol at the beginning of the path.

Use:

path: PATH_AS_STRING;

Example:

// Attribute `id` will be set as whatever JSON object attribute `vulnerability.id` is
id: {
  path: "vulnerability.id";
}

Example of recursion:

// Attribute `id` will look at the path `result.id` is instead of just the path `id`
controls: [
  {
    path: "result",
    id: { path: "id" },
  },
];

Example of escaping from recursion:

// Attribute `id` will look at the top-level path `id`, ignoring the recursive path `result`
controls: [
  {
    path: "result",
    id: { path: "$id" },
  },
];

transformer: Oftentimes, source data is not be formatted in a readable/desirable manner. To remedy this, we can make use of the transformer keyword. The transformer accepts a function as a parameter. This can either be an anonymous funciton or a normal, named function. This transformer acts as a callback function that will go to the specified path location and modify the data accordingly. Note that while the transformer keyword can accept an array as an input, it expects and will treat the input as a Javascript object.
- Use:
```
transformer: (PARAMETER: TYPE): OUTPUT_TYPE => {
  CODE_TO_EXECUTE;
};
```
- Example:
```
// Applying a transformer that maps NIST 800-53 tags to CCI tags
// Attribute 'cci' will be set as the returned CCI tag(s) from the ingested 'data' argument
cci: {
  path: 'vulnerabilityClassifications',
  transformer: (data: string) => getCCIsForNISTTags(nistTag(data))
}
```
arrayTransformer: The arrayTransformer works similarly to the transformer in that it takes our in-progress mapping for the array structure, accepts a function that returns an array, and transforms the array accordingly (e.g., filtering out element items, passing in additional data, etc.).
- Use:
```
arrayTransformer: (PARAMETER: TYPE[]): OUTPUT_TYPE[] => {
  CODE_TO_EXECUTE;
};
```
- Example:
```
controls: [
  {
    path: "...", // Some path to the array in the source file
    // The function 'deduplicateId' will run against all items in the current array that the 'arrayTransformer' was called inside
    arrayTransformer: deduplicateId,
  },
];
```
In the above code block, arrayTransformer is used in the controls array of the mapping, which creates a control for every item in the source file's specified path location. This array, however, may contain duplicate items (with the same IDs); thus, the arrayTransformer is being leveraged to deduplicate items with the same ID and collapse them into one element.
pathTransform: Converts the object path structure. This preprocesses the source file before any of the other keywords are run. Only use this as a last resort if the source file needs to be drastically reworked. Note that the intermediate format generated by it is not saved to disk, so it can be difficult to work through.
- Use:
```
pathTransform: (PARAMETER: TYPE): OUTPUT_TYPE => {
  CODE_TO_EXECUTE;
};
```
- Example:
```
// Returns the JSON path if it is an array, otherwise returns an empty array
pathTransform: (value) => (Array.isArray(value) ? value : []),
```
key: Used by base-converter to sort an array of objects by. Currently, this is only used in the controls array to automatically de-duplicate elements with the same key to ensure that all controls are unique. As a result, you typically only pass control IDs to key. When calling the constructor for the mapper, we can set collapseResults=true to make use of this automatic deduplication function.
- Use:
```
key: KEY_AS_STRING;
```
- Example:
```
// `id` is now the key by which this array will be sorted by
key: "id";
```

Utility Files

Several exported utility functions and variables exist within files contained under /src/utils/. Some examples of commonly used ones from global.ts are as follows:

getCCIsForNISTTags(): Converts a set of NIST 800-53 tags into CCI tags.
DEFAULT_STATIC_CODE_ANALYSIS_NIST_TAGS: An array of NIST 800-53s applicable to all automated configuration tests.
filterString(): Return the input string if it is not empty, otherwise return undefined if the string is empty.

You are encouraged to explore these files for further utilities and to add on more to help other developers who may require similar functionality to the one you're implementing.

Testing

Testing in OHDF Converters is facilitated through the Jest testing framework. The tests for *-to-OHDF mappers are contained under /test/mappers/forward/ and depend on a set of sample files and expected OHDF output files contained under /sample_jsons/.

The mapper tests operate by passing the sample files found under /sample_jsons/ to the given mapper to create a set of observed OHDF outputs. These observed outputs will then be compared against the set of expected outputs generated from expected OHDF files also stored under /sample_jsons/. The tests will pass if the observed outputs match the expected outputs and will fail otherwise. The exact technical implementation of this will be discussed later on.

A Look Back

In this section, we covered:

What OHDF Converters is
- OHDF Converters is a custom data normalization library that all OHDF mappers depend upon to implement the three core components of an OHDF mapper.
How OHDF Converters can be used to create:
- The fingerprinting component
  - OHDF Converters provides a fingerprinting service that helps identify the data format of the given input data.
- The mapper component
  - OHDF Converters provides a number of utilities provided by both the base-converter class and associated utility files to help streamline the OHDF mapper development experience.
- The testing component
  - OHDF Converters extends the testing framework provided by Jest to ensure that the tested OHDF mappers generate the correct OHDF output.