Flexible Transform

FlexT Flexible Transform

The Problem

Most cyber defense systems incorporate some form of cyber threat intelligence (CTI) collection and analysis. However, different systems and CTI sharing communities have implemented a variety of representations to transmit these data (e.g., STIX, OpenIOC, custom CSV). This diversity of formats presents a challenge when an organization using one format has the opportunity to join sharing communities where the members share data in different formats. Similarly, merging communities with different CTI formats can seem a nearly insurmountable challenge, and proceeds at the pace of the slowest member in each community to adopt a different format.

Although simple translators can be written to convert data from one format to another, challenges to this approach include the following:

  • An exponential increase in the effort required to support new formats.
  • Potential loss of meaning and context (semantics) between formats.

The obstacles posed by these challenges lead to the formation of “islands of sharing” defined not by the communities themselves but by the sharing formats. This pattern leaves smaller organizations, which tend to be unable to participate at all, isolated and defenseless.

Flex-T diagram

Figure 1: The scaling problem with developing pairwise translators for all supported formats (top), and the advantage of using an intermediate semantic representation (bottom).

The Solution

Flexible Transform (FlexT) is a tool that enables dynamic translation between formats. FlexT accomplishes this translation by “digesting” CTI data down to its semantic roots (meaning and context). As Figure 1 shows, making this objective the core of the translation effort simplifies the process. This approach allows the use of new formats with improved scalability and ensures that the original meaning and context of CTI data are preserved.

A “format” in FlexT is broken down into three components:

  • Syntax – A specification of valid document characters and their composition (e.g., CSV, XML, JSON).
  • Schema – A specification of the valid terms, the data they can convey, and restrictions on their use (e.g., STIX, OpenIOC, IODEF ).
  • Semantics – A definition of the meaning of terms (e.g., SourceIPAddress is the session originating IPv4 address).

Using FlexT, organizations are empowered to participate in sharing communities using any type of CTI, in any format. When coupled with a toolset such as Cyber Fed Model’s (CFM’s) Last Quarter Mile Toolset (LQMToolset), participants can not only share and process CTI, they can take automated action based on that intelligence with an array of security endpoint devices.

Features

Feature Enabling users to:
Multiple Interfaces Use FlexT as a library or from the command-line tool.
Accurate translation Convert the format without losing context or meaning.
Easy extensibility When supporting a new schema, simply define a mapping JSON file and immediately convert to/from any other supported format.
Flex-T Model

Figure 2: The process taken by FlexT to transform an input file to a different format, while preserving the context and meaning

Currently Supports

  • STIX with multiple profiles4
  • All CFM XML schemas
  • CFM 1.3 Legacy Format
  • CFM 2.0 Format
  • Key/value indicator schema

Coming Soon

  • Additional data formats
    • OpenIOC
    • FlexText5
    • REST Web-based interface

Get Involved!

There are many threat indicator formats to use, and many organizations have “grown their own.” Feel free to test out FlexT and provide feedback, submit code, or develop and share JSON configuration files.

To get involved and help turn CTI into actionable defense, contact us via CFMteam@anl.gov.