User-Defined Metadata (UDM) for Addresses

The IP address enrichment module provides supplemental information for IP addresses, such as hostname, autonomous system, geolocation, reputation and additional user-defined metadata. Values are cached for improved performance and flow record throughput. For more control of when enrichment is applied, IP addresses can be included or excluded from various enrichers by CIDR, IP range or individual IP address.

This page provide detailed information about User-Defined Metadata Enrichment and Scoping Enrichment with Include/Exclude.

User-Defined Metadata Enrichment

An example of the format of this file is:

# Specify whether the IP/CIDR/Range is considered to be "internal".
192.0.2.0/24:
  internal: true

# Additional options are name, vlan, tags and metadata.
192.0.2.192/26:
  name: atlanta_guest_wifi
  vlan: 1001
  tags:
    - wifi
    - dhcp
  metadata:
    dhcp.pool.name: atlanta_guest_wifi
    .site.id: atlanta

# Metadata fields beginning with a . will be organized under the object containing the IP address.
192.0.2.194-192.0.2.198:
  metadata:
    .site.bldg.id: hq
    .site.floor.id: 2
    .site.rack.id: 1

# An individual IP address.
192.0.2.194:
  metadata:
    device.type.name: wifi_ap

# Showcasing the "contexts" feature
10.0.0.0/16:
  contexts:
    - exporter
  metadata:
    .is_exporter: true
10.0.0.0/32:
  contexts:
    - endpoint
  metadata:
    .is_exporter: false

Metadata Types

The User-Defined Metadata enricher supports a combination of pre-defined metadata types as well as the ability to provide custom data as key-value pairs. This section describes the various metadata types. The following table provides a summary of these types.

Attribute
Data Type
Field Populated
Description

name

string

<object>.ip.subnet.name

The name given to this subnet.

contexts

array of strings

N/A

Limits which contexts this enrichment rule applies. See below for more details.

internal

boolean

<object>.isInternal

Specifies whether or not the IP belongs to a network considered to be "internal". (Only applies to endpoint fields like source and destination)

vlan

number (0-4094)

<object>.vlan.tag.id

A VLAN ID

tags

array of strings

<object>.ip.subnet.tags

Tags that describe attributes of the subnet or IP.

metadata

sequence of attributes

<object><attribute> or <attribute>

Key-value pairs which will be added at the IP object or record levels.

internal

internal is a boolean attribute used to specify whether the CIDR, Range or IP address is considered to be internal or external. This differs from whether the IP address is within a private or public IP range. Some private IPs may still be considered external, e.g. they are used with in a DMZ. Similarly some public IPs may still be considered internal if they are assigned to resources operated by the organization and to which access is generally restricted.

circle-info

We are planning future features which leverage this internal/external designation, and the derived ingress/egress direction of traffic flow.

contexts

This limits the current enrichment rule to only apply within certain contexts.

Value
Effect

exporter

Only apply the enrichment rule when an "exporter" field matches.

Exporter fields: system.ip.addr, flow.export.ip.addr .

Note: If you have Elasticsearch output enabled with ECS, the export field is host.ip

endpoint

Only apply the enrichment rule when an "endpoint" field matches.

Endpoint fields: flow.src, flow.dst , flow.client, flow.server

Note: If you have Elasticsearch output enabled with ECS, the endpoint fields are source.ip, destination.ip, client.ip, server.ip

For example, with the below rule:

Here is how that rule would apply (or not) in the following situations:

Exporter Address
Source Address
Result

10.0.0.1

192.168.10.11

  • flow.exporter.is_exporter = true

  • flow.exporter.vlan.tag.id = 9

  • system.is_exporter = true

  • system.vlan.tag.id = 9

  • owner = department_x

10.99.0.1

10.0.0.1

Nothing. No metadata enrichments applied.

If you have any context defined, then various other, less common, ip fields (like next_hop ) will never apply that enrichment rule.

You can have both exporter and endpoint context defined.

name

name is a string attribute to provide a user-friendly name to a subnet which is relevant to the user or organization.

triangle-exclamation

vlan

vlan allows a VLAN tag to be specified for a CIDR, Range or IP address. This tag will typically be assigned to source/destination and client/server related fields. There should be no conflict with VLAN tags provided in the flow records from network devices. The devices are reporting on the VLAN tags observed on their own interfaces, not the endpoints of the flow. The VLAN tags reported by devices are typically assigned to the in/out related fields.

tags

tags is an array of string values for attributes that further describe the CIDR, Range or IP address.

metadata

metadata is a list of key-value pairs which will be added as fields to the record. These can either be custom fields specific to the needs of the user, or existing fields from the ElastiFlow CODEX schema. When CODEX fields are specified, the configured metadata value will override any values that already exist in the record.

circle-info

If you have enabled ECS (Elasticsearch/OpenSearch) or CIM (Splunk) support and want to override values from these schemas, you must specify the CODEX equivalent fields in the metadata attribute. Metadata is applied in the decoder portion of the collector, where all data is still in the CODEX schema. Conversion to other schemas is output-specific and thus occurs at a later phase of processing.

Key names can be specified with or without a leading ..

  • If specified with a leading ., the field will be placed within the parent object containing the IP address.

  • If specified without a leading ., the field will be placed at the root of the record.

Consider an IP address from flow.src.ip.addr:

  • If the metadata key is defined as .site.name, the value would be assigned to flow.src.site.name.

  • If the metadata key is defined as site.name, the value would be assigned directly to site.name.

Merging Values from Multiple Definitions

Attribute values for an IP address which matches multiple CIDR, Range or IP address entries will be merged into a single result set. Consider the following example:

Here you have:

  • the whole Class C private network 192.168.0.0/16 with some location metadata.

  • a /24 block of that network that is tagged as the campus network, and also the firewall zone to which it belongs.

  • a range of those IP address that belong to the guest WiFi and are provided by DHCP.

Given a value for flow.src.ip.addr of 192.168.1.152, which matches all three entries in the above configuration, the resulting enrichment fields added to the record would be:

circle-info

The last two values above demonstrate one of the use-cases for User-Defined Metadata. The host.name and ip.addr have been overridden to more generic static values, thus anonymizing the individual guest WiFi users. This allows the traffic to still be collected and analyzed, without tracking each guest individually. Network or security operations can investigate suspect traffic which they may want to block, while preserving individual guests' privacy.

Scoping Enrichment with Include/Exclude

The Hostname/DNS, and Maxmind GeoIP enrichment features can be scoped to a subset of IP addresses by specifying specific Autonomous Systems or CIDRs to be included or excluded. These include/exclude definitions are provided via a YAML file which can be updated and refreshed without the need to restart the collector.

An Example of include/exclude definitions:

Evaluation of Include/Exclude Definitions

It is important to understand how include/exclude definitions are evaluated to ensure your configuration provides the desired outcome. The following rules apply:

  1. If no specific include values are defined, everything is included.

  2. Exclude values are evaluated within the scope of included values.

Consider the following examples:

circle-info

While the following examples use only CIDRs, the same logic applies when ASN values are specified.

no include/exclude definitions

If no include/excludes are defined, everything is included.

IP Address
Included?

192.168.0.1

10.0.0.1

10.111.0.1

only include is defined

Only those IP addresses within a defined AS or CIDR are included. In this example, only IPs within the CIDR 10.0.0.0/8 are included.

IP Address
Included?

192.168.0.1

10.0.0.1

10.111.0.1

only exclude is defined

All IP address which are not specifically excluded by the defined AS or CIDR are included. In this example, all IPs except those within the CIDR 10.111.0.0/16 are included.

IP Address
Included?

192.168.0.1

10.0.0.1

10.111.0.1

both include and exclude are defined

Only those IP addresses within a specified AS or CIDR are included, EXCEPT those within an excluded AS or CIDR.

IP Address
Included?

192.168.0.1

10.0.0.1

10.111.0.1

  • 192.168.0.1 is not included as it is not within an included AS or CIDR.

  • 10.0.0.1 is included as it is within an included AS or CIDR.

  • 10.111.0.1 is not included. While is does fall within the range of an included CIDR, it is also with a CIDR than is specifically excluded.

Last updated

Was this helpful?