User-Defined Metadata
The IP address enrichment module provides supplemental information for IP addresses, such as hostname, autonomous system, geolocation, reputation and additional user-defined metadata. Values are cached for improved performance and flow record throughput. For more control of when enrichment is applied, IP addresses can be included or excluded from various enrichers by CIDR, IP range or individual IP address.
This page provide detailed information about User-Defined Metadata Enrichment and Scoping Enrichment with Include/Exclude. For more details about RiskIQ or Maxmind enrichment, see the following information:
User-Defined Metadata Enrichment
An example of the format of this file is:
# Specify whether the IP/CIDR/Range is considered to be "internal".
192.0.2.0/24:
internal: true
# Additional options are name, vlan, tags and metadata.
192.0.2.192/26:
name: atlanta_guest_wifi
vlan: 1001
tags:
- wifi
- dhcp
metadata:
dhcp.pool.name: atlanta_guest_wifi
.site.id: atlanta
# Metadata fields beginning with a . will be organized under the object containing the IP address.
192.0.2.194-192.0.2.198:
metadata:
.site.bldg.id: hq
.site.floor.id: 2
.site.rack.id: 1
# An individual IP address.
192.0.2.194:
metadata:
device.type.name: wifi_ap
Metadata Types
The User-Defined Metadata enricher supports a combination of pre-defined metadata types as well as the ability to provide custom data as key-value pairs. This section describes the various metadata types. The following table provideas a summary of these types.
Attribute | Data Type | Field Populated | Description |
---|---|---|---|
internal | boolean | <object>.isInternal | Specifies whether or not the IP belongs to a network considered to be "internal". |
name | string | <object>.ip.subnet.name | The name given to this subnet. |
vlan | number (0-4094) | <object>.vlan.tag.id | A VLAN ID |
tags | array of strings | <object>.ip.subnet.tags | Tags that describe attributes of the subnet or IP. |
metadata | sequence of attributes | <object><attribute> or <attribute> | Key-value pairs which will be added at the IP object or record levels. |
internal
internal
is a boolean attribute used to specify whether the CIDR, Range or IP address is considered to be internal or external. This differs from whether the IP address is within a private or public IP range. Some private IPs may still be considered external, e.g. they are used with in a DMZ. Similarly some public IPs may still be considered internal if they are assigned to resources operated by the organization and to which access is generally restricted.
We are planning future features which leverage this internal/external designation, and the derived ingress/egress direction of traffic flow.
name
name
is a string attribute to provide a user-friendly name to a subnet which is relevant to the user or organization.
Only a single name
value is returned for a given IP address. Care should be taken to ensure that there are no conflicting names among overlapping CIDRs, Ranges and IP addresses. If you must assign multiple values, these should be add to the tags
attribute.
vlan
vlan
allows a VLAN tag to be specified for a CIDR, Range or IP address. This tag will typically be assigned to source/destination and client/server related fields. There should be no conflict with VLAN tags provided in the flow records from network devices. The devices are reporting on the VLAN tags observed on their own interfaces, not the endpoints of the flow. The VLAN tags reported by devices are typically assigned to the in/out related fields.
tags
tags
is an array of string values for attributes that further describe the CIDR, Range or IP address.
metadata
metadata
is a list of key-value pairs which will be added as fields to the record. These can either be custom fields specific to the needs of the user, or existing fields from the ElastiFlow CODEX schema. When CODEX fields are specified, the configured metadata value will override any values that already exist in the record.
If you have enabled ECS (Elasticsearch/OpenSearch) or CIM (Splunk) support and want to override values from these schemas, you must specify the CODEX equivalent fields in the metadata
attribute. Metadata is applied in the decoder portion of the collector, where all data is still in the CODEX schema. Conversion to other schemas is output-specific and thus occurs at a later phase of processing.
Key names can be specified with or without a leading .
.
- If specified with a leading
.
, the field will be placed within the parent object containing the IP address. - If specified witouth a leading
.
, the field will be placed at the root of the record.
Consider an IP address from flow.src.ip.addr
:
- If the metadata key is defined as
.site.name
, the value would be assigned toflow.src.site.name
. - If the metadata key is defined as
site.name
, the value would be assigned directly tosite.name
.
Merging Values from Multiple Definitions
Attribute values for an IP address which matches multiple CIDR, Range or IP address entries will be merged into a single result set. Consider the following example:
192.168.0.0/16:
metadata:
.geo.loc.coord: 48.167106,11.486918
.geo.city.name: Munich
.geo.country.code: DE
.geo.country.name: Germany
.geo.tz.name: Europe/Berlin
192.168.1.0/24:
name: munich_hq
tags:
- campus
metadata:
sec.zone.name: campus
192.168.1.151-192.168.1.200:
tags:
- guest_wifi
- dhcp
metadata:
.host.name: guest_wifi
.ip.addr: 192.168.1.0
Here you have:
- the whole Class C private network
192.168.0.0/16
with some location metadata. - a
/24
block of that network that is tagged as the campus network, and also the firewall zone to which it belongs. - a range of those IP address that belong to the guest WiFi and are provided by DHCP.
Given a value for flow.src.ip.addr
of 192.168.1.152
, which matches all three entries in the above configuration, the resulting enrichment fields added to the record would be:
flow.src.ip.subnet.name: munich_hq
flow.src.ip.subnet.tags: [campus guest_wifi dhcp]
flow.src.geo.loc.coord: 48.167106,11.486918
flow.src.geo.city.name: Munich
flow.src.geo.country.code: DE
flow.src.geo.country.name: Germany
flow.src.geo.tz.name: Europe/Berlin
sec.zone.name: campus
flow.src.host.name: guest_wifi
flow.src.ip.addr: 192.168.1.0
The last two values above demonstrate one of the use-cases for User-Defined Metadata. The host.name
and ip.addr
have been overridden to more generic static values, thus anonymizing the individual guest WiFi users. This allows the traffic to still be collected and analyzed, without tracking each guest individually. Network or security operations can investigate suspect traffic which they may want to block, while preserving individual guests' privacy.
Scoping Enrichment with Include/Exclude
The Hostname/DNS, RiskIQ Threat/IP Reputation and Maxmind GeoIP enrichment features can be scoped to a subset of IP addresses by specifying specific Autonomous Systems or CIDRs to be included or excluded. These include/exclude definitions are provided via a YAML file which can be updated and refreshed without the need to restart the collector.
An Example of include/exclude definitions:
include:
asn:
- 14168
cidr:
- 10.0.0.0/8
- 192.168.0.0/16
exclude:
#asn:
# -
cidr:
- 192.168.100.0/24
Evaluation of Include/Exclude Definitions
It is important to understand how include/exclude definitions are evaluated to ensure your configuration provides the desired outcome. The following rules apply:
- If no specific include values are defined, everything is included.
- Exclude values are evaluated within the scope of included values.
Consider the following examples:
While the following examples use only CIDRs, the same logic applies when ASN values are specified.
no include/exclude definitions
# no path provided or an empty file
If no include/excludes are defined, everything is included.
IP Address | Included? |
---|---|
192.168.0.1 | ✓ |
10.0.0.1 | ✓ |
10.111.0.1 | ✓ |
only include is defined
include:
cidr:
- 10.0.0.0/8
Only those IP addresses within a defined AS or CIDR are included. In this example, only IPs within the CIDR 10.0.0.0/8
are included.
IP Address | Included? |
---|---|
192.168.0.1 | ✕ |
10.0.0.1 | ✓ |
10.111.0.1 | ✓ |
only exclude is defined
exclude:
cidr:
- 10.111.0.0/16
All IP address which are not specifically excluded by the defined AS or CIDR are included. In this example, all IPs except those within the CIDR 10.111.0.0/16
are included.
IP Address | Included? |
---|---|
192.168.0.1 | ✓ |
10.0.0.1 | ✓ |
10.111.0.1 | ✕ |
both include and exclude are defined
include:
cidr:
- 10.0.0.0/8
exclude:
cidr:
- 10.111.0.0/16
Only those IP addresses within a specified AS or CIDR are included, EXCEPT those within an excluded AS or CIDR.
IP Address | Included? |
---|---|
192.168.0.1 | ✕ |
10.0.0.1 | ✓ |
10.111.0.1 | ✕ |
192.168.0.1
is not included as it is not within an included AS or CIDR.10.0.0.1
is included as it is within an included AS or CIDR.10.111.0.1
is not included. While is does fall within the range of an included CIDR, it is also with a CIDR than is specifically excluded.