# User-Defined Metadata (UDM) for Addresses

The IP address enrichment module provides supplemental information for IP addresses, such as hostname, autonomous system, geolocation, reputation and additional user-defined metadata. Values are cached for improved performance and flow record throughput. For more control of when enrichment is applied, IP addresses can be included or excluded from various enrichers by CIDR, IP range or individual IP address.

This page provide detailed information about [User-Defined Metadata Enrichment](#user-defined-metadata-enrichment) and [Scoping Enrichment with Include/Exclude](#scoping-enrichment-with-include-exclude).

### User-Defined Metadata Enrichment

An example of the format of this file is:

```yaml
# Specify whether the IP/CIDR/Range is considered to be "internal".
192.0.2.0/24:
  internal: true

# Additional options are name, vlan, tags and metadata.
192.0.2.192/26:
  name: atlanta_guest_wifi
  vlan: 1001
  tags:
    - wifi
    - dhcp
  metadata:
    dhcp.pool.name: atlanta_guest_wifi
    .site.id: atlanta

# Metadata fields beginning with a . will be organized under the object containing the IP address.
192.0.2.194-192.0.2.198:
  metadata:
    .site.bldg.id: hq
    .site.floor.id: 2
    .site.rack.id: 1

# An individual IP address.
192.0.2.194:
  metadata:
    device.type.name: wifi_ap

# Showcasing the "contexts" feature
10.0.0.0/16:
  contexts:
    - exporter
  metadata:
    .is_exporter: true
10.0.0.0/32:
  contexts:
    - endpoint
  metadata:
    .is_exporter: false

```

#### Metadata Types

The User-Defined Metadata enricher supports a combination of pre-defined metadata types as well as the ability to provide custom data as key-value pairs. This section describes the various metadata types. The following table provides a summary of these types.

<table><thead><tr><th width="108">Attribute</th><th width="160">Data Type</th><th width="226">Field Populated</th><th width="253">Description</th></tr></thead><tbody><tr><td><code>name</code></td><td>string</td><td><code>&#x3C;object>.ip.subnet.name</code></td><td>The name given to this subnet.</td></tr><tr><td><code>contexts</code></td><td>array of strings</td><td>N/A</td><td>Limits which contexts this enrichment rule applies. See below for more details.</td></tr><tr><td><code>internal</code></td><td>boolean</td><td><code>&#x3C;object>.isInternal</code></td><td>Specifies whether or not the IP belongs to a network considered to be "internal".<br>(Only applies to endpoint fields like source and destination)</td></tr><tr><td><code>vlan</code></td><td>number (0-4094)</td><td><code>&#x3C;object>.vlan.tag.id</code></td><td>A VLAN ID</td></tr><tr><td><code>tags</code></td><td>array of strings</td><td><code>&#x3C;object>.ip.subnet.tags</code></td><td>Tags that describe attributes of the subnet or IP.</td></tr><tr><td><code>metadata</code></td><td>sequence of attributes</td><td><code>&#x3C;object>&#x3C;attribute></code> or <code>&#x3C;attribute></code></td><td>Key-value pairs which will be added at the IP object or record levels.</td></tr></tbody></table>

**internal**

`internal` is a boolean attribute used to specify whether the CIDR, Range or IP address is considered to be ***internal*** or ***external***. This differs from whether the IP address is within a private or public IP range. Some private IPs may still be considered ***external***, e.g. they are used with in a DMZ. Similarly some public IPs may still be considered ***internal*** if they are assigned to resources operated by the organization and to which access is generally restricted.

{% hint style="info" %}
We are planning future features which leverage this internal/external designation, and the derived ingress/egress direction of traffic flow.
{% endhint %}

**contexts**

This limits the current enrichment rule to only apply within certain contexts.

<table><thead><tr><th width="126">Value</th><th>Effect</th></tr></thead><tbody><tr><td><code>exporter</code></td><td><p>Only apply the enrichment rule when an "exporter" field matches.</p><p>Exporter fields: <code>system.ip.addr</code>, <code>flow.export.ip.addr</code> .</p><blockquote><p>Note: If you have Elasticsearch output enabled with ECS, the export field is <code>host.ip</code> </p></blockquote></td></tr><tr><td><code>endpoint</code></td><td><p>Only apply the enrichment rule when an "endpoint" field matches.</p><p>Endpoint fields: <code>flow.src</code>, <code>flow.dst</code> , <code>flow.client</code>, <code>flow.server</code> </p><blockquote><p>Note: If you have Elasticsearch output enabled with ECS, the endpoint fields are <code>source.ip</code>, <code>destination.ip</code>, <code>client.ip</code>, <code>server.ip</code></p></blockquote></td></tr></tbody></table>

For example, with the below rule:

```yaml
10.0.0.0/24:
  contexts:
    - exporter
  vlan: 9
  metadata:
    .is_exporter: true
    owner: department_x
```

Here is how that rule would apply (or not) in the following situations:

<table><thead><tr><th width="185">Exporter Address</th><th width="164">Source Address</th><th>Result</th></tr></thead><tbody><tr><td>10.0.0.1</td><td>192.168.10.11</td><td><ul><li>flow.exporter.is_exporter = true</li><li>flow.exporter.vlan.tag.id = 9</li><li>system.is_exporter = true</li><li>system.vlan.tag.id = 9</li><li>owner = department_x</li></ul></td></tr><tr><td>10.99.0.1</td><td>10.0.0.1</td><td>Nothing. No metadata enrichments applied.</td></tr></tbody></table>

If you have any context defined, then various other, less common, ip fields (like `next_hop` ) will never apply that enrichment rule.

You can have both exporter and endpoint context defined.

**name**

`name` is a string attribute to provide a user-friendly name to a subnet which is relevant to the user or organization.

{% hint style="danger" %}
Only a single `name` value is returned for a given IP address. Care should be taken to ensure that there are no conflicting names among overlapping CIDRs, Ranges and IP addresses. If you must assign multiple values, these should be add to the `tags` attribute.
{% endhint %}

**vlan**

`vlan` allows a VLAN tag to be specified for a CIDR, Range or IP address. This tag will typically be assigned to source/destination and client/server related fields. There should be no conflict with VLAN tags provided in the flow records from network devices. The devices are reporting on the VLAN tags observed on their own interfaces, not the endpoints of the flow. The VLAN tags reported by devices are typically assigned to the in/out related fields.

**tags**

`tags` is an array of string values for attributes that further describe the CIDR, Range or IP address.

**metadata**

`metadata` is a list of key-value pairs which will be added as fields to the record. These can either be *custom* fields specific to the needs of the user, or existing fields from the ElastiFlow CODEX schema. When CODEX fields are specified, the configured metadata value will override any values that already exist in the record.

{% hint style="info" %}
If you have enabled ECS (Elasticsearch/OpenSearch) or CIM (Splunk) support and want to override values from these schemas, you must specify the CODEX equivalent fields in the `metadata` attribute. Metadata is applied in the decoder portion of the collector, where all data is still in the CODEX schema. Conversion to other schemas is output-specific and thus occurs at a later phase of processing.
{% endhint %}

Key names can be specified with or without a leading `.`.

* If specified ***with*** a leading `.`, the field will be placed within the parent object containing the IP address.
* If specified ***without*** a leading `.`, the field will be placed at the root of the record.

Consider an IP address from `flow.src.ip.addr`:

* If the metadata key is defined as `.site.name`, the value would be assigned to `flow.src.site.name`.
* If the metadata key is defined as `site.name`, the value would be assigned directly to `site.name`.

#### Merging Values from Multiple Definitions

Attribute values for an IP address which matches multiple CIDR, Range or IP address entries will be merged into a single result set. Consider the following example:

```yaml
192.168.0.0/16:
  metadata:
    .geo.loc.coord: 48.167106,11.486918
    .geo.city.name: Munich
    .geo.country.code: DE
    .geo.country.name: Germany
    .geo.tz.name: Europe/Berlin

192.168.1.0/24:
  name: munich_hq
  tags:
    - campus
  metadata:
    sec.zone.name: campus

192.168.1.151-192.168.1.200:
  tags:
    - guest_wifi
    - dhcp
  metadata:
    .host.name: guest_wifi
    .ip.addr: 192.168.1.0
```

Here you have:

* the whole Class C private network `192.168.0.0/16` with some location metadata.
* a `/24` block of that network that is tagged as the campus network, and also the firewall zone to which it belongs.
* a range of those IP address that belong to the guest WiFi and are provided by DHCP.

Given a value for `flow.src.ip.addr` of `192.168.1.152`, which matches all three entries in the above configuration, the resulting enrichment fields added to the record would be:

```yaml
flow.src.ip.subnet.name: munich_hq
flow.src.ip.subnet.tags: [campus guest_wifi dhcp]
flow.src.geo.loc.coord: 48.167106,11.486918
flow.src.geo.city.name: Munich
flow.src.geo.country.code: DE
flow.src.geo.country.name: Germany
flow.src.geo.tz.name: Europe/Berlin
sec.zone.name: campus
flow.src.host.name: guest_wifi
flow.src.ip.addr: 192.168.1.0
```

{% hint style="info" %}
The last two values above demonstrate one of the use-cases for User-Defined Metadata. The `host.name` and `ip.addr` have been overridden to more generic static values, thus anonymizing the individual guest WiFi users. This allows the traffic to still be collected and analyzed, without tracking each guest individually. Network or security operations can investigate suspect traffic which they may want to block, while preserving individual guests' privacy.
{% endhint %}

### Scoping Enrichment with Include/Exclude

The Hostname/DNS, and Maxmind GeoIP enrichment features can be scoped to a subset of IP addresses by specifying specific Autonomous Systems or CIDRs to be included or excluded. These include/exclude definitions are provided via a YAML file which can be updated and refreshed without the need to restart the collector.

**An Example of include/exclude definitions:**

```yaml
include:
  asn:
    - 14168
  cidr:
    - 10.0.0.0/8
    - 192.168.0.0/16
exclude:
  #asn:
  #  -
  cidr:
    - 192.168.100.0/24
```

#### Evaluation of Include/Exclude Definitions

It is important to understand how include/exclude definitions are evaluated to ensure your configuration provides the desired outcome. The following rules apply:

1. If no specific include values are defined, ***everything*** is included.
2. Exclude values are evaluated within the scope of included values.

Consider the following examples:

{% hint style="info" %}
While the following examples use only CIDRs, the same logic applies when ASN values are specified.
{% endhint %}

**no include/exclude definitions**

```yaml
# no path provided or an empty file
```

If no include/excludes are defined, ***everything*** is included.

| IP Address  | Included? |
| ----------- | --------- |
| 192.168.0.1 | **✓**     |
| 10.0.0.1    | **✓**     |
| 10.111.0.1  | **✓**     |

**only include is defined**

```yaml
include:
  cidr:
    - 10.0.0.0/8
```

Only those IP addresses within a defined AS or CIDR are included. In this example, only IPs within the CIDR `10.0.0.0/8` are included.

| IP Address  | Included? |
| ----------- | --------- |
| 192.168.0.1 | **✕**     |
| 10.0.0.1    | **✓**     |
| 10.111.0.1  | **✓**     |

**only exclude is defined**

```yaml
exclude:
  cidr:
    - 10.111.0.0/16
```

All IP address which are ***not*** specifically excluded by the defined AS or CIDR are included. In this example, all IPs *except* those within the CIDR `10.111.0.0/16` are included.

| IP Address  | Included? |
| ----------- | --------- |
| 192.168.0.1 | **✓**     |
| 10.0.0.1    | **✓**     |
| 10.111.0.1  | **✕**     |

**both include and exclude are defined**

```yaml
include:
  cidr:
    - 10.0.0.0/8
exclude:
  cidr:
    - 10.111.0.0/16
```

Only those IP addresses within a specified AS or CIDR are included, ***EXCEPT*** those within an excluded AS or CIDR.

| IP Address  | Included? |
| ----------- | --------- |
| 192.168.0.1 | **✕**     |
| 10.0.0.1    | **✓**     |
| 10.111.0.1  | **✕**     |

* `192.168.0.1` is ***not*** included as it is not within an included AS or CIDR.
* `10.0.0.1` is included as it is within an included AS or CIDR.
* `10.111.0.1` is ***not*** included. While is does fall within the range of an included CIDR, it is also with a CIDR than is specifically excluded.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.elastiflow.com/flowcoll/configuration/enrichment-options/ip-address-enrichment/enrich_ip_udm.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
