# Flow Processing

### Processor Configuration Options

#### EF\_PROCESSOR\_POOL\_SIZE[​](https://www.elastiflow.com/docs/config_ref/common/processor#ef_processor_pool_size) <a href="#ef_processor_pool_size" id="ef_processor_pool_size"></a>

Specifies the number of 'processors' to start. In other words, how many logical threads of execution to run at once when processing input. For NetObserv Flow, you will need at least one (1) processor for every 2000 records/second. Increasing the number of processors will allow the collector to better handle a high volume of high latency enrichment tasks such as DNS lookups for IP addresses. We see diminishing returns for a processor pool size of greater than 32.

{% hint style="info" %}
While increasing the number of processors can be beneficial, there are diminishing returns at higher processor counts. This is especially true when the number of processors exceeds the number of available CPU threads (real cores + SMT threads) or vCPUs. If you require more than 64 processors, and are using a Standard or Premium License, it may be more beneficial to use multiple collector instances.
{% endhint %}

The minimum value is `2`. If you configure this to be '1', it will change to '2'.

If you configure it to '0', it will use the default value, as explained below (the same as not setting it at all).

* Default
  * `4 * the number of 'units' supported`
    * for NetObserv Flow, a 'unit' equals `flows per second supported / 4000`
    * for NetObserv SNMP, a 'unit' equals `hosts supported / 40`
    * for NetObserv SNMP Trap, a 'unit' equals `hosts supported / 40`

{% hint style="info" %}
Your license (if you are using one) will support a certain amount of flows per second (for NetObserv Flow) and/or hosts per second (for NetObserv SNMP and NetObserv SNMP Trap). However, you can manually configure any instance of NetObserv to use *less* than the maximum supported by the license. This helps support horizontal scaling. If you do that, the default value will base its calculation off of the manually configured limit.
{% endhint %}

#### EF\_PROCESSOR\_TRANSLATE\_KEEP\_IDS[​](https://www.elastiflow.com/docs/config_ref/common/processor#ef_processor_translate_keep_ids) <a href="#ef_processor_translate_keep_ids" id="ef_processor_translate_keep_ids"></a>

Specifies which identifier values will be included in the final dataset.

* Valid Values
  * `none` - All identifiers are removed from the final dataset.
  * `default` - Most identifiers are removed from the final dataset. However, some identifiers which are required for common use-cases (e.g. raw protocol port values) are included.
  * `all` - All identifiers are included in the final dataset.
* Default
  * `default`

#### EF\_PROCESSOR\_DURATION\_PRECISION[​](https://www.elastiflow.com/docs/config_ref/common/processor#ef_processor_duration_precision) <a href="#ef_processor_duration_precision" id="ef_processor_duration_precision"></a>

The desired precision of duration-related values. Values received at a different precision than specified will be converted to the desired precision.

* Valid Values
  * `sec` - seconds
  * `ds` - deciseconds
  * `cs` - centiseconds
  * `ms` - milliseconds
  * `us` - microseconds
  * `ns` - nanoseconds
* Default
  * `ms`

{% hint style="info" %}
For most data sources this should milliseconds (`ms`)
{% endhint %}

#### EF\_PROCESSOR\_TIMESTAMP\_PRECISION[​](https://www.elastiflow.com/docs/config_ref/common/processor#ef_processor_timestamp_precision) <a href="#ef_processor_timestamp_precision" id="ef_processor_timestamp_precision"></a>

The desired precision of timestamp values. Values received at a different precision than specified will be converted to the desired precision.

* Valid Values
  * `sec` - seconds
  * `ds` - deciseconds
  * `cs` - centiseconds
  * `ms` - milliseconds
  * `us` - microseconds
  * `ns` - nanoseconds
* Default
  * `ms`

{% hint style="info" %}
For most data stores, e.g. Elasticsearch, this should milliseconds (`ms`)
{% endhint %}

#### EF\_PROCESSOR\_PERCENT\_NORM[​](https://www.elastiflow.com/docs/config_ref/common/processor#ef_processor_percent_norm) <a href="#ef_processor_percent_norm" id="ef_processor_percent_norm"></a>

The desired representation of percentages. Values received with a different representation than specified will be converted to the desired representation.

* Valid Values
  * `1` - values will be based on a scale of 0-1.
  * `100` - values will be based on a scale of 0-100.
* Default
  * `100`

#### EF\_PROCESSOR\_KEEP\_CPU\_TICKS[​](https://www.elastiflow.com/docs/config_ref/common/processor#ef_processor_keep_cpu_ticks) <a href="#ef_processor_keep_cpu_ticks" id="ef_processor_keep_cpu_ticks"></a>

For telemetry sources which provide CPU usage as timeticks, utilization percentages will be calculated. If this setting is set `false` the timetick values will be removed from the final dataset. If `true` they will be kept, in addition to the utilization values.

* Valid Values
  * `true`, `false`
* Default
  * `false`

#### EF\_PROCESSOR\_DROP\_FIELDS[​](https://www.elastiflow.com/docs/config_ref/common/processor#ef_processor_drop_fields) <a href="#ef_processor_drop_fields" id="ef_processor_drop_fields"></a>

This setting allows for a comma-separated list of fields that are to be removed from all records. The fields are dropped after all enrichment and *PRIOR* to the records being sent to the enabled outputs.

{% hint style="info" %}
The conversion from the default CODEX schema to alternate schemas, e.g. Elastic's ECS or Splunk's CIM, happens within the respective outputs. As fields are dropped *PRIOR* to the outputs, CODEX field names must be used to configure this option.
{% endhint %}

* Valid Values
  * any CODEX-schema field names, comma-separated
* Example
  * `flow.export.sysuptime,flow.export.version.ver,flow.start.sysuptime,flow.end.sysuptime,flow.seq_num`
* Default
  * `''`

#### EF\_PROCESSOR\_DECODE\_IPFIX\_ENABLE

Set to `true` to enable decoding of IPFIX records.

* Valid Values
  * `true`, `false`
* Default
  * `true`

#### EF\_PROCESSOR\_DECODE\_NETFLOW1\_ENABLE

Set to `true` to enable decoding of Netflow v1 records.

* Valid Values
  * `true`, `false`
* Default
  * `true`

#### EF\_PROCESSOR\_DECODE\_NETFLOW5\_ENABLE

Set to `true` to enable decoding of Netflow v5 records.

* Valid Values
  * `true`, `false`
* Default
  * `true`

#### EF\_PROCESSOR\_DECODE\_NETFLOW6\_ENABLE

Set to `true` to enable decoding of Netflow v6 records.

* Valid Values
  * `true`, `false`
* Default
  * `true`

#### EF\_PROCESSOR\_DECODE\_NETFLOW7\_ENABLE

Set to `true` to enable decoding of Netflow v7 records.

* Valid Values
  * `true`, `false`
* Default
  * `true`

#### EF\_PROCESSOR\_DECODE\_NETFLOW9\_ENABLE

Set to `true` to enable decoding of Netflow v9 records.

* Valid Values
  * `true`, `false`
* Default
  * `true`

#### EF\_PROCESSOR\_DECODE\_SFLOW5\_ENABLE

Set to `true` to enable decoding of sFlow v5 records.

* Valid Values
  * `true`, `false`
* Default
  * `true`

#### EF\_PROCESSOR\_DECODE\_SFLOW\_FLOWS\_ENABLE

Set to `true` to enable decoding of sFlow `flow_sample` and `flow_sample_expanded` records.

* Valid Values
  * `true`, `false`
* Default
  * `true`

#### EF\_PROCESSOR\_DECODE\_SFLOW\_FLOWS\_KEEP\_SAMPLES

When set to `true`, the packet data from an sFlow `sampled_header` record will be stored in `l2.section.sample` as a hex-encoded string.

* Valid Values
  * `true`, `false`
* Default
  * `false`

#### EF\_PROCESSOR\_DECODE\_SFLOW\_COUNTERS\_ENABLE

Set to `true` to enable decoding of sFlow `counters_sample` and `counters_sample_expanded` records.

* Valid Values
  * `true`, `false`
* Default
  * `true`

#### EF\_PROCESSOR\_DECODE\_MAX\_RECORDS\_PER\_PACKET

Corrupt packets can cause issues with the decoding of records. One way this is handled is by limiting the number of records that will be decoded from a packet. The default value is `64`. When the network between the device and collector has an MTU larger than `1500`, the default value may be exceeded by normal packets. This new configuration option allows the threshold to be increased when necessary.

* Default
  * `64`

#### EF\_PROCESSOR\_ENRICH\_ASN\_PREF

If enrichment with autonomous system attributes is enabled, but the autonomous system is already indicated directly in the flow record data, this setting specifies which source is preferred. If the preferred source is not available for a given record, the decoder will fall-back to the alternate option.

* Valid Values
  * `lookup` - prefer the autonomous system determined by lookup.
  * `flow` - prefer the autonomous system indicated directly in the flow record data.
* Default
  * `lookup`

#### EF\_PROCESSOR\_ENRICH\_JOIN\_ASN

Some features require that related values from separate fields are stored as an array in a single field. Such a "join" of autonomous system related fields is enabled when this setting is `true`.

{% hint style="info" %}
If records are being output to Elasticsearch this setting should be set to `true`.
{% endhint %}

* Valid Values
  * `true`, `false`
* Default
  * `true`

#### EF\_PROCESSOR\_ENRICH\_JOIN\_GEOIP

Some features require that related values from separate fields are stored as an array in a single field. Such a "join" of GeoIP related fields is enabled when this setting is `true`.

{% hint style="info" %}
If records are being output to Elasticsearch this setting should be set to `true`.
{% endhint %}

* Valid Values
  * `true`, `false`
* Default
  * `true`

#### EF\_PROCESSOR\_ENRICH\_JOIN\_NETATTR

Some features require that related values from separate fields are stored as an array in a single field. Such a "join" of network attribute related fields is enabled when this setting is `true`.

{% hint style="info" %}
If records are being output to Elasticsearch this setting should be set to `true`.
{% endhint %}

* Valid Values
  * `true`, `false`
* Default
  * `true`

#### EF\_PROCESSOR\_ENRICH\_JOIN\_SUBNETATTR

Some features require that related values from separate fields are stored as an array in a single field. Such a "join" of IP subnetwork attribute related fields is enabled when this setting is `true`.

{% hint style="info" %}
If records are being output to Elasticsearch this setting should be set to `true`.
{% endhint %}

* Valid Values
  * `true`, `false`
* Default
  * `true`

#### EF\_PROCESSOR\_ENRICH\_JOIN\_SEC

Some features require that related values from separate fields are stored as an array in a single field. Such a "join" of security attribute related fields is enabled when this setting is `true`.

{% hint style="info" %}
If records are being output to Elasticsearch this setting should be set to `true`.
{% endhint %}

* Valid Values
  * `true`, `false`
* Default
  * `true`

#### EF\_PROCESSOR\_EXPAND\_CLISRV

The collector will infer the client/server relationship of two source/destination endpoints. The is setting determines whether such inference is enabled or not.

* Valid Values
  * `true`, `false`
* Default
  * `true`

#### EF\_PROCESSOR\_EXPAND\_CLISRV\_NO\_L4\_PORTS

For flow records related to protocols which include no layer-4 ports, the collector will infer the client/server relationship of the two source/destination endpoints using the order of the IP addresses. The is setting determines whether such inference is enabled or not.

* Valid Values
  * `true`, `false`
* Default
  * `true`

#### EF\_PROCESSOR\_IFA\_ENABLE

* Valid Values
  * `true`, `false`
* Default
  * `false`

#### EF\_PROCESSOR\_IFA\_WORKER\_SIZE

Specifies the number of IFA Hop record processors to start.

* Default
  * `4 * the number of license units`

#### EF\_PROCESSOR\_ENRICH\_TOTALS\_IF\_NO\_DELTAS

The vast majority of flow exporters provide byte and packet quantities as *DELTA* values. This refers to the quantity since the last record for the flow was reported. However, some exporters will provide these quantities only as *TOTAL* values, referring to the quantity over the entire lifetime of the flow. Examples of such exporters are Cisco "Netflow Lite" (e.g. IE4000 series), some Juniper MX-series when sending IPFIX, and Versa Networks.

In cases where the exporter sends *ONLY* totals, it may still be desired to use these values to populate `flow.bytes` and `flow.packets`. The idea being that "something is better than nothing". When this option is set to `true`, *total* quantities will be used if they are available and when *delta* quantities are not.

{% hint style="danger" %}
***Total*** quantities can be problematic for many datastores. A simple sum of ***total*** values across multiple records within a time window will not produce an accurate quantity, as it does with ***delta*** values. As a result long-lived flows may over-report bytes and packets values if ***total*** values are used.
{% endhint %}

* Valid Values
  * `true`, `false`
* Default
  * `false`


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.elastiflow.com/flowcoll/configuration/flow-processing.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
