Skip to main content

Elasticsearch

Overview#

The Elasticsearch output can be used to send records to Elasticsearch and Elastic Cloud.

EF_FLOW_OUTPUT_ELASTICSEARCH_ENABLE#

Specifies whether the Elasticsearch output is enabled.

  • Valid Values
    • true, false
  • Default
    • false

EF_FLOW_OUTPUT_ELASTICSEARCH_ECS_ENABLE#

Specifies whether the data will be sent using Elastic Common Schema (ECS).

  • Valid Values
    • true, false
  • Default
    • false

EF_FLOW_OUTPUT_ELASTICSEARCH_BATCH_DEADLINE#

The maximum time, in milliseconds, to wait for a batch of records to fill before being sent to the Elasticsearch bulk API.

  • Default
    • 2000

EF_FLOW_OUTPUT_ELASTICSEARCH_BATCH_MAX_BYTES#

The maximum size, in bytes, for a batch of records being sent to the Elasticsearch bulk API.

  • Default
    • 8388608

EF_FLOW_OUTPUT_ELASTICSEARCH_TIMESTAMP_SOURCE#

Determines the timestamp source to be used to set the @timestamp field. Usually end would be the best setting. However, in the case of poorly behaving or misconfigured devices, collect may be the better option.

  • Valid Values
    • start - Use the timestamp from flow.start.timestamp. The flow start time indicated in the flow.
    • end - Use the timestamp from flow.end.timestamp. The flow end time (or last reported time).
    • export - Use the timestamp from flow.export.timestamp. The time from the flow record header.
    • collect - Use the timestamp from flow.collect.timestamp. The time that the collector processed the flow record.
  • Default
    • end

EF_FLOW_OUTPUT_ELASTICSEARCH_INDEX_PERIOD#

  • Valid Values
    • daily - New indices will be created each day. The format of the time period suffix will be -yyyy.MM.dd.
    • weekly - New indices will be created each week. The format of the time period suffix will be -yyyy.'w'ww.
    • monthly - New indices will be created each month. The format of the time period suffix will be -yyyy.MM.
    • ilm - Index Lifecycle Management will be used to handle the creation and deletion of indices.
  • Default
    • daily

EF_FLOW_OUTPUT_ELASTICSEARCH_INDEX_SUFFIX#

It can sometimes be useful to have separate indices for different environments, locations or other organizational unit. This settings allow you to specify a suffix that will be added to the index for such purposes.

  • Default
    • ''

EF_FLOW_OUTPUT_ELASTICSEARCH_INDEX_TEMPLATE_ENABLE#

Specifies whether the output should attempt to add the required index template to Elasticsearch.

  • Valid Values
    • true, false
  • Default
    • true

EF_FLOW_OUTPUT_ELASTICSEARCH_INDEX_TEMPLATE_OVERWRITE#

If the output is configured to add the index template to Elasticsearch (EF_FLOW_OUTPUT_ELASTICSEARCH_INDEX_TEMPLATE_ENABLE is true), this setting determines whether the index template should be overwritten if it already exists.

  • Valid Values
    • true, false
  • Default
    • false

EF_FLOW_OUTPUT_ELASTICSEARCH_INDEX_TEMPLATE_SHARDS#

The number of shards with which the index should be created. As a general rule, additional shards increases ingest performance, assuming there are sufficient data nodes across which the shards can be distributed.

  • Recommended
    • Equal to the number of Elasticsearch data nodes to which data will be indexed.
  • Default
    • 3
note

This setting configures the index template sent to Elasticsearch. It does NOT change any existing indices.

EF_FLOW_OUTPUT_ELASTICSEARCH_INDEX_TEMPLATE_REPLICAS#

The number of replicas that should be created for each shard. If using a multi-node cluster and data redundancy is desired, this value must be at least 1.

In general, additional replicas will increase query performance, assuming there are sufficient data nodes across which the replicas can be distributed.

  • Recommended
    • 1 if indexing data to a multi-node cluster.
    • 0 for a single-node.
  • Default
    • 1
note

This setting configures the index template sent to Elasticsearch. It does NOT change any existing indices.

EF_FLOW_OUTPUT_ELASTICSEARCH_INDEX_TEMPLATE_REFRESH_INTERVAL#

Specifies the period for the refresh interval. The refresh interval is the time window in which newly ingested documents are added to a segment, prior to the segment being added to the index. Only after the refresh interval has ended and the segment has been added to the index do the documents become searchable.

  • Recommended
    • 5s - If the data needs to become available for queries more quickly. However shorter refresh intervals will negatively impact ingest performance.
    • 30s - (or longer) If maximizing ingest performance is the highest priority. Longer refresh intervals negatively impact the real-time accessibility of new records.
    • 10s or 15s - This is a reasonable compromise between ingest performance and data accessibility for most network traffic analytics use-cases.
  • Default
    • 10s
note

This setting configures the index template sent to Elasticsearch. It does NOT change any existing indices.

EF_FLOW_OUTPUT_ELASTICSEARCH_INDEX_TEMPLATE_CODEC#

The setting determines the level of compression used for stored values.

  • Valid Values
    • default - Stored values are compressed using LZ4.
    • best_compression - Stored values are compressed using DEFLATE. This reduces disk capacity requirements with the trade-off of slightly higher CPU utilization.
  • Default
    • best_compression
note

This setting configures the index template sent to Elasticsearch. It does NOT change any existing indices.

EF_FLOW_OUTPUT_ELASTICSEARCH_INDEX_TEMPLATE_ILM_LIFECYCLE#

If data is being stored to an Elasticsearch cluster with Index Lifecycle Management (ILM) features enabled, this setting species the name of the ILM Lifecycle that should be applied to the indices.

note

The ILM Lifecycle itself MUST be configured separately in Elasticsearch.

  • Default
    • elastiflow

EF_FLOW_OUTPUT_ELASTICSEARCH_INDEX_TEMPLATE_ILM_ROLLOVER_ALIAS#

If data is being stored to an Elasticsearch cluster with Index Lifecycle Management (ILM) features enabled, this setting species the name of the ILM Lifecycle Rollover Alias that should be applied to the indices.

note

The ILM Lifecycle itself MUST be configured separately in Elasticsearch.

  • Default
    • ''

EF_FLOW_OUTPUT_ELASTICSEARCH_INDEX_TEMPLATE_PIPELINE_DEFAULT#

If it is desired to process the incoming with an Elasticsearch Ingest Pipeline prior to it being indexed, this setting specifies the name of the default pipeline.

  • Default
    • _none

EF_FLOW_OUTPUT_ELASTICSEARCH_INDEX_TEMPLATE_PIPELINE_FINAL#

If it is desired to process the incoming with an Elasticsearch Ingest Pipeline prior to it being indexed, this setting specifies the name of the final pipeline.

  • Default
    • _none

EF_FLOW_OUTPUT_ELASTICSEARCH_ADDRESSES#

This setting specifies the Elasticsearch servers to which the output should connect. It is a comma-separated list of Elasticsearch nodes, including port number.

warning

Do NOT include http:// or https:// in the provided value. TLS communications is enabled/disabled using EF_FLOW_OUTPUT_ELASTICSEARCH_TLS_ENABLE.

  • Default
    • 127.0.0.1:9200

EF_FLOW_OUTPUT_ELASTICSEARCH_USERNAME#

The username to use when connecting to Elasticsearch.

  • Default
    • elastic

EF_FLOW_OUTPUT_ELASTICSEARCH_PASSWORD#

The password to use when connecting to Elasticsearch.

  • Default
    • changeme

EF_FLOW_OUTPUT_ELASTICSEARCH_CLOUD_ID#

The URI for the Elastic Cloud endpoint to which the output should connect. If set, this value overrides EF_FLOW_OUTPUT_ELASTICSEARCH_ADDRESSES.

  • Default
    • ''

EF_FLOW_OUTPUT_ELASTICSEARCH_API_KEY#

The base64-encoded token to use for authorization.

Elasticsearch provides Security APIs to:

If set, this value overrides EF_FLOW_OUTPUT_ELASTICSEARCH_USERNAME and EF_FLOW_OUTPUT_ELASTICSEARCH_PASSWORD.

  • Default
    • ''

EF_FLOW_OUTPUT_ELASTICSEARCH_CLIENT_CA_CERT_FILEPATH#

The path to the Certificate Authority (CA) certificate to use for client PKI authentication.

  • Default
    • ''

To use PKI authentication, your elasticsearch cluster needs to be configured for such a setup. Here is a guide. The Unified Flow Collector requires the use of a role mapping with a "superuser" role. Alternatively, a custom role that can contain the privileges allowing the Unified Flow Collector to read/create/update/delete indices, and read/create/update/delete data to/from indices.

EF_FLOW_OUTPUT_ELASTICSEARCH_CLIENT_CERT_FILEPATH#

The path to the client certificate to use for client PKI authentication.

  • Default
    • ''

EF_FLOW_OUTPUT_ELASTICSEARCH_CLIENT_KEY_FILEPATH#

The path to the client key to use for client PKI authentication.

  • Default
    • ''

EF_FLOW_OUTPUT_ELASTICSEARCH_TLS_ENABLE#

This setting is used to enable/disable TLS connections to Elasticsearch.

  • Valid Values
    • true, false
  • Default
    • false

EF_FLOW_OUTPUT_ELASTICSEARCH_TLS_SKIP_VERIFICATION#

This setting is used to enable/disable TLS verification of the Elasticsearch server to which the output is attempting to connect.

  • Valid Values
    • true, false
  • Default
    • false

EF_FLOW_OUTPUT_ELASTICSEARCH_TLS_CA_CERT_FILEPATH#

The path to the Certificate Authority (CA) certificate to use for verification of the Elasticsearch server to which the output is attempting to connect.

  • Default
    • ''

EF_FLOW_OUTPUT_ELASTICSEARCH_RETRY_ENABLE#

Specifies whether to retry connecting to Elasticsearch after a connection has failed.

  • Valid Values
    • true, false
  • Default
    • true

EF_FLOW_OUTPUT_ELASTICSEARCH_RETRY_ON_TIMEOUT_ENABLE#

Specifies whether to retry bulk indexing requests which have timed-out.

  • Valid Values
    • true, false
  • Default
    • true

EF_FLOW_OUTPUT_ELASTICSEARCH_MAX_RETRIES#

Specifies the number of times to retry bulk indexing requests which have timed-out.

  • Default
    • 3

EF_FLOW_OUTPUT_ELASTICSEARCH_RETRY_BACKOFF#

If set, this value specifies the quantity of milliseconds that the output should "backoff" prior to retrying a failed bulk request.

  • Default
    • 1000

EF_FLOW_OUTPUT_ELASTICSEARCH_DROP_FIELDS#

This setting allows for a comma-separated list of fields that are to be removed from all records.

note

Fields are dropped after any output specific fields have been added and after any schema conversion. This means that you should use the field names as you see them in the user interface.

  • Valid Values
    • any field names related to the enabled schema, comma-separated
  • Example
    • flow.export.sysuptime,flow.export.version.ver,flow.start.sysuptime,flow.end.sysuptime,flow.seq_num
  • Default
    • ''