# Why are there discrepancies between ElastiFlow data & reality

### Problem

When visualizing ElastiFlow data, it is much higher or lower than expected.

![image](https://user-images.githubusercontent.com/43585378/212578740-42c7aaed-2723-4c9d-a2c8-61f681ce3464.png)

### Reason #1: Sample Rate

A sample rate is the average ratio of packets incoming on a sFlow-enabled port to the number of flow samples taken from those packets. sFlow sampling can affect performance on some network equipment. <https://blog.sflow.com/2009/06/sampling-rates.html>

#### How sample rates impact ElastiFlow data accuracy

The collector must adjust the calculation of bytes and packets based on the sampling rate used. Usually, devices will inform the collector of the sampling rate either within the flow record itself or as option data sent periodically by the device. This setting specifies the size of the cache to be used to hold sample rate information learned from option data.

Reference: <https://elastiflow.com/docs/config_ref_sampling/#sample-rate>

What is a "good" sampling rate?: <https://blog.sflow.com/2009/05/scalability-and-accuracy-of-packet.html>

![image](https://1.bp.blogspot.com/_N3xuQCvc1v4/SkUhTKcl9KI/AAAAAAAAACU/4o_4MJ9e7sE/s400/samplingrates.PNG)

Detailed Explanation: <https://sflow.org/packetSamplingBasics/index.htm>

#### Confirming a devices sample rate

If you go to:

![image](https://user-images.githubusercontent.com/43585378/212575740-ec2e911c-49fe-4a9e-b2c9-c419e377ae31.png)

After filtering on the exporter in question

![image](https://user-images.githubusercontent.com/43585378/212756393-72aacc60-a213-40d8-9650-df555818cc94.png)

You can look at a record and focus on the following field to see what NetObserv Flow thinks the current sample rate is:

![image](https://user-images.githubusercontent.com/43585378/212575820-ea629310-e522-4466-85c0-04f1b152933a.png)

In the above example you can see NetObserv Flow believes the sample rate is 1:1; however, as you can see below the device is configured with a sample rate of 1:512 thus the discrepancy in packets and bytes:

![image](https://user-images.githubusercontent.com/43585378/212575883-df42ce7d-06a3-49c2-9880-37d8415e8353.png)

This is because the sampler table is not configured to be sent by the flow exporter (In this example a Cisco Nexus switch). On some flow exporters you must specify a 'sample options template' to be sent, in this case Cisco refers to this as a 'sampler-table'. Configuring the network device to send this is as easy as adding it to the flow configuration on the flow exporter (a Cisco Nexus switch in this example):

![image](https://user-images.githubusercontent.com/43585378/212575995-e0fcb5c9-d0cf-410d-bc92-4c0706669ab7.png)

Now (after the timeout period has elapsed and the flow exporter sends the sampler table), when we go back into Kibana & check the flow\.meter.packet\_select.interval.packets field values in flow records coming from the Nexus switch, we can see NetObserv Flow knows about the correct sample rate of 1:512:

![image](https://user-images.githubusercontent.com/43585378/212575713-f39138e0-eac1-4d12-8bb1-c11f54500aa7.png)

### Reason #2: Your flow exporter is counting the same flow(s) multiple times

When configuring Flow exporters (routers, switches, firewalls, etc...) It is typical to configure flow collection on specific interface(s). If misconfigured, this can lead to flows being counted two or more times and then added together.

#### Example of a flow exporter counting a single flow multiple times

![image](https://user-images.githubusercontent.com/43585378/212578740-42c7aaed-2723-4c9d-a2c8-61f681ce3464.png)

#### Why this happens (configuration of R1)

As you can see below, not only is R1 configured to collect flows from Ethernet 0/3, Ethernet 0/2, and Vlan200, It is collecting flows from these interfaces in **both** the **input and output** directions.

In the example speed test above, flows are counted twice and achieve a download speed of \~213mbps. The flow is counted as it enters interface ethernet 0/3 **and** as it comes out of vlan 200 toward the destination Win11-2 computer. In both cases, the ingress interface is Ethernet 0/3, and the egress interface is Vlan200; thus, counting a single flow two times resulted in the above-measured throughput of 506.2Mbps when in reality, it was only \~213mbps.

![image](https://user-images.githubusercontent.com/43585378/212579509-4c8f6055-a097-480c-9870-b4cb7388fc33.png)

We will only apply the monitor in the input direction on all these interfaces to correct this and remove the output configuration.

![image](https://user-images.githubusercontent.com/43585378/212757255-885d604a-b92d-4e0a-b549-b4801cdefe15.png)

Now when we run another speed test, we see the correct throughput:

![image](https://user-images.githubusercontent.com/43585378/212757889-12d98d5b-ca17-4c40-a260-de4831f3dc74.png)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.elastiflow.com/additional-resources-reference-articles/faq/data_discrepancies.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
