# “Why Is the App Slow?” Troubleshooting Guide

##

## “Why Is the App Slow?” Troubleshooting Guide

**Dashboards: `kibana-8.14.x-flow-codex.ndjson`**

The workflow helps you decide whether slowness comes from

* **a DoS / DDoS flood** (external or internal)
* **an organic spike** in legitimate utilization (application or network)
* **mis-marked or re-marked DSCP** values starving the traffic of QoS

All filtering uses dropdown **input-list** controls; each choice becomes a blue filter pill that persists as you pivot between dashboards.

***

###

### 0 Set the Scene

1. **Analytics → Dashboard** in Kibana.
2. **Time-picker** ➟ include the slow period (e.g. **“Last 60 minutes”**).

Dashboard rail (left → right): *Overview | Top-N | Core Services | Threats | Flows | Graph | Geo IP | AS Traffic | Exporters | **Traffic Details** | Flow Records*

***

### 1 Top-N → **Top Applications** (Verify the slowdown & scope)

*Menu:* **Top-N ▸ ElastiFlow (flow): Top Applications**

| #     | What to do                                                                                   | Why                                                                                                        |
| ----- | -------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------- |
| **1** | **Exporter / Locality / Application – input list** ➟ type & tick the **app/port**, **Apply** | Focus every board on the app                                                                               |
| **2** | **Throughput / Applications (bits/s)**                                                       | • **Sharp rise** = flood or organic surge • \*\*No rise but high latency\*\* = check retransmissions (§ 3) |
| **3** | **Top Clients** / **Top Servers** tables                                                     | • One client dominating = internal DoS • Millions of IPs = external DDoS                                   |

***

### 2 Threats → **Threats (TCP / UDP DDoS)** (Is it an attack?)

*Menu:* **Threats ▸ ElastiFlow (flow): Threats (TCP)** (repeat for UDP if needed)

| Panel                      | What it means                                               |
| -------------------------- | ----------------------------------------------------------- |
| **TCP DDoS Events** (bar)  | Spikes = SYN-, ACK- or RST-flood                            |
| **Top Attack IPs / Ports** | • Many IPs = volumetric DDoS • Few internal IPs = rogue job |
| **Attack Type** donut      | Confirms vector (SYN, UDP, ICMP …)                          |

*If charts light up, escalate to SecOps/NOC with vectors & sources. If empty, continue.*

***

### 3 Traffic Details → **Traffic Details (attributes)**

*Are sessions retransmitting, stalling, or DSCP-mis-marked?* *Menu:* **Traffic Details ▸ ElastiFlow (flow): Traffic Details (attributes)** *(blue filter pill from § 1 already applied)*

| Panel / Control                                                         | What to do                                                                                                                                     | What it tells you                                        |
| ----------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------- |
| **VLAN / DSCP / TCP Flags – input list**                                | ► Tick the **expected DSCP value** for the app (e.g. `EF`, `AF21`) ➟ **Apply**                                                                 | Adds a DSCP pill so all charts show only that code-point |
| **DSCP Values (flow records)** donut \*\*Throughput / DSCP (bits/s)\*\* | • **Expected DSCP absent / tiny** = **app not marking** correctly • Large slice, then drops to \`CS0\` down-stream = \*\*network remarking\*\* |                                                          |
| **TCP Flags** donut                                                     | • Excess **SYN+RST** ⇒ SYN-flood blocked • Excess \*\*ACK+PSH\*\* with low bytes ⇒ app window shrink                                           |                                                          |
| **Session Established** metric                                          | • **False + surge** ⇒ handshake back-pressure (network saturation) • \*\*Yes + high latency\*\* ⇒ server busy / code bottleneck                |                                                          |

**Tip:** Clear the DSCP pill (❌) before moving on if you want to restore full traffic view.

***

### 4 Flow Records → **Flow Records (src/dst)** (Volume & burst analysis)

*Menu:* **Flow Records ▸ ElastiFlow (flow): Flow Records (src/dst)**

| Check                             | Why it matters                                                            |
| --------------------------------- | ------------------------------------------------------------------------- |
| **Flow Records/s (src/dst)** line | • Tall, narrow spikes = short-lived DoS • Wide plateau = legitimate surge |
| **Flow Record Count** metric      | Quantifies burst vs baseline                                              |

***

### 5 Flows → **Flows (src/dst)** (Pinpoint offenders & symmetry)

*Menu:* **Flows ▸ ElastiFlow (flow): Flows (src/dst)**

| #     | Action                                                                       | Meaning                                                                                                            |
| ----- | ---------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
| **1** | (Optional) **Src/Dst – input list** ➟ server IP or suspect client, **Apply** | Focus graph                                                                                                        |
| **2** | **Sankey: Flows (src/dst)**                                                  | • Many thin inbound edges = botnet DDoS • One thick edge = internal flood • Balanced edges + high volume = organic |

***

### 6 Geo IP → **Geo Location (src/dst)** (External vs internal)

*Menu:* **Geo IP ▸ ElastiFlow (flow): Geo Location (src/dst)**

*Map & donuts reveal whether sources are worldwide (DDoS) or a few corporate sites (organic / internal DoS).*

***

### 7 Exporters → **Flow Exporters (traffic)** (Collector / exporter health)

*Menu:* **Exporters ▸ ElastiFlow (flow): Flow Exporters (traffic)**

| Panel                             | Use it for                                                    |
| --------------------------------- | ------------------------------------------------------------- |
| **Exporter Throughput (bits/s)**  | Confirm exporter links aren’t saturating                      |
| **Exporter Packet Drop** counters | High drops = visibility loss; real congestion could be higher |

***

### 8 Decision Matrix

| Evidence                                                                      | Likely Root Cause              | Immediate Response                      |
| ----------------------------------------------------------------------------- | ------------------------------ | --------------------------------------- |
| **Threats** dashboard alerts + many source IPs                                | External DDoS                  | Engage ISP / enable scrubbing           |
| Threats empty, one internal IP dominates                                      | Internal DoS / runaway job     | Quarantine host, rate-limit             |
| Throughput & Flow-records plateau, balanced edges, Session Established = True | Organic usage spike            | Scale app / infra                       |
| Session Established = False surge, SYN-only spike                             | SYN-flood                      | Enable SYN-cookies / ACL                |
| **Expected DSCP absent at source** (§ 3)                                      | **App not marking QoS**        | Fix DSCP policy on server / container   |
| **DSCP present at source but reset mid-path** (§ 3 + 4)                       | **Network remarking**          | Audit QoS / policy-map on offending hop |
| **Exporter packet drops high**                                                | Collector / ingress bottleneck | Load-balance exporters, add collectors  |

***

### 9 90-Second Drill (Quick-look order)

1. **Top Applications** – dropdown filter to app; watch throughput & top talkers.
2. **Threats (TCP/UDP)** – confirm / rule out attacks fast.
3. **Traffic Details (attributes)** – handshake health *and* **DSCP correctness**.
4. **Flow Records (src/dst)** – burst vs plateau.
5. **Flows (src/dst)** & **Geo Location** – offender patterns.
6. **Exporters (traffic)** – ensure telemetry itself isn’t the bottleneck.

With these steps—including DSCP validation—you can tell within minutes whether to page the **network QoS engineer**, the **application owner**, or the **security team**.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.elastiflow.com/additional-resources-reference-articles/guides/using-elastiflow-dashboards-in-elastic-to-solve-real-world-problems/why-is-the-app-slow-troubleshooting-guide.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
