
How a CTO Finds Slow Queries and Fixes Them Before Finance Escalates
NeonEdge is a crypto-native iGaming platform based in Tallinn, Estonia, built around provably fair crash games with multi-chain support across ETH, Tron, Polygon, and SOL. The platform serves roughly fifteen thousand monthly active users and runs approximately $5M per week in gross gaming revenue. It is a lean, technically ambitious shop — two engineers own the entire data infrastructure — and Priya Desai, CTO and Head of Data, is the person both product and finance call when anything moves slower than expected.
Products used: Query Performance Analytics, Infrastructure Monitor, Cost Optimization
20 minutes | full investigation time
3 | problem queries identified from a cold start
12x | average speedup after optimizations were deployed
Challenge
The complaints arrived within the same forty-eight hours, from two different directions. The product team filed a Slack thread saying the player activity reports felt sluggish — "sometimes you click and wait, then give up." Finance was blunter: the weekly GGR reconciliation report had started timing out mid-load and the CFO had resorted to screenshotting a half-rendered page before it crashed. No error codes, no obvious failures — just slowness that had quietly crossed from annoying to broken.
Priya's standard diagnostic toolkit for this kind of problem was not fast. Tracing query latency across a multi-chain data stack meant correlating logs from three separate services, checking cluster resource utilization manually, and reconstructing which upstream transformation was adding time to which downstream report. On a good day with the right logs already surfaced, this took two to three hours. On a bad day — the kind where the slow query only appears under load — it could stretch to a full afternoon.
The structural problem was that NeonEdge's data stack had grown organically. Early engineering decisions — sensible at five thousand MAU — had not been revisited as the platform crossed fifteen thousand. Nobody had made a deliberate choice to skip indexes or write a table scan into a core report; it had just happened, gradually, while the team was focused on shipping features rather than auditing data access patterns.
"We're a two-person data team running infrastructure for fifteen thousand users and five million a week in GGR. I don't have the luxury of spending a full day hunting a slow query. I need the bottleneck on my screen in twenty minutes, or the problem sits until next sprint."
— Priya Desai, CTO, NeonEdge
Solution
Priya opened Gaming Mind AI and described the symptom in plain terms: reports are slow, nobody knows which query is the culprit, and the problem has been escalating for two days. Gaming Mind connected to NeonEdge's infrastructure telemetry and began with the most direct diagnostic question — which queries are consuming the most execution time across the stack right now.
Here's how the investigation unfolded:
Priya: "Reports are slow across the board. Where should I look first?"
| Rank | Query Name | Avg Execution Time | p99 Time | Runs/Day | Caller | % of Total Query Time |
|---|---|---|---|---|---|---|
| 1 | GGR Reconciliation Report | 41.2 sec | 68.4 sec | 4 | Finance | 24% |
| 2 | Player Activity Report | 31.0 sec | 54.1 sec | 8 | Product | 22% |
| 3 | Player Cohort Report | 28.3 sec | 47.6 sec | 6 | Product / CRM | 12% |
| 4 | Affiliate Revenue Attribution | 5.8 sec | 11.2 sec | 12 | Marketing | 4% |
| 5 | Daily Active User Summary | 5.1 sec | 9.7 sec | 24 | Ops | 4% |
| 6 | Chain Settlement Reconciliation | 4.9 sec | 9.1 sec | 6 | Finance | 3% |
| 7 | Churn Risk Score Refresh | 4.4 sec | 8.3 sec | 2 | CRM | 2% |
| 8 | Wallet Balance Snapshot | 3.8 sec | 7.2 sec | 48 | Ops | 2% |
| 9 | Bonus Utilisation Report | 3.2 sec | 6.4 sec | 4 | Product | 1% |
| 10 | New Registrations Funnel | 2.9 sec | 5.8 sec | 24 | Marketing | 1% |
| (all others) | — | < 2.0 sec | < 4.0 sec | — | Various | 25% |
| Total top 3 | ~100 sec combined | 58% | ||||
| Total top 10 | 71% |
⚠️ Gaming Mind flags: The top 3 queries account for 58% of total query execution time. The distribution is strikingly uneven — queries 1–3 average 28–41 seconds each while queries 4–10 average under 6 seconds combined. Fixing the top 3 will reduce total query time by an estimated 58% without touching anything else.
Gaming Mind's first response was a ranked latency leaderboard covering the previous seven days of query execution. The top ten slowest queries accounted for 71% of total query execution time, but the distribution was strikingly uneven — the three worst offenders averaged 28 to 41 seconds each, while numbers four through ten averaged under six seconds combined. Gaming Mind flagged the top three as the only ones worth investigating immediately: fixing them would reduce total query time by an estimated 58% without touching anything else. Priya had her starting point in under ninety seconds.
Priya: "Tell me about the slowest one."
Query: GGR Reconciliation Report
| Stage | Operation | Rows Scanned | Rows Output | Stage Time | Cumulative Time |
|---|---|---|---|---|---|
| 1 | Full table scan — transaction_ledger | 84,200,000 | 84,200,000 | 28.4 sec | 28.4 sec |
| 2 | Filter: this week's records | 84,200,000 | 312,400 | 6.1 sec | 34.5 sec |
| 3 | Join: chain metadata | 312,400 | 312,400 | 2.8 sec | 37.3 sec |
| 4 | Aggregate: GGR by chain + game type | 312,400 | 48 | 1.9 sec | 39.2 sec |
| 5 | Format output | 48 | 48 | 2.0 sec | 41.2 sec |
| Diagnostic | Detail |
|---|---|
| Data date range scanned | Platform launch (18 months) to present |
| Required date range | Current week only |
| Date filter applied before scan | No |
| Access pattern classification | Unfiltered historical scan |
| Severity | 🔴 High — known performance antipattern in append-heavy ledger architectures |
| Root cause | Missing WHERE date predicate before table scan — design decision from 18 months ago never revisited |
⚠️ Gaming Mind flags: The GGR Reconciliation Report — the exact query Finance complained about — is doing a full table scan across 18 months of transaction history to answer a question that only needs the current week. Applying a date filter before the scan is the single fix required. This is the root cause of the Finance CFO timeout.
The worst query was the GGR reconciliation report — exactly what Finance had complained about. Gaming Mind broke down its execution plan stage by stage: a full table scan was touching every row in the transaction ledger, including historical records stretching back to platform launch, every single time the report ran. The query had no date filter applied before the scan, which meant it was processing nearly eighteen months of transaction history to answer a question that only needed the current week. Gaming Mind classified this as a high-severity access pattern and labeled it an unfiltered historical scan — a known performance antipattern in append-heavy ledger architectures. The root cause was one design decision from eighteen months ago that nobody had revisited.
Priya: "What's happening with the second one?"
Query: Player Activity Report
| Stage | Operation | Input Rows | Output Rows | Stage Time | Cumulative Time |
|---|---|---|---|---|---|
| 1 | Scan: player_sessions | 2,100,000 | 2,100,000 | 4.2 sec | 4.2 sec |
| 2 | Join: game_events (wide, before filter) | 2,100,000 | 18,700,000 | 12.8 sec | 17.0 sec |
| 3 | Join: wallet_activity | 18,700,000 | 18,700,000 | 7.1 sec | 24.1 sec |
| 4 | Filter: player segment + date range | 18,700,000 | 480,000 | 4.6 sec | 28.7 sec |
| 5 | Aggregate + format | 480,000 | 920 | 2.3 sec | 31.0 sec |
| Diagnostic | Detail |
|---|---|
| Bottleneck stage | Stage 2 — join expands before filter narrows |
| Intermediate result set peak | 18,700,000 rows (~3x necessary size) |
| Cause | Join order places broad join before narrowest filter |
| Fix | Apply player segment + date filter before join with game_events |
| Estimated post-fix intermediate size | ~6,200,000 rows |
| Estimated post-fix execution time | < 5 sec |
| Memory reduction | ~66% |
⚠️ Gaming Mind flags: The Player Activity Report's join is expanding before it filters — producing an intermediate result set nearly 3x larger than necessary. Reversing two steps in the join sequence (applying the narrowest filter first) will reduce intermediate memory consumption by roughly two-thirds and bring execution time from 31 seconds to under 5.
The second problem query fed the player activity report the product team had flagged. Gaming Mind identified a multi-stage join that was expanding before it filtered — processing a wide intermediate result set at peak size before narrowing it down by player segment and date range. The join order was producing intermediate tables nearly three times larger than necessary. Gaming Mind annotated the exact stage where the result set ballooned, and estimated that reversing two steps in the join sequence — applying the narrowest filter first — would reduce intermediate memory consumption by roughly two-thirds and bring execution time from thirty-one seconds to under five.
Priya: "And the third?"
Query: Player Cohort Report
| Column | Individual Index Exists | Part of Composite Index | Cardinality | Time Spent Resolving Intersection |
|---|---|---|---|---|
| chain_identifier | Yes | No | Low (4 values) | — |
| registration_date | Yes | No | High (540 days) | — |
| game_category | Yes | No | Medium (12 values) | — |
| chain_identifier + registration_date + game_category | No | No | — | ~19 sec per execution |
| Diagnostic | Detail |
|---|---|
| Missing index type | Composite index on (chain_identifier, registration_date, game_category) |
| Current resolution method | Manual intersection at query time |
| Estimated execution time with composite index | < 3 sec |
| Queries across platform using this column combination | 14 distinct queries |
| Additional queries accelerated by adding 1 composite index | 13 |
| Severity | 🔴 High — single fix, platform-wide impact |
⚠️ Gaming Mind flags: The three columns that appear in every version of this query — chain_identifier, registration_date, and game_category — are each indexed individually but never as a composite. Adding one composite index will eliminate the manual intersection work and accelerate the cohort report plus 13 other queries across the platform simultaneously.
The third slow query was the player cohort report, used weekly by both product and the CRM team. Gaming Mind surfaced an index coverage gap: three columns that appeared together in every version of this query — chain identifier, registration date, and game category — were each indexed individually but never as a composite. Every execution was resolving the intersection manually at query time, doing work that a single composite index would have eliminated entirely. Gaming Mind noted that this column combination appeared in fourteen distinct queries across the platform, meaning a single index addition would accelerate the cohort report and thirteen other queries simultaneously.
Priya: "Show me resource consumption across the cluster while these are running."
14-Day Resource Utilization — Peak Scheduled Report Windows
| Date | Report Window | CPU Peak | Memory Peak | I/O Peak | Concurrent Slow Queries | Queue Cascade? |
|---|---|---|---|---|---|---|
| Day 1 | 09:00–09:45 | 74% | 71% | 68% | 1 | No |
| Day 2 | 09:00–09:45 | 78% | 76% | 72% | 1 | No |
| Day 3 | 09:00–09:45 | 81% | 79% | 75% | 2 | No |
| Day 4 | 09:00–09:45 | 76% | 74% | 71% | 1 | No |
| Day 5 | 09:00–09:45 | 82% | 89% | 84% | 2 | Yes |
| Day 6 | 09:00–09:45 | 77% | 75% | 73% | 1 | No |
| Day 7 | 09:00–09:45 | 79% | 77% | 74% | 1 | No |
| Day 8 | 09:00–09:45 | 83% | 89% | 85% | 2 | Yes |
| Day 9 | 09:00–09:45 | 75% | 73% | 70% | 1 | No |
| Day 10 | 09:00–09:45 | 84% | 91% | 87% | 2 | Yes |
| Day 11–14 | 09:00–09:45 | 72–78% | 70–76% | 67–73% | 0–1 | No |
Risk Summary
| Metric | Value |
|---|---|
| Memory ceiling during concurrent slow query runs | 89–91% |
| Cascade events in past 10 days | 3 |
| Cascade trigger | GGR Reconciliation + Player Activity running concurrently |
| Risk classification | 🔴 Cluster stability risk |
⚠️ Gaming Mind flags: Every CPU and I/O utilization spike coincides with scheduled report runs. On 3 occasions in the past 10 days, concurrent execution of the GGR Reconciliation and Player Activity reports pushed memory above 89%, triggering queue delays that cascaded to other workloads — including the Finance CFO timeout. This is not just a slow query problem. It is a cluster stability risk.
Gaming Mind overlaid the three slow query windows onto NeonEdge's cluster resource heatmap for the past two weeks. The pattern was unmistakable: every spike in CPU and I/O utilization coincided with scheduled report runs, and the cluster was hitting near-capacity on memory during concurrent execution of the first two queries. On three occasions in the past ten days, concurrent execution of the GGR reconciliation and player activity reports had pushed memory utilization above 89%, triggering query queue delays that cascaded to other workloads — including the Finance timeout the CFO had experienced. This wasn't just a slow query problem. It was a cluster stability risk.
Priya: "How do I prioritize these three fixes? Which one do I do first?"
| Fix | Query | Estimated Speedup | Implementation Effort | Deployment Risk | Queries Impacted | Recommended Order |
|---|---|---|---|---|---|---|
| Add composite index (chain_id + reg_date + game_cat) | Player Cohort Report | ~9x | Low (< 1 hr) | Very low | 14 queries | 1st — deploy today |
| Rewrite join order (filter before expand) | Player Activity Report | ~6x | Medium (3–4 hr) | Low (test in staging) | 1 query | 2nd — deploy after staging |
| Add date predicate before table scan | GGR Reconciliation | ~12x | Medium (2–3 hr) | Medium (Finance schedule dependency) | 1 query + caching eligibility | 3rd — next Finance maintenance window |
Scoring Detail
| Fix | Speedup Score | Effort Score | Risk Score | Impact Breadth Score | Total Priority Score |
|---|---|---|---|---|---|
| Composite index | 3/5 | 5/5 | 5/5 | 5/5 | 18/20 |
| Join rewrite | 4/5 | 3/5 | 4/5 | 2/5 | 13/20 |
| Date predicate | 5/5 | 3/5 | 3/5 | 3/5 | 14/20 |
⚠️ Gaming Mind flags: Deploy the composite index first — lowest risk, fastest to implement, broadest positive impact across 14 platform queries. The GGR join rewrite has the highest individual speedup (estimated 12x) but requires staging validation against Finance's exact report parameters. The table scan fix must be coordinated with Finance's maintenance window before deployment.
Gaming Mind produced a priority matrix scoring each fix on four dimensions: estimated speedup, implementation complexity, deployment risk, and breadth of downstream impact. The composite index was ranked first — lowest risk, fastest to deploy, and the broadest positive impact across the platform's fourteen affected queries. The GGR join rewrite was ranked second: highest individual speedup at an estimated twelve-fold improvement, but requiring careful testing against Finance's exact report parameters before deployment. The unfiltered table scan fix was ranked third — also high impact, but involving a date-filter change to a report Finance ran on a fixed weekly schedule, which meant coordinating a deployment window. Gaming Mind recommended deploying the index first, testing the join rewrite in staging, and scheduling the table scan fix for the next Finance maintenance window.
Priya: "What's the estimated total speedup if I fix all three?"
Projected Post-Optimization Performance
| Query | Before | After | Speedup |
|---|---|---|---|
| GGR Reconciliation Report | 41.2 sec | 3.4 sec | ~12x |
| Player Activity Report | 31.0 sec | 4.8 sec | ~6x |
| Player Cohort Report | 28.3 sec | 3.1 sec | ~9x |
| Combined avg report generation | 45.0 sec | < 4.0 sec | > 11x |
Cluster Resource Headroom
| Metric | Before Fixes | After Fixes | Change |
|---|---|---|---|
| Memory utilization (concurrent report runs) | 89% (ceiling) | ~34% | -55pp |
| Queue cascade events per 10 days | 3 | 0 (projected) | Eliminated |
| CFO report timeout risk | Active | None | Eliminated |
Secondary Benefit — GGR Reconciliation Caching
| Detail | Value |
|---|---|
| Post-fix caching eligibility | Yes (date predicate enables pre-computation) |
| Estimated monthly compute cost reduction | ~60% |
| Module flagging this benefit | Cost Optimization |
⚠️ Gaming Mind flags: The three fixes combined are projected to reduce average report generation time from 45 seconds to under 4 — a 90%+ reduction. Cluster memory during peak scheduled runs will drop from 89% to ~34%, eliminating the cascade queue risk entirely. The GGR reconciliation fix also unlocks pre-computation caching eligibility, projected to cut that report's monthly compute spend by approximately 60%.
Gaming Mind modeled the combined effect. The three fixes together were projected to reduce average report generation time from forty-five seconds to under four — a reduction of over ninety percent. Cluster memory utilization during peak scheduled runs was expected to drop from the current 89% ceiling to around 34%, eliminating the queue cascade risk entirely. The projection also flagged a secondary benefit: with the unfiltered table scan resolved, the GGR reconciliation report would become eligible for pre-computation caching, which Gaming Mind's Cost Optimization module estimated would reduce compute spend on that report by approximately sixty percent on a monthly basis.
"I walked in expecting to spend the afternoon in log files. Gaming Mind had the three queries ranked, profiled, and prioritized in twenty minutes. It even told me the order to fix them in. I just needed to write the code."
— Priya Desai
Results
Investigation completed in 20 minutes from a cold start
Priya had no pre-built dashboards for query performance and no open incident to trace from. Gaming Mind pulled the telemetry, ranked the offenders, and produced a deployment-ordered fix list in a single conversation. Zero log files opened, zero support tickets filed, zero engineers pulled from other work.
Three root causes identified across three different problem types
Each slow query had a structurally distinct cause — an unfiltered historical scan, a suboptimal join order, and a missing composite index. Gaming Mind diagnosed all three and explained each in plain terms Priya could communicate directly to the engineering team without translation. The investigation surfaced problems that had been accumulating for months, not just the symptoms reported over the previous forty-eight hours.
12x average speedup after fixes were deployed
The composite index went live the same afternoon. The join rewrite passed staging tests by end of day and was deployed the following morning. The table scan fix was coordinated with Finance and deployed the next maintenance window. After all three changes, average report generation time dropped from forty-five seconds to three seconds and forty-two seconds — a twelve-fold improvement measured against the platform's own telemetry.
Cluster stability risk eliminated before it became an incident
The memory utilization ceiling — which had been silently pushing 89% during concurrent report runs — dropped to 34% after the fixes. The three cascade events that had occurred over the previous ten days were the early warning sign of a cluster that was approaching failure under load. Gaming Mind surfaced this risk from the resource heatmap analysis; without it, the next incident would likely have been a full report outage during a high-traffic weekend session.
Monthly compute cost projected to fall by 60% for the reconciliation report
The Cost Optimization module flagged the post-fix eligibility of the GGR reconciliation report for pre-computation caching. Priya's team implemented the caching layer two weeks after the initial fixes, and the following month's infrastructure bill for that report workload came in at thirty-eight percent of the prior baseline — slightly better than projected.
"The cluster was three bad Sundays away from a real outage and we didn't know it. The performance investigation found the slow queries, but the resource heatmap found the stability risk. That's the part that actually scared me — and the part I'm most glad we caught before it caught us."
— Priya Desai, CTO, NeonEdge
Read in another language
Want to see how Gaming Mind AI can help your operation?
Get a Demo