474 lines
17 KiB
Markdown
474 lines
17 KiB
Markdown
# MySQL Slow Query Optimization Plan
|
||
|
||
**Source:** `/var/log/mysql/slow.log` — 68,950 total queries, 151,576s total exec time
|
||
**Period:** 2026-04-04 → 2026-04-26 (22 days)
|
||
**Threshold:** 500ms
|
||
**Server:** server3 / database: `skinbase`
|
||
|
||
---
|
||
|
||
## Summary Stats
|
||
|
||
| Metric | Value |
|
||
|---|---|
|
||
| Total queries logged | 68,950 |
|
||
| Unique query fingerprints | 139 |
|
||
| Total execution time | 151,576s (~42h) |
|
||
| Average exec time | 2s |
|
||
| 95th percentile | 3s |
|
||
| Total rows examined | 15.39B |
|
||
| Total bytes sent | 40.79GB |
|
||
|
||
---
|
||
|
||
## Priority 1 — Critical (fix immediately)
|
||
|
||
### P1-A: Correlated subquery counting artworks per tag (Query 8)
|
||
**Total time:** 1,004s · **Calls:** 1,138 · **Rows examined/call:** ~240k
|
||
|
||
**Current query:**
|
||
```sql
|
||
SELECT tags.*,
|
||
(SELECT count(*) FROM artworks
|
||
INNER JOIN artwork_tag ON artworks.id = artwork_tag.artwork_id
|
||
WHERE tags.id = artwork_tag.tag_id AND artworks.deleted_at IS NULL)
|
||
AS artworks_count
|
||
FROM tags
|
||
ORDER BY artworks_count DESC
|
||
LIMIT 10
|
||
```
|
||
|
||
**Problem:** N+1 correlated subquery — one full `artworks JOIN artwork_tag` count per tag row.
|
||
|
||
**Fix options (pick one):**
|
||
1. **Best — cached counter column:** Add `artworks_count INT DEFAULT 0` to `tags`, maintained by an Eloquent observer on `artwork_tag` attach/detach. Query becomes `SELECT * FROM tags ORDER BY artworks_count DESC LIMIT 10` — instant.
|
||
2. **Quick — JOIN + GROUP BY:**
|
||
```sql
|
||
SELECT t.*, COALESCE(cnt.c, 0) AS artworks_count
|
||
FROM tags t
|
||
LEFT JOIN (
|
||
SELECT at.tag_id, COUNT(*) AS c
|
||
FROM artwork_tag at
|
||
JOIN artworks a ON a.id = at.artwork_id AND a.deleted_at IS NULL
|
||
GROUP BY at.tag_id
|
||
) cnt ON cnt.tag_id = t.id
|
||
ORDER BY artworks_count DESC
|
||
LIMIT 10;
|
||
```
|
||
|
||
**Migration needed:** `php artisan make:migration add_artworks_count_to_tags`
|
||
|
||
---
|
||
|
||
### P1-B: Correlated subquery counting artworks per user (Query 33)
|
||
**Total time:** 85s · **Calls:** 131 · **Rows examined/call:** ~61k
|
||
|
||
**Current query:**
|
||
```sql
|
||
SELECT users.*,
|
||
(SELECT count(*) FROM artworks
|
||
WHERE users.id = artworks.user_id AND is_approved=1 AND is_public=1 ...)
|
||
AS artworks_count
|
||
FROM users
|
||
HAVING artworks_count > 0
|
||
ORDER BY artworks_count DESC
|
||
LIMIT 6
|
||
```
|
||
|
||
**Problem:** Same correlated N+1 pattern per user. `HAVING` on a subquery forces a full users scan.
|
||
|
||
**Fix:** Use the existing `user_statistics` table. Add `public_artworks_count INT DEFAULT 0` if not present, maintained by artwork publish/unpublish observer. Then:
|
||
```sql
|
||
SELECT users.*, us.public_artworks_count AS artworks_count
|
||
FROM users
|
||
JOIN user_statistics us ON us.user_id = users.id
|
||
WHERE us.public_artworks_count > 0 AND users.deleted_at IS NULL
|
||
ORDER BY us.public_artworks_count DESC
|
||
LIMIT 6;
|
||
```
|
||
|
||
---
|
||
|
||
### P1-C: Jobs table LIKE scan on JSON payload (Query 38)
|
||
**Total time:** 64s · **Calls:** 58 · **Rows examined/call:** ~55k
|
||
|
||
**Current query:**
|
||
```sql
|
||
SELECT count(*) FROM jobs
|
||
WHERE payload LIKE '%AutoTagArtworkJob%' AND payload LIKE '%69756%'
|
||
```
|
||
|
||
**Problem:** Full scan of the `jobs` table JSON payload column — no index possible on LIKE '%...%'.
|
||
|
||
**Fix:** Replace this deduplication check with a Redis key or a dedicated `job_dedup` table with an indexed `(job_class, subject_id)` column. Example:
|
||
```php
|
||
// Instead of scanning jobs table:
|
||
if (Cache::has("auto-tag-queued:{$artworkId}")) return;
|
||
Cache::put("auto-tag-queued:{$artworkId}", true, now()->addHours(1));
|
||
AutoTagArtworkJob::dispatch($artwork);
|
||
```
|
||
|
||
---
|
||
|
||
### P1-D: `SELECT DISTINCT artwork_id` from snapshots with no index (Query 12)
|
||
**Total time:** 923s · **Calls:** 280 · **Rows examined/call:** ~3.97M (max 8.3M!)
|
||
|
||
**Current query:**
|
||
```sql
|
||
SELECT DISTINCT artwork_id
|
||
FROM artwork_metric_snapshots_hourly
|
||
WHERE bucket_hour BETWEEN '...' AND '...'
|
||
```
|
||
|
||
**Problem:** No covering index on `(bucket_hour, artwork_id)`. Full or large range scan every time.
|
||
|
||
**Fix:** Add a compound index:
|
||
```sql
|
||
ALTER TABLE artwork_metric_snapshots_hourly
|
||
ADD INDEX idx_bucket_artwork (bucket_hour, artwork_id);
|
||
```
|
||
|
||
Migration: `php artisan make:migration add_index_bucket_artwork_to_metric_snapshots`
|
||
|
||
---
|
||
|
||
### P1-E: Full-text LIKE searches on artworks title + description (Queries 50, 53)
|
||
**Total time:** ~62s combined · **Calls:** ~48 · Pattern: `WHERE title LIKE '%keyword%' OR description LIKE '%keyword%' OR ...` (20+ OR conditions)
|
||
|
||
**Problem:** Leading wildcard LIKE cannot use B-tree indexes. Full table scan every time.
|
||
|
||
**Fix:** Use **Meilisearch** (already in use for artwork search). Route AI-tag search queries through `ArtworkSearchService` instead of raw LIKE. For any fallback that must stay in MySQL:
|
||
```sql
|
||
-- Add FULLTEXT index:
|
||
ALTER TABLE artworks ADD FULLTEXT INDEX ft_artwork_text (title, description);
|
||
-- Then use MATCH..AGAINST instead of LIKE:
|
||
WHERE MATCH(title, description) AGAINST ('+moon +lunar' IN BOOLEAN MODE)
|
||
```
|
||
|
||
---
|
||
|
||
## Priority 2 — High Impact
|
||
|
||
### P2-A: Artwork aggregate stats queries — top 2 time consumers (Queries 1 & 2)
|
||
**Total time:** 83,766s + 34,068s = **117,834s** (78% of all slow query time)
|
||
**Calls:** 37,124 + 13,789 = ~51k · **Avg:** 2.2s · **Rows examined/call:** ~215–234k
|
||
|
||
These are the same heavy SELECT pattern loading per-artwork stats from multiple tables:
|
||
```sql
|
||
SELECT a.id, a.user_id, a.published_at, a.is_public, a.is_approved,
|
||
(a.thumb_ext IS NOT NULL AND a.thumb_ext != '') AS has_thumbnail,
|
||
COALESCE(ast.views, 0) AS views_all,
|
||
COALESCE(ast.downloads, 0) AS downloads_all,
|
||
COALESCE(ast.favorites, 0) AS favourites_all,
|
||
COALESCE(cc.cnt, 0) AS comments_count,
|
||
COALESCE(sc.cnt, 0) AS shares_count,
|
||
COALESCE(ast.views_7d, 0) AS views_7d,
|
||
...
|
||
FROM artworks a
|
||
LEFT JOIN artwork_stats ast ON ast.artwork_id = a.id
|
||
LEFT JOIN (SELECT artwork_id, COUNT(*) cnt FROM artwork_favourites WHERE created_at >= ...) fav7 ...
|
||
LEFT JOIN (SELECT artwork_id, COUNT(*) cnt FROM artwork_comments ...) cc ...
|
||
LEFT JOIN (SELECT artwork_id, COUNT(*) cnt FROM artwork_shares ...) sc ...
|
||
WHERE a.is_public = 1 AND a.is_approved = 1 AND a.deleted_at IS NULL
|
||
AND a.published_at <= NOW()
|
||
ORDER BY a.id ASC
|
||
LIMIT 500
|
||
```
|
||
|
||
**Problems:**
|
||
- Inline derived subqueries for `fav7`, `cc`, `sc` run per page — not cached.
|
||
- `ORDER BY a.id` with filters requires index on `(deleted_at, is_public, is_approved, published_at, id)`.
|
||
- 51k calls/22 days = ~2,300 calls/day = every ~37 seconds, all day long. This is a **scheduled job or background process** churning through artworks.
|
||
|
||
**Fix — composite index (immediate):**
|
||
```sql
|
||
ALTER TABLE artworks
|
||
ADD INDEX idx_public_approved_published (deleted_at, is_public, is_approved, published_at, id);
|
||
```
|
||
|
||
**Fix — pre-aggregate counts (medium term):**
|
||
Ensure `artwork_stats` already stores `favorites_7d`, `comments_count`, `shares_count`. If so, move all counts to `artwork_stats` maintenance jobs and remove the inline derived joins. The SELECT becomes a single fast `LEFT JOIN artwork_stats`.
|
||
|
||
**Fix — reduce call frequency:**
|
||
If this is a scheduler-driven scan, batch it into chunks with exponential backoff and persist cursor position so it doesn't re-scan from scratch every run.
|
||
|
||
---
|
||
|
||
### P2-B: User stats calculation — 28 second average (Query 3)
|
||
**Total time:** 8,847s · **Calls:** 320 · **Avg:** 28s · **Max:** 180s!
|
||
|
||
Complex query joining `users`, `user_xp_logs`, `user_followers`, `artwork_likes`, `artworks`, `artwork_metric_snapshots_hourly`.
|
||
|
||
**Fix:** This should **never run on demand**. Route it to:
|
||
1. A scheduled background job that pre-aggregates into `user_statistics` (runs every N minutes).
|
||
2. The controller/service reads from `user_statistics` only — single-row lookup by `user_id`.
|
||
|
||
Also ensure `user_statistics` has `INDEX idx_user_id (user_id)`.
|
||
|
||
---
|
||
|
||
### P2-C: `artwork_metric_snapshots_hourly` heavy join query (Query 4)
|
||
**Total time:** 6,328s · **Calls:** 237 · **Avg:** 27s · **Max:** 52s
|
||
|
||
```sql
|
||
SELECT artworks.*, ...
|
||
FROM artworks
|
||
JOIN artwork_metric_snapshots_hourly amsh ON amsh.artwork_id = artworks.id
|
||
JOIN artwork_likes al ON ...
|
||
JOIN artwork_downloads ad ON ...
|
||
JOIN artwork_comments ac ON ...
|
||
WHERE amsh.bucket_hour BETWEEN ... AND ...
|
||
```
|
||
|
||
**Fix:**
|
||
1. Add index from P1-D: `idx_bucket_artwork (bucket_hour, artwork_id)`.
|
||
2. Pre-aggregate hourly snapshots into daily/weekly summary tables and query those instead.
|
||
3. Reduce time range of `BETWEEN` clause if querying recent data only.
|
||
|
||
---
|
||
|
||
### P2-D: rank_artwork_scores queries — ~10 variants (Queries 26, 27, 29, 37, 39, 41, 42, 43, 46)
|
||
**Total time:** ~750s combined · ~870 combined calls · **Avg:** ~750ms · **Rows examined/call:** ~140–150k
|
||
|
||
Pattern:
|
||
```sql
|
||
SELECT ras.artwork_id, a.user_id, ras.score_trending
|
||
FROM rank_artwork_scores ras
|
||
INNER JOIN artworks a ON a.id = ras.artwork_id AND a.is_public=1 AND a.is_approved=1 AND a.deleted_at IS NULL
|
||
WHERE ras.model_version = 'rank_v2'
|
||
ORDER BY ras.score_trending DESC
|
||
LIMIT 200
|
||
```
|
||
|
||
**Problem:** `WHERE model_version = 'rank_v2' ORDER BY score_X DESC` — no composite index covers both.
|
||
|
||
**Fix — add partial/composite indexes:**
|
||
```sql
|
||
ALTER TABLE rank_artwork_scores
|
||
ADD INDEX idx_mv_trending (model_version, score_trending DESC),
|
||
ADD INDEX idx_mv_new_hot (model_version, score_new_hot DESC),
|
||
ADD INDEX idx_mv_best (model_version, score_best DESC),
|
||
ADD INDEX idx_mv_score (model_version, score_new_hot, score_trending, score_best);
|
||
```
|
||
|
||
**Fix — cache ranking results:**
|
||
These are pre-computed ranking scores. Cache the TOP 200 list per `(model_version, score_column)` for 5–15 minutes in Redis. The ranking job already runs on a schedule — warm the cache at the end of each ranking job run.
|
||
|
||
---
|
||
|
||
### P2-E: `artworks` public count via full scan (Query 30)
|
||
**Total time:** 100s · **Calls:** 127 · **Avg:** 790ms · **Rows examined/call:** 97k
|
||
|
||
```sql
|
||
SELECT count(*) FROM artworks
|
||
WHERE deleted_at IS NULL AND is_approved=1 AND is_public=1
|
||
AND published_at IS NOT NULL AND published_at <= NOW()
|
||
```
|
||
|
||
**Fix — maintain a counter cache:**
|
||
```php
|
||
// In config or cache:
|
||
Cache::remember('artworks.public_count', 300, fn() => Artwork::public()->published()->count());
|
||
```
|
||
|
||
Or store the count in a `site_statistics` / `system_settings` table, updated by the publish observer.
|
||
|
||
---
|
||
|
||
### P2-F: artworks ORDER BY id scanner for publish pipeline (Queries 18, 22, 23, 40, 47)
|
||
**Total time:** ~650s combined · Many calls · Pattern: `SELECT * FROM artworks WHERE ... ORDER BY id ASC` or `ORDER BY trending_score_7d DESC`
|
||
|
||
**Problem:** `SELECT *` loads all columns including large `description` blobs. The publish/ranking pipeline only needs IDs.
|
||
|
||
**Fix:**
|
||
- Use `SELECT id` or `SELECT id, user_id` instead of `SELECT *`.
|
||
- Ensure `trending_score_7d` is indexed if using `ORDER BY trending_score_7d`.
|
||
- Add index from P2-A: `idx_public_approved_published`.
|
||
|
||
---
|
||
|
||
## Priority 3 — Medium Impact
|
||
|
||
### P3-A: Popular tags query with tag_interaction_daily_metrics (Query 6)
|
||
**Total time:** 2,234s · **Calls:** 2,349 · **Avg:** 951ms
|
||
|
||
Joining `artworks → artwork_tag → tags → tag_interaction_daily_metrics`. 246k rows examined/call.
|
||
|
||
**Fix:**
|
||
- Cache the result: popular tags change slowly — cache 5–15 minutes.
|
||
- Add index on `tag_interaction_daily_metrics (tag_id, metric_date)`.
|
||
- Precompute `tag_interaction_daily_metrics` aggregates into a `tag_trending_scores` table.
|
||
|
||
---
|
||
|
||
### P3-B: Browse/gallery category + tag joins (Query 7)
|
||
**Total time:** 1,758s · **Calls:** 2,344 · **Avg:** 750ms
|
||
|
||
`artworks + categories + artwork_category + artwork_tag` — 80M total rows examined.
|
||
|
||
**Fix:**
|
||
- Verify indexes: `artwork_category(category_id, artwork_id)`, `artwork_tag(tag_id, artwork_id)`.
|
||
- Pagination: ensure cursor/keyset pagination is used, not `OFFSET`.
|
||
- Cache browse results per category (already partially done in HomepageService).
|
||
|
||
---
|
||
|
||
### P3-C: artwork_metric_snapshots_hourly backup full scan (Query 15)
|
||
**Total time:** 646s · **Calls:** 93 · **Max:** 127s · **User:** `backuper`
|
||
|
||
`SELECT /*!40001 SQL_NO_CACHE */ * FROM artwork_metric_snapshots_hourly` — mysqldump reading full table.
|
||
|
||
**Fix (ops):**
|
||
- Partition `artwork_metric_snapshots_hourly` by month on `bucket_hour`. Backup only reads the active partition.
|
||
- Or: exclude this table from hot backup and back it up separately during low-traffic window (02:00–04:00 UTC).
|
||
- Archive data older than 90 days to a cold table.
|
||
|
||
---
|
||
|
||
### P3-D: `artworks LEFT JOIN artwork_stats` with OR condition (Query 32)
|
||
**Total time:** 86s · **Calls:** 110
|
||
|
||
```sql
|
||
WHERE (artworks.created_at >= '...' OR (s.ranking_score IS NOT NULL AND s.ranking_score > 0))
|
||
```
|
||
|
||
**Problem:** OR prevents index usage on `created_at`.
|
||
|
||
**Fix:** Rewrite as UNION:
|
||
```sql
|
||
SELECT id FROM artworks WHERE created_at >= '...' AND deleted_at IS NULL AND is_approved=1
|
||
UNION
|
||
SELECT a.id FROM artworks a JOIN artwork_stats s ON s.artwork_id = a.id
|
||
WHERE s.ranking_score > 0 AND a.deleted_at IS NULL AND a.is_approved=1
|
||
```
|
||
|
||
---
|
||
|
||
### P3-E: GROUP BY user_id from artworks (Query 31)
|
||
**Total time:** 92s · **Calls:** 110
|
||
|
||
```sql
|
||
SELECT a.user_id, COALESCE(us.followers_count, 0), COALESCE(us.favorites_received_count, 0)
|
||
FROM artworks a
|
||
LEFT JOIN user_statistics us ON us.user_id = a.user_id
|
||
WHERE a.is_public=1 AND a.is_approved=1 AND a.deleted_at IS NULL
|
||
GROUP BY a.user_id, us.followers_count, us.favorites_received_count
|
||
```
|
||
|
||
**Problem:** Full artworks scan + GROUP BY without a covering index.
|
||
|
||
**Fix:** Add index `(deleted_at, is_public, is_approved, user_id)` on artworks. Or pre-aggregate into `user_statistics` and query that directly without touching `artworks`.
|
||
|
||
---
|
||
|
||
### P3-F: Forum posts moderation scan (Query 62)
|
||
**Total time:** 12s · **Calls:** 15
|
||
|
||
```sql
|
||
SELECT * FROM forum_posts
|
||
WHERE (moderation_checked = 0 OR last_ai_scan_at IS NULL OR updated_at > last_ai_scan_at)
|
||
AND id IS NOT NULL AND deleted_at IS NULL
|
||
ORDER BY id ASC LIMIT 50
|
||
```
|
||
|
||
**Fix:** Add a partial index for unmoderated posts:
|
||
```sql
|
||
ALTER TABLE forum_posts ADD INDEX idx_needs_moderation (moderation_checked, last_ai_scan_at, id)
|
||
WHERE moderation_checked = 0;
|
||
```
|
||
Or maintain a `moderation_queue` table with only pending IDs.
|
||
|
||
---
|
||
|
||
## Recommended Index Migrations
|
||
|
||
Apply these in order (fastest wins first):
|
||
|
||
```bash
|
||
php artisan make:migration add_performance_indexes_batch1
|
||
```
|
||
|
||
```php
|
||
public function up(): void
|
||
{
|
||
// P1-D: snapshot bucket+artwork lookup
|
||
Schema::table('artwork_metric_snapshots_hourly', function (Blueprint $table) {
|
||
$table->index(['bucket_hour', 'artwork_id'], 'idx_bucket_artwork');
|
||
});
|
||
|
||
// P2-A + P2-F: public artwork scans & ORDER BY id
|
||
Schema::table('artworks', function (Blueprint $table) {
|
||
$table->index(
|
||
['deleted_at', 'is_public', 'is_approved', 'published_at', 'id'],
|
||
'idx_public_approved_published_id'
|
||
);
|
||
$table->index(
|
||
['deleted_at', 'is_public', 'is_approved', 'user_id'],
|
||
'idx_public_approved_user'
|
||
);
|
||
});
|
||
|
||
// P2-D: rank_artwork_scores per model_version
|
||
Schema::table('rank_artwork_scores', function (Blueprint $table) {
|
||
$table->index(['model_version', 'score_trending'], 'idx_mv_trending');
|
||
$table->index(['model_version', 'score_new_hot'], 'idx_mv_new_hot');
|
||
$table->index(['model_version', 'score_best'], 'idx_mv_best');
|
||
$table->index(['model_version', 'score_new_hot'], 'idx_mv_score');
|
||
});
|
||
|
||
// P3-A: tag daily metrics
|
||
Schema::table('tag_interaction_daily_metrics', function (Blueprint $table) {
|
||
$table->index(['tag_id', 'metric_date'], 'idx_tag_date');
|
||
});
|
||
|
||
// P1-A: tag artworks_count cached column
|
||
Schema::table('tags', function (Blueprint $table) {
|
||
$table->unsignedInteger('artworks_count')->default(0)->after('slug');
|
||
$table->index('artworks_count', 'idx_artworks_count');
|
||
});
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## Caching Quick Wins (no schema change needed)
|
||
|
||
| Surface | Cache Key | TTL | Notes |
|
||
|---|---|---|---|
|
||
| Popular tags | `tags.popular.10` | 15 min | Currently: correlated subquery per request |
|
||
| Ranking top-200 lists | `rank.{model}.{score}.200` | 10 min | Warm at end of ranking job |
|
||
| Public artwork count | `artworks.public_count` | 5 min | Used in sitemaps, stats |
|
||
| User artworks_count | `user.{id}.artworks_count` | 5 min | Warm on publish/unpublish |
|
||
| Group leaderboard | `leaderboard.group.monthly.5` | 30 min | Already in leaderboards service |
|
||
|
||
---
|
||
|
||
## Implementation Roadmap
|
||
|
||
| Phase | Items | Effort | Expected gain |
|
||
|---|---|---|---|
|
||
| **Week 1** | P1-A correlated tag query, P1-C jobs LIKE, P1-D snapshot index, all P2-D rank indexes | Low–Med | ~15–20% reduction in slow queries |
|
||
| **Week 2** | P2-A artworks composite index, P1-B user artworks_count, caching quick wins table | Med | ~40–50% reduction; kills #1 and #2 slow query families |
|
||
| **Week 3** | P2-B user stats background job, P2-C snapshot pre-aggregation, P1-E fulltext index | High | ~65–75% reduction; kills 28s queries |
|
||
| **Month 2** | P2-F SELECT * → SELECT id, P3-C partition snapshots table, P3-D OR→UNION rewrite | High | Remaining tail |
|
||
|
||
---
|
||
|
||
## Monitoring After Changes
|
||
|
||
```bash
|
||
# Reset slow query log on server
|
||
mysql -e "FLUSH SLOW LOGS;"
|
||
|
||
# Re-run pt-query-digest after 1 week:
|
||
pt-query-digest /var/log/mysql/slow.log > /tmp/slow-report-after.txt
|
||
|
||
# Check query plan for top queries:
|
||
EXPLAIN SELECT ... \G
|
||
```
|
||
|
||
Key metrics to watch:
|
||
- Total slow query count per day (target: -50% in week 2)
|
||
- `Rows_examined` for artwork queries (target: <10k instead of 234k)
|
||
- MySQL CPU usage during ranking job windows
|