# Discovery & Personalization Engine Covers the trending system, following feed, personalized homepage, similar artworks, unified activity feed, and all input signal collection that powers the ranking formula. --- ## Table of Contents 1. [Architecture Overview](#1-architecture-overview) 2. [Input Signal Collection](#2-input-signal-collection) 3. [Windowed Stats (views & downloads)](#3-windowed-stats-views--downloads) 4. [Trending Engine](#4-trending-engine) 5. [Discover Routes](#5-discover-routes) 6. [Following Feed](#6-following-feed) 7. [Personalized Homepage](#7-personalized-homepage) 8. [Similar Artworks API](#8-similar-artworks-api) 9. [Unified Activity Feed](#9-unified-activity-feed) 10. [Meilisearch Configuration](#10-meilisearch-configuration) 11. [Caching Strategy](#11-caching-strategy) 12. [Scheduled Jobs](#12-scheduled-jobs) 13. [Testing](#13-testing) 14. [Operational Runbook](#14-operational-runbook) --- ## 1. Architecture Overview ``` Browser │ ├─ POST /api/art/{id}/view → ArtworkViewController ├─ POST /api/art/{id}/download → ArtworkDownloadController └─ POST /api/artworks/{id}/favorite / reactions / awards / comments │ ▼ ArtworkStatsService UserStatsService artwork_stats (all-time + user_statistics windowed counters) └─ artwork_views_received_count artwork_downloads (log) downloads_received_count │ ▼ skinbase:reset-windowed-stats (nightly/weekly) └─ zeros views_24h / views_7d └─ recomputes downloads_24h / downloads_7d from log │ ▼ skinbase:recalculate-trending (every 30 min) └─ bulk UPDATE artworks.trending_score_24h / _7d └─ dispatches IndexArtworkJob → Meilisearch │ ▼ Meilisearch index (artworks) └─ sortable: trending_score_7d, trending_score_24h, views, ... └─ filterable: author_id, tags, category, orientation, is_public, ... │ ▼ HomepageService / DiscoverController / SimilarArtworksController └─ Redis cache (5 min TTL) │ ▼ Inertia + React frontend ``` --- ## 2. Input Signal Collection ### 2.1 View tracking — `POST /api/art/{id}/view` **Controller:** `App\Http\Controllers\Api\ArtworkViewController` **Route name:** `api.art.view` **Throttle:** 5 requests per 10 minutes per IP **Deduplication (layered):** | Layer | Mechanism | Scope | |---|---|---| | Client-side | `sessionStorage` key `sb_viewed_{id}` set before the request | Browser tab lifetime | | Server-side | `$request->session()->put('art_viewed.{id}', true)` | Laravel session lifetime | | Throttle | `throttle:5,10` route middleware | Per-IP per-artwork | The React component `ArtworkActions.jsx` fires a `useEffect` on mount that checks `sessionStorage` first, then hits the endpoint. The response includes `counted: true|false` so callers can confirm whether the increment actually happened. **What gets incremented:** ``` artwork_stats.views +1 (all-time) artwork_stats.views_24h +1 (zeroed nightly) artwork_stats.views_7d +1 (zeroed weekly) user_statistics.artwork_views_received_count +1 (creator aggregate) ``` Via `ArtworkStatsService::incrementViews()` with `defer: true` (Redis when available, direct DB fallback). --- ### 2.2 Download tracking — `POST /api/art/{id}/download` **Controller:** `App\Http\Controllers\Api\ArtworkDownloadController` **Route name:** `api.art.download` **Throttle:** 10 requests per minute per IP The endpoint: 1. Inserts a row in `artwork_downloads` (persistent event log with `created_at`) 2. Increments `artwork_stats.downloads`, `downloads_24h`, `downloads_7d` 3. Returns `{"ok": true, "url": ""}` for the native browser download The `` buttons in `ArtworkActions.jsx` call `trackDownload()` on click — a fire-and-forget `fetch()` POST. The actual browser download is triggered by the `href`/`download` attributes and is never blocked by the tracking request. **What gets incremented:** ``` artwork_downloads INSERT (event log, persisted forever) artwork_stats.downloads +1 (all-time) artwork_stats.downloads_24h +1 (recomputed from log nightly) artwork_stats.downloads_7d +1 (recomputed from log weekly) user_statistics.downloads_received_count +1 (creator aggregate) ``` Via `ArtworkStatsService::incrementDownloads()` with `defer: true`. --- ### 2.3 Other signals (already existed) | Signal | Endpoint / Service | Written to | |---|---|---| | Favorite toggle | `POST /api/artworks/{id}/favorite` | `user_favorites`, `artwork_stats.favorites` | | Reaction toggle | `POST /api/artworks/{id}/reactions` | `artwork_reactions` | | Award | `ArtworkAwardController` | `artwork_award_stats.score_total` | | Comment | `ArtworkCommentController` | `artwork_comments`, `activity_events` | | Follow | `FollowService` | `user_followers`, `activity_events` | --- ### 2.4 ArtworkStatsService — Redis deferral When Redis is available all increments are pushed to a list key `artwork_stats:deltas` as JSON payloads. A separate job/command (`processPendingFromRedis`) drains the queue and applies bulk `applyDelta()` calls. If Redis is unavailable the service falls back transparently to a direct DB increment. ```php // Deferred (default for view/download controllers) $svc->incrementViews($artworkId, 1, defer: true); // Immediate (e.g. favorites toggle needs instant feedback) $svc->incrementDownloads($artworkId, 1, defer: false); ``` --- ## 3. Windowed Stats (views & downloads) ### 3.1 Why windowed columns? The trending formula needs _recent_ activity, not all-time totals. `artwork_stats.views` is a monotonically increasing counter — using it for trending would permanently favour old popular artworks and new artworks could never compete. The solution is four cached window columns refreshed on a schedule: | Column | Meaning | Reset cadence | |---|---|---| | `views_24h` | Views since last midnight reset | Nightly at 03:30 | | `views_7d` | Views since last Monday reset | Weekly (Mon) at 03:30 | | `downloads_24h` | Downloads in last 24 h | Nightly at 03:30 (recomputed from log) | | `downloads_7d` | Downloads in last 7 days | Weekly (Mon) at 03:30 (recomputed from log) | ### 3.2 How views windowing works **No per-view event log exists** (storing millions of view rows would be expensive). Instead: - Every view event increments `views_24h` and `views_7d` alongside `views`. - The reset command **zeroes** both columns. Artworks re-accumulate from the reset time onward. - Accuracy is "views since last reset", which is close enough for trending (error ≤ 1 day). ### 3.3 How downloads windowing works **`artwork_downloads` is a full event log** with `created_at`. The reset command: 1. Queries `COUNT(*) FROM artwork_downloads WHERE artwork_id = ? AND created_at >= NOW() - {interval}` for each artwork in chunks of 1000. 2. Writes the exact count back to `downloads_24h` / `downloads_7d`. This overwrites any drift from deferred Redis increments, making download windows always accurate at reset time. ### 3.4 Reset command ```bash php artisan skinbase:reset-windowed-stats --period=24h php artisan skinbase:reset-windowed-stats --period=7d ``` Uses chunked PHP loop (no `GREATEST()` / `INTERVAL` MySQL syntax) → works in both production MySQL and SQLite test DB. --- ## 4. Trending Engine ### 4.1 Formula ``` score = (award_score × 5.0) + (favorites × 3.0) + (reactions × 2.0) + (downloads_Xd × 1.0) ← windowed: 24h or 7d + (views_Xd × 2.0) ← windowed: 24h or 7d - (hours_since_published × 0.1) score = max(score, 0) ← clamped via GREATEST() ``` Weights are constants in `TrendingService` (`W_AWARD`, `W_FAVORITE`, etc.) — adjust without a schema change. ### 4.2 Output columns | Artworks column | Meaning | |---|---| | `trending_score_24h` | Score using `views_24h` + `downloads_24h`; targets artworks ≤ 7 days old | | `trending_score_7d` | Score using `views_7d` + `downloads_7d`; targets artworks ≤ 30 days old | | `last_trending_calculated_at` | Timestamp of last calculation | ### 4.3 Recalculation command ```bash php artisan skinbase:recalculate-trending --period=24h php artisan skinbase:recalculate-trending --period=7d php artisan skinbase:recalculate-trending --period=all php artisan skinbase:recalculate-trending --period=7d --skip-index # skip Meilisearch jobs php artisan skinbase:recalculate-trending --chunk=500 # smaller DB chunks ``` **Implementation:** `App\Services\TrendingService::recalculate()` 1. Chunks artworks published within the look-back window (`chunkById(1000, ...)`). 2. Issues one bulk MySQL `UPDATE ... WHERE id IN (...)` per chunk — no per-artwork queries in the hot path. 3. After each chunk, dispatches `IndexArtworkJob` per artwork to push updated scores to Meilisearch (skippable with `--skip-index`). > **Note:** The raw SQL uses `GREATEST()` and `TIMESTAMPDIFF(HOUR, ...)` which are MySQL 8 only. The command is tested in production against MySQL; the 4 related Pest tests are skipped on SQLite with a clear skip message. ### 4.4 Meilisearch sync after calculation `TrendingService::syncToSearchIndex()` dispatches `IndexArtworkJob` for every artwork in the trending window. The job calls `Artwork::searchable()` which triggers `toSearchableArray()`, which includes `trending_score_24h` and `trending_score_7d`. --- ## 5. Discover Routes All routes under `/discover/*` are registered in `routes/web.php` and handled by `App\Http\Controllers\Web\DiscoverController`. All use **Meilisearch sorting** — no SQL `ORDER BY` in the hot path. | Route | Name | Sort key | Auth | |---|---|---|---| | `/discover/trending` | `discover.trending` | `trending_score_7d:desc` | No | | `/discover/fresh` | `discover.fresh` | `created_at:desc` | No | | `/discover/top-rated` | `discover.top-rated` | `likes:desc` | No | | `/discover/most-downloaded` | `discover.most-downloaded` | `downloads:desc` | No | | `/discover/following` | `discover.following` | `created_at:desc` (DB) | Yes | --- ## 6. Following Feed **Route:** `GET /discover/following` (auth required) **Controller:** `DiscoverController::following()` ### Logic ``` 1. Get user's following IDs from user_followers 2. If empty → show empty state (see below) 3. If present → Artwork::whereIn('user_id', $followingIds) ->orderByDesc('published_at') ->paginate(24) + cached 1 min per user per page ``` ### Empty state When the user follows nobody: - `fallback_trending` — up to 12 trending artworks (Meilisearch, with DB fallback) - `fallback_creators` — 8 most-followed verified users (ordered by `user_statistics.followers_count`) - `empty: true` flag passed to the view - The `discoverTrending()` call is wrapped in `try/catch` so a Meilisearch outage never breaks the empty state page --- ## 7. Personalized Homepage **Controller:** `HomeController::index()` **Service:** `App\Services\HomepageService` ### Guest sections ```php [ 'hero' => first featured artwork, 'trending' => 12 artworks sorted by trending_score_7d, 'fresh' => 12 newest artworks, 'tags' => 12 most-used tags, 'creators' => creator spotlight, 'news' => latest news posts, ] ``` ### Authenticated sections (personalized) ```php [ 'hero' => same as guest, 'from_following' => artworks from followed creators (up to 12, cached 1 min), 'trending' => same as guest, 'by_tags' => artworks matching user's top 5 tags, 'by_categories' => fresh uploads in user's top 3 favourite categories, 'tags' => same as guest, 'creators' => same as guest, 'news' => same as guest, 'preferences' => { top_tags, top_categories }, ] ``` ### UserPreferenceService `App\Services\UserPreferenceService::build(User $user)` — cached 5 min per user. Computes preferences from the user's **favourited artworks**: | Output key | Source | |---|---| | `top_tags` (up to 5) | Tags on artworks in `artwork_favourites` | | `top_categories` (up to 3) | Categories on artworks in `artwork_favourites` | | `followed_creators` | IDs from `user_followers` | ### getTrending() — Meilisearch-first ```php Artwork::search('') ->options([ 'filter' => 'is_public = true AND is_approved = true', 'sort' => ['trending_score_7d:desc', 'trending_score_24h:desc', 'views:desc'], ]) ->paginate($limit, 'page', 1); ``` Falls back to `getTrendingFromDb()` — `orderByDesc('trending_score_7d')` with no correlated subqueries — when Meilisearch is unavailable. --- ## 8. Similar Artworks API **Route:** `GET /api/art/{id}/similar` **Controller:** `App\Http\Controllers\Api\SimilarArtworksController` **Route name:** `api.art.similar` **Throttle:** 60/min **Cache:** 5 min per artwork ID **Max results:** 12 ### Similarity algorithm Meilisearch filters are built in priority order: ``` is_public = true is_approved = true id != {source_id} author_id != {source_author_id} ← same creator excluded orientation = "{landscape|portrait}" ← only for non-square (visual coherence) (tags = "X" OR tags = "Y" OR ...) ← tag overlap (primary signal) OR (if no tags) (category = "X" OR ...) ← category fallback ``` Meilisearch's own ranking then sorts by relevance within those filters. Results are mapped to a slim JSON shape: `{id, title, slug, thumb, url, author_id}`. --- ## 9. Unified Activity Feed **Route:** `GET /community/activity?type=global|following` **Controller:** `App\Http\Controllers\Web\CommunityActivityController` ### `activity_events` schema | Column | Type | Notes | |---|---|---| | `id` | bigint PK | | | `actor_id` | bigint FK users | Who did the action | | `type` | varchar | `upload` `comment` `favorite` `award` `follow` | | `target_type` | varchar | `artwork` `user` | | `target_id` | bigint | ID of the target object | | `meta` | json nullable | Extra data (e.g. award tier) | | `created_at` | timestamp | No `updated_at` — immutable events | ### Where events are recorded | Event type | Recording point | |---|---| | `upload` | `UploadController::finish()` on publish | | `follow` | `FollowService::follow()` | | `award` | `ArtworkAwardController::store()` | | `favorite` | `ArtworkInteractionController::favorite()` | | `comment` | `ArtworkCommentController::store()` | All via `ActivityEvent::record($actorId, $type, $targetType, $targetId, $meta)`. ### Feed filters - **Global** — all recent events, newest first, paginated 30/page - **Following** — `WHERE actor_id IN (following_ids)` — only events from users you follow The controller enriches each event batch with its target objects in a single query per target type (no N+1). --- ## 10. Meilisearch Configuration Configured in `config/scout.php` under `meilisearch.index-settings`. Push settings to a running instance: ```bash php artisan scout:sync-index-settings ``` ### Artworks index settings **Searchable attributes** (ranked in order): 1. `title` 2. `tags` 3. `author_name` 4. `description` **Filterable attributes:** `tags`, `category`, `content_type`, `orientation`, `resolution`, `author_id`, `is_public`, `is_approved` **Sortable attributes:** `created_at`, `downloads`, `likes`, `views`, `trending_score_24h`, `trending_score_7d`, `favorites_count`, `awards_received_count`, `downloads_count` ### toSearchableArray() — fields indexed per artwork ```php [ 'id', 'slug', 'title', 'description', 'author_id', 'author_name', 'category', 'content_type', 'tags', 'resolution', 'orientation', 'downloads', 'likes', 'views', 'created_at', 'is_public', 'is_approved', 'trending_score_24h', 'trending_score_7d', 'favorites_count', 'awards_received_count', 'downloads_count', 'awards' => { gold, silver, bronze, score }, ] ``` --- ## 11. Caching Strategy | Data | Cache key | TTL | Driver | |---|---|---|---| | Homepage trending | `homepage.trending.{limit}` | 5 min | Redis/file | | Homepage fresh | `homepage.fresh.{limit}` | 5 min | Redis/file | | Homepage hero | `homepage.hero` | 5 min | Redis/file | | Homepage tags | `homepage.tags.{limit}` | 5 min | Redis/file | | User preferences | `user.prefs.{user_id}` | 5 min | Redis/file | | Following feed | `discover.following.{user_id}.p{page}` | 1 min | Redis/file | | Similar artworks | `api.similar.{artwork_id}` | 5 min | Redis/file | **Rules:** - Personalized data (`from_following`, `by_tags`, `by_categories`) is **not** independently cached — it falls inside `allForUser()` which is called fresh per request. - Long-running cache busting: the trending command and reset command do not explicitly clear cache — the TTL is short enough that stale data self-expires within one trending cycle. --- ## 12. Scheduled Jobs All registered in `routes/console.php` via `Schedule::command()`. | Time | Command | Purpose | |---|---|---| | Every 30 min | `skinbase:recalculate-trending --period=24h` | Update `trending_score_24h` | | Every 30 min | `skinbase:recalculate-trending --period=7d --skip-index` | Update `trending_score_7d` (background) | | 03:00 daily | `uploads:cleanup` | Remove stale draft uploads | | 03:10 daily | `analytics:aggregate-similar-artworks` | Offline similarity metrics | | 03:20 daily | `analytics:aggregate-feed` | Feed evaluation metrics | | 03:30 daily | `skinbase:reset-windowed-stats --period=24h` | Zero views_24h, recompute downloads_24h | | Monday 03:30 | `skinbase:reset-windowed-stats --period=7d` | Zero views_7d, recompute downloads_7d | **Reset runs at 03:30** so it fires after the other maintenance tasks (03:00–03:20). The next trending recalculation (every 30 min, including ~03:30 or ~04:00) picks up the freshly-zeroed windowed stats and writes accurate trending scores. --- ## 13. Testing All tests live under `tests/Feature/Discovery/`. | Test file | Coverage | |---|---| | `ActivityEventRecordingTest.php` | `ActivityEvent::record()`, all 5 types, actor relation, meta, route smoke tests for the activity feed | | `FollowingFeedTest.php` | Auth redirect, empty state fallback, pagination, creator exclusion | | `HomepagePersonalizationTest.php` | Guest vs auth homepage sections, preferences shape, 200 responses | | `SimilarArtworksApiTest.php` | 404 cases, response shape, result count ≤ 12, creator exclusion | | `SignalTrackingTest.php` | View endpoint (404s, first count, session dedup), download endpoint (404s, DB row, guest vs auth), route names | | `TrendingServiceTest.php` | Zero artworks, skip outside window, skip private/unapproved — _recalculate() tests skipped on SQLite (MySQL-only SQL)_ | | `WindowedStatsTest.php` | `incrementViews/Downloads` update all 3 columns, reset command zeros views, recomputes downloads from log, window boundary correctness | Run all discovery tests: ```bash php artisan test tests/Feature/Discovery/ ``` Run specific suite: ```bash php artisan test tests/Feature/Discovery/SignalTrackingTest.php ``` **SQLite vs MySQL note:** Four tests in `TrendingServiceTest` are marked `.skip()` with the message _"Requires MySQL: uses GREATEST() and TIMESTAMPDIFF()"_. Run them against a real MySQL instance in CI or staging to validate the bulk UPDATE formula. --- ## 14. Operational Runbook ### Trending scores are stuck / not updating ```bash # Check last calculated timestamp SELECT id, title, last_trending_calculated_at FROM artworks ORDER BY last_trending_calculated_at DESC LIMIT 5; # Manually trigger recalculation php artisan skinbase:recalculate-trending --period=all # Re-push scores to Meilisearch php artisan skinbase:recalculate-trending --period=7d ``` ### Windowed counters look wrong after a deploy ```bash # Force a reset and recompute php artisan skinbase:reset-windowed-stats --period=24h php artisan skinbase:reset-windowed-stats --period=7d # Then recalculate trending with fresh numbers php artisan skinbase:recalculate-trending --period=all ``` ### Meilisearch out of sync with DB ```bash # Re-push all artworks in the trending window php artisan skinbase:recalculate-trending --period=all # Or full re-index php artisan scout:import "App\Models\Artwork" ``` ### Push updated index settings (after changing config/scout.php) ```bash php artisan scout:sync-index-settings ``` ### Check what the trending formula is reading ```sql SELECT a.id, a.title, a.published_at, s.views, s.views_24h, s.views_7d, s.downloads, s.downloads_24h, s.downloads_7d, s.favorites, a.trending_score_24h, a.trending_score_7d, a.last_trending_calculated_at FROM artworks a LEFT JOIN artwork_stats s ON s.artwork_id = a.id WHERE a.is_public = 1 AND a.is_approved = 1 ORDER BY a.trending_score_7d DESC LIMIT 20; ``` ### Inspect the artwork_downloads log ```sql -- Downloads in the last 24 hours per artwork SELECT artwork_id, COUNT(*) as dl_24h FROM artwork_downloads WHERE created_at >= NOW() - INTERVAL 1 DAY GROUP BY artwork_id ORDER BY dl_24h DESC LIMIT 20; ```