23 KiB
Discovery & Personalization Engine
Covers the trending system, following feed, personalized homepage, similar artworks, unified activity feed, and all input signal collection that powers the ranking formula.
This document also covers the v3 AI discovery layer: vision metadata extraction, vector indexing, AI similar-artwork search, reverse image search, and the hybrid feed section controls.
Table of Contents
- Architecture Overview
- Input Signal Collection
- Windowed Stats (views & downloads)
- Trending Engine
- Discover Routes
- Following Feed
- Personalized Homepage
- Similar Artworks API
- Unified Activity Feed
- Meilisearch Configuration
- Caching Strategy
- Scheduled Jobs
- Testing
- AI Discovery v3
- Operational Runbook
1. Architecture Overview
Browser
│
├─ POST /api/art/{id}/view → ArtworkViewController
├─ POST /api/art/{id}/download → ArtworkDownloadController
└─ POST /api/artworks/{id}/favorite / reactions / awards / comments
│
▼
ArtworkStatsService UserStatsService
artwork_stats (all-time + user_statistics
windowed counters) └─ artwork_views_received_count
artwork_downloads (log) downloads_received_count
│
▼
skinbase:reset-windowed-stats (nightly/weekly)
└─ zeros views_24h / views_7d
└─ recomputes downloads_24h / downloads_7d from log
│
▼
skinbase:recalculate-trending (every 30 min)
└─ bulk UPDATE artworks.trending_score_24h / _7d
└─ dispatches IndexArtworkJob → Meilisearch
│
▼
Meilisearch index (artworks)
└─ sortable: trending_score_7d, trending_score_24h, views, ...
└─ filterable: author_id, tags, category, orientation, is_public, ...
│
▼
HomepageService / DiscoverController / SimilarArtworksController
└─ Redis cache (5 min TTL)
│
▼
Inertia + React frontend
2. Input Signal Collection
2.1 View tracking — POST /api/art/{id}/view
Controller: App\Http\Controllers\Api\ArtworkViewController
Route name: api.art.view
Throttle: 5 requests per 10 minutes per IP
Deduplication (layered):
| Layer | Mechanism | Scope |
|---|---|---|
| Client-side | sessionStorage key sb_viewed_{id} set before the request |
Browser tab lifetime |
| Server-side | $request->session()->put('art_viewed.{id}', true) |
Laravel session lifetime |
| Throttle | throttle:5,10 route middleware |
Per-IP per-artwork |
The React component ArtworkActions.jsx fires a useEffect on mount that checks sessionStorage first, then hits the endpoint. The response includes counted: true|false so callers can confirm whether the increment actually happened.
What gets incremented:
artwork_stats.views +1 (all-time)
artwork_stats.views_24h +1 (zeroed nightly)
artwork_stats.views_7d +1 (zeroed weekly)
user_statistics.artwork_views_received_count +1 (creator aggregate)
Via ArtworkStatsService::incrementViews() with defer: true (Redis when available, direct DB fallback).
2.2 Download tracking — POST /api/art/{id}/download
Controller: App\Http\Controllers\Api\ArtworkDownloadController
Route name: api.art.download
Throttle: 10 requests per minute per IP
The endpoint:
- Inserts a row in
artwork_downloads(persistent event log withcreated_at) - Increments
artwork_stats.downloads,downloads_24h,downloads_7d - Returns
{"ok": true, "url": "<highest-res thumbnail URL>"}for the native browser download
The <a download> buttons in ArtworkActions.jsx call trackDownload() on click — a fire-and-forget fetch() POST. The actual browser download is triggered by the href/download attributes and is never blocked by the tracking request.
What gets incremented:
artwork_downloads INSERT (event log, persisted forever)
artwork_stats.downloads +1 (all-time)
artwork_stats.downloads_24h +1 (recomputed from log nightly)
artwork_stats.downloads_7d +1 (recomputed from log weekly)
user_statistics.downloads_received_count +1 (creator aggregate)
Via ArtworkStatsService::incrementDownloads() with defer: true.
2.3 Other signals (already existed)
| Signal | Endpoint / Service | Written to |
|---|---|---|
| Favorite toggle | POST /api/artworks/{id}/favorite |
user_favorites, artwork_stats.favorites |
| Reaction toggle | POST /api/artworks/{id}/reactions |
artwork_reactions |
| Award | ArtworkAwardController |
artwork_award_stats.score_total |
| Comment | ArtworkCommentController |
artwork_comments, activity_events |
| Follow | FollowService |
user_followers, activity_events |
2.4 ArtworkStatsService — Redis deferral
When Redis is available all increments are pushed to a list key artwork_stats:deltas as JSON payloads. A separate job/command (processPendingFromRedis) drains the queue and applies bulk applyDelta() calls. If Redis is unavailable the service falls back transparently to a direct DB increment.
// Deferred (default for view/download controllers)
$svc->incrementViews($artworkId, 1, defer: true);
// Immediate (e.g. favorites toggle needs instant feedback)
$svc->incrementDownloads($artworkId, 1, defer: false);
3. Windowed Stats (views & downloads)
3.1 Why windowed columns?
The trending formula needs recent activity, not all-time totals. artwork_stats.views is a monotonically increasing counter — using it for trending would permanently favour old popular artworks and new artworks could never compete.
The solution is four cached window columns refreshed on a schedule:
| Column | Meaning | Reset cadence |
|---|---|---|
views_24h |
Views since last midnight reset | Nightly at 03:30 |
views_7d |
Views since last Monday reset | Weekly (Mon) at 03:30 |
downloads_24h |
Downloads in last 24 h | Nightly at 03:30 (recomputed from log) |
downloads_7d |
Downloads in last 7 days | Weekly (Mon) at 03:30 (recomputed from log) |
3.2 How views windowing works
No per-view event log exists (storing millions of view rows would be expensive). Instead:
- Every view event increments
views_24handviews_7dalongsideviews. - The reset command zeroes both columns. Artworks re-accumulate from the reset time onward.
- Accuracy is "views since last reset", which is close enough for trending (error ≤ 1 day).
3.3 How downloads windowing works
artwork_downloads is a full event log with created_at. The reset command:
- Queries
COUNT(*) FROM artwork_downloads WHERE artwork_id = ? AND created_at >= NOW() - {interval}for each artwork in chunks of 1000. - Writes the exact count back to
downloads_24h/downloads_7d.
This overwrites any drift from deferred Redis increments, making download windows always accurate at reset time.
3.4 Reset command
php artisan skinbase:reset-windowed-stats --period=24h
php artisan skinbase:reset-windowed-stats --period=7d
Uses chunked PHP loop (no GREATEST() / INTERVAL MySQL syntax) → works in both production MySQL and SQLite test DB.
4. Trending Engine
4.1 Formula
score = (award_score × 5.0)
+ (favorites × 3.0)
+ (reactions × 2.0)
+ (downloads_Xd × 1.0) ← windowed: 24h or 7d
+ (views_Xd × 2.0) ← windowed: 24h or 7d
- (hours_since_published × 0.1)
score = max(score, 0) ← clamped via GREATEST()
Weights are constants in TrendingService (W_AWARD, W_FAVORITE, etc.) — adjust without a schema change.
4.2 Output columns
| Artworks column | Meaning |
|---|---|
trending_score_24h |
Score using views_24h + downloads_24h; targets artworks ≤ 7 days old |
trending_score_7d |
Score using views_7d + downloads_7d; targets artworks ≤ 30 days old |
last_trending_calculated_at |
Timestamp of last calculation |
4.3 Recalculation command
php artisan skinbase:recalculate-trending --period=24h
php artisan skinbase:recalculate-trending --period=7d
php artisan skinbase:recalculate-trending --period=all
php artisan skinbase:recalculate-trending --period=7d --skip-index # skip Meilisearch jobs
php artisan skinbase:recalculate-trending --chunk=500 # smaller DB chunks
Implementation: App\Services\TrendingService::recalculate()
- Chunks artworks published within the look-back window (
chunkById(1000, ...)). - Issues one bulk MySQL
UPDATE ... WHERE id IN (...)per chunk — no per-artwork queries in the hot path. - After each chunk, dispatches
IndexArtworkJobper artwork to push updated scores to Meilisearch (skippable with--skip-index).
Note: The raw SQL uses
GREATEST()andTIMESTAMPDIFF(HOUR, ...)which are MySQL 8 only. The command is tested in production against MySQL; the 4 related Pest tests are skipped on SQLite with a clear skip message.
4.4 Meilisearch sync after calculation
TrendingService::syncToSearchIndex() dispatches IndexArtworkJob for every artwork in the trending window. The job calls Artwork::searchable() which triggers toSearchableArray(), which includes trending_score_24h and trending_score_7d.
5. Discover Routes
All routes under /discover/* are registered in routes/web.php and handled by App\Http\Controllers\Web\DiscoverController. All use Meilisearch sorting — no SQL ORDER BY in the hot path.
| Route | Name | Sort key | Auth |
|---|---|---|---|
/discover/trending |
discover.trending |
trending_score_7d:desc |
No |
/discover/fresh |
discover.fresh |
created_at:desc |
No |
/discover/top-rated |
discover.top-rated |
likes:desc |
No |
/discover/most-downloaded |
discover.most-downloaded |
downloads:desc |
No |
/discover/following |
discover.following |
created_at:desc (DB) |
Yes |
6. Following Feed
Route: GET /discover/following (auth required)
Controller: DiscoverController::following()
Logic
1. Get user's following IDs from user_followers
2. If empty → show empty state (see below)
3. If present → Artwork::whereIn('user_id', $followingIds)
->orderByDesc('published_at')
->paginate(24)
+ cached 1 min per user per page
Empty state
When the user follows nobody:
fallback_trending— up to 12 trending artworks (Meilisearch, with DB fallback)fallback_creators— 8 most-followed verified users (ordered byuser_statistics.followers_count)empty: trueflag passed to the view- The
discoverTrending()call is wrapped intry/catchso a Meilisearch outage never breaks the empty state page
7. Personalized Homepage
Controller: HomeController::index()
Service: App\Services\HomepageService
Guest sections
[
'hero' => first featured artwork,
'trending' => 12 artworks sorted by trending_score_7d,
'fresh' => 12 newest artworks,
'tags' => 12 most-used tags,
'creators' => creator spotlight,
'news' => latest news posts,
]
Authenticated sections (personalized)
[
'hero' => same as guest,
'from_following' => artworks from followed creators (up to 12, cached 1 min),
'trending' => same as guest,
'by_tags' => artworks matching user's top 5 tags,
'by_categories' => fresh uploads in user's top 3 favourite categories,
'tags' => same as guest,
'creators' => same as guest,
'news' => same as guest,
'preferences' => { top_tags, top_categories },
]
UserPreferenceService
App\Services\UserPreferenceService::build(User $user) — cached 5 min per user.
Computes preferences from the user's favourited artworks:
| Output key | Source |
|---|---|
top_tags (up to 5) |
Tags on artworks in artwork_favourites |
top_categories (up to 3) |
Categories on artworks in artwork_favourites |
followed_creators |
IDs from user_followers |
getTrending() — Meilisearch-first
Artwork::search('')
->options([
'filter' => 'is_public = true AND is_approved = true',
'sort' => ['trending_score_7d:desc', 'trending_score_24h:desc', 'views:desc'],
])
->paginate($limit, 'page', 1);
Falls back to getTrendingFromDb() — orderByDesc('trending_score_7d') with no correlated subqueries — when Meilisearch is unavailable.
8. Similar Artworks API
Route: GET /api/art/{id}/similar
Controller: App\Http\Controllers\Api\SimilarArtworksController
Route name: api.art.similar
Throttle: 60/min
Cache: 5 min per artwork ID
Max results: 12
Similarity algorithm
Meilisearch filters are built in priority order:
is_public = true
is_approved = true
id != {source_id}
author_id != {source_author_id} ← same creator excluded
orientation = "{landscape|portrait}" ← only for non-square (visual coherence)
(tags = "X" OR tags = "Y" OR ...) ← tag overlap (primary signal)
OR (if no tags)
(category = "X" OR ...) ← category fallback
Meilisearch's own ranking then sorts by relevance within those filters. Results are mapped to a slim JSON shape: {id, title, slug, thumb, url, author_id}.
9. Unified Activity Feed
Route: GET /community/activity?type=global|following
Controller: App\Http\Controllers\Web\CommunityActivityController
activity_events schema
| Column | Type | Notes |
|---|---|---|
id |
bigint PK | |
actor_id |
bigint FK users | Who did the action |
type |
varchar | upload comment favorite award follow |
target_type |
varchar | artwork user |
target_id |
bigint | ID of the target object |
meta |
json nullable | Extra data (e.g. award tier) |
created_at |
timestamp | No updated_at — immutable events |
Where events are recorded
| Event type | Recording point |
|---|---|
upload |
UploadController::finish() on publish |
follow |
FollowService::follow() |
award |
ArtworkAwardController::store() |
favorite |
ArtworkInteractionController::favorite() |
comment |
ArtworkCommentController::store() |
All via ActivityEvent::record($actorId, $type, $targetType, $targetId, $meta).
Feed filters
- Global — all recent events, newest first, paginated 30/page
- Following —
WHERE actor_id IN (following_ids)— only events from users you follow
The controller enriches each event batch with its target objects in a single query per target type (no N+1).
10. Meilisearch Configuration
Configured in config/scout.php under meilisearch.index-settings.
Push settings to a running instance:
php artisan scout:sync-index-settings
Artworks index settings
Searchable attributes (ranked in order):
titletagsauthor_namedescription
Filterable attributes:
tags, category, content_type, orientation, resolution, author_id, is_public, is_approved
Sortable attributes:
created_at, downloads, likes, views, trending_score_24h, trending_score_7d, favorites_count, awards_received_count, downloads_count
toSearchableArray() — fields indexed per artwork
[
'id', 'slug', 'title', 'description',
'author_id', 'author_name',
'category', 'content_type', 'tags',
'resolution', 'orientation',
'downloads', 'likes', 'views',
'created_at', 'is_public', 'is_approved',
'trending_score_24h', 'trending_score_7d',
'favorites_count', 'awards_received_count', 'downloads_count',
'awards' => { gold, silver, bronze, score },
]
11. Caching Strategy
| Data | Cache key | TTL | Driver |
|---|---|---|---|
| Homepage trending | homepage.trending.{limit} |
5 min | Redis/file |
| Homepage fresh | homepage.fresh.{limit} |
5 min | Redis/file |
| Homepage hero | homepage.hero |
5 min | Redis/file |
| Homepage tags | homepage.tags.{limit} |
5 min | Redis/file |
| User preferences | user.prefs.{user_id} |
5 min | Redis/file |
| Following feed | discover.following.{user_id}.p{page} |
1 min | Redis/file |
| Similar artworks | api.similar.{artwork_id} |
5 min | Redis/file |
Rules:
- Personalized data (
from_following,by_tags,by_categories) is not independently cached — it falls insideallForUser()which is called fresh per request. - Long-running cache busting: the trending command and reset command do not explicitly clear cache — the TTL is short enough that stale data self-expires within one trending cycle.
12. Scheduled Jobs
All registered in routes/console.php via Schedule::command().
| Time | Command | Purpose |
|---|---|---|
| Every 30 min | skinbase:recalculate-trending --period=24h |
Update trending_score_24h |
| Every 30 min | skinbase:recalculate-trending --period=7d --skip-index |
Update trending_score_7d (background) |
| 03:00 daily | uploads:cleanup |
Remove stale draft uploads |
| 03:10 daily | analytics:aggregate-similar-artworks |
Offline similarity metrics |
| 03:20 daily | analytics:aggregate-feed |
Feed evaluation metrics |
| 03:30 daily | skinbase:reset-windowed-stats --period=24h |
Zero views_24h, recompute downloads_24h |
| Monday 03:30 | skinbase:reset-windowed-stats --period=7d |
Zero views_7d, recompute downloads_7d |
Reset runs at 03:30 so it fires after the other maintenance tasks (03:00–03:20). The next trending recalculation (every 30 min, including ~03:30 or ~04:00) picks up the freshly-zeroed windowed stats and writes accurate trending scores.
13. Testing
All tests live under tests/Feature/Discovery/.
| Test file | Coverage |
|---|---|
ActivityEventRecordingTest.php |
ActivityEvent::record(), all 5 types, actor relation, meta, route smoke tests for the activity feed |
FollowingFeedTest.php |
Auth redirect, empty state fallback, pagination, creator exclusion |
HomepagePersonalizationTest.php |
Guest vs auth homepage sections, preferences shape, 200 responses |
SimilarArtworksApiTest.php |
404 cases, response shape, result count ≤ 12, creator exclusion |
SignalTrackingTest.php |
View endpoint (404s, first count, session dedup), download endpoint (404s, DB row, guest vs auth), route names |
TrendingServiceTest.php |
Zero artworks, skip outside window, skip private/unapproved — recalculate() tests skipped on SQLite (MySQL-only SQL) |
WindowedStatsTest.php |
incrementViews/Downloads update all 3 columns, reset command zeros views, recomputes downloads from log, window boundary correctness |
Run all discovery tests:
php artisan test tests/Feature/Discovery/
Run specific suite:
php artisan test tests/Feature/Discovery/SignalTrackingTest.php
SQLite vs MySQL note: Four tests in TrendingServiceTest are marked .skip() with the message "Requires MySQL: uses GREATEST() and TIMESTAMPDIFF()". Run them against a real MySQL instance in CI or staging to validate the bulk UPDATE formula.
14. AI Discovery v3
15.1 Overview
The v3 layer augments the existing recommendation engine with:
- CLIP-derived embeddings and tags
- BLIP captions
- YOLO object detections
- vector-gateway similarity search
- hybrid feed reranking and section generation
Primary request paths:
GET /api/art/{id}/similar-aiPOST /api/search/imagePOST /api/uploads/{id}/vision-suggest
Primary async jobs:
AutoTagArtworkJobGenerateArtworkEmbeddingJobSyncArtworkVectorIndexJobBackfillArtworkVectorIndexJob
15.2 Core configuration
Vision gateway:
VISION_ENABLEDVISION_GATEWAY_URLVISION_GATEWAY_TIMEOUTVISION_GATEWAY_CONNECT_TIMEOUT
Vector gateway:
VISION_VECTOR_GATEWAY_ENABLEDVISION_VECTOR_GATEWAY_URLVISION_VECTOR_GATEWAY_API_KEYVISION_VECTOR_GATEWAY_COLLECTIONVISION_VECTOR_GATEWAY_UPSERT_ENDPOINTVISION_VECTOR_GATEWAY_SEARCH_ENDPOINT
Hybrid feed:
DISCOVERY_V3_ENABLEDDISCOVERY_V3_CACHE_TTL_MINUTESDISCOVERY_V3_VECTOR_SIMILARITY_WEIGHTDISCOVERY_V3_VECTOR_BASE_SCOREDISCOVERY_V3_MAX_SEED_ARTWORKSDISCOVERY_V3_VECTOR_CANDIDATE_POOL
AI section sizing:
DISCOVERY_V3_SECTION_SIMILAR_STYLE_LIMITDISCOVERY_V3_SECTION_YOU_MAY_ALSO_LIKE_LIMITDISCOVERY_V3_SECTION_VISUALLY_RELATED_LIMIT
15.3 Behavior notes
- Upload publish remains non-blocking for AI processing; derivatives can complete and the AI jobs are queued after the upload is finalized.
- The synchronous
vision-suggestendpoint is only for immediate upload-step prefill and does not replace the queued persistence path. similar-aiand reverse image search return vector-gateway results only when the gateway is configured; otherwise they fail closed with explicit JSON reasons.- Discovery sections are now tunable from config rather than fixed in code, which makes production adjustments safe without service edits.
15. Operational Runbook
Trending scores are stuck / not updating
# Check last calculated timestamp
SELECT id, title, last_trending_calculated_at FROM artworks ORDER BY last_trending_calculated_at DESC LIMIT 5;
# Manually trigger recalculation
php artisan skinbase:recalculate-trending --period=all
# Re-push scores to Meilisearch
php artisan skinbase:recalculate-trending --period=7d
Windowed counters look wrong after a deploy
# Force a reset and recompute
php artisan skinbase:reset-windowed-stats --period=24h
php artisan skinbase:reset-windowed-stats --period=7d
# Then recalculate trending with fresh numbers
php artisan skinbase:recalculate-trending --period=all
Meilisearch out of sync with DB
# Re-push all artworks in the trending window
php artisan skinbase:recalculate-trending --period=all
# Or full re-index
php artisan scout:import "App\Models\Artwork"
Push updated index settings (after changing config/scout.php)
php artisan scout:sync-index-settings
Check what the trending formula is reading
SELECT
a.id,
a.title,
a.published_at,
s.views,
s.views_24h,
s.views_7d,
s.downloads,
s.downloads_24h,
s.downloads_7d,
s.favorites,
a.trending_score_24h,
a.trending_score_7d,
a.last_trending_calculated_at
FROM artworks a
LEFT JOIN artwork_stats s ON s.artwork_id = a.id
WHERE a.is_public = 1 AND a.is_approved = 1
ORDER BY a.trending_score_7d DESC
LIMIT 20;
Inspect the artwork_downloads log
-- Downloads in the last 24 hours per artwork
SELECT artwork_id, COUNT(*) as dl_24h
FROM artwork_downloads
WHERE created_at >= NOW() - INTERVAL 1 DAY
GROUP BY artwork_id
ORDER BY dl_24h DESC
LIMIT 20;