Standalone, no-auth public discovery page at
/explorethat reads from the existing FAQ DB and ships its own anonymous analytics.
Goals
Non-goals
FAQ (additive, non-breaking)popularityScore : Number // recomputed every 5 min by background job
guestViewCount : Number // anonymous view count (separate from auth views)
avgReadCompletion : Number // 0..1 — mean scroll depth of guests
avgTimeSpentRatio : Number // 0..1 — actual time / expected reading time
guestViewLast24h : Number // rolling counter, drives "trending"
wordCount : Number // cached word count of (question + answer)
expectedReadMs : Number // 200 wpm × wordCount (cached for fast scoring)
popularityUpdatedAt: Date // last score recompute
Indexes added:
{ status: 1, popularityScore: -1 } — popular list{ status: 1, createdAt: -1 } — recent list (already exists){ status: 1, category: 1, popularityScore: -1 } — category browse rankedGuestEvent (raw event buffer){
faqId : ObjectId,
guestId : String, // random UUID stored in httpOnly cookie
sessionId : String, // per-tab session for view-dedup
type : 'view' | 'read' | 'completion' | 'scroll',
dwellMs : Number, // for read events
scrollPct : Number, // 0..1
faqLength : Number, // word count, snapshotted at event time
createdAt : Date
}
Indexes:
{ faqId: 1, type: 1, createdAt: -1 } — aggregation{ createdAt: 1 } expireAfterSeconds: 7d — auto-prune raw events{ guestId: 1, faqId: 1, type: 1, createdAt: -1 } — dedup lookupsThe FAQ.views field stays as the authenticated-user view counter. The new
guestViewCount is the anonymous equivalent — we do not merge them, so
admin analytics remain meaningful.
Implemented in backend/utils/popularityScore.ts. Pure function over
aggregated metrics, runs every 5 min in a background job, not on read path.
popularity_score = (view_weight * log1p(guestViewCount))
+ (recency_weight * recencyBoost)
+ (engagement_weight * (avgReadCompletion * avgTimeSpentRatio))
+ (trust_weight * trustBoost)
Component details (all normalised to 0..1):
| Component | Formula | Default weight |
|---|---|---|
view_weight |
min(1, log10(1 + guestViewCount) / log10(500)) |
0.40 |
recency_weight |
exp(-ageDays / 30) (half-life ~21 days) |
0.20 |
engagement_weight |
0.5 * avgReadCompletion + 0.5 * avgTimeSpentRatio |
0.30 |
trust_weight |
expert=1.0, high=0.8, medium=0.5, low=0.2 |
0.10 |
Why this is robust to gaming:
guestId + 30-min dedup window prevents the same person from inflating
scores by refreshing.Re-compute trigger: every 5 min via setInterval (reuses the existing
runRetention-style pattern in server.ts).
Re-compute query: single Mongo aggregation pipeline over FAQ collection
using $set with arithmetic expressions on the cached metric fields — O(N)
once per 5 min, no per-request work on the hot path.
/api/public/*All routes public, no auth, no PII. Soft rate limit per IP (200 req/min) on read endpoints, tighter (60 req/min) on tracking endpoints to absorb analytics write amplification.
| Method | Path | Purpose |
|---|---|---|
| GET | /api/public/popular-faqs |
Top N by popularityScore |
| GET | /api/public/recent-faqs |
Newest N by createdAt |
| GET | /api/public/categories |
All categories with counts (and per-category top 3) |
| GET | /api/public/search?q=&category= |
Text search, optional category filter |
| GET | /api/public/faqs/:id |
Single FAQ (anonymous-safe view) |
| POST | /api/public/track-view |
Page open event (idempotent within 30 min/guest) |
| POST | /api/public/track-reading |
Dwell time + scroll depth + completion event |
Tracking payloads — PII-free by construction:
// POST /api/public/track-view
{ "faqId": "6655...", "sessionId": "tab-uuid" }
// POST /api/public/track-reading
{ "faqId": "6655...", "sessionId": "tab-uuid", "dwellMs": 18200,
"scrollPct": 0.83, "faqLength": 240 }
guestId comes from an httpOnly, SameSite=Lax cookie set on first
public-page hit. The server never reads IP for analytics, never stores
User-Agent beyond what’s already in standard request logs.
Response shape (popular):
{
"faqs": [
{ "_id": "...", "question": "...", "answer": "...",
"category": "...", "tags": [...], "createdAt": "...",
"popularityScore": 12.4, "guestViewCount": 217,
"avgReadCompletion": 0.72, "wordCount": 240 }
],
"generatedAt": "2026-06-10T..."
}
Caching: popular and recent endpoints are wrapped in an in-process LRU
(5-min TTL) and an optional Redis cache (already wired via
utils/cache.ts). Categories and search are not cached (filter dimensions
are too varied for hit-rate).
Two client-side hooks, both fire-and-forget with navigator.sendBeacon
on pagehide to survive tab close:
useViewTracker(faqId) — fires on mount, idempotent per
(guestId, faqId, 30-min-bucket). Server checks GuestEvent for a
recent view event with the same key and skips the increment.
useReadingTracker(faqId) — observes scroll depth every 250 ms
(rAF-throttled), tracks total dwell from mount, computes
completionPct = maxScrollY / articleHeight. On pagehide or
visibilitychange→hidden it POSTs one read event with
{ dwellMs, scrollPct, faqLength }.
Both endpoints accept the event but never echo back per-user data and
never require identification. Events buffer to GuestEvent and are
folded into FAQ metrics by the next aggregation tick.
frontend/src/
├── pages/
│ └── ExplorePage.tsx // route: /explore
└── components/
└── explore/
├── ExploreHero.tsx // title + search + tags
├── ExploreSearchBar.tsx // debounced, highlights
├── PopularFaqsCard.tsx // "Most Popular" column
├── RecentFaqsCard.tsx // "Recent FAQs" column
├── CategoriesCard.tsx // "Browse Categories" column
├── FaqListItem.tsx // shared numbered row
├── CategoryAccordion.tsx // collapsible section
├── CategoryFaqList.tsx // list inside accordion
├── ReadingTracker.tsx // mount-time scroll/dwell observer
├── ExploreSkeleton.tsx // loading placeholders
├── ExploreEmpty.tsx // empty states
├── highlightMatch.ts // <mark> for search results
└── usePublicFaqApi.ts // cached fetcher with abort
Matching the existing FAQ Hive aesthetic:
bg-bg), bg-card rounded-2xl with border-border.shadow-subtle once scrolled past hero.lg, stacks to single column on mobile.1 / 2 / 3 style).text-accent) for icons + hover.State / data flow:
usePopularFaqs, useRecentFaqs,
useCategories) each with skeleton + error + empty states.useDebouncedSearch) that cancels
the prior request via AbortController (same pattern as SearchBar).| Concern | Mitigation |
|---|---|
| View count inflation | 30-min dedup per (guestId, faqId) in track-view |
| Read-event spam | 60 req/min/IP rate limit on /track-* |
| Search abuse | 30 req/min/IP rate limit on /search |
| DDoS amplification | Helmet + global IP limiter (already in server.ts) |
| PII collection | No IP, no UA, no fingerprint stored — only UUID cookie |
| XSS via answer body | sanitizeHtml already applied to FAQ.answer (admin write path) |
| Cookie scope | httpOnly, SameSite=Lax, no Secure in dev |
| Cookie expiry | 90 days, sliding |
| Score manipulation | Score uses log + engagement, not raw views |
Read path (popular / recent / categories / search):
popularityScore and indexed createdAt..lean() to skip Mongoose hydration.utils/cache.ts).utils/api.ts — add routes to
CACHE_CONFIGS.Write path (tracking events):
track-view and track-reading write to GuestEvent only.track-view also bumps FAQ.guestViewCount directly via
findOneAndUpdate({ _id }, { $inc: ... }) — single indexed update,
O(1).track-reading does not update the FAQ inline. The aggregation
job reads from GuestEvent and re-derives avgReadCompletion /
avgTimeSpentRatio from the buffer.Background aggregation (every 5 min):
$merge into FAQ collection.popularityScore, popularityUpdatedAt only — does not touch
guestViewCount (incremented inline).Scale targets (designed for):
track-view writes/day → 1M+ GuestEvent
inserts, retained 7 days, ~7M rolling doc count.backend/
├── models/
│ ├── FAQ.ts (extended)
│ └── GuestEvent.ts (new)
├── utils/
│ └── popularityScore.ts (new)
├── controllers/
│ └── publicFaqController.ts (new)
├── routes/
│ └── publicFaq.ts (new)
└── server.ts (mount + scheduler)
frontend/src/
├── pages/
│ └── ExplorePage.tsx (new)
├── components/explore/ (new dir, 11 files)
└── App.tsx (add /explore route)