The infrastructure behind modern ranking systems (serving, data, MLOps)

(shaped.ai)

1 points | by semi_sentient 2 days ago ago

2 comments

Follow-up posts for context (same series):

Part 2 – Data Layer (feature store to prevent online/offline skew; vector DB choices and pre- vs post-filtering): https://www.shaped.ai/blog/the-infrastructure-of-modern-rank...

Part 3 – MLOps Backbone (training pipelines, registry, GitOps deployment, monitoring/drift/A-B): https://www.shaped.ai/blog/the-infrastructure-of-modern-rank...

Happy to share more detail (autoscaling policies, index swaps, point-in-time joins, GPU batching) if helpful.

semi_sentient 2 days ago

Modern ranking systems (feeds, search, recommendations) have strict latency budgets, often under 200 ms at p99. This write-up describes how we designed a production system using a decoupled microservice architecture for serving, a feature + vector store data layer, and an automated MLOps pipeline for training → deployment. This is less about modeling, more about the infrastructure that keeps it all running.