Sergei Notevskii
Русская версия

Start Here

How to choose a reading path by the AI platform problem in front of you.

Foundation
v0.1
Updated May 23, 2026
AI Platform Leads
Staff Engineers
Engineering Managers
roadmap
start
artifacts
Saved only in this browser.

Problem

The demo works. Then the real questions start: who owns quality, where cost is visible, why latency drifts, how a model change ships, and how product teams avoid building local AI stacks in every service.

This page helps you choose the first path. Do not read the handbook linearly.

Quick Choice

Thinking About Moving From MaaS To Self-hosted?

Do not start with model choice or GPU choice. Start with the scenario: what data is involved, which SLA matters, what the traffic shape looks like, how quality will be evaluated and who will own operations.

Use this first review path:

  1. Scenario: what exactly is moving?
  2. Data: can the input be de-identified and stay on MaaS?
  3. SLA: is the workload real-time, long-context or batch?
  4. Quality: do evals exist before the migration?
  5. Economics: did you count production, staging, test, debug, on-call and reserve capacity?

If these questions have no answers, self-hosted is not a strategy yet. It is an expensive experiment.

Before GPUs

If there are no evals and no cost baseline, start the self-hosted migration with a scenario document, not with GPU selection.

Mental model

Production AI platform is not one layer. It is the connection between product scenarios, gateway, routing, inference, cache, evals, observability, cost, guardrails and ownership.

Start with the current pain. If cost hurts, go to economics and cache. If quality hurts, go to evals. If product teams are fragmented, go to gateway and ownership.

Paths By Role

  • AI Platform Lead: map, Semantic Router, gateway, economics, prefix cache, observability, ownership.
  • Staff Engineer: inference runtime, prefix cache, context budget, tool stability, router evals.
  • CTO / Head of Engineering: maturity model, MaaS vs self-hosted, inference economics, operating model.
  • Product Engineer: start here, gateway, observability checklist, quality gate.

What To Read First

  • If there is no shared map, start with Platform Map.
  • If maturity language is missing, start with Maturity Model.
  • If MaaS vs self-hosted is disputed, start with the strategy chapter.
  • If cache does not help, start with Prefix Cache.
  • If model releases are risky, start with AI Quality Gate.

How To Apply It

Use each chapter as a review checklist. After reading, you should know:

  • which platform layer is involved;
  • which owner is missing;
  • which metrics matter;
  • what can break in production;
  • which next document or tool to open.

Example

A team says: "the model became too expensive." Do not start by swapping the model. Check route, retries, cached tokens, tool schema stability, accepted outcome rate and fallback events. That path usually finds the real cause faster.

On this page