System Overview
How We Build and Search the Vector
One query vector is built from your inning inputs, then compared against canonical game vectors. This keeps matching fast, consistent, and orientation-insensitive.
Jump To
Processing Pipeline
- 1
Input
Take team inning values from the query form.
- 2
Canonicalize
Normalize A/B orientation so swaps map consistently.
- 3
Vectorize
Expand into the 65-dimensional feature structure.
- 4
Match
Run vector similarity search and rank by distance.
Vector Shape
The search vector has exactly 65 dimensions: 30 + 30 + 5.
We support up to 30 innings per team in fixed slots, then append summary signals for totals, scoring timing, and extra innings.
Feature Layout
| Vector Slot(s) | Feature | Why We Use It |
|---|---|---|
| 0-29 | Team A inning runs (30 slots) | Captures the full scoring shape across regulation and extras in fixed positions. |
| 30-59 | Team B inning runs (30 slots) | Mirrors Team A so both sides are represented symmetrically in the vector. |
Note This is where inning-by-inning shape ends and summary features begin, making similarity search between games more powerful. | ||
| 60 | Team A total runs | Adds game-level scoring magnitude beyond inning-by-inning pattern. |
| 61 | Team B total runs | Adds game-level scoring magnitude beyond inning-by-inning pattern. |
| 62 | Last scoring inning for Team A | Late-comeback profiles from early-scoring profiles. |
| 63 | Last scoring inning for Team B | Late-comeback profiles from early-scoring profiles. |
| 64 | Extra innings flag (0 or 1) | Separates regulation-only games from extra-inning games quickly. |
Exact Example (65 values)
Example input shown below is canonicalized (team order normalized), then expanded into the full 65-value vector.
Input Team A: [0,0,0,0,1,0,0,0,0]
Input Team B: [1,0,0,0,2,0,0,0,0]
Canonical Team A: [0,0,0,0,1,0,0,0,0]
Canonical Team B: [1,0,0,0,2,0,0,0,0]
Canonicalization Example (A/B vs B/A)
It's really just putting the lesser vector in away, and greater in home. If they are the same, it doesn't matter.
At write time, innings are canonicalized and one vector is saved.
Example 1 input: {"awayInnings":[0,0,0,0,1,0,0,0,0],"homeInnings":[1,0,0,0,2,0,0,0,0]}
Example 2 input (swapped): {"awayInnings":[1,0,0,0,2,0,0,0,0],"homeInnings":[0,0,0,0,1,0,0,0,0]}
Canonical output 1: {"awayInnings":[0,0,0,0,1,0,0,0,0],"homeInnings":[1,0,0,0,2,0,0,0,0]}
Canonical output 2: {"awayInnings":[0,0,0,0,1,0,0,0,0],"homeInnings":[1,0,0,0,2,0,0,0,0]}
Canonical outputs match: Yes
Resulting vectors match: Yes