Documentation

How we convert linescores into a fixed 65-dimensional search vector.

System Overview

How We Build and Search the Vector

One query vector is built from your inning inputs, then compared against canonical game vectors. This keeps matching fast, consistent, and orientation-insensitive.

Jump To

Processing Pipeline

  1. 1

    Input

    Take team inning values from the query form.

  2. 2

    Canonicalize

    Normalize A/B orientation so swaps map consistently.

  3. 3

    Vectorize

    Expand into the 65-dimensional feature structure.

  4. 4

    Match

    Run vector similarity search and rank by distance.

Vector Shape

The search vector has exactly 65 dimensions: 30 + 30 + 5.

We support up to 30 innings per team in fixed slots, then append summary signals for totals, scoring timing, and extra innings.

Feature Layout

Vector Slot(s)FeatureWhy We Use It
0-29Team A inning runs (30 slots)Captures the full scoring shape across regulation and extras in fixed positions.
30-59Team B inning runs (30 slots)Mirrors Team A so both sides are represented symmetrically in the vector.
Note
This is where inning-by-inning shape ends and summary features begin, making similarity search between games more powerful.
60Team A total runsAdds game-level scoring magnitude beyond inning-by-inning pattern.
61Team B total runsAdds game-level scoring magnitude beyond inning-by-inning pattern.
62Last scoring inning for Team ALate-comeback profiles from early-scoring profiles.
63Last scoring inning for Team BLate-comeback profiles from early-scoring profiles.
64Extra innings flag (0 or 1)Separates regulation-only games from extra-inning games quickly.

Exact Example (65 values)

Example input shown below is canonicalized (team order normalized), then expanded into the full 65-value vector.

Input Team A: [0,0,0,0,1,0,0,0,0]

Input Team B: [1,0,0,0,2,0,0,0,0]

Canonical Team A: [0,0,0,0,1,0,0,0,0]

Canonical Team B: [1,0,0,0,2,0,0,0,0]

[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 5, 5, 0]

Canonicalization Example (A/B vs B/A)

Note

It's really just putting the lesser vector in away, and greater in home. If they are the same, it doesn't matter.

At write time, innings are canonicalized and one vector is saved.

Example 1 input: {"awayInnings":[0,0,0,0,1,0,0,0,0],"homeInnings":[1,0,0,0,2,0,0,0,0]}

Example 2 input (swapped): {"awayInnings":[1,0,0,0,2,0,0,0,0],"homeInnings":[0,0,0,0,1,0,0,0,0]}

Canonical output 1: {"awayInnings":[0,0,0,0,1,0,0,0,0],"homeInnings":[1,0,0,0,2,0,0,0,0]}

Canonical output 2: {"awayInnings":[0,0,0,0,1,0,0,0,0],"homeInnings":[1,0,0,0,2,0,0,0,0]}

Canonical outputs match: Yes

Resulting vectors match: Yes