Real-Time Sentiment Scoring Across 7,000+ Tickers
The Challenge
A quantitative trading operation needed to score every NYSE-listed stock and ETF for directional sentiment — but their Python batch application couldn’t scale past the Nasdaq 100. The pipeline fetched 1-minute OHLCV data from Polygon, calculated a dozen technical indicators, and ran ensemble regression models (AdaBoost, ExtraTrees, MLP) to produce a sentiment signal from −100 (bearish) to +100 (bullish). Even with 8 task threads, fetching data and running prediction for each ticker took long enough that only ~100 tickers could be scored daily — while the universe of interest numbered in the thousands. Adding more containers scaled linearly at best, duplicating the full application stack per instance.
What We Built
Decomposed the monolith into serverless functions orchestrated by a state machine. Golang handles data fetching from Polygon per ticker and function invocation coordination; Python remains in the indicator calculation, training, and prediction functions where the ML libraries live. The re-architecture eliminated the serial bottleneck entirely: each ticker’s fetch-and-score cycle now runs as an independently scalable, stateless function that spins up on demand and tears down between runs. The ensemble models and their specialized dependencies are packaged per-function, so inference is self-contained.
The Result
The pipeline now scores 7,000+ stocks and ETFs every 5 minutes — a 70× expansion from the original 100-ticker daily batch — for dollars a day or a few hundred dollars a month.