Scoring Methodology
How relevance scores and signal rankings are computed across the platform.
Polymarket Relevance Score v1 · Mar 2026
Two-stage keyword filter that scores prediction markets 0–1 for energy & macro relevance.
Markets scoring 0.0 are discarded entirely and never stored.
Markets scoring 0.0 are discarded entirely and never stored.
Stage 1 — Hard Exclusion
Before any scoring, markets are dropped if they match a hard exclusion pattern. This eliminates noise from categories that are never relevant to an energy fund.
NFL / NBA / NHL / MLB
Oscar / Emmy / Grammy
Bitcoin / ETH / crypto
Celebrity / reality TV
Video games / esports
Meme coins
Sports championships
Stage 2 — Tiered Keyword Scoring
| Tier | Topic | Score range |
|---|---|---|
| T1 | Core energy & commodities | 0.70 – 1.00 |
| T2 | Energy-adjacent & broader commodities | 0.45 – 0.75 |
| T3 | Macro & geopolitical signals | 0.20 – 0.40 |
Scoring is additive within each tier up to its cap. Only the highest matching tier is used — a T1 match short-circuits T2/T3 evaluation.
Tier 1 — Core Energy Keywords
crude oil
wti
brent
oil price
natural gas
lng
lpg
opec
opec+
petroleum
refinery
refining
gasoline
diesel
pipeline
shale
fracking
permian
offshore drilling
oil output
oil production
oil reserves
oil supply
energy crisis
straits of hormuz
strait of malacca
Tier 2 — Energy-Adjacent Keywords
energy
power grid
electricity
utility
utilities
nuclear
solar
wind power
renewable
clean energy
carbon
emissions
co2
net zero
climate
coal
mining
metals
copper
aluminum
lithium
commodities
gulf of mexico
north sea
caspian
Tier 3 — Macro & Geopolitical Signals
fed
federal reserve
interest rate
inflation
cpi
pce
recession
gdp
us dollar
dxy
treasury
rate cut
rate hike
iran
russia
ukraine
china
middle east
saudi arabia
venezuela
opec meeting
g7
g20
tariff
sanctions
war
conflict
score = 0.0
if T1_matches ≥ 1: score = min(1.00, 0.70 + T1_matches × 0.15)
elif T2_matches ≥ 1: score = min(0.75, 0.45 + T2_matches × 0.10)
elif T3_matches ≥ 1: score = min(0.40, 0.20 + T3_matches × 0.05)
if score == 0.0: discard market (never stored)
Known limitations: Pure keyword matching — no semantic understanding. A market like "Will Trump tweet about oil?" might score 0 despite being energy-relevant. Future improvement: semantic embedding scoring (OpenAI embeddings) as an optional second pass.
X / Twitter Handle Relevance Score v1 · Mar 2026
Scores each tracked handle 0–1 based on bio semantics, follower reach, and verified status.
Used to sort handles on the Twitter/X Intelligence page and prioritise signal weight.
Used to sort handles on the Twitter/X Intelligence page and prioritise signal weight.
Bio Keyword Scoring
| Tier | Topic | Per match | Cap |
|---|---|---|---|
| T1 | Energy & commodities | +0.35 | 0.50 |
| T2 | Finance & macro | +0.15 | 0.25 |
| T3 | Geopolitical | +0.05 | 0.10 |
Reach & Credibility Bonuses
| Signal | Bonus |
|---|---|
| Followers ≥ 10,000 | +0.05 |
| Followers ≥ 100,000 | +0.10 |
| Followers ≥ 1,000,000 | +0.15 |
| Verified account (✓) | +0.05 |
Follower bonuses are non-cumulative — only the highest tier applies. Total score is capped at 1.0.
bio_score = min(0.50, T1_matches × 0.35)
+ min(0.25, T2_matches × 0.15)
+ min(0.10, T3_matches × 0.05)
follower_bonus = 0.15 if followers ≥ 1M
else 0.10 if followers ≥ 100k
else 0.05 if followers ≥ 10k
else 0.00
verified_bonus = 0.05 if verified else 0.00
score = min(1.0, bio_score + follower_bonus + verified_bonus)
Tier 1 Bio Keywords (Energy)
oil
crude
energy
gas
lng
opec
petroleum
refin
commodit
brent
wti
ngl
shale
offshore
pipeline
Substring match — e.g. "refin" matches "refining", "refinery".
Tier 2 Bio Keywords (Finance / Macro)
macro
market
trade
invest
equity
fund
asset
analyst
research
economics
finance
portfolio
hedge
commodities
mining
metals
copper
lithium
Tier 3 Bio Keywords (Geopolitical)
geopolit
sanctions
iran
russia
saudi
opec
climate
renewable
transition
carbon
esg
risk
inflation
central bank
fed
Score Interpretation
Known limitations: Bio scoring uses substring matching against a manually curated keyword list. Handles with sparse or non-English bios may score lower than their actual relevance warrants. Profile data is refreshed on-demand (👤 button) or via weekly scheduled task — scores are not live.
MSCI Style Factor Definitions EFMGEMTR · June 2022
Definitions for all 17 style factors tracked in the EGC risk platform, as defined in the
MSCI Global Equity Factor Trading Model (EFMGEMTR) Empirical Notes, Section 3.4.
Factors are standardised to a cap-weighted mean of zero and an equal-weighted standard deviation of one.
The platform tracks 17 style factors drawn from the EFMGEMTR model. These are grouped below by their Level 1 type as defined in Appendix E of the model handbook.
Each exposure is the portfolio's net weighted-average exposure to that factor — positive values indicate a long tilt, negative values a short tilt.
Volatility
| Factor | Definition (EFMGEMTR §3.4) | Key Descriptors |
|---|---|---|
| Beta | Captures market risk that cannot be explained by the Market factor. Computed by a time-series regression of excess stock returns against the cap-weighted estimation universe. Typically the strongest style factor by volatility. | HBETA (Historical Beta) |
| Residual Volatility | Captures volatility in stock returns not explained by the Beta factor. Orthogonalised to the Beta, Liquidity, and Size factors. | HSIGMA · DSTD · CMRA |
Size
| Factor | Definition (EFMGEMTR §3.4) | Key Descriptors |
|---|---|---|
| Size | A strong source of equity return covariance. Captures return differences between large-cap and small-cap stocks. Measured by the log of market capitalisation. | LNCAP (Log Mkt Cap) |
Value
| Factor | Definition (EFMGEMTR §3.4) | Key Descriptors |
|---|---|---|
| Value | Captures the extent to which a company is mispriced using Book-to-Price and Sales-to-Price ratios. Book-to-Price is the most important descriptor. | BTOP · STOP |
| Earnings Yield | Captures return differences due to various forms of earnings-to-price ratios, including analyst-predicted E/P, historical E/P, cash E/P, and enterprise multiple (EBIT/EV). | ETOPF · ETOP · CETOP · EBITTOEV |
| Dividend Yield | Captures return differences due to companies' historical and analyst-predicted dividend-to-price ratios. | DTOP · DPIBS |
| LT Reversal | Captures return differences due to the stocks' long-term (four years lagged by 13 months) relative performance. Orthogonalised to the Momentum factor. | LTRSTR · LTHALPHA |
Momentum
| Factor | Definition (EFMGEMTR §3.4) | Key Descriptors |
|---|---|---|
| Momentum | Captures return differences due to stocks' recent 12-month performance, excluding the most recent 11 days to avoid short-term reversal contamination. Second strongest style factor by volatility after Beta. | RSTR · HALPHA |
| ST Reversal | Captures how stocks under- or over-performed the market in the recent past, as this effect is expected to reverse in the near future. | STREV |
Quality
| Factor | Definition (EFMGEMTR §3.4) | Key Descriptors |
|---|---|---|
| Profitability | Combines profitability measures characterising efficiency of a firm's operations and total activities: asset turnover, gross profitability, gross profit margin, return on assets, and return on equity. | ROA · GP · GPM · ATO |
| Leverage | Captures return differences between high- and low-leverage stocks. Descriptors include market leverage, book leverage, and liabilities-to-assets ratio. | MLEV · BLEV · DTOA |
| Earnings Quality | Captures return differences due to companies' cash-earnings-to-earnings ratio and accrual components of earnings. | ABS · ACF · CETEO |
Liquidity
| Factor | Definition (EFMGEMTR §3.4) | Key Descriptors |
|---|---|---|
| Liquidity | Captures return differences due to relative trading activity, measured by the fraction of total shares outstanding traded over monthly, quarterly, and annual trailing windows. | STOM · STOQ · STOA · ATVR |
Growth
| Factor | Definition (EFMGEMTR §3.4) | Key Descriptors |
|---|---|---|
| Growth | Captures return differences due to analyst-predicted earnings growth and historical sales and earnings growth. Analyst-predicted earnings growth (mid-term) is the most important descriptor. | EGRMF · EGRLF · EGRO · SGRO |
Sentiment
| Factor | Definition (EFMGEMTR §3.4) | Key Descriptors |
|---|---|---|
| Short Interest | Captures the extent to which a stock is sold short. Three descriptors: short utilisation rate (shares short ÷ shares available to borrow), borrow rate charged by prime brokers, and days-to-cover (shares on loan ÷ average daily volume). | SHORTUTIL · BORROWRATE · DAYSCOVER |
Machine Learning
| Factor | Definition (EFMGEMTR §3.4) | Key Descriptors |
|---|---|---|
| ML Factor | Captures non-linearities and interactions in the relationship between factor exposures and stock returns using machine learning to identify those relationships. Uses neural networks with 24-, 48-, and 72-month lookback windows. | ML_GEMTR_NN_24M · _48M · _72M |
Crowding
| Factor | Definition (EFMGEMTR §3.4) | Key Descriptors |
|---|---|---|
| Stock Crowding | Measures crowdedness of a stock based on deviations of its current factor exposures from their historical medians. Six descriptors drawn from Value, Earnings Yield, Short Interest, Liquidity, Momentum, and Residual Volatility factors — the first three are most important. | CROWD_VALUE · CROWD_EARNYILD · CROWD_SHORTINT · CROWD_LIQUIDITY · CROWD_MOMENTUM · CROWD_RESVOL |
Source: MSCI Global Equity Factor Trading Model (EFMGEMTR) Empirical Notes, George Bonne et al., June 2022, Section 3.4 — Style Factors. Definitions reproduced for internal reference only.
More methodology docs coming
Planned: semantic search scoring, portfolio beta methodology, risk limit calibration, synthesis prompt design.