Methodology
Scoring & Signal Models
Scoring Methodology
How relevance scores and signal rankings are computed across the platform.
📊
Polymarket Relevance Score v1 · Mar 2026
Two-stage keyword filter that scores prediction markets 0–1 for energy & macro relevance.
Markets scoring 0.0 are discarded entirely and never stored.
Stage 1 — Hard Exclusion

Before any scoring, markets are dropped if they match a hard exclusion pattern. This eliminates noise from categories that are never relevant to an energy fund.

NFL / NBA / NHL / MLB Oscar / Emmy / Grammy Bitcoin / ETH / crypto Celebrity / reality TV Video games / esports Meme coins Sports championships
Stage 2 — Tiered Keyword Scoring
TierTopicScore range
T1 Core energy & commodities 0.70 – 1.00
T2 Energy-adjacent & broader commodities 0.45 – 0.75
T3 Macro & geopolitical signals 0.20 – 0.40
Scoring is additive within each tier up to its cap. Only the highest matching tier is used — a T1 match short-circuits T2/T3 evaluation.

Tier 1 — Core Energy Keywords
crude oil wti brent oil price natural gas lng lpg opec opec+ petroleum refinery refining gasoline diesel pipeline shale fracking permian offshore drilling oil output oil production oil reserves oil supply energy crisis straits of hormuz strait of malacca
Tier 2 — Energy-Adjacent Keywords
energy power grid electricity utility utilities nuclear solar wind power renewable clean energy carbon emissions co2 net zero climate coal mining metals copper aluminum lithium commodities gulf of mexico north sea caspian
Tier 3 — Macro & Geopolitical Signals
fed federal reserve interest rate inflation cpi pce recession gdp us dollar dxy treasury rate cut rate hike iran russia ukraine china middle east saudi arabia venezuela opec meeting g7 g20 tariff sanctions war conflict
score = 0.0 if T1_matches ≥ 1: score = min(1.00, 0.70 + T1_matches × 0.15) elif T2_matches ≥ 1: score = min(0.75, 0.45 + T2_matches × 0.10) elif T3_matches ≥ 1: score = min(0.40, 0.20 + T3_matches × 0.05) if score == 0.0: discard market (never stored)
Known limitations: Pure keyword matching — no semantic understanding. A market like "Will Trump tweet about oil?" might score 0 despite being energy-relevant. Future improvement: semantic embedding scoring (OpenAI embeddings) as an optional second pass.
🐦
X / Twitter Handle Relevance Score v1 · Mar 2026
Scores each tracked handle 0–1 based on bio semantics, follower reach, and verified status.
Used to sort handles on the Twitter/X Intelligence page and prioritise signal weight.
Bio Keyword Scoring
TierTopicPer matchCap
T1 Energy & commodities +0.35 0.50
T2 Finance & macro +0.15 0.25
T3 Geopolitical +0.05 0.10
Reach & Credibility Bonuses
SignalBonus
Followers ≥ 10,000 +0.05
Followers ≥ 100,000 +0.10
Followers ≥ 1,000,000 +0.15
Verified account (✓) +0.05
Follower bonuses are non-cumulative — only the highest tier applies. Total score is capped at 1.0.
bio_score = min(0.50, T1_matches × 0.35) + min(0.25, T2_matches × 0.15) + min(0.10, T3_matches × 0.05) follower_bonus = 0.15 if followers ≥ 1M else 0.10 if followers ≥ 100k else 0.05 if followers ≥ 10k else 0.00 verified_bonus = 0.05 if verified else 0.00 score = min(1.0, bio_score + follower_bonus + verified_bonus)

Tier 1 Bio Keywords (Energy)
oil crude energy gas lng opec petroleum refin commodit brent wti ngl shale offshore pipeline

Substring match — e.g. "refin" matches "refining", "refinery".

Tier 2 Bio Keywords (Finance / Macro)
macro market trade invest equity fund asset analyst research economics finance portfolio hedge commodities mining metals copper lithium
Tier 3 Bio Keywords (Geopolitical)
geopolit sanctions iran russia saudi opec climate renewable transition carbon esg risk inflation central bank fed

Score Interpretation
High relevance (≥ 0.60)
≥ 60%
Medium (0.30 – 0.59)
30–59%
Low (< 0.30)
< 30%
Known limitations: Bio scoring uses substring matching against a manually curated keyword list. Handles with sparse or non-English bios may score lower than their actual relevance warrants. Profile data is refreshed on-demand (👤 button) or via weekly scheduled task — scores are not live.
🧮
MSCI Style Factor Definitions EFMGEMTR · June 2022
Definitions for all 17 style factors tracked in the EGC risk platform, as defined in the MSCI Global Equity Factor Trading Model (EFMGEMTR) Empirical Notes, Section 3.4. Factors are standardised to a cap-weighted mean of zero and an equal-weighted standard deviation of one.
The platform tracks 17 style factors drawn from the EFMGEMTR model. These are grouped below by their Level 1 type as defined in Appendix E of the model handbook. Each exposure is the portfolio's net weighted-average exposure to that factor — positive values indicate a long tilt, negative values a short tilt.
Volatility
FactorDefinition (EFMGEMTR §3.4)Key Descriptors
Beta Captures market risk that cannot be explained by the Market factor. Computed by a time-series regression of excess stock returns against the cap-weighted estimation universe. Typically the strongest style factor by volatility. HBETA (Historical Beta)
Residual Volatility Captures volatility in stock returns not explained by the Beta factor. Orthogonalised to the Beta, Liquidity, and Size factors. HSIGMA · DSTD · CMRA
Size
FactorDefinition (EFMGEMTR §3.4)Key Descriptors
Size A strong source of equity return covariance. Captures return differences between large-cap and small-cap stocks. Measured by the log of market capitalisation. LNCAP (Log Mkt Cap)
Value
FactorDefinition (EFMGEMTR §3.4)Key Descriptors
Value Captures the extent to which a company is mispriced using Book-to-Price and Sales-to-Price ratios. Book-to-Price is the most important descriptor. BTOP · STOP
Earnings Yield Captures return differences due to various forms of earnings-to-price ratios, including analyst-predicted E/P, historical E/P, cash E/P, and enterprise multiple (EBIT/EV). ETOPF · ETOP · CETOP · EBITTOEV
Dividend Yield Captures return differences due to companies' historical and analyst-predicted dividend-to-price ratios. DTOP · DPIBS
LT Reversal Captures return differences due to the stocks' long-term (four years lagged by 13 months) relative performance. Orthogonalised to the Momentum factor. LTRSTR · LTHALPHA
Momentum
FactorDefinition (EFMGEMTR §3.4)Key Descriptors
Momentum Captures return differences due to stocks' recent 12-month performance, excluding the most recent 11 days to avoid short-term reversal contamination. Second strongest style factor by volatility after Beta. RSTR · HALPHA
ST Reversal Captures how stocks under- or over-performed the market in the recent past, as this effect is expected to reverse in the near future. STREV
Quality
FactorDefinition (EFMGEMTR §3.4)Key Descriptors
Profitability Combines profitability measures characterising efficiency of a firm's operations and total activities: asset turnover, gross profitability, gross profit margin, return on assets, and return on equity. ROA · GP · GPM · ATO
Leverage Captures return differences between high- and low-leverage stocks. Descriptors include market leverage, book leverage, and liabilities-to-assets ratio. MLEV · BLEV · DTOA
Earnings Quality Captures return differences due to companies' cash-earnings-to-earnings ratio and accrual components of earnings. ABS · ACF · CETEO
Liquidity
FactorDefinition (EFMGEMTR §3.4)Key Descriptors
Liquidity Captures return differences due to relative trading activity, measured by the fraction of total shares outstanding traded over monthly, quarterly, and annual trailing windows. STOM · STOQ · STOA · ATVR
Growth
FactorDefinition (EFMGEMTR §3.4)Key Descriptors
Growth Captures return differences due to analyst-predicted earnings growth and historical sales and earnings growth. Analyst-predicted earnings growth (mid-term) is the most important descriptor. EGRMF · EGRLF · EGRO · SGRO
Sentiment
FactorDefinition (EFMGEMTR §3.4)Key Descriptors
Short Interest Captures the extent to which a stock is sold short. Three descriptors: short utilisation rate (shares short ÷ shares available to borrow), borrow rate charged by prime brokers, and days-to-cover (shares on loan ÷ average daily volume). SHORTUTIL · BORROWRATE · DAYSCOVER
Machine Learning
FactorDefinition (EFMGEMTR §3.4)Key Descriptors
ML Factor Captures non-linearities and interactions in the relationship between factor exposures and stock returns using machine learning to identify those relationships. Uses neural networks with 24-, 48-, and 72-month lookback windows. ML_GEMTR_NN_24M · _48M · _72M
Crowding
FactorDefinition (EFMGEMTR §3.4)Key Descriptors
Stock Crowding Measures crowdedness of a stock based on deviations of its current factor exposures from their historical medians. Six descriptors drawn from Value, Earnings Yield, Short Interest, Liquidity, Momentum, and Residual Volatility factors — the first three are most important. CROWD_VALUE · CROWD_EARNYILD · CROWD_SHORTINT · CROWD_LIQUIDITY · CROWD_MOMENTUM · CROWD_RESVOL
Source: MSCI Global Equity Factor Trading Model (EFMGEMTR) Empirical Notes, George Bonne et al., June 2022, Section 3.4 — Style Factors. Definitions reproduced for internal reference only.
More methodology docs coming
Planned: semantic search scoring, portfolio beta methodology, risk limit calibration, synthesis prompt design.