▸ Built in public

Confidence calibration

Every panel verdict ships with a confidence number. A confidence of 90 should mean "right roughly 90% of the time". This page checks whether that's true. The dashed diagonal is perfect calibration. Points above are underconfident; points below are overconfident. Bubble size scales with sample count.

Last 60 days of team panel verdicts. The bot is paper-trading. Forward window is 5 days. Win threshold is plus or minus 2%.

50607080901000%20%40%60%80%100%stated confidenceactual win rateB/60-69B/70-79B/80-84B/85-89A/70-79A/80-84A/85-89A/90-100BUYHOLDAVOID

Per bucket

n is the count of verdicts in the band. res is how many had a matured 5-day return at run time. win% is the share that ended up correct. meanRet is the average return signed against the verdict thesis (positive = right side of the move). score = win% minus mean stated confidence; negative is overconfident, positive is underconfident.

verdict bucket n res win% meanRet meanConf score flag
BUY60-69220.0%-5.84%66.0-66.0low n
BUY70-79141414.3%-2.53%74.9-60.6low n
BUY80-84440.0%-5.80%82.0-82.0low n
BUY85-892250.0%-2.47%86.0-36.0low n
BUY90-10010----low n
AVOID70-79121154.5%+3.37%77.5-23.0low n
AVOID80-84776752.2%+2.34%80.4-28.2ok
AVOID85-89121241.7%+2.44%86.1-44.4low n
AVOID90-100727270.8%+4.95%93.2-22.4ok

lookback 60d, maturation 5d, no-price 1, immature 11