▸ Built in public

Confidence calibration

Every panel verdict ships with a confidence number. A confidence of 90 should mean "right roughly 90% of the time". This page checks whether that's true. The dashed diagonal is perfect calibration. Points above are underconfident; points below are overconfident. Bubble size scales with sample count.

Last 60 days of team panel verdicts. The bot is paper-trading. Forward window is 5 days. Win threshold is plus or minus 2%.

Per bucket

n is the count of verdicts in the band. res is how many had a matured 5-day return at run time. win% is the share that ended up correct. meanRet is the average return signed against the verdict thesis (positive = right side of the move). score = win% minus mean stated confidence; negative is overconfident, positive is underconfident.

verdict	bucket	n	res	win%	meanRet	meanConf	score	flag
BUY	60-69	2	2	0.0%	-5.84%	66.0	-66.0	low n
BUY	70-79	14	14	14.3%	-2.53%	74.9	-60.6	low n
BUY	80-84	4	4	0.0%	-5.80%	82.0	-82.0	low n
BUY	85-89	2	2	50.0%	-2.47%	86.0	-36.0	low n
BUY	90-100	1	0	-	-	-	-	low n
AVOID	70-79	12	11	54.5%	+3.37%	77.5	-23.0	low n
AVOID	80-84	77	67	52.2%	+2.34%	80.4	-28.2	ok
AVOID	85-89	12	12	41.7%	+2.44%	86.1	-44.4	low n
AVOID	90-100	72	72	70.8%	+4.95%	93.2	-22.4	ok

lookback 60d, maturation 5d, no-price 1, immature 11