What is more important in ICT backtesting — win rate or expectancy?

Expectancy is more important than win rate. Expectancy = (Win rate × Average win R) - (Loss rate × Average loss R). A 35% win rate with an average 4R win and average 1R loss produces expectancy of (0.35 × 4) - (0.65 × 1) = 1.4 - 0.65 = +0.75R per trade — a profitable system. A 60% win rate with average 0.8R win and average 1.5R loss produces expectancy of (0.60 × 0.8) - (0.40 × 1.5) = 0.48 - 0.60 = -0.12R per trade — a losing system. ICT's low-frequency, high-R:R structure means win rates of 35-50% are normal and positive expectancy depends on achieving the planned R:R targets.

ICT Backtesting Guide: How to Backtest ICT Properly (2026)

Q: Can you automate ICT backtesting?

No — ICT strategy cannot be automated or algorithmically backtested with standard tools. ICT entries depend on discretionary identification of liquidity sweeps, market structure shifts, fair value gaps in correct zones, kill zone timing, and daily bias alignment. These require human judgment at each decision point. Automated backtesting tools (like TradingView strategy backtester or dedicated algo platforms) cannot replicate the ICT decision-making process. ICT backtesting must be done manually using TradingView's Bar Replay feature or similar candle-by-candle replay tools.

Q: How many trades do you need to backtest ICT?

A minimum of 100 trades is required before drawing any meaningful conclusions about an ICT setup's edge. Below 100 trades, the results are dominated by variance rather than genuine performance. For setup-specific analysis (e.g. Silver Bullet only, or Venom Model only), aim for 100 trades per setup type. For a complete ICT intraday strategy with multiple setup types, 200-300 total trades across at least 6 months of historical data will produce statistically meaningful results. Do not draw conclusions from 20-30 trades, even if the win rate looks impressive — this is the most common ICT backtesting error.

Q: How long does ICT backtesting take?

Expect 2-4 hours per month of historical data reviewed, depending on how many setups you log and how carefully you analyse each session. For 100 trades covering 6 months of history, budget 15-25 hours of total backtesting time. This can be spread across multiple sessions — there is no requirement to complete it in one sitting. Many traders find 90-minute backtesting sessions (covering 2-3 weeks of history per session) most sustainable. The first month is always the slowest as the process becomes familiar; subsequent months are typically 30-40% faster.

ICT traders who have not backtested their setups share a common experience: they understand every concept, they can identify setups in real time, and they still hesitate to pull the trigger when the moment comes. The hesitation is not a psychology problem. It is a knowledge problem — they do not have statistical confidence in what they are doing because they have never actually measured it.

Backtesting does not guarantee future results. What it does is give you the data to answer the questions that hesitation raises: does my bias work? Does the Silver Bullet produce positive expectancy when I follow my full checklist? Does SMT divergence actually improve my win rate or does it just feel like it does? Without backtesting, these are assumptions. With backtesting, they are answers — and entering trades with answers is a fundamentally different psychological experience than entering with assumptions.

Why ICT Must Be Backtested Manually

Most trading strategies can be automated and therefore backtested algorithmically. An RSI crossover, a moving average strategy, a breakout system — these have precise rules that a script can evaluate mechanically. ICT does not. The discretion in ICT entries is not a flaw; it is intentional. But it means that algorithmic backtesting cannot capture it.

Consider a standard 2022 Model entry. At each step — confirming the daily bias, identifying the sweep, calling the MSS, selecting which FVG is the 1st Presented — a human judgment is required. No algorithm can reliably assess whether the sweep's body closed inside the range with institutional conviction. No algorithm can determine whether the daily bias for that specific day was truly bearish based on the higher timeframe context. No algorithm can identify the kill zone timing with all the nuance the framework requires.

Traders who run algorithmic backtests on ICT-like rules consistently find that the results are meaningless — the algorithm either generates too many false signals (when rules are loosened to capture real setups) or too few signals (when rules are tightened to eliminate false ones). The discretion is the strategy. You cannot remove it without changing what you are testing.

The consequence: every ICT backtest must be manual. You sit down with TradingView Bar Replay, set the date back to the start of your backtesting window, and replay the market forward candle by candle, making the same decisions you would make in real time: does this session have a valid bias? Has the sweep occurred? Is the MSS confirmed? Would I enter here? Then you log the outcome and move to the next session.

Setting Up TradingView Bar Replay

TradingView Bar Replay is the standard tool for ICT manual backtesting. It replays historical price action forward one candle at a time, allowing you to practice entries and exits as if trading in real time. Here is the setup:

What you need: TradingView account (Pro or above recommended — free plan has limited Bar Replay access). Two or three chart panels: 15-minute (primary execution), 5-minute (MSS and entry), 1-hour (kill zone context and daily range). NQ or ES as the primary instrument.

Opening Bar Replay: In TradingView, click the clock icon in the top toolbar (or press Shift+R). A date picker appears. Select a date at least 6 months ago — you want enough historical distance that you do not remember specific price action. Avoid using very recent data initially; familiarity with recent events creates hindsight bias. Ideal starting point: 12 months ago.

The replay workflow: Once Bar Replay starts, you are in the past with no knowledge of what comes next. Navigate to the Sunday before your first week to mark the weekly profile. Identify the monthly EQ. Then advance through the week candle by candle, applying the full ICT analysis process: daily bias, kill zone timing, sweep identification, MSS, FVG entry. When you would take a trade in real life, record the entry. Continue advancing until the trade closes. Log the result. Advance to the next session.

Timeframe discipline: In real trading, you do not switch between charts without purpose. In Bar Replay, replicate this exactly. Work primarily on the 15-minute chart. Switch to the 5-minute only when an entry is forming. Use the 1-hour only for bias confirmation at session boundaries. Do not use the daily chart to "check" the direction mid-session — that would be hindsight bias. If the daily information was available to you at the time (because you would have seen it in your morning prep), use it. If not, do not.

The 7-Step Backtesting Process

Set the date and prepare the monthly/weekly context

Start Bar Replay on a Sunday. Before advancing any candles from Monday, do your full Sunday preparation: identify the monthly EQ and bias, mark the prior week's high/low and EQ, identify the weekly draw on liquidity. Record: "Week of [date] — Monthly: BEARISH (price at 21,850, EQ 21,760). Weekly: BEARISH, draw on equal lows 21,180." This takes 3–5 minutes per week and replicates the preparation you would do in real trading.

Advance through Monday morning prep

Advance to 7:30 AM ET (before any kill zone). Do your daily bias check: what does the current price action tell you about today's direction? Mark the pre-market range high and low. Note any CBDR or NWOG if visible. Record your daily bias: "Monday — bearish. Pre-market high 21,524 (BSL). Pre-market low 21,456 (SSL). Plan: look for Judas above 21,524 at 9:30 AM." Advance to the kill zone.

Watch the kill zone — identify setup or no-trade

Advance through the London or NY open kill zone candle by candle. At each candle: does the sweep occur? Does the body close inside? Does the MSS fire? Is there a clean 1st Presented FVG? Apply your full pre-trade checklist (all 8 questions from the intraday strategy guide). If all 8 pass: record the hypothetical entry. If not: log "No valid setup — [reason]" and advance to the next session. No-trade logs are as important as trade logs — they tell you whether your standards are being applied correctly.

Log the entry before advancing

Before advancing another candle after your entry, record: Date, Session (London/NY), Setup type (2022 Model / Venom / Silver Bullet), Bias score (1–3: weak/moderate/strong), Confluence score (1–6), Entry price, Stop price, T1 target, T2 target, Direction (L/S). Do not advance the replay until this is logged. Recording after seeing the outcome creates hindsight contamination — you will unconsciously adjust what you log based on whether the trade won or lost.

Advance through the trade — manage as you would in real life

Continue advancing candles with your stop and T1/T2 targets in mind. Apply the T1/T2 rule exactly: when T1 is reached, record "50% closed at T1" and note the new stop at break-even. When T2 is reached or the stop is hit, record the final result. Do not deviate from your pre-logged T1/T2 levels mid-trade. "Letting it run" past T2 or "cutting early" are real-trading decisions that must be separately tested — do not mix them into a clean backtest session.

Log the result and continue to the next session

Record the outcome: T1 hit (R multiple), T2 hit (R multiple), stopped out (−1R), or break-even (0R). Note any observations: "MSS was unclear — should have waited." "Bias was correct but I missed the FVG entry zone." "SMT was present but I didn't check ES." One sentence per trade is enough. These observations accumulate into the most valuable part of your backtest — the pattern of your specific errors.

Analyse results after 50 and 100 trades

After every 50 trades, run the analysis: win rate, average win R, average loss R, expectancy. Look for patterns: which setup type performs best? Which kill zone produces higher win rates? Does your bias accuracy improve as the backtest progresses (suggesting your identification skill is developing)? After 100 trades, draw initial conclusions. After 200, conclusions become actionable for live trading decisions.

The Backtest Log — What to Record

The log is the output of the backtest. A minimal log that captures only win/loss is nearly useless — it tells you the outcome but not why. A well-structured log tells you which variables produce positive expectancy and which degrade it. Here is the recommended field set:

#	Date	Session	Setup	Bias	Conf.	Entry	Stop	T1	T2	Result	R	Note
1	Jan 8	NY	2022M	3/3	6/6	21,501 S	21,568	21,438	21,180	T2 Hit	+5.2R	SMT confirmed, clean FVG
2	Jan 9	NY	SB	2/3	4/6	21,448 S	21,490	21,380	21,220	Stopped	−1R	Bias was unclear — countertrend
3	Jan 10	London	2022M	3/3	5/6	21,362 S	21,420	21,302	21,140	T1 only	+1.0R	T2 stalled — news at 10 AM
4	Jan 10	NY	Venom	3/3	6/6	21,419 S	21,464	21,348	21,180	T2 Hit	+5.3R	Body close rule ✓ SMT ✓
5	Jan 11	NY	SB	2/3	3/6	21,398 S	21,432	21,342	21,220	BE	0R	T1 hit, stop moved to BE, reversed

Field definitions: Setup — 2022M (2022 Model), SB (Silver Bullet), Venom, Unicorn. Bias — score 1–3: 1 = unclear/countertrend, 2 = probable, 3 = strongly confirmed. Conf. — confluence score 1–6 from the 6-element confluence stack. R — the actual R multiple achieved: +5.2R means the trade returned 5.2× the amount risked (based on full T2 close). T1-only results are calculated on the 50% portion closed.

Minimum Sample Size — When Results Are Valid

One of the most destructive mistakes in ICT backtesting is drawing conclusions from too small a sample. Twenty trades with a 75% win rate feels significant. Statistically, it is noise. A coin flip produces 75% heads in 20 flips with surprising regularity. You need enough trades that variance is averaged out and genuine edge — or lack of it — becomes visible.

Minimum thresholds:

50 trades: First checkpoint. Run preliminary analysis. What is your win rate? What is your average R? Can you see any patterns? Do not make any strategy changes at this point — just note what you see and continue.

100 trades: First actionable sample. Conclusions about overall expectancy are now meaningful. If the system is producing positive expectancy, continue to 200. If it is clearly negative (expectancy below −0.3R per trade consistently), investigate why before continuing.

200 trades: Statistically solid sample. Breakdown by setup type, session, and confluence score is now meaningful. You can now make specific adjustments: "Silver Bullet trades with confluence 3/6 or below are consistently negative — removing them from my live trading plan."

Important: 200 trades for ICT intraday trading represents approximately 6–8 months of historical data (assuming 1–2 valid setups per session, 3–4 sessions per week). This is not a weekend project. Budget 15–25 hours of backtesting time spread over several weeks.

Reading Your Results — Expectancy vs Win Rate

Win rate is the number traders obsess over. Expectancy is the number that matters. A system with 35% win rate and 4R average wins is more profitable than a system with 65% win rate and 0.8R average wins. ICT's low-frequency, high-R:R structure naturally produces lower win rates — and this is correct and expected.

42%

Win rate — normal for ICT

3.4R

Avg win (T1 + T2)

1.0R

Avg loss (structural stop)

+0.84R

Expectancy per trade

How to calculate expectancy: Expectancy = (Win rate × Average win R) − (Loss rate × Average loss R). Using the stats above: (0.42 × 3.4) − (0.58 × 1.0) = 1.428 − 0.58 = +0.848R per trade.

This means: for every trade taken at 1% risk, you expect to earn 0.84% of account on average. At 50 trades per month, that is 42% monthly return — unrealistic in practice due to variance, but the mathematical expectancy shows the system is profitable.

What to look for in your results:

A positive expectancy across 100+ trades confirms the system has edge. A negative expectancy requires investigation — is the bias identification off? Are you entering outside kill zones? Are you trading the 2nd or 3rd Presented FVG as if it were the 1st? The breakdown by confluence score is particularly revealing: most traders find that 5/6 and 6/6 confluence trades are solidly positive, 3/6 and 4/6 trades are near-zero or negative, and below 3/6 trades are consistently negative. This data alone — available after 100–150 backtested trades — tells you exactly which setups to keep and which to eliminate.

Expectancy by Confluence Score — Sample Backtest Data (120 Trades) S-tier (6/6) and A-tier (5/6) produce positive expectancy · below 4/6 degrades or negative

Sample backtest results: 120 trades categorised by confluence score. Trades with 1/6 and 2/6 confluence are consistently negative — eliminate. 3/6 is break-even — review whether the standards were correctly applied or whether these are genuinely marginal setups. 4/6 through 6/6 all produce positive expectancy, with S-tier (6/6) producing 1.76R per trade average. The practical trading decision: only take 4/6+ setups, scale size with tier (4/6 = 0.5%, 5/6 = 1%, 6/6 = 2%). This data is available after approximately 100–150 backtested trades.

What to Backtest — and What Not To

Not all ICT concepts are worth backtesting in isolation. The goal is not to find the "best" individual concept — it is to find the specific combination of concepts that produces positive expectancy when applied together as a system.

Backtest these first:

The Silver Bullet is the best first backtest target because it has the most constrained rules: a specific 1-hour window (10:00–11:00 AM ET), a specific entry type (FVG within the window), and a specific session context. Its constraints reduce ambiguity. Start here, build 50 Silver Bullet trades, and use the results to calibrate your bias identification and FVG entry skills before expanding to the broader 2022 Model.

The Venom Model is the second-best target because the body close rule is binary — it either passes or it does not — removing one of the main sources of backtest discretion. The opening range is precisely defined. This makes Venom backtests produce cleaner data than more discretionary setups.

The full 2022 Model should be backtested after Silver Bullet and Venom, because it requires the most judgment at each decision point. By the time you test the 2022 Model, your pattern recognition has been calibrated by the more constrained models — you will make fewer identification errors.

Do not backtest these in isolation:

Individual PD arrays (FVG, OB, Breaker Block) without the AMD and bias context produce meaningless results. An FVG without a sweep, without kill zone timing, without daily bias — tested in isolation — will produce negative expectancy because the missing context is precisely what gives the FVG its edge. If you test "enter every FVG that forms," you will get different (and worse) results than "enter the 1st Presented FVG after a BSL sweep in the NY kill zone on a bearish day." Test the complete setup, not the component.

The Forward Test Bridge

Backtesting produces confidence in historical data. Forward testing — applying the same process to real-time data, still without real capital — bridges the gap between historical confidence and live trading. The three-stage progression:

Stage 1 — Backtesting (historical, no capital): 100–200 trades in Bar Replay. Goal: confirm positive expectancy and identify which setup types/confluence levels produce it. Duration: 3–6 weeks.

Stage 2 — Forward testing on demo (real-time, no capital): Apply the identical process to live market data, with demo account positions but the same position sizing and risk management rules as your planned live setup. Goal: confirm that your backtested edge survives in real-time conditions (where you cannot replay, cannot rewind, and emotional pressure is higher). Minimum duration: 30 trading days. Minimum trades: 30. If demo results are consistent with backtest results (within expected variance), proceed to Stage 3.

Stage 3 — Live trading at minimum size: Begin with the smallest viable position size — 1 MNQ contract or minimum lot size. Goal: acclimate to the psychological reality of real capital at risk. The edge is confirmed. The process is confirmed. The only new variable is emotional response to real P&L. Trade minimum size for 30 days before scaling to target size.

The most common progression error is skipping Stage 2. Traders backtest, see positive results, and immediately open a funded account. The psychological experience of live trading is fundamentally different from historical replay — the uncertainty is real, the losses feel real, and the hesitation returns even when the process is known. Stage 2's demo forward test at real-time speed is the bridge that makes Stage 3 a continuation rather than a new experience.

Backtesting Progression — Expectancy Stabilises After 100 Trades Running expectancy per trade across 150 backtested positions — high variance early, stabilises at true edge

Running expectancy across 150 backtested trades. Early trades (0–50) show high variance — the running average swings dramatically. A 20-trade sample that looks like +1.5R per trade or −0.5R per trade is unreliable. After 50 trades, the curve begins stabilising. After 100 trades, the true long-run expectancy becomes visible — in this example, approximately +0.8R per trade. Only at 100+ trades can meaningful conclusions be drawn. The shape of this curve — volatile early, stable late — is identical for every trader's backtest regardless of system quality.

Common ICT Backtesting Mistakes

Hindsight bias — knowing the outcome before logging the entry. The most corrosive backtesting error. When you see a session that clearly delivered 200 points lower, you subconsciously identify the entry more easily — the bias looks obvious, the MSS looks clean, the FVG looks textbook. In real trading, the bias is never obvious. Combat this by setting Bar Replay to a date far enough back that you genuinely do not remember the price action, and always logging your entry decision before advancing another candle. If you see yourself logging entries that "look great" on charts you can see the outcome of, restart the session from a date you are less familiar with.

Cherry-picking setups mid-session. Replaying a session and deciding at the end which setups you "would have taken" produces a best-case backtest that does not reflect real trading. In real trading, you do not know which setup will work and which will not. You must log every setup you would have entered given the information available at the time. If you would have entered five times during the session in real trading, log five times in the backtest. If you only log the two that worked, you are measuring your ability to identify winning trades in hindsight, not your actual trading edge.

Testing a single setup type and concluding the system works. Thirty Silver Bullet trades with 80% win rate is not a valid conclusion about the ICT framework. It is a valid preliminary observation about Silver Bullet setups in that 30-trade sample. Expand the sample to 100 Silver Bullet trades before concluding anything. Then expand to 100 Venom trades. Then 100 2022 Model trades. The complete system's edge only becomes visible when all components have been tested with sufficient sample sizes.

Treating backtesting as a one-time exercise. A backtest completed 6 months ago is informative but not current. Markets change. The ICT community's understanding of the framework deepens over time. Your own skill level changes. Retest the same setups every 6–12 months, comparing your new results to your old ones. Consistent improvement in expectancy and reduction in the number of identification errors (logged in your notes) is evidence that your pattern recognition is developing as it should.

Frequently Asked Questions

Can you automate ICT backtesting?

No. ICT entries require discretionary judgment at multiple decision points — identifying whether a sweep's body closed inside the range with conviction, assessing daily bias, calling the MSS on the trading timeframe. These cannot be reliably codified into an algorithm. Any automated backtest of ICT-like rules will either generate too many false signals (loose rules) or miss too many real setups (tight rules). ICT backtesting must be manual, using TradingView Bar Replay or a similar candle-by-candle replay tool.

How many trades do you need to backtest ICT?

Minimum 100 trades before drawing any conclusions. Below 100, results are dominated by variance rather than edge. Aim for 200 trades across at least 6 months of historical data for actionable setup-specific conclusions (e.g. eliminating low-confluence trades from your live trading plan). For a complete ICT intraday strategy with multiple setup types, 200–300 total trades produces statistically meaningful results. The most common backtesting error is drawing conclusions from 20–30 trades.

What is more important — win rate or expectancy?

Expectancy. Expectancy = (Win rate × Average win R) − (Loss rate × Average loss R). A 35% win rate with 4R average wins produces +0.75R expectancy — profitable. A 60% win rate with 0.8R average wins and 1.5R average losses produces −0.12R expectancy — losing. ICT's low-frequency, high-R:R structure naturally produces win rates of 35–50% — which is normal and expected. Do not target a higher win rate by lowering your R:R standards. Target higher expectancy by trading only high-confluence setups at the right R:R.

What tool should I use to backtest ICT?

TradingView Bar Replay is the standard. It allows candle-by-candle replay of any symbol and timeframe from any historical date. TradingView Pro or above provides full historical access without limitations (the free plan has restricted Bar Replay). Set up three panels: 15-minute (primary execution), 5-minute (MSS and entry), 1-hour (context and daily range). Replay on the 15-minute chart and switch to 5-minute only when an entry is forming — this replicates the real trading workflow exactly.

How long does ICT backtesting take?

Budget 2–4 hours per month of historical data reviewed. For 100 trades covering 6 months of history, plan 15–25 hours total, spread across multiple sessions. Most traders find 90-minute sessions covering 2–3 weeks of history each most sustainable. The first month of historical data takes the longest (the process is new); subsequent months are 30–40% faster as the workflow becomes automatic. Do not rush — the quality of the log is more important than the speed.

ICT backtesting in five rules

1 — Manual only. No algorithm can replicate the discretion. Use TradingView Bar Replay on a date far enough back that you do not remember the price action. 2 — Log before advancing: entry price, stop, T1, T2, setup type, bias score, confluence score. Never log after seeing the outcome. 3 — Minimum 100 trades before conclusions. 200 for setup-specific analysis. 4 — Expectancy, not win rate, is the measure. 35–45% win rate is normal for ICT. 5 — Backtest → demo forward test (30 days) → live at minimum size. Do not skip stage 2.