Minimum Detectable Effect

Q: Should I change my MDE after I see early test results?

Changing minimum detectable effect after the test has started can invalidate statistical guarantees because sample size and error rates were based on the initial MDE. If early results suggest your assumptions were incorrect, it is better to stop and redesign the experiment rather than adjusting parameters mid-test. Document what you learned and apply it to the next iteration.

Q: Can I set different MDE values for primary and secondary metrics?

Yes. It is common to define a stricter, smaller MDE for a primary business metric like conversion rate and a larger MDE or exploratory approach for secondary metrics such as time on page or email signup rate. Primary metrics receive the full power of your sample size, while secondary metrics are monitored directionally without the same detection guarantees.

Q: How does MDE differ from the minimum effect of interest?

Minimum effect of interest is usually defined from a business perspective as the smallest effect worth acting on. Minimum detectable effect is a statistical design parameter that determines what size of change your test can reliably detect. In practice, teams often align these two values so that the test is powered to detect effects they actually care about. When MDE equals minimum effect of interest, you avoid wasting resources detecting changes too small to matter.

Q: What should I do if my observed effect is smaller than my planned MDE?

If the observed effect is smaller than the planned minimum detectable effect, the study may not have had enough power to reliably detect such a small change. Treat the result as inconclusive rather than firm evidence of no effect. Future tests may need a lower MDE and larger sample size if that smaller change is still important for business goals. The data you collected can inform an estimate for the next iteration.

May 17, 2026

What Is Minimum Detectable Effect? Meaning & Examples

Minimum detectable effect (MDE) is the smallest true change in a metric that an experiment is designed to pick up as statistically significant with a chosen confidence level and power. Think of it as the resolution of your test: just like a microscope can only show details above a certain size, your experiment can only reliably detect improvements above a certain threshold.

MDE is always tied to your baseline conversion rate. For example, if your current signup rate is 5 percent, the MDE represents the minimum improvement over that baseline you care to detect. A 10 percent relative MDE would mean you want the test to flag a change from 5 percent to 5.5 percent as statistically significant.

Understanding minimum detectable effect requires separating it from the observed effect. MDE is defined before the test as a design parameter. The observed effect is what you actually measure after the test completes. These are fundamentally different concepts that serve different purposes in experiment design.

Key properties of MDE include:

It is a critical input for sample size calculations, not an output of the test
It applies to any metric, including conversion rate, revenue per visitor, or average order value
In some contexts, the same idea is called minimum effect, minimum effect of interest, or minimum effect size
For guardrail metrics where you want to detect decreases, it may be framed as a minimum detectable reduction

Illustration of A/B testing showing two browser windows labeled A and B with bar charts, traffic split through a funnel, comparing revenue outcomes between variations.

Why minimum detectable effect matters

MDE sits at the center of experiment design because it determines how sensitive your test is to meaningful changes in your key metrics. Setting the right MDE is crucial for balancing the cost of acquiring traffic for an A/B test and achieving a meaningful return on investment.

Here is why MDE deserves your attention during test planning:

It shapes the required sample size. MDE directly impacts the sample size needed for a test. Smaller effects are harder to detect and require more data compared to larger effects. A 5 percent relative MDE might need four times the sample size of a 10 percent MDE.
It controls test duration. Because smaller minimum detectable effects demand a larger sample size, they extend how long you need to run the experiment. This ties up development resources and delays decisions.
It connects statistical tests to business goals. Choosing the right MDE aligns experiments with what actually matters financially. Detecting a 0.1 percent lift sounds precise, but if that improvement does not cover implementation costs, the sensitivity is wasted.
It prevents wasted effort. A poorly chosen MDE can make tests either too long and expensive or too weak to detect changes that matter for decision making. Either outcome burns resources without producing informed decisions.

Defining the MDE helps ensure the study distinguishes between noise and meaningful results, balancing sensitivity with practical constraints.

How minimum detectable effect works and how to use it

When designing an A/B test, you typically specify four parameters that are closely related: baseline conversion rate, available traffic or required sample size, significance level, and statistical power. MDE is the fourth variable tied to these inputs, and understanding how it connects to the other three is what makes the difference between a well-designed experiment and one that wastes time or produces inconclusive results.

The core relationships between MDE and test parameters

Factors influencing MDE include sample size, statistical power, significance level, data variance, and study design. Typical settings for sample size calculations include 80 percent power and a 5 percent alpha, influencing the MDE achievable for a given amount of traffic.

The relationships work like this. For a fixed significance level and desired power, a smaller MDE implies a larger required sample size. A larger MDE allows a smaller sample size and shorter test duration. Higher power levels demand more data to detect the same effect size. Lower significance levels (stricter p-value thresholds) also increase required sample size.

Statistical power is defined as the probability of correctly rejecting the null hypothesis when it is false, and it is influenced by sample size, effect size, and significance level. Most teams target 80 percent power to cap false negatives at 20 percent.

Standard sample size calculators take your baseline conversion rate, target minimum detectable effect, desired significance level, and desired power to compute the needed number of users per variant. You enter three values and the calculator returns the fourth.

Here's a concrete example. Suppose you are planning a test on a pricing page with a 3 percent baseline conversion rate and you want to target a 10 percent relative lift as the minimum detectable effect. That means detecting an increase from 3 percent to 3.3 percent. With 80 percent power and 5 percent significance, a calculator might return approximately 39,000 visitors per variant. At 1,000 daily visitors split evenly, this test would run about 78 days. That duration alone might prompt you to reconsider whether a 10 percent relative MDE is realistic or whether a larger MDE with a shorter test window makes more practical sense.

How to choose the right MDE for your test

Choosing the appropriate MDE requires balancing statistical rigor with business reality. There is no single correct value, but a structured process helps you find what makes sense for each test.

Start from business goals. Ask what minimum change in conversion rate, average order value, or lead quality would justify implementation cost and risk. If a 2 percent lift does not cover development time, detecting it provides no value. The MDE should represent the smallest effect that would actually change your decision about whether to ship.
Consider typical ranges. High-traffic sites often target 2 to 5 percent relative minimum detectable effects. Low-traffic properties may need 10 to 20 percent or higher to complete tests in a reasonable time frame. These ranges aren't arbitrary. They reflect the practical reality of how much traffic is needed to detect effects of different sizes within windows that don't stall your entire testing roadmap.
Match MDE to the test scope. Use a lower MDE for a major redesign that is expensive to ship and carries significant risk, since you need high confidence that even modest improvements are real before committing. Use a higher MDE for a small copy tweak where even a modest lift is worth shipping quickly, and the cost of being wrong is low.
Account for constraints. The maximum acceptable test duration and available daily traffic limit determine how low your MDE can realistically be. Running a six-month test rarely makes sense when you could ship three faster experiments instead. The opportunity cost of a single long-running test often exceeds the value of detecting a marginally smaller effect.
Factor in platform differences. Consider the unique characteristics of user behavior across different platforms, as mobile app users may exhibit different patterns compared to desktop users, which can affect A/B test outcomes. Conversion rates often differ significantly between mobile and desktop, which means the same MDE target may require very different sample sizes depending on which platform you're testing.
Document your decisions. Record the chosen MDE, rationale, and assumptions including baseline conversion rate, business goals, and risk tolerance in each experiment brief. This documentation becomes invaluable when reviewing past experiments and calibrating future MDE choices based on what your team has actually observed.

Finding the right balance between sensitivity and feasibility is the core skill in test planning.

How to calculate MDE and required sample size

Calculating MDE involves understanding relative effect size and using that target in standard tools. Here is the process broken down.

Relative effect size expresses MDE as a percentage lift over baseline. The formula is: (target conversion rate minus baseline conversion rate) divided by baseline conversion rate.

Worked example: a landing page has a 12 percent baseline conversion rate. The team wants to detect an increase to 13.2 percent. The absolute lift is 1.2 percentage points, and the relative minimum detectable effect is 1.2 divided by 12, which equals 10 percent.

To translate this into required sample size, utilize sample size calculators or statistical software to estimate the necessary sample size for detecting the desired minimum detectable effect with a specified level of confidence and power. Enter your baseline conversion rate (12 percent), target MDE (10 percent relative), significance level (5 percent), and power (80 percent). The calculator returns the required number of visitors per variant. A smaller minimum detectable effect requires a larger sample size to achieve the same level of statistical power, as detecting smaller effects necessitates more data.

To estimate test duration, divide the required sample size by your daily traffic per variant. If you need 15,000 visitors per variant and receive 500 per day, the test runs 30 days.

Iterating between traffic assumptions and MDE helps teams arrive at a feasible combination of minimum detectable effect, duration, and cost. If the initial calculation shows an 80-day test, you might raise the MDE to complete faster, or you might decide the change is important enough to run longer. This iterative process is normal and healthy. The goal isn't to pick the perfect MDE on the first try but to find the combination of sensitivity, duration, and business value that makes the most sense given your current constraints.

Two overlapping normal distribution curves representing null and alternative hypotheses, with shaded regions showing 80% statistical power, 5% false positive rate, and the MDE decision cutoff.

Examples of minimum detectable effect in real tests

Context determines what MDE is appropriate. Here are three scenarios showing how choices differ based on business model and traffic.

Ecommerce product page test

An online retailer tests a new product page layout. The baseline conversion rate is 4 percent. The team targets a 7.5 percent relative minimum detectable effect, meaning they want to detect an increase from 4 percent to 4.3 percent (0.3 percentage points absolute). With moderate traffic, this requires approximately 25,000 visitors per variant. The chosen MDE reflects the revenue impact of even small improvements on a high volume page.

SaaS free trial signup funnel

A software company tests changes to their trial signup flow. The baseline conversion rate is 10 percent. Because even small lifts compound over subscription lifetime value, the team designs a test around a 3 percent relative minimum detectable effect, targeting a change from 10 percent to 10.3 percent. The relationship between sample size and power indicates that increasing the sample size can lead to a higher probability of detecting a true effect, thereby increasing the test’s power. This stricter MDE requires roughly 11,000 conversions per variant, which translates to about 110,000 visitors given the 10 percent baseline.

Low traffic publisher

A content publisher with limited daily sessions must accept a larger minimum detectable effect to run tests that conclude in reasonable time. With a 1 percent baseline conversion rate on newsletter signups, they set a 15 percent relative MDE to detect a shift from 1 percent to 1.15 percent. This higher MDE keeps test duration under four weeks but risks overlooking subtle improvements that could still matter over time.

Each example shows how the chosen MDE influences required sample size, test duration, and alignment with business impact.

Best practices for working with MDE

These guidelines help teams use MDE effectively across their experimentation program. Getting MDE right is what separates teams that run experiments confidently from ones that either waste time on underpowered tests or over-invest in detecting changes too small to matter.

Set MDE during planning, not mid-test

Define your minimum detectable effect before launching traffic into the experiment. Changing it after the test starts invalidates statistical guarantees because sample size calculations and error rates were based on the initial MDE. Mid-test adjustments are one of the most common ways teams unknowingly compromise their results, often because early data looks promising or disappointing and they want to adjust the goalposts. Resist this urge. If you realize your MDE was poorly calibrated, finish the current test, document the learning, and apply a better MDE to the next experiment rather than retrofitting the one already running.

Align statistical and practical significance

Strive for a balance between statistical significance and practical impact in A/B testing, ensuring that results are not only statistically significant but also meaningful to business goals. A statistically significant 0.01 percent lift rarely justifies action. Before launching any test, ask your team: "If we detected exactly this size of effect, would we actually ship the change?" If the answer is no, your MDE is set too low and you're spending resources to detect changes nobody would act on. The best MDE sits right at the threshold where a detected effect would trigger a real business decision.

Use historical data to refine estimates

Past experiments, baseline conversion rate trends, and implementation costs help calibrate typical MDE ranges for your organization. Teams that have run dozens of experiments develop an intuitive sense of what realistic lift looks like in their product. If your last ten tests averaged a 3 to 5 percent lift on winning variants, setting an MDE of 0.5 percent for your next test is probably overkill unless you have massive traffic to support it. Build a reference table of past experiment results so new team members can quickly understand what "normal" looks like and set MDEs accordingly.

Avoid chasing extremely small MDEs

Setting a lower MDE requires a larger sample size to detect minor changes, which can increase the costs associated with running an A/B test significantly. Unless traffic volume and budget clearly support longer durations, focus on effects that are practically significant. A test designed to detect a 0.1 percent lift might need to run for months on moderate-traffic sites, tying up testing infrastructure and delaying other experiments that could deliver faster, more impactful learnings.

Review MDE choices regularly

As traffic patterns and business goals evolve, revisit your standard MDE ranges. What worked a year ago may no longer fit current conditions. A product that doubled its traffic can now detect smaller effects in the same timeframe, which means tighter MDEs become feasible. Conversely, a product that shifted to a higher-value, lower-volume customer base may need to accept larger MDEs to keep test durations reasonable. Build an annual or quarterly review of your MDE standards into your experimentation governance process.

Prefer randomized designs

Randomized controlled trials typically offer better MDEs compared to non-randomized designs due to better controlled variance. When randomization isn't possible (for example, in geographic or time-based tests), expect wider confidence intervals and plan for larger sample sizes to compensate. Non-randomized designs introduce confounding variables that inflate the variance in your results, which directly increases the minimum effect size you can reliably detect. Whenever you have the choice, default to user-level randomization for the tightest possible MDE given your available traffic.

Key metrics to monitor when using MDE

Tracking the right metrics ensures your MDE planning translates into actionable results.

Baseline conversion rate: This is your crucial input for planning minimum detectable effects and estimating sample sizes. Inaccurate baseline estimates throw off all downstream calculations.
Absolute and relative effect size: After the experiment concludes, compare the observed effect against your planned MDE to assess whether the test was properly powered.
Required sample size per variant and total: These metrics determine how long the test must run to detect the planned minimum detectable effect with your chosen power.
Significance level and statistical power: These design parameters interact with MDE and influence risk of type I error (false positives) and false negatives.
Secondary metrics: Revenue per visitor, average order value, or churn rate may be part of business goals. MDE can also be defined for these outcomes when they represent conversion events worth measuring.

If the true impact is below the MDE, the likelihood of producing a significant result decreases, but it is not impossible. This is why monitoring actual power and effect sizes after tests complete helps refine future planning.

Minimum detectable effect and related concepts

MDE connects to several experimentation concepts that inform how you interpret results.

Statistical power: Power is the probability of detecting at least the minimum detectable effect when the true effect is present. With typical power targets such as 80 percent, you accept a 20 percent chance of missing a real effect. Higher power levels require larger sample sizes for the same MDE.
Effect size: MDE is a chosen threshold representing the smallest improvement worth detecting. Effect size is what the test actually observes. A test is powered to detect effects at or above the MDE, but the observed effect may be larger, smaller, or absent entirely.
Confidence intervals and p-values: After the test completes, MDE helps interpret results. If the observed effect exceeds MDE and the confidence interval excludes zero, you have strong evidence the change is real. P-values alone can mislead without power context.
Hypothesis testing: MDE works within the framework of the null hypothesis (no difference between variants) and the alternative hypothesis (a significant difference exists). Proper sample size ensures you can reliably reject the null when the alternative is true.
Broader experiment design: MDE connects with hypothesis formulation, user segments, segmentation strategies, and prioritization frameworks that rank tests by expected impact and feasibility. Tests targeting larger effects often receive higher priority because they require less traffic and produce faster results.

Key takeaways

Minimum detectable effect (MDE) is the smallest improvement over a baseline conversion rate that an A/B test is designed to reliably detect with a given level of confidence and power.
Smaller minimum detectable effects require a larger sample size, more traffic, and longer experiments to reach statistical significance.
The right MDE depends on your business goals, risk tolerance, and traffic or budget constraints rather than a universal rule.
Choosing and calculating MDE before launching a test leads to more efficient, decision focused experimentation that balances sensitivity with practical constraints.
By defining the MDE upfront, you can ensure that your A/B tests are designed to detect changes that are statistically significant and practically meaningful, which is vital for effective decision making.

FAQs about Minimum Detectable Effect

How do I know if my chosen MDE is too small?-

An MDE is likely too small if the required sample size is larger than what your traffic can deliver in a reasonable test duration, such as a few weeks. If your calculation shows you need 200,000 visitors per variant but you only receive 2,000 daily, the test would run over six months. At that point, the cost of acquiring the needed traffic outweighs the potential benefit of detecting that small improvement. Adjust your MDE upward until duration becomes acceptable.

Should I change my MDE after I see early test results?+

Can I set different MDE values for primary and secondary metrics?+

How does MDE differ from the minimum effect of interest?+

What should I do if my observed effect is smaller than my planned MDE?+

A/B Testing

Website Personalization

Widgets

Integrations

Become a Partner

Partner Directory

Become a Personizely Affiliate

White Label

Blog

Case Studies

Help Desk

Contents