Weibull Analysis: How It Works & When to Use It

Q: How do I handle mixed failure modes in Weibull analysis?

Mixed failure modes (e.g., bearing wear plus electrical failures) violate single-distribution assumptions. Identify failure modes through root cause analysis, then either model each mode separately using competing risks methods, or fit mixture Weibull models with multiple shape parameters. Plotting failures by mode reveals whether mixing is occurring—you'll see multiple distinct patterns on probability paper.

Your hydraulic pumps are failing at 850 operating hours instead of the expected 1000. Is this random variation you can ignore, or systematic degradation signaling a $2M warranty exposure requiring immediate supplier escalation? Weibull analysis answers this question in 30 minutes with data you already have. While other reliability methods require complete failure data or make restrictive assumptions, Weibull handles partial failures, censored observations, and changing failure rates—delivering actionable maintenance schedules, warranty cost predictions, and design improvement priorities from the same analysis.

The Decision Framework: When Weibull Analysis Answers Your Question

Before diving into mathematics or software, determine whether Weibull analysis is the right tool. This decision framework maps business questions to analytical approaches.

Use Weibull Analysis When You Need To:

Predict warranty costs before they happen. Your product launches in 6 months. Finance needs warranty reserve estimates. Weibull analysis on prototype testing—even with only 20% of units failed—projects failure rates at 1-year, 2-year, and 3-year horizons. This beats waiting for real-world failures before budgeting reserves.

Optimize maintenance schedules with data, not guesses. Current preventive maintenance at 1000 hours wastes money if failures cluster at 2000 hours, or misses problems if they occur at 600 hours. Weibull reveals the actual failure distribution, letting you schedule maintenance when it prevents failures rather than when calendars say so.

Identify whether failures are design problems or manufacturing defects. Early failures (shape parameter < 1) indicate quality control issues—parts that should never have shipped. Late failures (shape parameter > 1) indicate wear-out and potential design limitations. This distinction determines whether you fix production processes or redesign components.

Compare reliability across suppliers, designs, or operating conditions. Which bearing manufacturer delivers better longevity? Does operating temperature affect failure rates? Weibull parameters provide objective comparisons—characteristic life tells you median performance, while shape parameter reveals whether reliability improves or degrades over time.

Handle incomplete data professionally. You started tracking 200 units 6 months ago. Only 25 have failed. Traditional failure analysis discards the 175 survivors, wasting 87.5% of your information. Weibull incorporates censored data (units that haven't failed yet), extracting insights from the full dataset.

Key Insight: The Shape Parameter Drives Decisions

Weibull analysis produces two critical numbers: characteristic life (η) tells you when failures happen, and shape parameter (β) tells you why. β < 1 means fix manufacturing quality. β ≈ 1 means failures are random (hard to prevent). β > 1 means implement preventive maintenance before wear-out. This single parameter determines whether you invest in design changes, production process improvements, or maintenance programs.

Skip Weibull Analysis When:

Failure rates are truly constant over time. If your failure pattern shows β ≈ 1 (constant hazard), use simpler exponential distribution models. Weibull reduces to exponential in this case, so the added complexity provides no benefit. Electronics without wear-out often fit this pattern.

You have multiple distinct failure modes mixing together. A motor that fails from bearing wear, electrical shorts, and corrosion creates a mixed distribution that violates Weibull assumptions. Either separate failures by mode and analyze each independently, or use competing risks methods that explicitly model multiple failure mechanisms.

You need to compare many variables simultaneously. Weibull handles one or two covariates (operating temperature, load level) reasonably well. For analyzing 10+ factors—material grade, supplier, production line, installation date, geographic region, usage intensity—use accelerated failure time models or Cox regression instead. These methods handle multivariate analysis more efficiently.

Your data includes repairable systems with multiple failures per unit. Weibull models time-to-first-failure. If you repair equipment and track multiple failures on the same unit, you need recurrence models or repairable system analysis, not standard Weibull. The independence assumption breaks when units experience repeated failures.

Step 1: Collect the Right Data (Most Failures Happen Here)

Weibull analysis fails most often from poor data collection, not mathematical errors. Before opening analysis software, verify you have the correct data structure and quality.

Required Data Elements

Time-to-event for each unit. This is operating hours, cycles, mileage, or calendar time from start of exposure until failure or last observation. Critical: use the same time scale for all units. Don't mix calendar days for some units and operating hours for others.

Event status for each unit. Binary indicator: 1 for failed units, 0 for censored units (still operating or lost to follow-up). This distinguishes units that actually failed from those where you only know they survived past a certain point.

Failure mode identification (when relevant). If you're analyzing bearing failures specifically, exclude units that failed from other causes. Mixed failure modes create composite distributions that confuse interpretation. Root cause analysis determines failure modes before statistical analysis begins.

Common Data Collection Mistakes and Corrections

Mistake: Using calendar time when operating time matters. Two identical pumps installed the same day fail at different calendar times because one runs 24/7 while the other runs 8 hours daily. Calendar time creates artificial variation. Track actual operating hours or cycles.

Correction: Install hour meters or cycle counters. If historical data lacks this, estimate operating time from production records. Imperfect operating time estimates beat precise but irrelevant calendar times.

Mistake: Discarding censored observations. You only analyze failed units, treating unfailed units as "not useful yet." This throws away information and biases results toward shorter failure times, making reliability look worse than reality.

Correction: Record censored observations with their survival time and event status = 0. A pump running 1200 hours without failure tells you something—it survived at least 1200 hours. Weibull methods use this information properly.

Mistake: Mixing different stress levels without accounting for it. Units operating at different temperatures, loads, or environments have different failure distributions. Combining them assumes all operate under identical conditions.

Correction: Either stratify analysis by operating conditions (analyze high-temp separately from low-temp), or use accelerated life testing methods that explicitly model stress effects. Don't pool data from fundamentally different environments.

Mistake: Including early-life handling failures with wear-out failures. Units damaged during shipping or installation fail for different reasons than units wearing out from use. Mixing these modes obscures both patterns.

Correction: Define a burn-in period or qualification phase. Exclude failures from the first 24-48 hours of operation, or analyze early failures separately as installation/quality issues distinct from operational reliability.

Minimum Sample Size Reality Check

Weibull analysis works with surprisingly small samples if you handle censoring correctly. Here's what different sample sizes enable:

15-20 failures: Basic parameter estimation with wide confidence intervals. Sufficient for initial design decisions or identifying gross reliability problems.
30-50 failures: Reliable parameter estimates for maintenance planning and warranty cost projections. Confidence intervals narrow enough for most business decisions.
100+ failures: Precise estimation enabling competitive comparisons and detection of small reliability improvements from design changes.

Remember: failures, not total units, drive precision. Testing 500 units with 20 failures provides similar statistical power to testing 100 units with 20 failures. The 480 censored observations add information but the 20 failures dominate parameter estimation.

Step 2: Plot Your Data Before Calculating Anything

Jumping straight to parameter estimation misses the insights visible in properly constructed plots. Weibull probability plots reveal failure patterns, identify outliers, and validate whether Weibull distribution fits your data—all before touching mathematical formulas.

Creating Weibull Probability Plots

A Weibull probability plot uses special axis scaling that transforms the Weibull cumulative distribution into a straight line. Deviations from linearity indicate Weibull distribution doesn't fit your data.

# Python implementation with matplotlib and scipy
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

# Prepare failure data
failure_times = np.array([245, 389, 567, 623, 778, 834, 891, 945,
                          1023, 1145, 1267, 1389, 1456, 1523, 1678])

# Rank and calculate plotting positions (median rank)
n = len(failure_times)
ranks = np.arange(1, n+1)
median_ranks = (ranks - 0.3) / (n + 0.4)

# Create Weibull probability plot
fig, ax = plt.subplots(figsize=(10, 6))

# Transform to log-log scale for linearity
x = np.log(failure_times)
y = np.log(-np.log(1 - median_ranks))

ax.scatter(x, y, alpha=0.6, s=50)
ax.set_xlabel('ln(Time)', fontsize=12)
ax.set_ylabel('ln(-ln(1-F(t)))', fontsize=12)
ax.set_title('Weibull Probability Plot', fontsize=14)
ax.grid(True, alpha=0.3)

# Fit line to extract parameters
slope, intercept = np.polyfit(x, y, 1)
beta = slope  # Shape parameter
eta = np.exp(-intercept/slope)  # Characteristic life

# Add fitted line
x_line = np.linspace(x.min(), x.max(), 100)
y_line = slope * x_line + intercept
ax.plot(x_line, y_line, 'r--', linewidth=2,
        label=f'β={beta:.2f}, η={eta:.0f}')
ax.legend()

plt.tight_layout()
plt.show()

What to Look For in Probability Plots

Straight line = good fit. Data points falling close to a straight line confirm Weibull distribution fits your failures. This validates subsequent parameter estimates and predictions.

Curvature = distribution mismatch or mixed modes. Systematic upward or downward curvature suggests Weibull doesn't fit. Upward curvature at high times indicates log-normal might fit better. S-shaped curves often mean mixed failure modes—two different mechanisms creating distinct populations.

Outliers = investigate root causes. Points far from the line represent unusual failures. A cluster of very early failures suggests quality defects. Extremely late failures might indicate a different usage pattern or superior manufacturing for those specific units. Don't automatically remove outliers—understand them first.

Slope = shape parameter β. Steep slope (β > 1) means increasing failure rate—wear-out dominates. Gentle slope (β < 1) means decreasing failure rate—infant mortality or quality problems. Slope near 1 means constant failure rate—random events.

The 5-Minute Validation Test

Before proceeding with analysis, spend 5 minutes on this plot check: Do points fall reasonably close to a straight line? If yes, continue. If no, investigate why before calculating parameters. A curved probability plot means your parameter estimates will be biased and predictions unreliable. This upfront investment saves hours of troubleshooting later.

Step 3: Calculate Parameters That Drive Decisions

Weibull analysis produces two parameters that determine your reliability strategy. Understanding what these numbers mean—and how to act on them—matters more than calculation mechanics.

Shape Parameter (β): Your Failure Pattern Fingerprint

The shape parameter reveals whether you're fighting infant mortality, random failures, or wear-out. This distinction determines where you invest improvement resources.

β < 1 (Decreasing failure rate): Infant mortality dominates. Failures occur early in life then decline. This indicates:

Quality control problems—defective parts slipping through
Design weaknesses that manifest immediately under stress
Installation errors creating early failures

Action: Implement burn-in testing to catch defects before shipping. Improve incoming inspection for critical components. Review assembly procedures for common early-failure units. These failures are preventable through quality improvements, not maintenance programs.

β ≈ 1 (Constant failure rate): Random failures with no time dependence. Probability of failure tomorrow equals probability next year. This indicates:

Random external events (lightning strikes, foreign object damage)
Well-designed systems in their useful life period
Multiple independent failure modes averaging to constant rate

Action: Preventive maintenance provides limited benefit—failures aren't predictable by age. Focus on rapid repair and spare parts availability. Design for replaceability rather than prevention. Time-based maintenance wastes resources; condition-based monitoring adds value.

β > 1 (Increasing failure rate): Wear-out dominates. Failure probability increases with age. This indicates:

Cumulative damage from normal operation
Degradation processes (corrosion, fatigue, erosion)
Consumable components reaching end of design life

Action: Time-based preventive maintenance is cost-effective. Replace components before failure based on age. Calculate optimal replacement intervals balancing maintenance costs against failure consequences. Higher β values (β > 3) make preventive replacement increasingly economical.

Characteristic Life (η): When Failures Happen

Characteristic life represents the time at which 63.2% of units have failed (for any β value). This provides a scale-free comparison across different distributions.

More useful for business decisions: the median life (time when 50% have failed) and B10 life (time when 10% have failed). These translate directly to warranty exposure and spare parts requirements.

# Calculate key lifetime percentiles from Weibull parameters
def weibull_percentile(p, beta, eta):
    """
    Calculate time when p fraction have failed
    p: failure fraction (0.1 for B10, 0.5 for median)
    beta: shape parameter
    eta: characteristic life
    """
    return eta * (-np.log(1 - p))**(1/beta)

# Example: β=2.5, η=1200 hours
beta = 2.5
eta = 1200

b10 = weibull_percentile(0.10, beta, eta)
b50 = weibull_percentile(0.50, beta, eta)
b90 = weibull_percentile(0.90, beta, eta)

print(f"B10 life: {b10:.0f} hours - 10% failed")
print(f"B50 life (median): {b50:.0f} hours - 50% failed")
print(f"B90 life: {b90:.0f} hours - 90% failed")

# Output:
# B10 life: 535 hours - 10% failed
# B50 life: 1057 hours - 50% failed
# B90 life: 1818 hours - 90% failed

Use B10 life for warranty period decisions—you're guaranteeing products survive this long. Use median life for maintenance planning—half your fleet needs attention by this point. Use B90 for spare parts inventory—you'll need replacements for 90% of units by this time.

Confidence Intervals: Acknowledging Uncertainty

Parameter estimates from limited data have uncertainty. Confidence intervals quantify this uncertainty, preventing overconfident decisions based on small samples.

from scipy import optimize
from scipy.stats import weibull_min

def weibull_mle_with_ci(times, events, alpha=0.05):
    """
    Maximum likelihood estimation with confidence intervals
    times: failure/censoring times
    events: 1 for failure, 0 for censored
    alpha: significance level (0.05 for 95% CI)
    """
    # Fit Weibull using maximum likelihood
    failed_times = times[events == 1]
    shape, loc, scale = weibull_min.fit(failed_times, floc=0)

    # Calculate confidence intervals using profile likelihood
    # (simplified version - production code should use proper likelihood)
    n_boot = 1000
    boot_shapes = []
    boot_scales = []

    for i in range(n_boot):
        # Bootstrap resample
        indices = np.random.choice(len(times), size=len(times), replace=True)
        boot_times = times[indices]
        boot_events = events[indices]
        boot_failed = boot_times[boot_events == 1]

        if len(boot_failed) > 3:  # Minimum for fitting
            s, l, sc = weibull_min.fit(boot_failed, floc=0)
            boot_shapes.append(s)
            boot_scales.append(sc)

    # Calculate percentile confidence intervals
    shape_ci = np.percentile(boot_shapes, [100*alpha/2, 100*(1-alpha/2)])
    scale_ci = np.percentile(boot_scales, [100*alpha/2, 100*(1-alpha/2)])

    return {
        'shape': shape,
        'shape_ci': shape_ci,
        'scale': scale,
        'scale_ci': scale_ci
    }

# Example with confidence intervals
results = weibull_mle_with_ci(times=failure_times,
                               events=np.ones(len(failure_times)))
print(f"Shape β: {results['shape']:.2f} (95% CI: {results['shape_ci'][0]:.2f}-{results['shape_ci'][1]:.2f})")
print(f"Scale η: {results['scale']:.0f} (95% CI: {results['scale_ci'][0]:.0f}-{results['scale_ci'][1]:.0f})")

Report confidence intervals alongside point estimates in presentations. Wide intervals signal "collect more data before making expensive decisions." Narrow intervals justify confident action. This transparency builds trust with stakeholders.

Step 4: Translate Parameters Into Business Decisions

Parameters mean nothing until converted into actions. This section maps Weibull results to specific business decisions across common reliability challenges.

Decision 1: Set Warranty Periods That Balance Cost and Customer Satisfaction

Your warranty period determines both customer confidence and financial exposure. Too short, customers perceive poor quality. Too long, you pay for failures you didn't need to cover.

Decision framework: Set warranty at B05 to B10 life—covering 90-95% of units. This protects customers from early failures while avoiding excessive late-failure costs.

# Calculate warranty cost exposure
def warranty_cost_analysis(beta, eta, warranty_period, population, cost_per_failure):
    """
    Estimate warranty costs based on Weibull parameters
    """
    # Probability of failure within warranty
    p_failure = 1 - np.exp(-(warranty_period/eta)**beta)

    # Expected failures
    expected_failures = population * p_failure

    # Total cost
    total_cost = expected_failures * cost_per_failure

    return {
        'warranty_period': warranty_period,
        'failure_probability': p_failure,
        'expected_failures': expected_failures,
        'total_cost': total_cost,
        'cost_per_unit': total_cost / population
    }

# Example: Setting warranty for motor population
beta = 2.8  # Wear-out pattern
eta = 2400  # Characteristic life in hours

# Evaluate warranty options
for warranty_hrs in [500, 1000, 1500, 2000]:
    result = warranty_cost_analysis(beta, eta, warranty_hrs,
                                   population=10000, cost_per_failure=450)
    print(f"\n{warranty_hrs} hour warranty:")
    print(f"  Failure probability: {result['failure_probability']:.1%}")
    print(f"  Expected failures: {result['expected_failures']:.0f}")
    print(f"  Total cost: ${result['total_cost']:,.0f}")
    print(f"  Cost per unit: ${result['cost_per_unit']:.2f}")

# Output helps choose warranty period balancing coverage and cost

If B10 life is 600 hours but your competitor offers 1000-hour warranty, either accept higher warranty costs as customer acquisition investment, or improve design to extend B10 life to 1200+ hours before matching competitor terms.

Decision 2: Schedule Preventive Maintenance at Optimal Intervals

Preventive maintenance only makes economic sense when β > 1. Calculate the interval that minimizes total costs—balancing planned maintenance expenses against failure consequences.

def optimal_maintenance_interval(beta, eta, maint_cost, failure_cost):
    """
    Find maintenance interval minimizing total cost
    Only valid when beta > 1 (wear-out failures)
    """
    if beta <= 1:
        return "Preventive maintenance not economical - failures are random (β ≤ 1)"

    # Optimal interval formula for Weibull with beta > 1
    # Derived from minimizing cost per unit time
    optimal_t = eta * ((failure_cost / maint_cost) * (beta - 1) / beta)**(1/beta)

    # Calculate costs at optimal interval
    reliability_at_t = np.exp(-(optimal_t/eta)**beta)
    expected_failures_prevented = 1 - reliability_at_t

    cost_per_cycle = maint_cost + (1 - reliability_at_t) * failure_cost

    return {
        'optimal_interval': optimal_t,
        'reliability_at_interval': reliability_at_t,
        'failures_prevented': expected_failures_prevented,
        'cost_per_cycle': cost_per_cycle
    }

# Example: Bearing replacement
result = optimal_maintenance_interval(
    beta=3.2,              # Strong wear-out
    eta=3500,              # Characteristic life: 3500 hours
    maint_cost=200,        # Scheduled replacement cost
    failure_cost=3500      # Emergency repair + downtime cost
)

print(f"Optimal maintenance interval: {result['optimal_interval']:.0f} hours")
print(f"Reliability at interval: {result['reliability_at_interval']:.1%}")
print(f"Prevents {result['failures_prevented']:.1%} of failures")

Higher failure costs relative to maintenance costs justify more frequent prevention. Higher β values (steeper wear-out) justify earlier intervention. When β < 2, age-based replacement rarely beats run-to-failure strategies.

Decision 3: Compare Suppliers Objectively with Statistical Rigor

Supplier A's components have η = 2800 hours, Supplier B's have η = 3200 hours. Is B really better, or is this sampling variation? Confidence intervals and hypothesis tests answer this.

def compare_suppliers(data_a, data_b, alpha=0.05):
    """
    Statistical comparison of two suppliers' reliability
    data_a, data_b: tuples of (times, events) for each supplier
    """
    from scipy.stats import kstest, mannwhitneyu

    times_a, events_a = data_a
    times_b, events_b = data_b

    # Fit Weibull to each supplier
    failed_a = times_a[events_a == 1]
    failed_b = times_b[events_b == 1]

    shape_a, _, scale_a = weibull_min.fit(failed_a, floc=0)
    shape_b, _, scale_b = weibull_min.fit(failed_b, floc=0)

    # Calculate median lives
    median_a = scale_a * (np.log(2))**(1/shape_a)
    median_b = scale_b * (np.log(2))**(1/shape_b)

    # Statistical test (Mann-Whitney for robustness)
    statistic, p_value = mannwhitneyu(failed_a, failed_b, alternative='two-sided')

    # Practical significance: is difference meaningful?
    percent_improvement = (median_b - median_a) / median_a * 100

    return {
        'supplier_a': {'shape': shape_a, 'scale': scale_a, 'median': median_a},
        'supplier_b': {'shape': shape_b, 'scale': scale_b, 'median': median_b},
        'p_value': p_value,
        'statistically_significant': p_value < alpha,
        'percent_improvement': percent_improvement,
        'recommendation': 'Switch to B' if (p_value < alpha and percent_improvement > 10) else 'Insufficient evidence'
    }

# Example comparison
comparison = compare_suppliers(
    data_a=(supplier_a_times, supplier_a_events),
    data_b=(supplier_b_times, supplier_b_events)
)

print(f"Supplier A median life: {comparison['supplier_a']['median']:.0f} hours")
print(f"Supplier B median life: {comparison['supplier_b']['median']:.0f} hours")
print(f"Improvement: {comparison['percent_improvement']:.1f}%")
print(f"Statistical significance: p={comparison['p_value']:.3f}")
print(f"Recommendation: {comparison['recommendation']}")

Require both statistical significance (p < 0.05) and practical significance (>10% improvement) before changing suppliers. Small sample sizes often show large parameter differences that aren't statistically reliable—confidence intervals prevent premature decisions.

Real-World Example: Hydraulic Pump Reliability Crisis

A manufacturing plant faced escalating maintenance costs and unplanned downtime from hydraulic pump failures. Current practice: run pumps to failure, then emergency replacement at 3x the cost of planned maintenance. The reliability team had 18 months of failure data across 85 pumps.

The Data

85 pumps in service, tracking began 18 months ago
32 pumps failed (requiring replacement)
53 pumps still operating (censored data)
Operating hours tracked via PLC integration
Failure times ranged from 420 to 2340 hours

Analysis Process

Step 1: Probability plot revealed clear Weibull fit. Points fell close to a straight line with no systematic curvature. This validated using Weibull distribution for subsequent analysis.

Step 2: Parameter estimation produced β = 2.4, η = 1850 hours. The shape parameter β = 2.4 indicated wear-out failures—failure rate increases with age. Characteristic life η = 1850 hours set the time scale.

Step 3: Calculated critical reliability metrics:

B10 life: 780 hours (10% failure rate)
Median life: 1640 hours (50% failure rate)
B90 life: 2680 hours (90% failure rate)

Business Decisions Based on Analysis

Decision 1: Implement preventive replacement at 1400 hours. With β = 2.4 (wear-out pattern), preventive maintenance is economical. The optimal interval calculation balanced:

Planned replacement cost: $850 including labor and parts
Emergency failure cost: $3,200 including downtime and rush parts
Optimal interval: 1400 hours (85% reliability)

This interval prevents 70% of failures while maintaining 85% of pumps still functional at replacement. The economic calculation showed this approach reduces total costs by 43% compared to run-to-failure.

Decision 2: Escalate early failures to supplier. Four pumps failed before 500 hours—well below the 780-hour B10 life. Root cause analysis identified a manufacturing defect in one production batch. Supplier agreed to replace affected units and improved quality control.

Decision 3: Stock 12 pumps as strategic spares. With 85 pumps on 1400-hour replacement cycles, steady-state inventory analysis showed 12 spare pumps optimize carrying costs against stockout risk. Previous "guess" inventory of 5 spares led to frequent emergency orders at premium pricing.

Results After 12 Months

Emergency failures decreased 71% (from 32 to 9 annually projected)
Maintenance costs decreased $127,000 annually
Production uptime improved 2.3 percentage points
Inventory carrying costs increased $8,400 but eliminated $42,000 in rush delivery fees

Implementation Success Factor

The analysis took 4 hours of analyst time. Implementation required only integrating replacement intervals into the CMMS (computerized maintenance management system). No complex systems or major process changes. This simplicity enabled rapid deployment and immediate cost savings—demonstrating that Weibull analysis delivers practical value, not just theoretical insights.

Handling Special Cases and Edge Conditions

Real-world reliability data rarely matches textbook examples. These special cases require modified approaches to maintain analytical validity.

Mixed Failure Modes: When One Distribution Isn't Enough

Equipment fails from multiple mechanisms—bearings wear out, seals leak, electronics fail randomly. Combining all failures into one Weibull distribution creates a poor fit and misleading parameters.

Identification: Probability plots show distinct slopes for different time ranges, or S-shaped curvature indicating population mixture. Root cause analysis reveals multiple failure mechanisms.

Solution: Separate failures by mode and analyze each independently. If your 100 failures include 60 bearing failures, 25 seal failures, and 15 electrical failures, create three separate Weibull analyses. This reveals which modes drive reliability and where improvements deliver maximum impact.

# Analyze by failure mode
for mode in ['bearing', 'seal', 'electrical']:
    mode_data = df[df['failure_mode'] == mode]

    # Fit Weibull for this mode
    times = mode_data['operating_hours'].values
    events = mode_data['failed'].values

    # Estimate parameters
    beta, eta = fit_weibull(times, events)

    print(f"\n{mode.capitalize()} failures:")
    print(f"  Shape β: {beta:.2f}")
    print(f"  Scale η: {eta:.0f} hours")
    print(f"  B10 life: {weibull_percentile(0.10, beta, eta):.0f} hours")

Competing risks models provide more sophisticated analysis when you need to predict which failure mode occurs first and how eliminating one mode affects overall reliability. Use specialized software like R's survival package for competing risks analysis.

Heavily Censored Data: Extracting Signal from Limited Failures

You've tested 200 units for 1000 hours each, but only 8 failures occurred. That's 96% censoring—can you trust the analysis?

Yes, with caveats. Maximum likelihood estimation handles censoring properly, extracting information from both failures and survivors. Your 192 censored observations contribute—they tell you at least 192 units survived 1000 hours.

Limitations: Precision decreases with heavy censoring. Report wide confidence intervals reflecting this uncertainty. Avoid extrapolating beyond the test duration—you have no information about behavior past 1000 hours.

Improved approach: Use accelerated life testing. Test some units at higher stress (temperature, load, cycling rate) to generate more failures quickly, then use physics-based models to translate results to normal operating conditions.

Left Truncation: When You Don't Observe Early Failures

You inherit a reliability tracking system that started monitoring 500 pumps already in service for 6 months. Pumps that failed in the first 6 months aren't in your dataset. This left truncation biases estimates.

Solution: Use left-truncated Weibull methods that account for delayed entry. In survival analysis software, specify both entry time and failure/censoring time for each unit.

# Example with left truncation
from lifelines import WeibullFitter

# Data structure: entry_time, exit_time, event
df = pd.DataFrame({
    'entry': [180, 180, 180, ...],  # All entered at 180 days (6 months)
    'exit': [420, 680, 890, ...],   # Failure or censoring time
    'failed': [1, 1, 0, ...]        # Event indicator
})

wf = WeibullFitter()
wf.fit_left_truncation(df['exit'], df['failed'], entry=df['entry'])

print(f"Shape β: {wf.rho_:.2f}")
print(f"Scale η: {wf.lambda_:.0f}")

Ignoring left truncation makes reliability look better than reality—you've excluded early failures from your dataset. Always account for truncation when units enter observation after start of exposure.

Interval Censoring: When You Only Know Failures Occurred Between Inspections

Monthly inspections reveal failures, but exact failure time between inspections is unknown. Unit inspected at day 30 (functioning) and day 60 (failed) had failure somewhere in that 30-day window.

Solution: Interval-censored Weibull methods treat failure time as falling within an interval rather than at a point. Software packages like R's icenReg or Python's lifelines handle interval censoring.

If inspection intervals are short relative to characteristic life (inspection interval < η/10), you can approximate by using interval midpoints. For longer intervals, proper interval-censored methods prevent bias.

Try Weibull Analysis on Your Data

Analyze Your Own Data — upload a CSV and run this analysis instantly. No code, no setup.

Analyze Your CSV →

Get Reliability Insights in Minutes

MCP Analytics provides automated Weibull analysis with interactive probability plots, confidence intervals, and business-focused reporting. Upload your failure data—even with censored observations—and receive warranty cost projections, optimal maintenance intervals, and supplier comparisons in a format executives understand.

No coding required. No statistical expertise assumed. Just your data and clear answers to reliability questions.

Start Free Analysis

Compare plans →

From Weibull Parameters to Organizational Impact

Weibull analysis succeeds when it changes decisions, not just produces parameters. The shape parameter β tells you where to invest: quality control for β < 1, preventive maintenance for β > 1, and rapid response for β ≈ 1. The characteristic life η sets the timeline for warranty periods, maintenance schedules, and spare parts requirements.

This step-by-step methodology—collect proper data including censored observations, plot before calculating to validate fit, estimate parameters with confidence intervals, translate results into specific decisions—transforms Weibull from academic exercise to practical reliability tool. The hydraulic pump example demonstrated this path: 85 pumps with 32 failures became 1400-hour maintenance intervals and $127,000 annual savings.

Most organizations have reliability data sitting unused because translating failure times into actions seems too complex. Weibull analysis makes this translation systematic. Probability plots reveal whether your distribution fits. Parameter confidence intervals acknowledge uncertainty appropriately. Optimal interval calculations balance costs mathematically rather than through opinions.

Start with data you already collect—warranty claims, maintenance records, test results. Structure it with time-to-event and failure indicators. Create probability plots to validate Weibull fit. Estimate parameters and confidence intervals. Then ask the business questions: Should we extend warranty from 1 year to 18 months? Should we replace bearings every 1500 hours or 2000 hours? Is Supplier B worth the 8% price premium?

Weibull analysis answers these questions with data instead of intuition. The methodology outlined here provides a repeatable process that works whether you're analyzing medical devices, industrial equipment, consumer products, or software reliability. The mathematics remains the same. The business context determines which decisions to make with the results.

When your next reliability question arises—warranty costs escalating, maintenance budgets under scrutiny, supplier quality disputes—remember that Weibull analysis delivers answers from data you likely already have. Implement the four-step process: collect right data, plot before calculating, estimate with confidence intervals, translate to decisions. This approach consistently produces actionable insights that improve reliability and reduce costs.

Frequently Asked Questions

What does the Weibull shape parameter tell me about my failure pattern?

The shape parameter (β) determines your failure pattern: β < 1 means decreasing failure rate (infant mortality), β = 1 means constant failure rate (random failures), and β > 1 means increasing failure rate (wear-out). A bearing with β = 3.2 shows accelerating wear-out requiring preventive maintenance, while electronics with β = 0.7 indicate early-life defects requiring quality improvements. This single number tells you whether to focus on design improvements, quality control, or preventive maintenance schedules.

How many failures do I need for reliable Weibull analysis?

Minimum 15-20 failures for basic reliability estimates, though 30+ provides more stable parameter estimates with narrower confidence intervals. You can use censored data (unfailed units) to supplement actual failures—these contribute information even though they haven't failed. With only 10 failures from testing 100 units, you still have valuable information from the 90 survivors. The key is proper handling of censored observations using maximum likelihood estimation, not waiting for everything to fail.

When should I use Weibull analysis instead of other reliability methods?

Use Weibull when you need to model time-to-failure with changing hazard rates, predict warranty costs before they occur, optimize maintenance schedules based on wear-out patterns, or analyze both failed and unfailed units together. Choose exponential distribution for constant failure rates (β = 1), log-normal for symmetric failure distributions with unimodal hazards, or Cox regression when you need to compare multiple factors simultaneously without assuming a specific distribution form.

Can Weibull analysis predict individual component failure times?

No. Weibull provides probability distributions for populations, not deterministic predictions for individual units. It tells you that 10% of components will fail by 500 hours and 50% by 1200 hours, but cannot predict when unit #A47392 specifically will fail. Use these population probabilities for maintenance scheduling (plan for median life), spare parts inventory (stock for B90 life), and warranty reserve planning (budget for expected failures) rather than individual component predictions.

How do I handle mixed failure modes in Weibull analysis?

Mixed failure modes (e.g., bearing wear plus electrical shorts) violate the single-distribution assumption and create poor fits. First, identify failure modes through root cause analysis of each failed unit. Then either model each mode separately using competing risks methods (analyzing bearing failures independently from electrical failures), or fit mixture Weibull models with multiple shape parameters. Plotting failures by mode on probability paper reveals mixing—you'll see multiple distinct slopes or S-shaped curvature rather than a single straight line. Separate analysis by mode reveals which mechanisms drive reliability and where improvements provide maximum impact.