Back to Home

Research Methodology

How we gathered and verified information about experimentation platforms

Sources and Methodology

How we collected and verified information for our comparison

Research Approach

Our comparison of experimentation platforms is based on a comprehensive review of publicly available information, including:

  • Official documentation and websites for commercial platforms
  • Published research papers and technical blogs from Microsoft and Netflix
  • Conference presentations and technical talks
  • Books and academic literature on experimentation systems
  • Industry reports and case studies

Key References

  • Microsoft ExP Platform:

    • Kohavi, R., Tang, D., & Xu, Y. (2020). Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing. Cambridge University Press.
    • Gupta, S., Ulanova, L., Bhardwaj, S., Dmitriev, P., Raff, P., & Fabijan, A. (2018). The Anatomy of a Large-Scale Online Experimentation Platform. IEEE International Conference on Software Architecture (ICSA).
    • Bajpai, A., Gupta, S., Nagpal, S., Bhardwaj, S., Dmitriev, P., & Fabijan, A. (2022). Extensible Experimentation Platform: Effective AB Test Analysis at Scale. IEEE International Conference on Software Architecture (ICSA).
  • Netflix XP Platform:

    • Xu, Y., Chen, N., Fernandez, A., Sinno, O., & Bhasin, A. (2015). From Infrastructure to Culture: A/B Testing Challenges in Large Scale Social Networks. KDD '15.
    • Netflix Technology Blog. (Various dates). Articles on experimentation and personalization.
    • Conference presentations by Netflix engineers at QCon, Strata, and other technical conferences.
  • Commercial Platforms (StatSig, Eppo):

    • Official documentation and feature descriptions from company websites
    • Technical blogs and case studies published by the companies
    • Product demos and webinars
    • Public pricing information and feature comparisons

Verification Process

To ensure accuracy in our comparison, we followed these verification steps:

  • Cross-referencing information across multiple sources when available
  • Prioritizing recent sources (published within the last 2-3 years)
  • Distinguishing between confirmed features and inferred capabilities based on published information
  • Providing a mechanism for reporting inaccuracies to continuously improve our comparison

Limitations and Considerations

Important context for interpreting our comparison

Information Availability

There are significant differences in the amount and detail of information available for each platform:

  • Microsoft ExP: Extensively documented in research papers and books, but as an internal platform, some details may be outdated or incomplete
  • Netflix XP: Less extensively documented than Microsoft's platform, with information primarily from conference talks and blog posts
  • Commercial Platforms: Documentation focuses on marketing and user-facing features, with less detail on internal architecture and implementation

Evolving Platforms

All experimentation platforms are continuously evolving, which presents challenges for comparison:

  • Features and capabilities may change over time
  • Pricing models and tiers may be updated
  • Internal platforms like Microsoft ExP and Netflix XP may have significant changes that aren't publicly documented
  • Commercial platforms regularly release new features that may not be reflected in our comparison

Context-Specific Considerations

The suitability of an experimentation platform depends heavily on organizational context:

  • Scale of experimentation (number of users, experiments, metrics)
  • Existing technical infrastructure and integration requirements
  • Team size, expertise, and resources available for implementation and maintenance
  • Industry-specific requirements and constraints
  • Budget considerations and ROI expectations