The ability to reproduce the experimental results of others is a hallmark of science. It is part of the checks and balances process and is a surefire way to know one is on the right track when the results of a study can not only be replicated by the same team but also reproduced and verified by independent researchers. However, even though reproducibility is a standard by which scientific work is supposed to be judged, it is a standard that is hardly met. Not only are independent researchers unable to meet the requirement of recreating the results of others, often times the same team of researchers are unable to recreate their own work using the same methods and materials which produced the results originally.
Irreproducibility is just another in a long laundry list of troublesome issues (cells/antibiotics/chemicals/nutrients used, inability to replicate microenvironment, contamination altering results) plauging the use of cell culture for “viral” identification. Being able to reproduce the results of other researchers is a essential requirement but this has become a huge problem in science and has been shown to be prevalent in cell culture studies. This is a crisis which has only become worse with time. A few articles highlight this glaring mess.
The first two articles offer insight into the state of the problem. The inability to reproduce results is a pervasive and an ultimately expensive issue that has been known for a while and has not been corrected. There is no one factor but a multitude of them that are leading to the lack of reproducibility such as study design, lab conditions, reference materials, data analysis, and other technical/bilogical variables.
Reproducibility: changing the policies and culture of cell line authentication
“Advances in life science research build
upon the reproducibility of previously
published data and findings, yet irreproducibility in basic and preclinical biological research is a pervasive, expensive and increasingly well-recognized problem1,2. Also called replication, validation, verification or reanalysis3, in simplest terms, reproducibility means that an experiment should be able to be confirmed in an independent laboratory with results that broadly support the conclusions of the original scientist. Excluding deliberate scientific misconduct4, irreproducibility typically results from errors or flaws in one or more of the following areas of the research process: reference materials, study design, laboratory protocols, and data analysis and reporting5,6. Irreproducible preclinical research contributes to both delays and increased costs in drug discovery.”
Potential Causes of Irreproducibility Revealed
“While much of biology research suffers from a lack of reproducibility, no single factor has emerged as the driver of this problem. In a multi-lab study published this week in Cell Systems, researchers have attempted to reproduce the results of an assay in which cultured cells were treated with cancer drugs. Their lack of success highlights the role that technical variables play in the ability to repeat experiments.
“The key thing [is] raising awareness of this variability and the fact that a lot of it will be really difficult to control,” says Paula Bates, who works in cancer drug development at the University of Louisville and did not participate in the study. “It’s especially important for projects where there is a lot of data being collected and compared.”
“Heiser’s group and three other labs received human mammary epithelial cells, media, drugs, and a detailed protocol from the group of Peter Sorger, who leads the LINCS group at Harvard University. All five teams (Sorger’s included) cultured the cells and treated them with each of eight small-molecule drugs, then measured cell viability to estimate drug potency.
In initial experiments, the results revealed drug potencies that varied as much as 200-fold between groups. The researchers investigated the reasons for the incongruent findings and identified several technical factors. For instance, not everyone used the same method for counting cells, and direct counts using a microscope did not correlate well with measuring ATP levels in lysed cells, which is often used as a proxy for cell number. Groups also used different image-processing algorithms to count live cells, and this seemed to make a difference as well. The researchers also determined that the location in a cell culture plate in which cells were grown could lead to variation in results between labs. These so-called edge effects come from uneven evaporation of culture media and temperature gradients.
Once the teams addressed these issues with a more standardized protocol and randomization of which wells in a plate are treated with drug, the experiments’ replicability—both between groups and within single labs over time—improved. But the results were more consistent within a group than between groups, a factor that in the paper the authors write could be due to possible differences in pipetting technique, variations in equipment, or failure to stick to the protocol due to “a belief—belied by the final analysis—that counting cells is such a simple procedure that different assays can be substituted for each other without consequence.”
But following the protocol exactly is only part of the story, Sorger tells The Scientist. Another important part of understanding reproducibility is that many experiments depend on some biological component that goes unmeasured. “Whether cells grow or die in any given condition or . . . do any other response is actually contextually dependent on where they’re growing, how fast they’re growing, what their prior history is, what the medium is,” among other potential biological factors, Sorger says. More comprehensive measurements of this biological context could help improve the translation of preclinical findings to effective treatments, he adds.”
The next two articles delve a bit more specifically into the problems directly related to cell culture such as quality control, contamination, cell line misidentification, lack of standardized guidelines, etc.
Quality control: the dark side of cell culture
“However, the explosion in research utilizing cell culture in this way has not been matched by efforts to ensure quality control of the cells in question. This poses even further risk to the successful translation of research, with a lack of quality control resulting in a lack of reproducibility.
Quality control in cell culture includes accurately identifying cells, as well as ensuring there is no contamination and the cells do not change excessively during the culture period. This is difficult when there are no absolute guidelines on methods to do this, which results in quality control procedures varying between laboratories. Sharing of cell lines between laboratories is also common practice.
“Cell culture sometimes feels like a black art, with everyone having their own preferred method,” commented Derfogail Delcassian, a researcher at MIT (MA, USA).
In Vitro Research Reproducibility: Keeping Up High Standards
“Concern regarding the reproducibility of observations in life science research has emerged in recent years, particularly in view of unfavorable experiences with preclinical in vivo research. The use of cell-based systems has increasingly replaced in vivo research and the application of in vitro models enjoys an ever-growing popularity. To avoid repeating past mistakes, high standards of reproducibility and reliability must be established and maintained in the field of in vitro biomedical research. Detailed guidance documenting the appropriate handling of cells has been authored, but was received with quite disparate perception by different branches in biomedical research. In that regard, we intend to raise awareness of the reproducibility issue among scientists in all branches of contemporary life science research and their individual responsibility in this matter. We have herein compiled a selection of the most susceptible steps of everyday in vitro cell culture routines that have the potential to influence cell quality and recommend practices to minimize the likelihood of poor cell quality impairing reproducibility with modest investment of time and resources.
A survey published in Nature in 2016 (Baker, 2016) evaluating questionnaires on reproducibility in life science research disclosed not only the difficulties researchers have reproducing experiments from other laboratories, but also from their own. Even more surprising was the fact that awareness of this problem was widespread within the scientific community. The inability to reproduce study results, often inherent in observations from academic laboratories, are usually uncovered not without relevant delay, e.g. when potential therapies that are based on these findings transition from preclinical testing to the far more stringent conditions of clinical trials (Collins and Tabak, 2014). Needless to say, the societal costs associated with this problem are intolerable (Freedman et al., 2015). The controversial matter of insufficient reproducibility was, in fact, communicated openly in oncology and cardiovascular biology (Begley and Ellis, 2012; Errington et al., 2014; Libby, 2015). In toxicology, which may better reflect the background of most readers of this journal, awareness of this problem has emerged only gradually in association with insufficient in vivo reproducibility (Kilkenny et al., 2009; Voelkl et al., 2018). Such disclosures, in concert with studies indicating that in vivo data from rats and mice combined can only predict human clinical toxicology of less than 50% of candidate pharmaceuticals (Olson et al., 2000), promoted a revision of several toxicologists’ opinions towards mechanistic in vitro assays from the traditional reliance on pharmacological and toxicological in vivo animal testing.
“A major concern raised by researchers in different fields of biomedicine was how a cell culture model, often not even originating from the organ of interest, could provide information about multilayer processes and pathological outcomes in humans.“
Insufficient Reproducibility in Cell Models
“A defined assay performed with a defined in vitro model needs to yield identical results— no matter when or where it is performed. As trivial as this statement may appear, its implementation is quite difficult in reality. The Nature survey of 2016 (Baker, 2016) highlighted the degree of inadequate reproducibility in biomedical research and underlined the widespread awareness of the problem within the scientific community. It is, thus, all the more astonishing that systematic comparisons of experimental models applied in different laboratories are rather rare, particularly in the field of in vitro research.”
“An exemplary illustration of this transparency is a publication by Elliott and colleagues (Elliott et al., 2017) assessing the reproducibility of MTS-tetrazolium reduction assay results as indicators of cell viability in an international inter-laboratory comparison study with five independent laboratories. Strict standard operating procedures (SOP) were employed using a sophisticated 96-well plate design that allowed detection of up to seven parameters of assay performance, including accuracy of multi-channel pipetting, cell handling/cell growth, and instrument performance (i.e. plate reader issues) (Rösslein et al., 2015). A549 cells were purchased from two independent, credible, accepted commercial sources and both, seemingly identical, cell cultures were used in all labs. Even under such strict conditions, EC50 values of the two A549 cultures upon CdSO4 treatment differed by a factor of two in all laboratories. In the course of these investigations, cell line authentication was discovered to be one of the main factors influencing assay results. Short tandem repeat sequencing revealed a partial chromosome deletion in one of the cell cultures. Technical aspects also contributed to result variability. For example, simple cell handling steps, such as PBS washing, were identified to significantly change assay outcomes. This example provides a vivid illustration of the impact of seemingly trivial details and the necessity to draw attention to all aspects of in vitro experimentation.
A recent evocative study of the mammary epithelial cell line MCF10A and growth rate inhibition by anti-cancer drugs systematically addressed inter- and intra-study center variations and identified factors contributing to insufficient reproducibility (Niepel et al., 2019). Although the five research centers applied cells and chemicals of the same stock, astonishing center-to-center variations up to 200-fold were observed in growth inhibition rates. Cell seeding, i.e. slight variations in initial cell numbers, was identified as one key source of these variations (for more details see Recommendation 5) (Cell density and medium change). Overall, the subtle interplay between experimental methods and a vast array of poorly defined sources of biological variability was found to be the main cause of the observed irreproducibility. For example, two distinct methods were used to quantify cell viability: a) microscopic cell counting as a direct measure of viable cell number and b) detection of intracellular ATP levels as a proxy of viable cells. ATP levels do not necessarily directly correlate to the number of viable cells, resulting in identical EC50 values for some drugs, but differing greatly for others. Changes in ATP levels following treatment could be the consequence of cell death, effects on cell proliferation or the alteration of cellular ATP metabolism. Furthermore, linearity between cellular ATP levels and cell viability is not justified for many cell types. In several cases, a reduction of ATP levels by almost 50% is tolerated by cells without significant influence on cell viability (Pöltl et al., 2012). In conclusion, while both assays (direct cell counting and ATP measurements) might be quite robust and reproducible per se, they provide different information from their results, e.g. drugs that alter cellular ATP metabolism, and are thus not interchangeable in these cases. As a consequence of the huge number of individual biological factors involved, Niepel and colleagues came to the rather discouraging conclusion that “most examples of irreproducibility are themselves irreproducible” (Niepel et al., 2019).
This spectrum of biological factors further depends on the complexity of the cell model applied. The introduction of 2D co-culture models and 3D cell models was motivated by the ambition to recapitulate the natural in vivo environment of cells in a cell culture dish. In fact, cells in a 3D culture differ morphologically and physiologically from their counterparts in a 2D setup (Baharvand et al., 2006; Edmondson et al., 2014). Introduction of the third dimension in a cell culture model results in additional parameters that could potentially affect reproducibility, including spheroid size and consequently the oxygen and nutrient supplies to cells in different layers within the structure; spatial organization of surface receptors involved in interactions with neighboring cells; activation of signal transduction pathways; and induction of gene expression profiles (Vinci et al., 2012). All of these changes ultimately have the potential to influence cell biology and cellular response towards exogenous stressors.”
“Even with the best of intentions, it must be concluded that the limits of reproducibility in cell culture work is reached when confronted with the question of reference standards, particularly for established and widely distributed cell lines. Simply put, which of the currently available and characterized stocks of common cell lines, like HeLa cells, should be considered as the gold standard? Even if a consensus could be reached for individual cell lines, storage capacity limitations force even large cell banks to passage their cells, which necessarily influences the cells in one way or another over time.“
- Irreproducibility is a pervasive and well-known problem
- To avoid repeating past mistakes, high standards of reproducibility and reliability must be established and maintained in the field of in vitro biomedical research
- No single factor has emerged as the reason for the lack of reproducibility and instead it appears to be due to a multitude of factors
- Variables such as the pipetting technique, the protocols followed, the medium used, and the location of the cell culture in the plate can lead to different and irreproducible results
- Lack of quality control, cell-line misidentification, non-standardized methods, and contamination are other areas critical to the irreproducibilty of cell culture experiments
- A survey published in Nature in 2016 evaluating questionnaires on reproducibility in life science research disclosed not only the difficulties researchers have reproducing experiments from other laboratories, but also from their own
- Even more surprising was the fact that awareness of this problem was widespread within the scientific community
- A major concern raised by researchers in different fields of biomedicine was how a cell culture model could provide information about multilayer processes and pathological outcomes in humans
- A defined assay performed with a defined in vitro model needs to yield identical results— no matter when or where it is performed
- Systematic comparisons of experimental models applied in different laboratories are rather rare
- In the course of one investigation, cell line authentication was discovered to be one of the main factors influencing assay results and irreproducibility
- Technical aspects also contributed to result variability
- Simple teps thought to be trivial such as washing cells with phosphate-buffered saline can dramatically change results
- In a study by Niepel et al, five research centers applied cells and chemicals of the same stock yet astonishing center-to-center variations up to 200-fold were observed in growth inhibition rates
- Niepel found that the subtle interplay between experimental methods and a vast array of poorly defined sources of biological variability was found to be the main cause of observed irreproducibility
- As a consequence of the huge number of individual biological factors involved, Niepel concluded that “Most examples of irreproducibility are themselves irreproducible.”
- This spectrum of biological factors further depends on the complexity of the cell model applied (2D vs 3D)
- Cells in a 3D culture differ morphologically and physiologically from their counterparts in a 2D setup
- Introduction of the third dimension in a cell culture model results in additional parameters that could potentially affect reproducibility such as:
- Spheroid size and consequently the oxygen and nutrient supplies to cells in different layers within the structure
- Spatial organization of surface receptors involved in interactions with neighboring cells
- Activation of signal transduction pathways
- Iinduction of gene expression profiles
- All of these changes ultimately have the potential to influence cell biology and cellular response towards exogenous stressors
- It must be concluded that the limits of reproducibility in cell culture work is reached when confronted with the question of reference standards, particularly for established and widely distributed cell lines
- Even if a consensus could be reached for individual cell lines, storage capacity limitations force even large cell banks to passage their cells, which necessarily influences the cells in one way or another over time
As can be seen, lack of reproducibility in cell culture (and in all of science) is a huge problem. There are many factors which contribute to this inability to recreate the same results independently. In order for the results to be considered accurate and valid, all of these various factors would need to be taken into consideration and do exactly what was intended without any unforeseen problems arising. Even if things do seemingly go correctly, reproducibility is hard to come by.
It’s clear that cell cultures are not an accurate representation of reality and can not be considered as proof of a “virus.” These toxic concoctions are akin to witches brew in that what comes from them is not what was presented at the start but a creation stemming from the interactions of various cells, antibiotics, chemicals, serums, “nutrients,” etc.