t-test & ANOVA (Analysis of Variance)

What are they? The t-test is a method that determines whether two populations are statistically different from each other, whereas ANOVA determines whether three or more populations are statistically different from each other. Both of them look at the difference in means and the spread of the distributions (i.e., variance) across groups; however, the ways that they determine the statistical significance are different.

When are they used? These tests are performed when 1) the samples are independent of each other and 2) have (approximately) normal distributions or when the sample number is high (e.g., > 30 per group). More samples are better, but the tests can be performed with as little as 3 samples per condition.

How do they work?

t-test Example

We want to determine whether the concentration of Proteins 1 – 4 in serum are significantly different between healthy and diseased patients. A t-test is performed, which can be visually explained by plotting the protein concentration on the X-axis and the frequency along the Y-axis of the two proteins on the same graph (Figures 1 – 4).

Proteins 1 & 2 have the same difference in protein concentration means but different group variances. Alternatively, Proteins 3 & 4 have similar variances but Protein 4 has a larger difference in protein concentration means between the patient groups.

A t-test assigns a “t” test statistic value to each biomarker. A good differential biomarker, represented by little to no overlap of the distributions and a large difference in means, would have a high “t” value.

Which is a better biomarker of disease: Protein 1 or Protein 2?

Protein 1

Which is a better biomarker of disease: Protein 3 or Protein 4?

Protein 4

What type of statistical value do I get? The t-test and ANOVA produce a test statistic value (“t” or “F”, respectively), which is converted into a “p-value.” A p-value is the probability that the null hypothesis – that both (or all) populations are the same – is true. In other words, a lower p-value reflects a value that is more significantly different across populations. Biomarkers with significant differences between sample populations have p-values ≤ 0.05.

148 Feedbacks on “t-test & ANOVA (Analysis of Variance)”

  1. Hiii! I hope you can help me out.
    I’m currently deciding whether to use t-test or one way anova. My study aims to determine whether horticultural activity decreases levels of stress and anxiety. So I have 1 independent variable and 2 dependent variable. I will be using a post test and pretest and that’s it. Is it possible to use t-test in computing for the mean between within before and after of an experimental group and on the control group. Then subject results from the t-test to an ANOVA to determine significant difference between the control and experimental group? Thank you!

    1. There are two ways to analyze the data:

      1) Student’s t-test: to calculate the effect of treatment (pre- and post-test) within each subject

      2) One-way ANOVA with repeated measurements: to calculate the effect of treatment and time on stress and anxiety. There are repeated measurements at two time points, pre- (baseline) and post-test for each subject. There is one experimental factor (treatment: yes/no). Note that the two dependent variables (stress and anxiety) should be analyzed separately. Learn more here: https://statistics.laerd.com/spss-tutorials/one-way-anova-repeated-measures-using-spss-statistics.php

  2. Good day! I would like to ask what to use between T-test and Anova. We have an estimated 30 respondents(all of them are teachers)and we are looking if there are any significant difference between their demographic profile (sex, age, educational attainment, years of experience) and level of knowledge about Tpack.( Technological knowledge, Content Knowledge, Pedagogical Knowledge and so forth)

    1. T-test and ANOVA are designed to detect difference between groups.

      You may first need to categorize the teachers according to their gender, age group, and educational level, etc. Assuming that the Tpack score is numerical, you could compare the Tpack score across the levels of factors.

      You could also develop a multiple-variable linear regression model, with the Tpack score as a dependent variable and all the factors (categorized or not) as independent variables.

  3. Hello, Is it possible for the p value of a t test and one way anova be different? I have 2 samples, one independent variable, the p value using t-test was 0.001 which should mean reject null, but when i ran an anova, i got another p value of 0.11. This has left me confused.

    A brief idea on my samples, these are for a conventional technology and then another for when it is intensified by adding a chemical. I assume the independent variable is the conventional technology. I used excel to do this.
    Please advice.

  4. Hello.
    My question is how to implement ANOVA or t-test when proving relation between metal intake through foods and metal excretion through urine. I want to know the significance.
    Thanks.

    1. The relationship between metal intake and excretion can be analyzed with correlation or regression analysis. ANOVA and t-test methods would not be appropriate for your objective because they are intended to detect differences across sample groups.

  5. Hi! my study is about an evaluation the efficacy of security measures to online fraud on selected banks. Am i gonna used t-test? and i am going to used it for my independent hypothesis is it correct?

    1. Is the online fraud assessed as incidence rate or the number of times that fraud occurs? In either case, the data should be analyzed either by Poisson distribution (considering the fraud is rare relative the transaction amount). If multiple security measures-of-interest (like the region, customer age/gender, distribution of the banks) will be considered, you should use a log-linear model.

  6. If a t-test statistic is found to be not significant, then performing an ANOVA is recommended. What does this mean? Why would it make sense to run ANOVA for data that fit the criteria for t test?

    1. The choice of whether the t-test or ANOVA should be performed depends on the type of dataset. For example, a 2-level univariate dataset should use a t-test. ANOVA should be used when there are 3 or more levels in the dataset, or if there are co-variates. It is not recommended to select a statistical method based on the p-value.

    1. If you are categorizing gender into two independent categories/levels, you can apply a t-test.

  7. Hello,

    I am doing a one-group pretest/posttest program evaluation. All students will participate in a program and we will be using an indicator before and after participation to assess whether the program increases their employment readiness. WOuld a t-test or ANOVA be a more appropriate tool for analysis?

  8. You’re amazing. Thank you!

    I think understand and am mentally pulling together some stats concepts I hadn’t considered before you responded.

    In the study we are working on, participants are listening to a total of three advertisements. They listen to one ad as assigned (10 listen to ad A first, 10 listen to ad B first, and 10 listen to ad C first) and then fill out a likest-scale evaluation of it. For example the first prompt asks if the ad was easy to understand and they rank from 1-5. Then they listen to a second ad and evaluate (those who listen to A first will now listen to B; those who listened to B first will now listen to C; those who listened to C first will now listen to A). Last, they will listen and evaluate the third ad. (Following that wa a qualitative interview, which is the heart of the project. We just wanted to be able to determine, if possible, whether or not the order of the ads influenced responses).

    Following what you said about power, “1-beta”, and .8 — if I understand, we need to determine Eta-squared for each prompt to to estimate the likelihood of the order accounting for the variance between the three treatments (order they listened to the ads, ABC, BCA, CAB).

    After running ANOVA for Likest-Scale responses (1-5) for our first prompt (“This ad was easy to understand”) against the treatment (order of ad delivery) we have the Sum Sqaure Between Groups = .867 and Total = 27.867. Then if we divide .867 by 27.867 = .031. That suggest the effect size is quite minimal and only accounts for .03% of the variance between the groups. Power would need to be .8 or higher in order to reject the null hypothesis, right? That would then mean we fail to reject the null hypothesis that order of ad delivery affected the evaluation scores. In our study that would be a good thing because it would be stronger if the likest-scale scores did not arise from the order in which they listened to them.

    We would need to run this for each prompt, I think.

    Does that seem correct?

    Thank you again. This was meant to be a qualitative study, but it was suddenly thrown at us to think about if the order they listened to the ad might account for the variance between them. It’s good that the prof pushes us to consider more than we were, but once he got irritated when I asked I started looking online and just kept getting confused with internet research. Reading some of your answers to others gave me hope because you have a knack of cutting through the clutter. <3

    1. Based on your description of the experiment, we realized that the study actually follows a 3×3 crossover design. Under this design, there are 4 sources that contribute to the total variance: content of advertisements (A, B, C), period (1, 2, 3), order (ABC, BAC, CAB), and the subject (random effect). The analysis should be conducted with ANOVA under the general linear model. You may refer this paper https://analytics.ncsu.edu/sesug/2004/SD04-Yarandi.pdf for details.

  9. We have a concept test in which we did ABC, BCA and CAB testing to rule out order bias in responses. Total sample size is 30, with 10 in each treatment. Is ANOVA ok to use with such a small sample size? Or do we need to do nonparametric such as Kruskal-Wallis? The goal is simply to ascertain if the scale responses related to the order of the treatment delivery or not. In the ideal world we’d find that the order did not impact the evaluation scale choices. But if it did, we’d like to know it. Is ANOVA appropriate? I see that you’ve helped many others here and I’m hopeful you might have advice. I’ve been researching this for a few days and seem to get the answer that sample size isn’t important as long as it’s more than 3, but then others indicate low sample size is brought up as a problem with using ANOVA.

    1. Sample size is related to the statistical power of a study. Larger sample sizes will give us more confidence when we have to accept the NULL hypothesis, which states that there is no difference among different groups. A NULL hypothesis usually means that the p-value is more than 0.05. On the other hand, we are less concerned about sample size when the difference is apparent. That is, where the p-value is much less than 0.05 and the delta is big with a current (perhaps limited) sample size. However, we must take care when the p-value is close to the significant level (alpha, usually 0.05). Under this scenario, we must calculate the power (“1-beta” where beta is Type II error) to ensure that the power is higher than predefined criteria (0.8, in general) based on the current sample size.

      Regarding your study, ANOVA would be a better choice than a non-parametric test (e.g., Kruscal-Wallis test). This is because ANOVA is able to handle both within-subject (order) and between-subject (treatment) factors, whereas the latter can handle only one factor.

      It is unclear how many concept tests are being conducted on each subject according to the provided description. However, if only one test will be conducted after delivery of all three treatments, there will be only one factor in the study (i.e., order of the treatments). Under this scenario, one-way ANOVA would be sufficient. On the other hand, if there are 3 concept tests after each treatment, there will be repeated measurements on subjects with two experimental factors (i.e., treatment A, B, C and timepoint 1, 2, 3). We would like to get idea of the effect of both the treatment and timepoint on the performance of the concept test. However, the within-subject correlation via repeated measurements must be handled appropriately. ANOVA with repeated measurements (under the General Linear Model framework) was designed for this purpose. Please see https://statistics.laerd.com/spss-tutorials/mixed-anova-using-spss-statistics.php for a detailed explanation.

  10. I’m confused about whether to run a t test or ANOVA
    I have one independent variable which is parent survival under three different treatments a 1:1 2:1 and control. We measured the parent survival on three different days per treatment. There are 28 clones with 10 replicates per treatment. Were trying to see if there is a statistical difference between parent survival amongst clones

    1. This is a design of one dependent (outcome) variable (i.e., parent survival) and 2 experimental factors (i.e., treatment and clone) with repeated measurements (i.e., 3 days). In total there, should be 3*28*10 experiments, and repeated measurements on 3 different days. The number of data points will therefore be 3*3*28*10.

      For analysis, we suggest the General Linear Model with repeated measurements to handle both the within-subject (i.e., cell flask containing the cells) correlation, and the between-subject experimental factors, the treatments, and clones. Please see https://statistics.laerd.com/spss-tutorials/mixed-anova-using-spss-statistics.php for a more detailed explanation.

  11. Hello! If I have 5 groups for my independent variable, could I still opt for t-tests by running them between two variables each time? My justification is that, while running an ANOVA yields a p-value >0.05, a t-test between two groups might turn out to have a p-value 0.05 eliminate the possibility that two of the groups have a statistically significant mean difference?

  12. Good Day! Can you help me, what I will use if I would like to determine the significant differences between the profile of the respondents (age, sex, gender) and learning competencies (cognitive, affective, and psychomotor)?

    1. You can use a general linear model analysis with the learning competencies as dependent variables, and the respondents (factors) as independent variables.

      For univariate analysis on categorical dependents where there are at least 3 people per category level, you can use a t-test or ANOVA. For example, a categorical dependent could be “sex.” The category levels for “sex” would include “male,” “female,” and “unknown.”

  13. Hello everyone, Can i ask what kind of ANOVA we will use in our study. Our study, has 2 dependent variable which are the 14th day and 28th day compressive strength of a concrete who has a mixture of 0%, 5%, and 10%. Each mixture has 2 samples to average the strength separately for 14th day and 28th day giving 6 samples for 14th day and another 6 samples for 28th day giving a total of 12 samples of all mixtures. And we are confuse what statistical treatment we are going to use if we are seperately solve the two dependent variable and what anova we will be performing. I hope somebody will enlighten us. Thank you

  14. Hello, can you help me, please.
    I have 3 experiments and each has multiple observations made at the same time, however the quantity of observations are not the same. For example 340 and 320. I’ve been looking around and I still don’t get if I could use a non parametric test

    1. Non-parametric tests that are used to compare 2 or more groups (e.g., Kruskal-Wallis test) do not require that the groups have an equal number of samples. If the requirements for ANOVA are not met with your experiment, you can try the Kruskal-Wallis test.

  15. I have 2 groups, 1 control and 1 experiment. A pre and post-test will be conducted and a t-test will show the difference in the performance. What if I switch the treatment after a few weeks?

    1. A paired t-test can be used if the pre- and post-treatment difference is a measurement of the efficacy, and if a cross-over design study with washout period is conducted in the same subjects.

    1. ANOVA is primarily used to detect differences in numerical values across 3 or more groups. For example, ANOVA can be used to help identify biomarker candidates that are highly expressed in one group compared to the other groups. Diagnostics tests, in general, determine whether a patient has a “positive” result based on a pre-set cutoff value.

      For these reasons, ANOVA is not used in diagnostics tests (to our knowledge).

  16. Hi, I have a question. I have 10 same protein samples, each protein has concentration of 2g/L, but 3 proteins have odd concentration (which isn’t 2g/L). what statistical test should I do on each of my 3 proteins?

    1. 1) Z-test. Use a Z-test assuming the protein concentration follows Gaussian distribution. If the original raw data do not follow a Gaussian distribution, try some transformation like log/square root to make the transformed values follow a Gaussian distribution if the original raw data do not). Then, calculate the Z-score, which is equal to (value – (mean of 10 samples ))/(std of 10 samples). Z-scores that are > 1.96 or < -1.96 should be considered as outliers. 2.) 25% quantile. -1.5* IQR is the lower boundary while (75% quantile + 1.5* IQR) is the upper boundary for determining outliers. IQR = interquartile range

      A boxplot will help to visualize the data.

  17. hello, I would like to ask for an opinion for which test should I use… I have patients from 7 differnt groups with diffrent drug concentraion in their blood who did respond to a treatment after one year are significally different from patient who did not respond to the same treatment…(hope I’m clear). Which test do I choose, T paired or ANOVA? Thank you!

    1. It depends on the purpose of the investigation.
      1) For determining the difference of response rates using different drug doses: Chi-squared test
      2) For determining the drug concentration levels in responsive/unresponsive patients using the same drug dose (and assuming the drug concentration is a random variable due to, for example, metabolism): t-test

  18. Hello again! The research study is actually about the Covid-19 implications on our lives. So I used stratified sampling wherein I divided my population into subgroups such as their age,education religion,employment and etc. So in this statistical test, I want to know if there is a difference in the views/perspective of respondents( about covid 19) based on their sociodemographics background. I used ANOVA becuase t-test is limited to 2 groups only while ANOVA can be used when you have a categorical independent variable (with two or more categories) and a normally distributed interval dependent variable.
    You said the dependent variable is unclear. I’m confused too. For me, the dependent variable is the “views of the respondents” while the categorical (independent) is the “sociodemographic backgrounds.”
    Am I right?

    Please enlighten me. Thanks you so muchhhh

  19. Hi, I have a question. What can I use to know the difference in the views/perspective of respondents based on their sociodemographic background? T-test or ANOVA? I’m not sure but I think I have to use ANOVA since the independent variable (sociodemographic )has 2 or more categories.

    Can someone help me please? Thank you.

    1. It is unclear what you mean by “dependent variable.” If the measurements are numerical, ANOVA will be appropriate for the preliminary analysis when there are 3 or more socio-demographic categories. However, the t-test may be applicable if gender (i.e., male/female) is the only characteristic considered. For age, linear regression analysis will be worth trying.

  20. Hi,
    I have data set which consist age of babies and its two measurement (x1 and x2).
    Age of babies are 3,12 and 24 months. Anova method is used as there is a 3 group of age. can you please help me out that what is the python code for anova.

  21. Hello! I have 2 diseased groups (experimental vs control). The experimental group received treatment and the control didn’t. My aim is to see how effective is the drug against the virus. I was thinking of doing 2 independent t-tests of the viral load, before and after 24h (where the experimental group received the drug and the control did not). But if I do this I am not really comparing the 2 groups. Do you have any suggestions? Thank you 🙂

    1. Try using the “decrease in viral load relative to baseline 100*(24hr – 0hr)/0hr” as the measurement to be compared.

  22. I have to conduct both a t-test and anova test from a given subset of data. For the t test, I used the variable if people exercised (yes/no) comparing average calories in each group (of the yes/no if people exercised). Then, for the anova test I used different forms of exercise compared to amount of carbs each person ate depending on their exercise form. I understand to use a t test with 2 variables (the yes and no) and the anova for 3 more (multiple exercise forms).
    I do not understand how the t test and anova data would relate to each to form a hypothesis. I am struggling to understand what I would be comparing between the two tests if I am using different data.
    I also have to determine the mean, median, SE, and covariance of the other categories (fat, protein) I have not used. How would this relate t to anything either??
    Am I even doing anything correct??
    Any help would be greatly appreciated! Thank you 🙂

    1. It is correct that ANOVA can be used to test a factor with multiple groups, while t-test is limited to 2 groups only. For the t-test, the question is simple: Is there a difference between two groups?

      Hypothesis
      The null hypothesis “H0” assumes that there is no difference between two groups. Under this hypothesis, both levels will share the same population average and standard deviation. Any difference that is observed would come from random variation during sampling only.

      The alternative hypothesis “H1” states that there is a difference between the two groups.

      The above definitions refer to the t-test. For ANOVA, “H0” and “H1” assume there are not or are differences between all groups, respectively. If the alternative hypothesis “H1” is true (meaning that there is a difference between the groups), further analysis is needed to determine which group or groups are different from the others; we do this using pairwise comparisons called “post-hoc” analysis. This strategy is similar to the t-test with appropriate adjusting for multiple comparisons.

      Descriptive statistics
      The descriptive statistics (mean, median, standard deviation, minimum, maximum) provide the data profile. First, these values can be used to characterize the data distribution. The t-test and ANOVA require that the data follow a normal distribution. If the data are not normal, then transformation may be needed before performing a t-test or ANOVA. Second, the descriptive statistics can help calculate the t- and F-statistics without prior knowledge of the raw data.

      For your study, you may want to perform separate ANOVA analyses to demonstrate that the exercise form does not influence the amount of protein and fat in a person’s diet. Otherwise, the change in carbohydrate intake cannot be attributed to exercise alone. If exercise form DOES influence protein/fat intake, univariate analyses like the t-test and ANOVA may not be appropriate. Rather, a linear model with covariates (e.g., protein/fat intake) should be considered.

  23. I have 1 experiment with 4 condition treated as follows:
    1.cells non-treated (3 samples)
    2.cells treated with drug A (3 samples)
    3.cells treated with drug B (3 samples)
    4.cells treated with drug C (3 samples)
    This one experiment is repeated in 3 independent times with same results trend.

    a) I would like to present the data that is only 1 representative experiment of 3 experiment.
    So that which presented data is correct and why?
    Mean +- S.D. or Mean +- S.E.M. ?

    b) I would like to assess the statistically difference specificically between just some TWO groups (e.g.1vs2, 1vs3, 1vs4) (but present 4 condition in 1 graph).
    Therefore, whether using unpaired 2 tails t test is correct? or ANOVA 1 way is more appropriate, and why ?
    And if ANOVA is correct, so what test followed by should be used (Tukey, Dunnett’s test …) ?

    Thank you so much in advance for your help.

    REPLY
    Brianne PetritisOctober 26, 2020 at 11:50 am
    a) Both mean+/-SD or mean +/- SEM are employed in published literature and are valid. SEM tends to be smaller than SD, so under some situations, the data spread looks better at first glance. In these cases, it should be clearly stated when SEM is used. Also, please note that SEM = [SD/(square root of N)] where N is number of subject in the group.

    b) The post-hoc analysis of ANOVA will be appropriate. During ANOVA, only ONE pair of data is analyzed at a time, which means that the p-value has to be corrected for the multiple possible comparisons. The post-hoc analysis of ANOVA takes this correction into account.

    The Tukey test compares every mean with every other mean. The Dunnett test compares every mean to a control mean.
    ——–
    Mark reply:
    Thank you very much for your answer.
    However, I have one more problem.
    I don’t know why for example:
    1 vs 2 (in 4 conditions I mentioned before) :
    condition 1 : Mean= 1
    condition 2 : mean=0.03
    looks significant difference, and when I use t test then p value less than 0.05
    but when use ANOVA, Tukey then p value more than 0.05, it is not significant.
    Could you please give me some insight and solution about that?

    1. The t-test is used when there is one comparison, but methods like ANOVA perform multiple comparisons while post-hoc analysis conducts a multiple-comparison correction. In general, the p-value after multiple comparisons will be bigger.

      When we conduct only one comparison, we want to determine whether there is a difference between the two groups. With a p-value threshold of 0.05, we accept a risk that we will mistakenly state there is a difference (when there is actually no difference) in 5% of cases. Under this situation, we still have a 95% possibility to make a correct judgement (1-0.05).

      If we increase the number of comparisons that we’re making to 2, the possibility to make the correct judgement has now decreased! If each separate comparison returned a p-value of 0.05, the possibility to make a correct judgement is 90.25% (i.e., (1-0.05)*(1-0.05) = .9025). We thus need to move the original significant level from 0.05 to a lower value, (e.g., 0.025) so that the final risk will be maintained at 5% or less. If we accepted a p-value threshold of 0.025 for the original analyses, then the possibility to make a correct judgement would be 95%. That is, (1-0.025)*(1-0.025) = 0.9506.

      Statisticians have developed various methods to account for the increased p-values following multiple comparison analyses. Some of these methods, like the Tukey test, inflate the variance so that a smaller statistic (and a bigger p-value) will be exported for post-hoc analysis.

  24. Hi Thank you for this but still I want to raise a question for me to fully understand when to use Anova or T-test.
    So for an example, you are the Dean of the Graduate School who formed two groups of MBA students: those who chose the non-thesis program, and those who opted the thesis program. You compared the two groups on a test of graduate school activities readiness to see if there is an effect of the choice of the program on graduate school activities readiness. What statistical test would be appropriate for this study and what needs to happen for you to be able to state that the choice of the program of MBA students has some effect on their graduate school activities readiness?

    1. When there are 2 groups to compare, use the t-test. If there are more than 2 groups, ANOVA is usually more appropriate. ANOVA answers two questions: “Is there a difference across all groups?” (via the F-test) and “Is there a difference between each comparison?” (via post-hoc analysis).

  25. I have 1 experiment with 4 condition treated as follows:
    1.cells non-treated (3 samples)
    2.cells treated with drug A (3 samples)
    3.cells treated with drug B (3 samples)
    4.cells treated with drug C (3 samples)
    This one experiment is repeated in 3 independent times with same results trend.

    a) I would like to present the data that is only 1 representative experiment of 3 experiment.
    So that which presented data is correct and why?
    Mean +- S.D. or Mean +- S.E.M. ?

    b) I would like to assess the statistically difference specificically between just some TWO groups (e.g.1vs2, 1vs3, 1vs4) (but present 4 condition in 1 graph).
    Therefore, whether using unpaired 2 tails t test is correct? or ANOVA 1 way is more appropriate, and why ?
    And if ANOVA is correct, so what test followed by should be used (Tukey, Dunnett’s test …) ?

    Thank you so much in advance for your help.

    1. a) Both mean+/-SD or mean +/- SEM are employed in published literature and are valid. SEM tends to be smaller than SD, so under some situations, the data spread looks better at first glance. In these cases, it should be clearly stated when SEM is used. Also, please note that SEM = [SD/(square root of N)] where N is number of subject in the group.

      b) The post-hoc analysis of ANOVA will be appropriate. During ANOVA, only ONE pair of data is analyzed at a time, which means that the p-value has to be corrected for the multiple possible comparisons. The post-hoc analysis of ANOVA takes this correction into account.

      The Tukey test compares every mean with every other mean. The Dunnett test compares every mean to a control mean.

  26. Hi, I am doing a PET/MRI imaging study on Parkinson’s patients (PD) and healthy controls (HC). Each participant is completing 4 different tasks during 4 separate PET/MRI scans, so we can collect both blood flow data (PET) and activation data (MRI) simultaneously for each task (So from one scan you get 2 different data values per person, PET data and MRI data). The 4 tasks are various movement/ cognitive tasks. Is there such thing as a 2x 4 repeated measures test that can compare the two populations (PD and HC) one each of the 4 tasks for both PET and MRI data ?

    Thank you!

    1. There are 2 fixed factors in the experiment, Parkinson’s disease/healthy (2 levels) and tasks (4 levels). There are also 1 random factor (subject) and 2 measurement methods (PET, MRI, each reporting tons of results).

      Which method you use to analyze the data depends on the objective/purpose of the study:
      1) Determine the difference of PET signals across PD/HC and activities: 2-way ANOVA with repeated measures, with subject as ‘within-subject factor‘.
      2) Determine the difference of MRI signals across PD/HC and activities: 2-way ANOVA with repeated measures, with subject as ‘within-subject factor‘.
      3) Explore the correlation between PET and MRI signals: Pearson or Spearman correlation analysis, according to the type of data, on data acquired from the same task, paired by subjects.
      4) Develop a preliminary statistical model to recognize PD against HC, if the sample size is sufficient (50+ in each group), using all the data available.

  27. Hi, I’m not sure what to use in my experiment to test the statistical differences.
    Two groups A & B are using separate dashboards whilst driving the same electric car (in Virtual Reality). They are driving the exact same route with the goal of reaching a total of 8km without running out of battery (data has been gathered every second whilst driving). For testing the statistical difference of completion rate, is a t-test appropriate?

    I’m also wondering if it’s possible to test for a statistical difference in driving behavior? For example, if there’s a difference in what speed they used along certain sections of the route. To answer, for example, if there’s a difference between groups A and B in the selection of speed between [0m,3000m]. Then, should the mean for that section be used for each individual, and then a t-test? Or is it possible to incorporate if users varied their speed in some form or way? For example, if participants in group A varied their speed a lot between i.e 40-60 km/h, whilst group B either picked a speed of 40 or 60 km/h and stuck with it, the SD seems to be very similar but the behavior is different.

    A bit of a long question, thankful for any input.

    1. For completion rate (i.e., a proportion), a chi-squared test would be appropriate.

      Testing driving behavior problem is more complicated. We can assume that the driving behavior is influenced by the personality of an individual and the device (i.e., dashboard). A better design would be having the same individual drive the same electric car, first with one dashboard and then with the other dashboard. The driving behavior per individual across the different dashboards could then be compared using a paired t-test. Driving behaviors could include accelerating time, frequency of braking, and turning speed.

      However, if the individual only has one opportunity to drive such that they can only test one dashboard, then a t-test would be okay for data analysis as long as the personality of the drivers are balanced against the two groups (i.e., dashboard A and dashboard B).

  28. Hi, I’m using two different devices to test the same parameter, device 1 and device 2, to test the microbial population (parameter) of samples 1 to 140, and I believe in my case it should be a paired t-test, right? I’m trying to compare the performance of these two devices, the p-value will tell me are their performances the same or not, right?

    1. It depends on the objective of your project. If you are attempting to prove the difference between two devices, you should use a paired t-test. If you are attempting to prove these two devices are substitutes to each other, then you should use an intra-class correlation coefficient and a Bland-Altman plot; see also https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5654219.

    1. You should use a t-test if the trials are un-correlated. If the same subject was studied at both concentrations (e.g., Trial 1 @ 25%, Trial 1 @ 50%), you should use a paired t-test.

  29. Hello, i am wondering which test i should be using. I have 3 different types of mice called A, B and C. However, I am only interested in knowing the statistical differences between A and B, and A and C. I also want to put all 3 groups in the same column graph (for a visual reason).

    1. Use ANOVA with post-hoc analysis. Report the comparison-of-interest (e.g., A vs B) from the analysis.

  30. The data consist of 5 task, for 2 different menus. Basically I recorded the time of completion of the task, and their slips and mistakes.

    1. It looks like the outcome-of-interest for your study is of the “failure” to accomplish tasks. You can use a Chi-square test if the failure is time-irrelevant, such that each task is evaluated on whether it is completed or not completed regardless of the time. However, if the failure is time-relevant, such that the time to complete each task is measured, then survival analysis should be chosen. For survival analysis, the task status should be recorded periodically rather than only at one time.

  31. Hello,
    I am conducting an experimental study, survey design to look at whether national affirmation is effective in encouraging a national in-group identity. The control condition will receive an unrelated task. The DV is measured by 4 categories. is it encouraged to use both a t-test and anova or just one? I appreciate all responses

    1. In general, a t-test is an appropriate choice with a 2-arm experiment if all other confounders are controlled. I am unsure what “DV” standards for, but I assume it represents an in-group identity measured by various aspects.

  32. First, I used one way Anova using GraphPad Prism. But a reviewer commented that One Way Anova is not good because it gives multiple p-values.

    1. See my previous response. You should also include a post-hoc pairwise following the one-way ANOVA.

    1. For single gene expression, a one-way ANOVA plus a post-hoc pairwise test should be okay. ANOVA will provide a p-value that reflects the difference among all the levels/groups, and then the post-hoc pairwise test will give the p-value between each pair of levels- or groups-of-interest. Generally the post-hoc test takes into account the multiple comparisons; in other words, the post-hoc test will adjust the p-value.

  33. Hi. I have obtained the data of a gene expression of gene (say X) in 4 different histopathological types of breast cancer. My total samples are 105. Also I checked the expression of same X gene in 15 normal (disease free) samples. Distribution of samples among different histopathological types is uneven tanging from 70 samples to two samples. Which analysis should be performed to find if gene x is associated with any of the 4 histopathological types of breast cancer. And which approach should I use to find if any threshold expression level (fold change) of that X gene is associated with any of the histopathological type of breast cancer.

    1. Considering the high number of measurements when a gene microarray is used, the classic ANOVA on single measurement may not be appropriate and the multiple comparison issue must be handled/adjusted. There is a popular analysis tool in R language for microarray gene expression data named “limma” (http://bioconductor.org/packages/release/bioc/html/limma.html), which can be applied to the data set. The data may need some pre-processing or filtration, like removing some low-quality gene spots, normalization, or excluding outlier samples, before inputting the data into the linear model.

    1. The descriptive statistics (mean +/- SD, median, or range) should be selected according to data distribution. A rule of thumb is that if the ratio of SD to mean is < 1, then you should select the mean +/- SD. You may also want to consider using ANOVA. SD = standard deviation

  34. Hi, I am designing an experiment on Impact of social norms and social network on sustainable behaviour. I have three groups of independent variable (social norm, social influence and control group), and dependent variable (consumer intention). Is better to use independent Anova Test?

  35. Thank you for your response. So if I have three samples and would like to do multiple comparison, which test shall I choose when homogeneity of variance is violated? It seems ANOVA with post-hoc doesn’t work in these situation.

    1. I suggest you perform some data transformation (e.g., log, squared-root, reciprocal and arcsin-squared-root) and then conduct the homogeneity test again. Go ahead with ANOVA if the homogeneity requirement is fulfilled; otherwise, try some non-parametric test methods.

  36. What about variance assumptions? what is difference between ANOVA and t-test in term of variance assumptions?
    Thanks

    1. Does the “variance assumption” refer to “homogeneity of variance?” The “homogeneity of variance” (HV) is an assumptions that all groups being tested have similar variance. It is a prerequisite of both ANOVA and the t-test. HV is derived from the “independent identical distribution” assumption that samples retrieved from the same population must have similar variance; this is also known as the ANOVA/t-test-based hypothesis testing. If heterogeneity of variance is observed, the t-test should be adjusted to accommodate the information.

  37. Hello,
    I’m conducting a case study to see the effect of prosthetic alignment change on gait parameters. The case study is for one participant. I have 3 conditions (A,B,C). one of them is baseline with no changes (A). the other two included changes. the trial included 80 means for each condition. So far I have use paired t test to compare the changes to the baseline (A vs B ) and (A vs C). it did show some significant differences. just wanted to make sure that I’m using the right test. what do you think ?
    best wishes ,
    Saleh Ibrahim

    1. Are the 80 means from measurements of different aspects of gait? If the answer is yes, these measurements cannot be considered as samples from the same population and the t-test would be inappropriate.

      If the 80 means were repeated measurements of the same aspect of gait, the t-test would still not be an appropriate choice due to the limited sample size. That is, all the data came from the same individual such that the effective sample size is 1. Statistical analysis is uncommon in this case. Instead, descriptive statistics, and under some scenarios a regression analysis, can be applied to analyze the data.

  38. I have an assignment problem where it asks “Scientists are trying to understand if there is a relationship between the age of patients who
    contract COVID-19 and the duration of their illness (number of days that they suffer from the
    symptoms of the illness). What type of statistical analysis among the following (z-test, t-test, one-way ANOVA,
    two-way ANOVA, regression) is needed to conduct such investigation?”

    What would you pick?

  39. Hello- I am conducting a research on Impact of motivation on employee performance.
    My hypothesis is “There’s no relationship between rewards given(monetary, non-monetary, both and none) and its impact on performance.
    And another hypothesis is “there’s no relationship between gender (2options male and female) and intrinsic factors of motivation (there are 5 factors in it).
    There are 100 samples.
    It would be helpful to me if you could let me know which test would be applicable in this case.

    Thank you

    1. For the first hypothesis, you should use a two-way ANOVA with and without the interaction term. The two factors are “rewards given” (yes/no) and “monetary reward given” (yes/no). This analysis will answer two questions: 1) whether providing rewards affects performance, and 2) whether the performance is affected differently by giving monetary or non-monetary rewards.

      For the second hypothesis, a t-test should be okay.

  40. Hello – I am proposing a research design – where I am evaluating the influence of police-body-cameras on police productivity and police brutality. I am measuring police productivity through the number of police actions taken (citations and arrests); I am measuring police brutality through the number of “use of force” instances and through the number of citizen complaints of officer misconduct. The two police department sites would run independent studies. Randomly sampling will be used to select 120 patrol officers from the Duty List at each site; then random sampling will be used to place those 120 into three respective groups: treatment group – given automatic body cameras (N=40), control group A – given traditional body cameras (N=40), and control group B – given no cameras. Data collection would occur for a 9 month period. Participants would not be told of the study, as it may impact their behaviors, knowing they are being closely monitored for specific actions/behaviors.

    Would an ANOVA test be appropriate? And how would I calculate the effect size?

    1. From what it sounds like, this is “rates comparison” problem. In other words, the outcome is proportional to the brutality (of the action taken). ANOVA may not be appropriate for such a comparison.

      I suggest trying a Chi-squared test where the effect size would be the difference in brutality rates across the groups. An even better analysis approach would be the McNemar test (or paired Chi-squared test). This would allow you to perform a cross-over design analysis, which would account for a particular police officer wearing or not wearing camera for a predefined pattern/period.

  41. Good evening,

    Just like to confirm, my experiment consists of two experimental groups: control and intervention group. We are analysing the baseline results and the ‘after ten weeks’ results using EORTC-CLC30 numerical scores.

    Would a Repeated measure one way anova be the appropriate statistical test for such an experiment? Would a paired t-test be appropriate too?

    Thank you

    1. Yes ,repeated measure one-way ANOVA would be appropriate. A paired t-test can be applied to compare the ‘baseline’ and ‘after-treatment’ groups. In case you do perform a t-test, please make a derivative measurement from the ‘baseline’ and ‘after-treatment’ scores (e.g., percentage of change) and then apply the t-test on two groups.

  42. Hello, Thank you so much for the article. I will appreciate it if you can take a look at something for me.
    I have several mixes of materials that are used in the same kind of test. Let’s say the sample size of each mix is 12 and there are 7 different mixes. This test measures how much energy to fracture the materials. I also know what is the behavior of the materials from the load versus displacement curves.

    To compare the average energy between mixes, should I use T-Test or ANOVA? If the result I get from T-Test is a mix of both significantly different and not significantly different (not in a clear pattern), how should I proceed?

    For the load vs displacement curve, I only want to look at post peak data and I divide that section post peak into 10% increment sections (95%-85%, 85%-75% …) and I’m interested in wanting to know whether these sections have any effect on each other. I know that these sections do not describe the same thing, but can I use either T-Test or ANOVA to do my analysis?

    Thank you so much.

    1. ANOVA with post-hoc on 7 groups of samples will be appropriate for comparing the average energy to break different types of material.

      For the load vs displacement curve, comparing the the maximum post-peak displacement (peak to breaking) may be worth trying. Another measurement could be post-peak rate per increment section. ANOVA still can be applied here for comparing different types of material.

      For your section-by-section data, it is obvious that the data are auto-correlated. In other words, section1 must have happened before section2. Accordingly, ANOVA may not be appropriate for analyzing the ‘section effect’ due to the need for independent data points. Perhaps you can take a look at Time Series analysis, which is common in analyzing economics data.

  43. I have four independent variables that i am using to predict the dependent variable. however, two of the independent variables are interrelated and one of them has some measurement variables. i had used Anova to account for the variations in one of the independent variables but stack on how to deal with the two interdependent independent variables in a simplified way. My plan is to have a simplified multiple regression equation factoring in all the variables

  44. Thanks for your reply. I have one more question. Regarding to two-group comarison, you answered both t-test and ANOVA are fine. Then in the scientific paper, is it ok to show both ANOVA and t-test results instead of choosing one in the same figure? Or I have to choose one?

    1. It is uncommon to present results from more than one statistical approach in one paper, except for studies that aim to compare the performance of different statistical methods. Generally, as the hypothesis is tested more with different statistical approaches and more p-values are acquired, the less confidence we can establish in the data (or p-values).

      Therefore, please pick one approach and report the results from that approach.

  45. Hi,
    I have three mouse groups — wild type, mouse model 1, mouse model 2 (mouse model 1 and 2 are two different Alzheimer’s disease models). Each group has 4 different individuals.
    If I am only interested in the comparisons of model 1 with wild type and model 2 with wild type, Should I use ANOVA or t-test?
    If I am interested in the all three groups comparisons which test should I use?

    1. For the two-group comparison, the t-test would be a good choice. However, ANOVA with post-hoc analysis will be better. To perform an analysis on all of the groups, ANOVA is necessary.

  46. im conducting a research which is to determine what kind of microbes is in the surface specially doorknob in public restroom, what should i use?

    1. It depends on the objective of the study. If the objective is to determine whether a microbe-of-interest is on the doorknob such that the result is either “yes” or “no,” statistical analysis may not be necessary. If the objective is to determine how the counts of specific microbes on the doorknobs differ in different buildings/locations, I would performing a comparison based on Poisson distribution.

  47. I comparing 3 schools doing online learning with 13 different factors/observations. I was planning to use a one sample t-test across the different factors to test which ones are significant which we can deep dive into in follow up research.

    Concurrently, i am planning to use ANOVA to compare the factors/observations which didn’t fare well across the 3 schools to analyze the cause-effect.

    Does this seem like a decent plan?

    Thank you so much for seemingly reverting to every single question here!

    1. The following information is based on the assumption that the objective of the study is to determine the impact of “online” learning based on some performance measurement(s). As a pilot study, it would be fair to explore the “online” effect at the level of a school, but the power of a sample with only 3 schools may not be strong enough to detect subtle effects. In other words, a sample size of 3 is low for statistical analysis, such that the mean and standard deviation (SD) cannot be estimated accurately. You may want to consider using units that are smaller than a school for this analysis, such as classes or students.

      Finally, you mentioned using a one-sample t-test, which implies that there will not be a control. However, some statistics (mean and SD) from the ENTIRE population is historically available. Using a control is advisable.

      Thank you for your kind words! In case you’re interested, we have explained other biostatistical methods as well: https://raybiotech.com/learning-center/common-biostatistical-methods-explained/

  48. Hi!
    I have 2 populations of different types of Alzheimer’s disease and a control group. Can I use t-test to compare each group with every other, taking into account that they were not subjected to any treatment? I just want to know if they’re naturally different to each other (I understood that ANOVA helps if I want to compare the effect of a treatment in more than two experimental groups, but in this case, the groups are not subjected to any treatment, I want to check for their natural differences).

    Thank you very much!

    1. ANOVA is still the appropriate analysis method. You can consider the groups you are testing as those at different levels of Alzheimer’s disease. For example; control group = level 0; Alzheimer’s disease, type 1 = level 1; and Alzheimer’s disease, type 2 = level 2.

  49. hi, I have 1 independent variable (promotion framing type) with 2 categories (percentage and dollar) and 1 dependent variable (purchase intention). I want to know which promotion framing results in a higher purchase intention.

    My hypothesis is: Dollar off (vs percent off) discount framing has a stronger positive effect on purchase intention for consumers.

    What is the best test to use for this? My sample size is 200 people.

    Thank you

    1. The type of test that should be used depends on 1) whether the two promotions were placed on the same framing type and 2) if both of the promotions were shown to the same person. If both of these stipulations were followed, you should use a paired t-test. If only one promotion was marketed to a person, then you should use a t-test. Since your hypothesis is that the promotions will not result in the same purchasing effect, you may use the one-side probability.

  50. i have four types of pollutants and i analyse the concentration of that pollutants in both wastewater and river. I want to see the significance different between the concentration in wastewater and river. which test are suitable?

    1. If the samples from wastewater and receiving water were collected separately without a one-to-one corresponding relationship, the t-test will be sufficient for the two-group setting.

  51. i want to determine the significance difference of hormones concentration in wastewater with the concentration in receiving river. which statistical analysis should i use?

    1. You can use either a paired t-test or ANOVA with repeated measurements. ANOVA with repeated measurements would be appropriate if the samples from the wastewater and receiving river were collected simultaneously at various time points. ANOVA may also answer an additional question – whether the hormone concentrations change over time.

  52. hi i want to determine the differences in CAD currency between March 2020 and April 2020. what method should i use? is it okay for t-test?

    1. While the t-test could provide a p-value, it may not be appropriate for this comparison. In the case you present, the CAD currency data would be dependent to previous data. This would violate a t-test assumption that the data follow an independent distribution (i.e., the collected data are independent of the data collected previously). For your type of data, a time-series analysis may be a better approach; learn more here: https://www.itl.nist.gov/div898/handbook/pmc/section4/pmc4.htm

  53. I’m comparing two types of menus in interface design, and I want to find out which is better between them. I have just one group consisting of 20 people who did both the menu test. Which would be the best way to analyze this. Just one-way ANOVA? Do I need t-test as well? Or is there a better method. Thanks

  54. I am comparing 2 types of schools and 3 different levels of support for distance learning of those schools. Should I use independent t tests, or ANOVA?

  55. Hi, am comparing the retention level of total phenolic content of spices before and after cook, then compare retention level of 3 different amount of spices used after cook. Should i use ANOVA or a paired t-test?

    1. To compare the retention level of total phenolic content before and after cooking, a paired t-test would be appropriate. To compare the dose-dependent effect of spices on retention level, use ANOVA with repeated measures: https://statistics.laerd.com/spss-tutorials/two-way-repeated-measures-anova-using-spss-statistics.php This approach enables multiple comparisons within a single analysis. You may also want to perform a scatter plot with regression.

  56. Hello sir
    I’m doing research on differences between 8 food samples divided in 2 groups. Is it okay to conduct one way anova and post hoc test ?

    1. For two-groups, the t-test is a good choice. ANOVA will also give similar results, although a post-hoc test is not required for two-group comparisons. If you want to do multiple comparisons between samples, first perform ANOVA on multiple groups and then perform post-hoc analysis between various pairs of groups.

  57. Hi! I am writing an Internal Assessment on the calcium content in six different types of tofu using EDTA. There are 5 trials per tofu, would ANOVA test work for this experiment?

    1. Yes, an ANOVA test would be appropriate. It is a 6-group balanced investigation as there are 6 different types of tofu. Since each “trial” (5 trials per tofu) represents a sample, there will be a total of 30 samples.

  58. My data is not normally distributed even after using log, sqrt or cuberoot. Do I use non-parametric test?

    1. Yes. A non-parametric test is a safe choice when the normality cannot be achieved after data transformation.

  59. Hello. I’m running an experiment to determine which agar and treatment I should use for my protocol. I’m testing two different agars. Each agar will undergo different 4 different conditions. Each condition has 3 levels. Which test should i use to analyse my data?

    1. Whether you use one-way or two-way ANOVA will depend on the objective of your study. If the objective of this study is to find the “best” condition-level combination, then you should use one-way ANOVA. However, if you’d rather evaluate the effect of the conditions and levels, then two-way ANOVA with interaction analysis would be more appropriate.

  60. I am comparing 2 unrelated parameters (expressed in 3 different ways, all in numerical values) between control and disease group. The disease group is further sub-divided based on severity of abnormality. The numbers are small, as its a pilot study.
    Which test should be used to assess differences between groups?
    I have used unpaired t test initially, but want to be sure, if its the right way.

    1. For a pilot study, one-way ANOVA for each disease state (e.g., disease-free, diseased) would be the appropriate choice. Data should be appropriately pre-processed (i.e., transformed), if necessary. This will enable any follow-up studies to focus on the information pertaining to each disease sub-group.

  61. Hello! I am experimenting on the productivity of the students with respect to their time preferences. The first condition is that, in the first 5 days, the student will do his/her homework in the morning (8 AM to 4 PM). The second condition is that, in the next 5 days, the students will do his/her homework in the evening (8 PM to 4 AM). What test can I use to further interpret my results?

    1. Paired t-test. It will be better if the students can be randomized into two branches, one branch switching from morning to evening, and the other switching from evening to morning. Thus the effect of proficiency gain after 5-day monitored practicing will be balanced.

  62. I did an experiment to look at a treatments for an injury. I have 5 groups: sham injury, injury w/o treatment, and then injury + treatments 1, 2, or 3.

    We use sham groups to prove that the injury model did, in fact, cause an injury. In my statistical analysis, should I:
    a) compare sham v. injury w/ treatment with t-test, then injured w/o treatment to injured+treatments with ANOVA?
    b) compare all 5 groups with ANOVA, then post-hoc analysis?

    Having the sham data in the same ANOVA set with the injured and treated animals is adding a second variable. Variance should be equal within groups, but means should be different.
    Thanks!

    1. Both of the strategies would work.

      The first method is a two-stage approach with easily interpretable results because it focuses on the primary objective of the study: the effect of treatment following injury.

      The second method is a one-way ANOVA that has statistical “integrity.” In other words, all of the data can be analyzed at one time without having to perform separate tests or reuse the data. From a statistical point-of-view, this approach is better than the first.

      There is also a third way to analyze the data if you have a “no injury but received treatment” group. You can perform a 2-way ANOVA by creating two grouping variables (injury and treatment) for each animal. This method is relevant if you are interested in the individual effect of injury and treatment, as well as their potential interactions. To learn more about 2-way ANOVA, please refer to http://www.sthda.com/english/wiki/two-way-anova-test-in-r.

  63. I’m doing my bachelors level research in which I have one independent and two dependent variables. I’m seeing the impact of the independent variable on the two dependent variables separately. The two dependent variables have no connection with each other. The participants for the study are 60 and are not grouped.
    Which test would be recommended for it?

  64. Hi, I am doing a statistical analysis on my paper on patients who underwent surgery for vitreous hemorrhage. By diagnosis, it came out as this, subretinal pathology 57, retinovascular diseases 54, trauma 6, retinal detachment 6, and vasoproliferative tumors 3. Can I use ANOVA to compare the pre-operative and post-operative results among the population with more than 3 itmes. Thanks

  65. I want determine which of the 2 exercises is more effective in decreasing cholesterol levels (baseline -post exercise)

    1. Both paired-t-test and two-way ANOVA could be used since only 1 factor (baseline versus post-treatment) is being compared. If you want to analyze additional factors or groups, such as different treatments, two-way ANOVA with repeated measures would be better. You can learn more about ANOVA with repeated measures here: https://www.statisticssolutions.com/conduct-interpret-repeated-measures-anova/. This is because ANOVA considers the variance across ALL treatment groups whereas a t-test analysis would have to be performed within EACH treatment group.

      Please note that both t-test and ANOVA require normality of data distribution so that sometimes appropriate data preprocessing (i.e., transformation) is necessary.

  66. So I have got 5 conditions and 10 trials for each condition. Im comparing the growth in plants due to a change light spectrum color’s. Im conducting this experiment with 3 different plants, meaning for every plant I have a set of data of 5 condtions and 10 trials. Is it better to use an ANOVA test? and does it depend on my error bars which test I should use(ive been hearing that).

    1. Yes, you should repeated measures ANOVA under a generalized linear model (https://www.statisticssolutions.com/conduct-interpret-repeated-measures-anova/). If the case of normality is not met (https://en.wikipedia.org/wiki/Normality_test), the data should be pre-processed appropriately before inputting the data into the models. For example, you may need to transform your data by converting everything to the log format. The type of pre-processing that you should use will be based on the characteristics of the original data. If the data are generated from a high throughput platform like a microarray or next generation sequencing, please refer to the R package limma (https://www.bioconductor.org/packages/release/bioc/html/limma.html).

  67. I have three populations of cells. Population 1 is my control, untreated cells. Population 2 is my sham treated cells, and population three is my treated cells. Can I use student t-test to determine a statistical difference comparing pop 1 to pop 2, and pop 3 to pop1, and pop 2 to pop 3, or must I use a one-way ANOVA to compare all three? And what is the rationale?

    1. Great question!

      1.) The t-test could be used, but is not recommended. The t-test can determine differences between two groups, but is not recommended for multi-group comparisons because the alpha level (i.e., significance level) must be set lower than the standard 0.05.

      An alpha of 0.05 is another way of saying that there is a 5% chance that the data would be (falsely) identified as significant (i.e., rejecting the null hypothesis). On the flip side, that would mean that 95% of the data would be accurately assigned (i.e., level of confidence). If the t-test were to be implemented here, the alpha value would need to be adjusted to maintain a 95% level of confidence. Using an alpha = 0.05, the level of confidence would actually be 85.7% (0.95^3=0.857). Therefore, to get a 95% level of confidence (0.983^3 = 0.95), the alpha level should be set at 0.017.

      Unfortunately, this low alpha threshold can result in missing statistically significant markers that would otherwise be identified with analyses where the alpha could be set at 0.05.

      2.) ANOVA should be used, but should also be accompanied by a post-hoc test. ANOVA can determine whether there is a difference between the groups, but cannot determine which group contributes to the difference. For a single variable test, ANOVA can be used first. To account for the multiple comparisons, the ANOVA data should be analyzed with another test (e.g., Duncan, Newman-Keuls) where the alpha can be set at 0.05.

      It is also important to mention the sample size. You mentioned that you had three populations of cells. A minimum of 3 biological replicates should be used to conduct initial statistical comparisons to understand the effect size (signal) and variance (noise). This information will determine the sample size that you’ll need to ensure that the power is no less than 0.80. Notably, more accurate information will be obtained with a larger sample set in the pilot study. Sometimes, the signal and noise are known a priori; in this case, a pilot study may not be needed.

Leave a Reply

Your email address will not be published. Required fields are marked *