MEET WITH QW
ENGAGEMENT IQ
Where awesome workplaces grow.

Why Statistical Significance Doesn't Belong In Engagement Survey Software

/ 1.29.19

StatisticalSignificanceIf you’ve ever been in a meeting about metrics or results and heard someone ask the question “But is it statistically significant?”, please read this entire post.

 

If you've ever asked, “But is it statistically significant?”, please read this entire post.

 

If you think you know what ‘statistical significance’ means, please read this entire post.

 

If you’re a manager, HR representative, or leader who relies on statistics to guide strategies or business decisions, which can affect your employees’ jobs and livelihoods, please read this entire post.

 

The Tyranny of Statistical Significance

 

Statistical significance is numerically represented as a p-value (e.g., p = 0.03). You might’ve heard that the p-value is the probability of your results happening by chance; the lower the p-value, the lower the probability that your results happened by chance. In other words, lower p-values are associated with results that are more important. This explanation is straightforward and easy to understand.

 

There’s just one problem: all of that is completely wrong, as is conventional wisdom surrounding statistical significance.

 

The technical and most accurate definition is that statistical significance is the probability of getting results at least as extreme as the ones you observed given that the null hypothesis is true.

 

Unless you’re well-versed in statistics, that definition probably doesn’t make much sense. And unfortunately, I can’t explain it differently. It’s not because I don’t want to, but because I literally can’t – even scientists can’t easily explain what a p-value is.

 

And that’s a huge red flag. The p-value is commonly misunderstood and difficult to explain, yet the sciences are absolutely obsessed with it. It’s often used as a threshold to determine whether scientific studies get published or funded, whether certain strategies or products are implemented within businesses, or even whether certain drugs can be used to treat diseases. This means the p-value can make or break careers, businesses, and lives. Thankfully statistical significance is coming under attack and quickly becoming a more controversial statistic, but I digress.

 

New Research: Employee Engagement Trends Report    

Why We Don't Calculate Statistical Significance

 

Here at Quantum Workplace, we currently don’t report p-values in our software for the sake of simplicity and statistical soundness. Below are five reasons why we avoid p-values; the first two illustrate our emphasis on simplicity, whereas the final three illustrate our emphasis on statistical soundness.

  1. The average employee does not have a statistical background. A large portion of users would not know how to accurately interpret the meaning of a p-value, making it a number that causes confusion and misinterpretation rather than offering clarity and guidance.
  1. Statistical significance is not the same as practical importance. Reporting p-values could give users a false sense of direction or security, with the misunderstanding that p-values under or above a certain threshold are less or more valuable to pursue for positive organizational change.
  1. Organizations are populations. Statistical significance is part of the branch of statistics known as inferential statistics. These statistics focus on being able to infer results about a population (e.g., a country, an industry) from a smaller group of that population because it’s rarely possible to get data from an entire population. However, in the case of census engagement surveys, the organization is the population. Response rates for our engagement surveys tend to be quite high ( > 80%), so there is little room, and therefore little need, for statistical inference.
  1. Statistical significance is strongly impacted by group size. Results are less likely to be “statistically significant” with smaller group sizes, and more likely to be “statistically significant” with larger group sizes. This relates directly to the second reason listed above; users could be misguided by an artifact of statistical significance (e.g., group size) rather than being guided by practically important differences.
  1. Assumptions for statistical significance testing would be violated to an extreme degree. This reason is more for my fellow stats geeks, but if statistical significance were calculated for every possible comparison within an organization, the resulting p-values would be so unreliable that they become meaningless at best and counterproductive at worst.

Instead of relying on statistical significance to determine whether a difference or change is important, we recommend focusing on relative differences or changes within your data. For example, say that most survey questions increased in favorability since your previous engagement survey, yet a few decreased. Relatively speaking, those questions that decreased in favorability are more practically important to focus on. And more specifically, those questions that decreased the most should receive highest priority.

 

Likewise, if one department has especially low overall favorability, that department is most important to focus on. Yet if all departments have similar yet fairly low favorability, then that suggests a strong organization-wide effort is required.

 

Statistical significance has become a convenient shortcut for a lot of decisions made across the world, and I don’t want to suggest that statistical significance should be completely abandoned. It does have its place, but that place isn’t in engagement survey software.

 

If you're as obsessed with numbers as I am, you can't afford to miss out of on our free annual trends report. It's the ultimate resource for gauging engagement.

 

Employee Engagement Trends Report! Get the latest research and trends on engagement.

 

OTHER POSTS YOU MIGHT LIKE

Post A Comment

0 Comments