Search This Blog

Wednesday, 1 November 2017

Research Design and Methods Analysing Variation in Quantitative Data

Presenting Quantitative Data

Presented Using
Measuring “Typical Response”
Nominal
Frequency Table
Bar Chart (space between the bars)
Pie Chart
Mode – the most common response
Ordinal
Frequency Table
Bar Chart (space between the bars)
Pie Chart
Median – the middle response when values are ranked
Interval or Continuous
Histogram (no space between the bars)
Mean – the average response

Univariate Analysis

       Bar Charts and Pie Charts (for nominal and ordinal)
        Histograms (for interval or scale data)

        Frequencies (nominal/ordinal)
         
Number
%
Agree
40
50%
Don’t Know
30
37.5%
Disagree
10
12.5%
TOTAL
80
100%

Descriptives
N
Min
Max
Mean
Std. Deviation
How many people does your business normally employ? - Full Time
71
0
1500
32.15
180.513
Valid N
71
 
 
 
 

Why is measuring dispersal important?
Group A
Group B
Age
N
Age
N
30
35
40
45
50
55
60
0
10
20
40
20
10
0
30
35
40
45
50
55
60
40
10
0
0
0
10
40
Total
100
Total
100
Mean
45
Mean
45
SD
5.5
SD
14

VIDEO: why it is important to understand what lies behind the mean
bivariate analysis?
       Usually, to move beyond ‘description’ to ‘explanation’. To move beyond ‘how things are’ to ‘why things are the way they are’
       Explanation can be theoretically informed (testing theory) or inductive (producing theory)
       Measured effects are explained by measured causes.  In other words ‘explanation’ requires the analysis of relationships within the data.  Often relationships are explored between ‘independent’ and ‘dependent’ variables.  
       Quantitative analysis is more than explanation – it also about prediction.   It involves inferring what is likely within a wider population, given certain conditions.
       Design limitations imply that ‘explanations’ are probabilistic.  Hence, ‘attending class increases the likelihood of doing well in assessments’.

Looking at differences by independent variable: cross-tabs
   ‘Cross-Tabs’
(for nominal and
ordinal variables)
               
       Need to decide on the independent (cause) and dependent (effect) variable. 
       You can relate this to an hypothesis your research is designed to test, e.g.
      H1: there is a relationship between gender and liking chocolate
      H0: there is no relationship between gender and liking chocolate.
Male
Female
Like Chocolate
25
50%
40
80%
Don’t Like Chocolate
25
50%
10
20%
TOTAL
50
100%
50
100%

Looking at differences-  comparing means
Compare Means’ 
(for nominal/
Ordinal vs. scale
Data)
               
       Again, need to decide on the independent (cause) and dependent (effect) variable
Graduate
Non-Graduate
Mean Salary at 25
£20,000
£15,000
Mean Salary at 35
£35,000
£25,000

Inferential Statistics
       Descriptions of data (e.g. tables) on their own are limited.
       A key principle of quantitative analysis is generalisation -  exploring the extent to which your observations/relationships are likely to exist in the population.
       In other words your data is often based on a sample, which you are using to estimate about the population.

What do inferential statistics do?
       They use sample data – to make an inference (estimate) about the population from which the sample was drawn
       From a small sample we can make estimates about a large population
       They tell us whether or not we can make inferences – that is confidently predict that what we have found in the sample exists in the population.
       Statistically significant correlations and differences in sample data mean you can be confident these results can be applied to the population

Significance Testing?
       “Testing the probability of a relationship between variables occurring by chance alone if there really was no difference in the population from which that sample was drawn is known as significance testing” (Saunders et al).
       Software (such as SPSS) can give you a test statistic value, the degrees of freedom and the probability (p-value)
       If p=<0.05 then you have a statistically significant relationship (i.e the probability of it occurring by chance alone is very small)
       It’s more difficult to obtain this information from a very small sample – so sometimes these tests help guide you to how much data you need to collect.
       Example of sampling coach loads of people the probability of getting very tall people only or very small people only is very unlikely (possible, but very unlikely) if you are drawing from the general population randomly.

What is statistically significant?
       Usually in business research we would be happy if we could be 95% confident in our estimate.  This equates to a 5% significance level, which can be written as:
                                               
sig = 0.05
       Usually SPSS calculates significance for all statistics – indicated by a sig value.
       Where sig is less than or equal to 0.05, the result is said to be statistically significant -  you can be confident that what you have observed is likely to be a significant difference or association.
       Where sig is greater than 0.05, you can only assume that what you have observed could just have occurred due to chance -  it is unlikely to exist, causally, in the population and should not be claimed as a significant finding.

Tests of Difference
       In addition to running crosstabs and compare means, it is usual practice in survey and experimental designs to run a test of difference.
       Such tests answer the question whether groups (independent variables) differ according to some measure (dependent variable).  For example, does performance differ by gender?
       Test of difference include:
      Chi-square
      ANOVA (F-test)
                 
Remember it is usual in business to look for “Sig.” values that are <= 0.05   (i.e. it is usual to work with a 95% confidence level)

Looking at differences: cross-tabs & chi-squared tests
   ‘Cross-Tabs’
(for nominal and
ordinal variables)
               
       Adding a Chi-Squared test to your analysis will give you some evidence to decide if the independent cause (i.e. gender) is likely to have an evidenced effect on the  dependent variable (liking chocolate)
       This adds more information to what you can already see/read from the percentages in a cross-tab table above.
Male
Female
Like Chocolate
25
50%
40
80%
Don’t Like Chocolate
25
50%
10
20%
TOTAL
50
100%
50
100%

Looking at differences: cross-tabs & chi-squared tests
‘Cross-Tabs’
(for nominal and
ordinal variables)
               
       If a chi-squared test doesn’t give a sig. value that is less than (or equal to) 0.05 then the pattern of variation you see above is reasonably likely to have just occurred by chance. You will have to reject the hypothesis that gender effects liking chocolate. The evidence is not sufficient from your sample to infer this of the population.
Male
Female
Like Chocolate
25
50%
40
80%
Don’t Like Chocolate
25
50%
10
20%
TOTAL
50
100%
50
100%

Looking at differences: comparing means & ANOVA
       ‘Compare Means’
(for nominal/
Ordinal vs. scale
Data)
                Again, adding a statistical test to our analysis could help us see if these apparent patterns of the independent (cause) and dependent (effect) variables could have just happened by chance anyway.
       A request to run an Analysis of Variance (ANOVA) could be added when you run the compare means.
       Again a low sig. value (<=0.05) shows a result likely to be significant
Graduate
Non-Graduate
Mean Salary at 25
£20,000
£15,000
Mean Salary at 35
£35,000
£25,000

Co-variance as a prerequisite for causality
       In quantitative analysis, explanatory power is usually measured in terms of the extent to which effects co-vary with potential causes.
       Two key modes of analysis:
      Do categories or classes differ in terms of values, attitudes or behaviours?  Are these patterns statistically significant? Tests of difference
      Do quantities co-vary?  That is, a statistical relationship exists.  Correlation
      In other words, where two variables co-vary they are said to be statistically related.   Statistical relationships can vary in strength, direction and significance.   The stronger the statistical relationship (as covariance approaches perfection) the greater the potential explanatory power.

Looking at co-variance: do marks in tests and attendance in class co-vary?
 Correlation coefficients-  as a measures of statistical relationships
       The problem with diagrams is they tend to lead to subjective  conclusions and also make comparisons difficult.  They also lack precision.
       The solution is a set of statistics known as correlation coefficients.  Examples include:
      Cramers V
      Pearson’s Product Moment Correlation Coefficient
      Spearman’s Rank
      Kendal’s Tau……
      The choice depends on the form (nominal, ordinal, interval) and the distribution of the data

Perfect Positive Correlation

 Correlation Coefficient
       Single summary measure of the relationship between two variables
       It tells you:
                - how strong or weak the relationship is
                - and for scale data the direction of that relationship
i.e. whether it is positive or negative


Strength of correlations
No association
Moderate association
Perfect association
Male
Female
Male
Female
Male
Female
YES
NO
65%
35%
65%
35%
30%
70%
75%
25%
0%
100%
100%
0%
TOTAL
225
175
225
175
225
175
Correlation = 0
Correlation = 0.5
Correlation = 1

Testing for significant relationships and differences?
Tests for differences include:
       Chi-square tests – e.g. use with cross tabulations to look at the differences between 2 groups
       Analysis of variance (ANOVA – F tests) – e.g. use with ‘compare means’ to see the extent the means are different.
Tests for Correlation:
       Pearson’s product moment correlation coefficient – e.g. use with continuous (scale) data
       Spearman’s rank correlation coefficient – e.g. use with ordinal data where variables contain data that can be ranked
see p.357 in Saunders et al (2003) for more details or Collis & Hussey (2014; p262)

Age of respondent and degree of satisfaction with holiday
 One way ANOVA

 In this case there is no statistically significant differences between national groups in terms of expenditure.

Summary – Which statistics do you choose?

No comments:

Post a Comment