Correspondence Analysis in Archaeology
  • Home
  • Guide by worked examples
    • Aim of Correspondence Analysis
    • Association between rows and columns
    • Number of dimensions useful for data interpretation
    • Interpreting the CA scatterplot: dimensions interpretation
    • Interpreting the CA scatterplot (continued): correlation between row profiles and dimensions
    • Quality of the representation
    • Assembling the whole picture
    • Extension: clustering rows and/or columns
    • Another worked example
  • References
  • CA in R
    • CAinterprTools (R package)
    • R function for various CA scatterplots
    • R function for improved CA scatterplot
    • R function for perceptual-map-like CA scatterplot
    • R function for plotting Pareto chart of categories contribution
    • R Script for CA
    • Additional R Script for CA
    • R Script for the Significance of CA's Dimensions
  • Other Tools for Statistics
    • R package for seriation via CA
    • R function for scalar-stress probability calculation
    • R function for post. prob. for different relations btw 2 Bayesian 14C phases
    • R function for Posterior Probability Density plot
    • R function for binary Logistic Regression
    • R function for binary Logistic Regression internal validation
    • R function for optimism-adjusted AUC
    • R function for Brainerd-Robinson similarity coefficient
    • R function for univariate outliers detection
    • R function for plotting Jenks natural breaks classification
    • R function for permutation-based Chi square test of independence
    • R function for permutation t-test
    • R function for visually displaying Mann-Whitney test
    • R function for visually displaying Kruskal-Wallis test
    • Kruskal-Wallis Excel Template
    • Chi-squared Excel Template
    • Excel Template for Robust Statistics
  • GIS
  • Blog
  • About me
  • Guestbook/Comments
'chi.perm': R function for permutation-based Chi square test of independence (DOI: 10.13140/RG.2.1.3582.1846)
'chi.perm' is an R function which allows to perform the chi-square test of independence on the basis of  permuted tables, whose number is selected by user. For the rationale of this approach, see for instance the nice description provided by Beh E.J., Lombardo R. 2014, Correspondence Analysis: Theory, Practice and New Strategies, Chichester, Wiley, at pages 62-64.

The function is quite straightforward:
chi.perm(data, B, resid, filter, thresh, cramer) 

where:

data: is the dataframe containing the contingency table;
B: is the desired number of permutations (set at 1000 by default);
resid: takes TRUE or FALSE (default) if the user does or doesn't want to plot the table of Pearson's standardized residuals;
filter: takes TRUE or FALSE (default) if the user does or does't want to filter the Pearson's standardized residuals according to the threshold provided by the thresh parameter; by default, the threshold is set at 1.96, which corresponds to an alpha level of 0.05;
cramer: takes TRUE or FALSE (default) if the user does or doesn't want to calculate and plot the bootstrap confidence interval for Cramer's V.

Using for illustrative purposes the greenacre_data to which reference is made HERE, the figures 1 to 3 below are obtained by means of the following command:
chi.perm(greenacre_data, B=1000, resid=TRUE, filter=FALSE, cramer=TRUE) 
Picture
Fig. 1
Picture
Fig. 2
Picture
Fig. 3
Picture
Fig. 4
Fig. 1 displays the permuted distribution of the chi square statistic based on 1000 permuted tables. The selected number of permuted tables, the observed chi square, the 95th percentile of the permuted distribution, and the associated p value are reported at the bottom of the chart.
Fig. 2 displays the bootstrap distribution of Cramer's V coefficient, based on a number of bootstrap replicates which is equal to the value of the function's parameter B. The 95% confidence interval for V is also reported. Fig. 3 displays the Pearson's Standardized Residuals: a colour scale allows to easily understand which residual is smaller (BLUE) or larger (RED) than expected under the hypothesis of independence.

Should the user want to only display residuals larger than a given threshold, it suffices to set the filter parameter to TRUE, and to specify the desidered threshold by means of the thresh parameter, which is set at 1.96 by default:
chi.perm(greenacre_data, B=1000, resid=TRUE, filter=TRUE, thresh=1.96, cramer=TRUE) 
The output is displayed in Fig. 4 above.


The function requires the package 'corrplot', 'lrs', and 'InPosition' to be already loaded in R.
You can get the function via a small donation (about a couple of USD)  -------------->
Upon making your donation, please do not forget to provide your preferred email contact where you will receive the file.
Have you found this website helpful?  Consider to leave a comment in this page.

Powered by Create your own unique website with customizable templates.