Correspondence Analysis in Archaeology
  • Home
  • Guide by worked examples
    • Aim of Correspondence Analysis
    • Association between rows and columns
    • Number of dimensions useful for data interpretation
    • Interpreting the CA scatterplot: dimensions interpretation
    • Interpreting the CA scatterplot (continued): correlation between row profiles and dimensions
    • Quality of the representation
    • Assembling the whole picture
    • Extension: clustering rows and/or columns
    • Another worked example
  • References
  • CA in R
    • CAinterprTools (R package)
    • R function for various CA scatterplots
    • R function for improved CA scatterplot
    • R function for perceptual-map-like CA scatterplot
    • R function for plotting Pareto chart of categories contribution
    • R Script for CA
    • Additional R Script for CA
    • R Script for the Significance of CA's Dimensions
  • Other Tools for Statistics
    • R package for seriation via CA
    • R function for scalar-stress probability calculation
    • R function for post. prob. for different relations btw 2 Bayesian 14C phases
    • R function for Posterior Probability Density plot
    • R function for binary Logistic Regression
    • R function for binary Logistic Regression internal validation
    • R function for optimism-adjusted AUC
    • R function for Brainerd-Robinson similarity coefficient
    • R function for univariate outliers detection
    • R function for plotting Jenks natural breaks classification
    • R function for permutation-based Chi square test of independence
    • R function for permutation t-test
    • R function for visually displaying Mann-Whitney test
    • R function for visually displaying Kruskal-Wallis test
    • Kruskal-Wallis Excel Template
    • Chi-squared Excel Template
    • Excel Template for Robust Statistics
  • GIS
  • Blog
  • About me
  • Guestbook/Comments
'pareto.contrib': R function for plotting a pareto chart for row/column categories contribution to dimensions
'pareto.contrib' is an R function which allows to plot the contribution of row/column categories to any given CA dimension as a Pareto chart. It depends on the 'qcc' package, which of course must be installed into R before running the function.

The chart returned by the function proves useful in quickly spotting which categories have an higher (in relative terms) contribution to a given dimension, and which percentage of the inertia of a given dimension groups of categories are cumulatively explaining.
The function is quite straightforward:
pareto.contrib(data, type, x, which)


where data is the input dataset,
type indicates whether the input dataset is a contingency table (value to enter would be: "table") or an object returned by FactoMineR's CA() function (value to enter would be: "obj"),

x is dimensions of interest,
which indicates whether the user is interested in row ("R") or column ("C") categories.


Example:
Let's load the "smoke" dataset coming with the 'ca' package:
library(ca)
data(smoke)


To get the Pareto chart of the contribution of row categories to the first dimension, we use the following code:
pareto.contrib(smoke, type="table", 1, which="R")

The output is reproduced below:
Picture
To load the function into R, just copy and paste the function below into the R console, and press return (or you can download the .R file HERE):
pareto.contrib <- function(data,type,x, which){
  if (type=="table"){
  numb.dim.cols<-ncol(data)-1
  numb.dim.rows<-nrow(data)-1
  dimensionality <- min(numb.dim.cols, numb.dim.rows)
  res.CA <- CA(data, ncp=dimensionality, graph=FALSE)  #requires FactoMineR
  if (which=="R") {
    cntr= res.CA$row$contrib[,x]
    names(cntr) <- rownames(data)
    title <- paste("Rows contribution to the Inertia of Dim.", x)
  } else {
    cntr= res.CA$col$contrib[,x]
    names(cntr) <- colnames(data)
    title <- paste("Columns contribution to the Inertia of Dim.", x)
  }
  pareto.chart(cntr, cumperc = seq(0, 100, by = 20), ylab="Percentage", main=title, cex.axis=0.8, cex.names=0.8) #requires qcc
  }else{
    if (type=="obj"){
    res.CA <- data
    if (which=="R") {
      cntr= res.CA$row$contrib[,x]
      names(cntr) <- rownames(res.CA$row$contrib)
      title <- paste("Rows contribution to the Inertia of Dim.", x)
    } else {
      cntr= res.CA$col$contrib[,x]
      names(cntr) <- rownames(res.CA$col$contrib)
      title <- paste("Columns contribution to the Inertia of Dim.", x)
    }
    pareto.chart(cntr, cumperc = seq(0, 100, by = 20), ylab="Percentage", main=title, cex.axis=0.8, cex.names=0.8) #requires qcc
    }
  }
}

Final note: 
remember that in order for the function to work, the 'qcc' package must be installed in R.
Have you found this website helpful?  Consider to leave a comment in this page.

Powered by Create your own unique website with customizable templates.