class: title-slide, center, middle # Computer Aided Archaeology ## 06 - Visualisation II ### Martin Hinz #### Institut für Archäologische Wissenschaften, Universität Bern 25/10/23 --- ## Seriation: idea and basics [1] .pull-left[ Types come into use, have a maximum and than disappear - First Seriation: Sir William Flinders-Petrie 1899 - became very popular during Processual Archaeology - First major trials with seriating methods in Germany Goldman 1979 with reciprocal averaging ] .pull-right[ ![](data:image/png;base64,#../images/06_session/battleship_2.jpg) ] --- ## Seriation: idea and basics [2] .pull-left[ - Represent 'types' per 'object' in a table - Sort the table so that a sequence (diagonal) appear Methods: - Reciprocal Averaging - Correspondence Analysis Necessary: - two 'types' per 'object' - two 'objects' per 'type' ] .pull-right[ ![](data:image/png;base64,#../images/06_session/muensingen_ser.jpg) ] --- ## First step ### Get the data out of the database in a useful form - query - pivot table --- ## Second step ### Remove non informative rows - rows that have not artefact or only one ### Remove non informative columns - columns that have not artefact or only one ### Repeat until no further step necessary <hr> ## Commands in LibreOffice Calc (and Excel) - COUNT (ANZAHL) - COUNTIF (ZÄHLENWENN) --- ## Reciprocal Averaging: idea and basics Produce a diagonal in such a way, that all objects and types are ranked relational - Calculate rank for rows - Sort rows according to rank - repeat the same for columns - repeat both until no further changes Iterative procedure [Example Dataset](data/ra_example.csv) --- ## Correspondence analysis: idea and basics [1] ### Similar things have similar characteristics...[2] **Visual explorative/descriptive method** - Correspondence analysis does not work with significances, therefore it does not 'proof' anything - Visualization of contingency tables or presence/absence matrices **Idea** - Representation of items (*sites*) and properties (Variables, *species*) in a common space (coordinate system) - Data that is related to each other is more closely related represented next to each other - Similarities are calculated using chi-square methods **Prerequisites** A data matrix with at least nominally scaled variables, therefore especially suitable for archaeological questions --- ## Correspondence analysis: idea and basics [1] ### Similar things have similar characteristics... **General procedure** - Standardizing the data to a comparable measure - "Projection" of the data into a multidimensional variable space - determining the vectors which stepwise contain most of the information (variability) of the data and are oriented perpendicular to each other - "Projection" of the data onto these vectors - Representation of the position of the data on these vectors in a diagram --- .pull-left[ ### multidimentional data space ![](data:image/png;base64,#../images/06_session/multi_space.png) .caption[.tiny[source: http://www.aapspharmscitech.org]] ] .pull-right[ ### projection of points onto a plane ![](data:image/png;base64,#../images/06_session/multi_projection.png) .caption[.tiny[source: http://www.cs.mcgill.ca]] ] --- ## Correspondence Analysis: History ### General information - Development in the field of biology and psychology - Algebrarian Foundations 1940s (Hartley/Guttman) - First explicit use by Benzéncri in the 1960s linguistic studies - Further development in various research groups → resulted in different versions and names of the procedure - 1984 Greenacre basic monograph on the method ### In archaeology - Wide application of the procedure for chronological sorting of the Rhineland Linear Pottery - Continuation by institutes Cologne and Kiel (Zimmermann, Müller) --- ## Correspondence Analysis: Procedure ### Preparation: contingency table, if necessary **Presence Absence Matrix** Notes the presence or absence of a characteristic for a unit, which is the most widely used base in archaeology | | Pot | Cup | Fibula | Sum | |---------|-----|-----|--------|-----| | Burial1 | 1 | 1 | 0 | 2 | | Burial2 | 0 | 1 | 1 | 2 | | Burial3 | 1 | 1 | 1 | 3 | | Burial4 | 1 | 0 | 1 | 2 | | Sum | 3 | 3 | 3 | 9 | Prerequisite: total number of filled cells per column at least 2, total per row at least 2 --- ## Preparation: contingency table, if necessary ### contingency table Notes the number of a characteristics for a unit or a group of units | | Pot | Cup | Fibula | Sum | |-------------|-----|-----|--------|-----| | Settlements | 20 | 23 | 40 | 83 | | Hoards | 23 | 10 | 6 | 39 | | Burials | 10 | 56 | 4 | 70 | | Sum | 53 | 89 | 50 | 192 | Also possible: Burt-Matrix, if you want, you can ask me for details after the lecture... --- ## Correspondence analysis: Procedure (using a presence/absence matrix) ### Preparation: Standardising to relative frequency Calculation: Divide each cell by the total sum .tiny[ .pull-left[ <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> pot </th> <th style="text-align:right;"> cup </th> <th style="text-align:right;"> fibula </th> <th style="text-align:right;"> Sum </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> burial1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> burial2 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> burial3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> burial4 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Sum </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 9 </td> </tr> </tbody> </table> ] .pull-right[ <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> pot </th> <th style="text-align:right;"> cup </th> <th style="text-align:right;"> fibula </th> <th style="text-align:right;"> Sum </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> burial1 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 0.22 </td> </tr> <tr> <td style="text-align:left;"> burial2 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.22 </td> </tr> <tr> <td style="text-align:left;"> burial3 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.33 </td> </tr> <tr> <td style="text-align:left;"> burial4 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.22 </td> </tr> <tr> <td style="text-align:left;"> Sum </td> <td style="text-align:right;"> 0.33 </td> <td style="text-align:right;"> 0.33 </td> <td style="text-align:right;"> 0.33 </td> <td style="text-align:right;"> 1.00 </td> </tr> </tbody> </table> ] ] Margins of the table stored for calculation of expectation values and scaling the result later on --- ## Correspondence analysis: Procedure (using a presence/absence matrix) ### Preparation: Calculation of expected values .pull-left[ <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> pot </th> <th style="text-align:right;"> cup </th> <th style="text-align:right;"> fibula </th> <th style="text-align:right;"> Sum </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> burial1 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 0.22 </td> </tr> <tr> <td style="text-align:left;"> burial2 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.22 </td> </tr> <tr> <td style="text-align:left;"> burial3 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.33 </td> </tr> <tr> <td style="text-align:left;"> burial4 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.22 </td> </tr> <tr> <td style="text-align:left;"> Sum </td> <td style="text-align:right;"> 0.33 </td> <td style="text-align:right;"> 0.33 </td> <td style="text-align:right;"> 0.33 </td> <td style="text-align:right;"> 1.00 </td> </tr> </tbody> </table> ] .pull-right[ <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> pot </th> <th style="text-align:right;"> cup </th> <th style="text-align:right;"> fibula </th> <th style="text-align:right;"> Sum </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> </td> <td style="text-align:right;"> 0.07 </td> <td style="text-align:right;"> 0.07 </td> <td style="text-align:right;"> 0.07 </td> <td style="text-align:right;"> 0.22 </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:right;"> 0.07 </td> <td style="text-align:right;"> 0.07 </td> <td style="text-align:right;"> 0.07 </td> <td style="text-align:right;"> 0.22 </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.11 </td> <td style="text-align:right;"> 0.33 </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:right;"> 0.07 </td> <td style="text-align:right;"> 0.07 </td> <td style="text-align:right;"> 0.07 </td> <td style="text-align:right;"> 0.22 </td> </tr> <tr> <td style="text-align:left;"> Sum </td> <td style="text-align:right;"> 0.33 </td> <td style="text-align:right;"> 0.33 </td> <td style="text-align:right;"> 0.33 </td> <td style="text-align:right;"> 1.00 </td> </tr> </tbody> </table> ] --- ## Correspondence analysis: Procedure (using a presence/absence matrix) .pull-left[ ### Preparation: Calculation of standardised values `\(\chi^2=\sum_{i=1}^n \frac{(O_i - E_i)^2}{E_i}\)` `\(z_{ij}=\frac{(O_i - E_i)}{\sqrt{E_i}}\)` <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> pot </th> <th style="text-align:right;"> cup </th> <th style="text-align:right;"> fibula </th> <th style="text-align:right;"> Sum </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> burial1 </td> <td style="text-align:right;"> 0.14 </td> <td style="text-align:right;"> 0.14 </td> <td style="text-align:right;"> -0.27 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> burial2 </td> <td style="text-align:right;"> -0.27 </td> <td style="text-align:right;"> 0.14 </td> <td style="text-align:right;"> 0.14 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> burial3 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> burial4 </td> <td style="text-align:right;"> 0.14 </td> <td style="text-align:right;"> -0.27 </td> <td style="text-align:right;"> 0.14 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> Sum </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 0 </td> </tr> </tbody> </table> ] .pull-right[ ### Inertia Measurement for the spread of the data in relation to the number of cases `\(I = \frac{\chi^2}{n} = \sum_i \sum_j z_{ij}^2\)` Inertia here: 0.3333333 ] ---
3D plot
--- .pull-left[ ### multidimentional data space ![](data:image/png;base64,#../images/06_session/multi_space.png) .caption[.tiny[source: http://www.aapspharmscitech.org]] ] .pull-right[ ### projection of points onto a plane ![](data:image/png;base64,#../images/06_session/multi_projection.png) .caption[.tiny[source: http://www.cs.mcgill.ca]] ] --- ## Correspondence analysis: Procedure (using a presence/absence matrix) ### Extraction of dimensions **SVD** **S**ingular **v**alue **d**ecomposition, method for dimensional reduction with minimal loss of information `\(Z=U∗S∗V'\)` .tiny[ Z : Matrix with the standardized data U : Matrix for the row elements V : Matrix for the column elements S : Diagonal matrix with the singular values ] ![Gene Golub’s license plate, photographed by Professor P. M. Kroonenberg of Leiden University.](data:image/png;base64,#../images/06_session/prof_svd.gif) .caption[.tiny[Gene Golub’s license plate, photographed by Professor P. M. Kroonenberg of Leiden University.]] --- ## Correspondence analysis: Procedure (using a presence/absence matrix) ### Extraction of dimensions **SVD in R** .tiny[ ```r burial.z<-read.csv2("burial_z.csv",row.names=1) burial.svd<-svd(burial.z) burial.svd ``` ``` ## $d ## [1] 4.082483e-01 4.082483e-01 9.634284e-16 ## ## $u ## [,1] [,2] [,3] ## [1,] 7.071068e-01 -0.4082483 -0.5773503 ## [2,] 9.008927e-17 0.8164966 -0.5773503 ## [3,] 0.000000e+00 0.0000000 0.0000000 ## [4,] -7.071068e-01 -0.4082483 -0.5773503 ## ## $v ## [,1] [,2] [,3] ## [1,] 0.0000000 -0.8164966 0.5773503 ## [2,] 0.7071068 0.4082483 0.5773503 ## [3,] -0.7071068 0.4082483 0.5773503 ``` ] --- ## SVD and Inertia The singular values (eigenvalues) represent the inertia. .tiny[ The eigenvalues ```r burial.svd$d ``` ``` ## [1] 4.082483e-01 4.082483e-01 9.634284e-16 ``` The squared eigenvalues are the inertia of the individual dimensions ```r burial.svd$d^2 ``` ``` ## [1] 1.666667e-01 1.666667e-01 9.281943e-31 ``` The sum of the squared eigenvalues is equal to the total of the intertia. ```r sum(burial.svd$d^2) ``` ``` ## [1] 0.3333333 ``` If the inertia of the individual dimensions is divided by the total inertia, the (eigenvalue) proportion of the dimensions is obtained. ```r burial.svd$d^2/sum(burial.svd$d^2) ``` ``` ## [1] 5.000000e-01 5.000000e-01 2.784583e-30 ``` ] --- ## Correspondence analysis: Procedure (using a presence/absence matrix) ### Normalization of coordinates Scaling of the coordinates in such a way that The dimensions are weighted according to their proportion of the total inertia. The rows/columns are weighted according to their proportion of the mass. --- ## Correspondence analysis: Real World case ### Münsingen Burial Site .pull-left[ .small[ ```r muensingen <- read.csv("muensingen_ideal.csv", row.names = 1) muensingen.cca <- cca(muensingen) muensingen.species <- data.frame( scores(muensingen.cca)$species ) ggplot(muensingen.species, aes(x=CA1, y=CA2, label=rownames(muensingen.species) ) ) + geom_point() + geom_text_repel(size=2) ``` ] ] .pull-right[ <img src="data:image/png;base64,#session_06_visualisation_2_files/figure-html/unnamed-chunk-19-1.png" width="504" /> ] --- ## Correspondence analysis: Real World case ### Münsingen Burial Site .pull-left[ ![](data:image/png;base64,#../images/06_session/tosca.png) ] .pull-right[ ![](data:image/png;base64,#../images/06_session/tosca_seriation.png) ] [http://tosca.archaeological.science](http://tosca.archaeological.science) --- ## Correspondence Analysis: Interpretation ### Guttman effect (horseshoe, parabola) .pull-left[ In archaeology, this is often regarded as evidence of a temporal orientation. The Guttman effect occurs when a process affects the data on multiple levels. The largest influencing factor, given a longer runtime, is mostly the time, but: This does not always have to be the case. Check against other information necessary. ] .pull-right[ ![](data:image/png;base64,#session_06_visualisation_2_files/figure-html/unnamed-chunk-20-1.png)<!-- --> ] --- class: inverse, middle, center # Any questions? .footnote[ .right[ .tiny[ You might find the course material (including the presentations) at https://berncodalab.github.io/caa You can contact me at <a href="mailto:martin.hinz@iaw.unibe.ch">martin.hinz@iaw.unibe.ch</a> ] ] ]