Beginner Level Table Tutorial

This tutorial provides a fast track example to a simple table so as to introduce the process in R. Given the object-oriented perspective, many analyst may experience initial challenges with this package as it is significantly different from many of the other popular statistical packages.
Still with use, it is expected that R may make many converts.

The Tutorial is structured around the construction of a table object. This can be seen as a multi-stage process with well defined steps. First the basic table must created. Then it must be embellished. Finally, it must be exported to an environment useful to the target audience.

First the user must create the core table object. This is comprised of going through the database and filling all the elements of the two dimensional matrix. Typically this is the most computational intensive phase as it involves reading the entire database. Each cell in the matrix will be the sum of records that meet the conditions given for that column and row.

The second component involves the embellishment of the table. First there are the secondary calculations that must be done. Typically this may include row or column headings. These are then added to the table object. Then there is the embellishment table object with the approbriate text and graphical elements.

The final stage is the export of the table object from R to the environment of choice, which best suits the needs of the target audience.

Build Core Table Object

This basic table simply provides a crosstabs of fifteen observations in the database. Note that the categorical variables were created to allow the compilation of a simple 2 by 2 table.

First need to load that data into system

The database used is available with the default R system to allow for easy replication. It is loaded with the data.frame command after the workspace has been cleared with the rm command.

#clear the workspace
#make dataframe out of preloaded dataset
workset <- data.frame(women)

Create categorical variables

Tables cannot be built if the input data is continuous. In this example, individuals are categorized as by height then by weight. This result is two binary variables that can be used to build tables.

#use the height variable in inches 
#classify anyone 60 inches or less as short
workset$heightcat[workset$height <= 60] <- "short"
#the rest are tall
workset$heightcat[workset$height >  60] <- "tall"
#classify anyone under 135 as thin
workset$weightcat[workset$weight <= 135] <- "thin"
#rest are heavy
workset$weightcat[workset$weight >  135] <- "heavy"
#define as factor variables
workset$weightcat <- factor(workset$weightcat)
workset$heightcat <- factor(workset$heightcat)

Compile the table

Now that there are two categorical variables a 2 by 2 table can be built.

#create table object
tab1 <- table(workset$heightcat,workset$weightcat)
#display object in its initial format 
##         heavy thin
##   short     0    3
##   tall      7    5

The result is a simple raw table. It uses the variable lables for the row and column lables. No title is assumed.

Embelish the Basic Table

In the above form, the correctness of the results can be easily validated. However, for reporting purposes, this table would be seen as plain. In order to improve the appearances the titles have been added.

Change Row and Column Titles

These values are changes by using functions to modify the tab1 object produced above.

Change the Dimension Titles

This refers to the label that defines either the column or row dimension.

#assign values to row and column names
names(dimnames(tab1)) <- c("inches", "pounds")
##        pounds
## inches  heavy thin
##   short     0    3
##   tall      7    5

Change the individual row and column labels

The above labels were generated from the variable names. This is a reasonable default but often users may want something more descriptive in the tables.

rownames(tab1) <- c("<60"," 60+")
colnames(tab1) <- c("<135"," 135+")
##       pounds
## inches <135  135+
##   <60     0     3
##    60+    7     5

Add marginal totals

Another embellishment is to calculate the row and column totals.

tab1 <- addmargins(tab1)
##       pounds
## inches <135  135+ Sum
##   <60     0     3   3
##    60+    7     5  12
##   Sum     7     8  15

Note that the word ‘sum’ is assumed. This can be changed. Basically, the row label object is retrieved with the rownames function. Then the third element is changed from ‘sum’ to ‘total’. Then this new object is reassigned to the table object.

#retrieve row label object
thetit = rownames(tab1)
## [1] "<60"  " 60+" "Sum"
#modify object
thetit[3] <- "total"
## [1] "<60"   " 60+"  "total"
#change table object with modified row label object
rownames(tab1) <- thetit
##        pounds
## inches  <135  135+ Sum
##   <60      0     3   3
##    60+     7     5  12
##   total    7     8  15

Now the table shows with the sum of the rows labeled as ‘total’. However, the lable for the sum of the columns is unchanged. This can be rectified with a similar procedure.

Export the Table to its Final Form

The tutorial is written in R, and this is how the tables are displayed. However, most analysts do not provide the tables to their bosses as R programs and must export them to a format usuable by their audience. Thus the usual final stage of producing a report is to export the tables to alternate formats that might be used for publication. There are numerous possible formats that can be used. This section will focus on two of the more popular ones.

An important subtlety is that it is at the export stage that the overall title is added. There are possible opposing views on this subject but nevertheless it is at this point that the title is added to the report.

Two examples are provided here. One example exports the table to a Latex format. This would allow for easy incorporation into a PDF format. The second example allows for export to an html format. This is turn could be loaded directly to an Intranet site, or pasted in a MicroSoft Word document.

It is important to note that in this tutorial it is assumed that certain libraries have already been installed on the system. If this is not the case, then there will be a need to install the libraries and possible support libraries to make these examples work. As a general rule, this is easier to do in RStudio.

Export to Latex

In this example, the xtable function must be imported. Note that what is shown here is the latex commands that would generate a table. Also note that the title is entered with xtable. However, the position of the table is entered with print.

tab2 = xtable(tab1,caption='weight by height')
print(tab2,caption.placement = "top")
## % latex table generated in R 3.0.2 by xtable 1.8-2 package
## % Sun Aug 14 21:05:28 2016
## \begin{table}[ht]
## \centering
## \caption{weight by height} 
## \begin{tabular}{rrrr}
##   \hline
##  & $<$135 &  135+ & Sum \\ 
##   \hline
## $<$60 & 0.00 & 3.00 & 3.00 \\ 
##    60+ & 7.00 & 5.00 & 12.00 \\ 
##   total & 7.00 & 8.00 & 15.00 \\ 
##    \hline
## \end{tabular}
## \end{table}

Export to HTML

The html export function must be brought into the system.

tab3 = htmlTable(tab2,caption='weight by height')
weight by height
<135 135+ Sum
<60 0 3 3
60+ 7 5 12
total 7 8 15

The defaults are plain. Fortunately there are a fair number of options available. This beginner tutorial will not attempt to document all the possibilities. However, here the use of the css.cell option shows how the spacing can be more pleasing. Future tutorials will explore this in more depth as there are many features.

tab3 = htmlTable(tab2,caption='weight by height',
       css.cell = "padding-left: 1.5em; padding-right: 1.2em;")
weight by height
<135 135+ Sum
<60 0 3 3
60+ 7 5 12
total 7 8 15