Publications by John Akwei

Merging Dataframes Exercises

14.04.2016

When combining separate dataframes, (in the R programming language), into a single dataframe, using the cbind() function usually requires use of the “Match()” function. To simulate the database joining functionality in SQL, the “Merge()” function in R accomplishes dataframe merging with the following protocols; “Inner Join” where the ...

4007 sym 2 img

Data Exploration with Tables exercises

20.04.2016

The table() function is intended for use during the Data Exploration phase of Data Analysis. The table() function performs categorical tabulation of data. In the R programming language, “categorical” variables are also called “factor” variables. The tabulation of data categories allows for Cross-Validation of data. Thereby, finding possib...

3688 sym 2 img

Complex Tables – Exercises

26.04.2016

The ftable() function combines Cross-Tabulation with the ability to format , or “flatten”, contingency tables of 3 or more dimensions. The resulting tables contain the combined counts of the categorical variables, (also factor variables in R), that are then arranged as a matrix, whose rows and columns correspond to the original data’s rows ...

3192 sym 2 img

Cross Tabulation with Xtabs exercises

12.05.2016

The xtabs() function creates contingency tables in frequency-weighted format. Use xtabs() when you want to numerically study the distribution of one categorical variable, or the relationship between two categorical variables. Categorical variables are also called “factor” variables in R. Using a formula interface, xtabs() can create a conting...

4094 sym 2 img

Accessing Dataframe Objects Exercises

20.05.2016

The attach() function alters the R environment search path by making dataframe variables into global variables. If incorrectly scripted, the attach() function might create symantic errors. To prevent this possibility, detach() is needed to reset the dataframe objects in the search path. The transform() function allows for transformation of datafr...

3245 sym 2 img

Scripting Loops In R

01.06.2016

An R programmer can determine the order of processing of commands, via use of the control statements; repeat{}, while(), for(), break, and next Answers to the exercises are available here. Exercise 1 The repeat{} loop processes a block of code until the condition specified by the break statement, (that is mandatory within the repeat{} loop), is m...

3198 sym 2 img

Summary Statistics With Aggregate()

16.06.2016

The aggregate() function subsets dataframes, and time series data, then computes summary statistics. The structure of the aggregate() function is aggregate(x, by, FUN). Answers to the exercises are available here. Exercise 1 Aggregate the “airquality” data by “airquality$Month“, returning means on each of the numeric variables. Also, remo...

2524 sym 2 img

Data Shape Transformation With Reshape()

06.07.2016

reshape() is an R function that accesses “observations” in grouped dataset columns and “records” in dataset rows, in order to programmatically transform the dataset shape into “long” or “wide” format. Required dataframe: data1 <- data.frame(id=c("ID.1", "ID.2", "ID.3"), sample1=c(5.01, 79.40, 80.37), sample2=c(5.12, 81.42, 83.12...

2350 sym 2 img

As.Date() Exercises

14.07.2016

The as.date() function creates objects of the class “Date“, via input of character representations of dates. Answers to the exercises are available here. Exercise 1 The format of as.Date(x, ...) accepts character dates in the format, “YYYY-MM-DD”. For the first exercise, use the c() function, and as.date(), to convert “2010-05-01” and...

2036 sym 2 img

Interactive Subsetting Exercises

29.07.2016

The function, “subset()” is intended as a convienent, interactive substitute for subsetting with brackets. subset() extracts subsets of matrices, data frames, or vectors (including lists), according to specified conditions. Answers to the exercises are available here. Exercise 1 Subset the vector, “mtcars[,1]“, for values greater than “...

1942 sym 2 img