Publications by John Akwei
Merging Dataframes Exercises
When combining separate dataframes, (in the R programming language), into a single dataframe, using the cbind() function usually requires use of the “Match()” function. To simulate the database joining functionality in SQL, the “Merge()” function in R accomplishes dataframe merging with the following protocols; “Inner Join” where the ...
4007 sym 2 img
Data Exploration with Tables exercises
The table() function is intended for use during the Data Exploration phase of Data Analysis. The table() function performs categorical tabulation of data. In the R programming language, “categorical” variables are also called “factor” variables. The tabulation of data categories allows for Cross-Validation of data. Thereby, finding possib...
3688 sym 2 img
Complex Tables – Exercises
The ftable() function combines Cross-Tabulation with the ability to format , or “flatten”, contingency tables of 3 or more dimensions. The resulting tables contain the combined counts of the categorical variables, (also factor variables in R), that are then arranged as a matrix, whose rows and columns correspond to the original data’s rows ...
3192 sym 2 img
Cross Tabulation with Xtabs exercises
The xtabs() function creates contingency tables in frequency-weighted format. Use xtabs() when you want to numerically study the distribution of one categorical variable, or the relationship between two categorical variables. Categorical variables are also called “factor” variables in R. Using a formula interface, xtabs() can create a conting...
4094 sym 2 img
Accessing Dataframe Objects Exercises
The attach() function alters the R environment search path by making dataframe variables into global variables. If incorrectly scripted, the attach() function might create symantic errors. To prevent this possibility, detach() is needed to reset the dataframe objects in the search path. The transform() function allows for transformation of datafr...
3245 sym 2 img
Scripting Loops In R
An R programmer can determine the order of processing of commands, via use of the control statements; repeat{}, while(), for(), break, and next Answers to the exercises are available here. Exercise 1 The repeat{} loop processes a block of code until the condition specified by the break statement, (that is mandatory within the repeat{} loop), is m...
3198 sym 2 img
Summary Statistics With Aggregate()
The aggregate() function subsets dataframes, and time series data, then computes summary statistics. The structure of the aggregate() function is aggregate(x, by, FUN). Answers to the exercises are available here. Exercise 1 Aggregate the “airquality” data by “airquality$Month“, returning means on each of the numeric variables. Also, remo...
2524 sym 2 img
Data Shape Transformation With Reshape()
reshape() is an R function that accesses “observations” in grouped dataset columns and “records” in dataset rows, in order to programmatically transform the dataset shape into “long” or “wide” format. Required dataframe: data1 <- data.frame(id=c("ID.1", "ID.2", "ID.3"), sample1=c(5.01, 79.40, 80.37), sample2=c(5.12, 81.42, 83.12...
2350 sym 2 img
As.Date() Exercises
The as.date() function creates objects of the class “Date“, via input of character representations of dates. Answers to the exercises are available here. Exercise 1 The format of as.Date(x, ...) accepts character dates in the format, “YYYY-MM-DD”. For the first exercise, use the c() function, and as.date(), to convert “2010-05-01” and...
2036 sym 2 img
Interactive Subsetting Exercises
The function, “subset()” is intended as a convienent, interactive substitute for subsetting with brackets. subset() extracts subsets of matrices, data frames, or vectors (including lists), according to specified conditions. Answers to the exercises are available here. Exercise 1 Subset the vector, “mtcars[,1]“, for values greater than “...
1942 sym 2 img