Publications by ebenezer
Document"Data Uniformity: A Statistic Assessment and management of Outlier and Missing Values In R "
By Ebenezer Akpati July 6, 2023 Introduction The uniformity of a dataset helps the analyst to get an accurate result or an higher accuracy; two major issues to accuracy are from outliers and missing values not handled well. Thus, pre-processing of your data value is the crucial point of any analysis and the focal point of any analyst whose int...
36061 sym R (21239 sym/54 pcs) 9 img
Analyzing President Bola Ahmed Tinubu's 29 MAY 2023 Inaugural Address
citation:http://www.sthda.com/english/wiki/word-cloud-generator-in-r-one-killer-fun ction-to-do-everything-you-need https://programminghistorian.org/en/lessons/basic-text-processing-in-r#fn:4 ################################################################################ Introduction this data set is from https://businessday.ng/news/legal-busin...
6317 sym R (8268 sym/67 pcs) 5 img
Analyzing President Bola Ahmed Tinubu’s 29 MAY 2023 Inaugural Address
citation:http://www.sthda.com/english/wiki/word-cloud-generator-in-r-one-killer-fun ction-to-do-everything-you-need https://programminghistorian.org/en/lessons/basic-text-processing-in-r#fn:4 ################################################################################ Introduction this text data set is from https://businessday.ng/news/legal-...
3563 sym R (8644 sym/67 pcs) 5 img
Stacking and Sorting Data
Often times large datasets are divided into smaller dataframes and stored in separate files. For instance, point of sale data may be stored in such a way that there’s a separate file for each month. Alternatively, subsets of the data are extracted from a large database in smaller sections. In either of these situations, the rows from the dif...
12606 sym R (394 sym/7 pcs)
Joining Data
Some useful insights can be gained when one dataset is analyzed in the context of another dataset. For instance, if weather is expected to have an influence on sales, then it may be worth combining the weather measurements to the point-of-sale data. Combining two datasets together in this way is typically done with a join. A join always results...
12634 sym R (716 sym/7 pcs)
DocumentMORE ON FUNCTIONS: ARGUMENTS, CREATING, PRINTING, SAVING RESULTS, RETURNING RESULTS
MORE ON FUNCTIONS: ARGUMENTS, CREATING, PRINTING, SAVING RESULTS, RETURNING RESULTS Introduction—- Whether your dealing with data types, reading in data, or other analytic tasks a fundamental principle of data analytic languages, and programming in general, is to avoid repeating code. This concept is often abbreviated as DRY: Don’t rep...
2614 sym 4 img
Using dplyr's Mutate, Rename, Relocate, and Distinct Functions
This lesson focuses on four functions that simplify some common data preprocessing tasks, mutate(), rename(), relocate(), and distinct(). There are many other functions in the dplyr package for wrangling data. You should spend some time reviewing them when you want to perform a specific task. Preliminaries Load the dplyr and magrittr packages...
8001 sym R (750 sym/7 pcs)
Handling Missing Values
This lesson introduces some ways to deal with missing values. It’s important that missing values are either removed or filled in with imputed values so that algorithms do not throw errors. Preliminaries Load the dplyr and magrittr packages. library(dplyr) library(magrittr) Make sure that this file and the jan17Items.csv file are in the same ...
9759 sym R (10239 sym/15 pcs)
Pivoting Dataframes Between Wide and Long Shapes
This lesson introduces two functions from the tidyr package for pivoting dataframes between wide and long formats. The tidyr package is part of the tidyverse, and it has functions for reshaping dataframes. The shape of a dataframe refers to the number of rows and columns. Many plotting functions and dashboard applications work best with long d...
8049 sym R (1016 sym/7 pcs)
Data Aggregation and Summary
This lesson introduces two functions from the dplyr package for aggregating data: the group_by() function and the summarise() function. We will also review how to use the lubridate package for converting strings to datetime types, as well as for rounding datetime values to date values. Finally, we will introduce the n_distinct() function for c...
6289 sym R (388 sym/3 pcs)