Publications by STEM Research

Publish Document

12.04.2021

Introduction Frequency tabulation is a common statistical method of summarizing data into manageable form without substantial loss of information. This technique is applied to variables that are categorical, where the aim is simply to see how many of the subjects in the sample fall in a given category (or what the proportion/percentage is). When ...

5518 sym R (3063 sym/9 pcs) 7 tbl

Publish Document

12.04.2021

Introduction Descriptive statistics refers to the analysis of data which helps to describe or summarize data in the form of tables. Descriptive statistics do not, however, allow us to make conclusions beyond the data we have analyzed or reach conclusions regarding any hypotheses we might have made. They are categorized into the following three c...

8448 sym R (6394 sym/11 pcs) 8 tbl

Document

12.04.2021

Introduction This R guide will include information about Multiple Linear Regression and all of the statistics and tests that come with it. It also will talk about ways to check how accurate a model is and what we can look at to determine if we have the best model for our data. We will discuss how to run a multiple linear regression in R and what ...

4694 sym R (3463 sym/12 pcs) 3 tbl

Document

12.04.2021

Introduction Some of the data set you will be using in R, if not most, will be available in external formats or the internet. As such, you, as an R programmer will need to have a way of getting them into R before you can begin working on them. At times, this task can become time intensive and to some extent quite frustrating, more so if data is ...

6732 sym R (3622 sym/10 pcs) 3 img 7 tbl

Publish Document

12.04.2021

Introduction In programming, a string is considered as simply a sequence of characters - they are usually enclosed in either single or double quotation marks, depending on the programming language. They form an important data type in almost all programming lannguages, where they are used to store human readable information e.g. sentences, chara...

7883 sym R (14348 sym/58 pcs) 1 img

Document

12.04.2021

Introduction Functions in programming are a set of related instructions that are bundled together to perform a specific tasks. They are designed to be used repeatedly, or because of the complexity of given computational task(s), or because a programmer wants to break a large program into smaller and manageable chunks. Functions may or may not re...

9573 sym R (7422 sym/24 pcs) 8 tbl

Pie chart and doughnut using ggplot2 Library

27.12.2021

Introduction Content here … Required package library(foreign) # for importing the Stata v12 dataset library(dplyr) library(tidyverse) # has drop_na() function library(ggplot2) library(scales) # percent function library(kableExtra) # display table formatting Import data set to use chs = read.dta("~/chs12.dta") # note that chs12.dta is ...

790 sym R (2662 sym/13 pcs) 8 img 2 tbl

Document

22.12.2021

library(haven) chs = read_dta("~/chs.dta") 1. T tests T tests are used to test differences between two means. 1.1. One sample t test results <- t.test(chs$hemoglobin, alternative="two.sided", mu = 14.3, conf.level = 0.95) results ## ## One Sample t-test ## ## data: chs$hemoglobin ## t = -2.9343, df = 10336, p-value = 0.003351 ## alte...

397 sym R (1666 sym/11 pcs)

Data structures in R

13.12.2021

Introduction To make the best out of the R programming language, a programmer needs a strong understanding of the basic data types and data structures, and how to operate on them. A clear understanding of R data structures crucial since these are the objects you will manipulate on a day-to-day basis in R. Dealing with object conversion is one of...

12372 sym R (16926 sym/116 pcs) 5 img 4 tbl

Data types in R

13.12.2021

Introduction To make the best out of the R programming language, a programmer needs a strong understanding of the basic data types and data structures, and how to operate on them. Each variable in R has an associated data type each of which requires different amounts of memory. There are specific operations that can be performed on a given data ...

3154 sym R (1602 sym/54 pcs) 1 img