Publications by BayesianN
important functions for string manipulation
introduction most data cleaning processes involve working with structured and unstructured character/string datatypes . the ability to manipulate string data can be a super power Load in the necessary packages library(tidyverse) library(odbc) library(DBI) library(RSQLite) create a fake dataset in R using tribble() original_table<-tribble(~co...
2482 sym 9 tbl
Answering complex business questions using SQL
tools evolve CREATE TABLE sales ( "customer_id" VARCHAR(1), "order_date" DATE, "product_id" INTEGER ); INSERT INTO sales ("customer_id", "order_date", "product_id") VALUES ('A', '2021-01-01', '1'), ('A', '2021-01-01', '2'), ('A', '2021-01-07', '2'), ('A', '2021-01-10', '3'), ('A', '2021-01-11', '3'), ('A', '2021-0...
6924 sym 3 img 18 tbl
job retention analysis
Library Setup library(tidyverse) library(knitr) library(gtsummary) library(ggpubr) library(RColorBrewer) library(lemon) library(paletteer) library(survival) library(survminer) library(cowplot) library(rms) library(car) library(patchwork) options("encoding" = "UTF-8") knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE...
17947 sym Python (22615 sym/32 pcs) 12 img 1 tbl
probability distributions in R
Introduction to Distribution Theory Probability theory Bongani Ncube 2023-11-29 Probability concepts The probability of an event (E) is the number of ways event E can occur divided by the total number of probable outcomes. We live in a world where decision making is based on conditions of uncertainty. It is therefore important to know the chanc...
8674 sym 7 img 4 tbl
R and SQL, a bit about both
0.0.1 Load in required packages library(odbc) library(DBI) library(RSQLite) library(tidyverse) options(scipen = 999) 0.0.2 Read in data sales<-readr::read_csv("sales.csv") 0.0.3 What variables do we have names(sales) ## [1] "Order ID" "Product" "Quantity Ordered" "Price Each" ## [5] "Order Date" "Purchase Addr...
3280 sym 1 img 8 tbl
INTRODUCTION TO SQL window functions
0.1 Window functions A window function performs an aggregate-like operation on a set of query rows. However, whereas an aggregate operation groups query rows into a single result row, a window function produces a result for each query row: 0.1.1 Anatomy of a window function FUNCTION_NAME() OVER() ORDER BY PARTITION BY ROWS/RANGE PRECEDING/F...
4012 sym 10 tbl
more sql queries
library(tidyverse) library(odbc) library(DBI) library(RSQLite) ## read in the dataset df <- readr::read_csv("recipe_site_traffic_2212.csv") ## sample 100 observations and select first 3 variables set.seed(1123) data1<- df |> sample_n(size=25) |> select(1,2,3,6) ## sample 100 observations and select subsequent 3 variables(including...
3785 sym 21 tbl
airbnb dashboard in progress
AIRBNB : Data Exploration Sidebar In this study i explored the AIRBNB dataset through data visualisation . The other goal was to also expand my knowledge of using flexdashboard for dashboard design Overview Most listed neighborhoods neighhood counts Measure Distribution correlations boxplots key findings Statistical correlations As expe...
1530 sym 13 img
SQL joins
0.0.1 Introduction Greetings , hope you will enjoy SQL JOINS coming from an R afficionado who has dealt mainly with R joins . These notes are based on how much i have understood and spent some time looking for pictures from the internet to aid in the presentation. > The presentation assumes you are already familiar with a bit of SQL 0.0.2 Mut...
5685 sym 11 img
movie rating in SQL
0.0.1 Explanatory data analysis still learn to work with date formats in SQL so will start my analysis using R again since , updating a database using the DBI package is still trivial , i have resorted to creating new columns USING R instead ,thus am Calculating the return on investment as the worldwide_gross/production_budget. dat_new<-dat_...
2137 sym