Publications by Banu Boopalan

DATA606 Lab5a

08.03.2020

In this lab, we investigate the ways in which the statistics from a random sample of data can serve as point estimates for population parameters. We’re interested in formulating a sampling distribution of our estimate in order to learn about the properties of the estimate, such as its distribution. The data We consider real estate data from th...

12791 sym R (6641 sym/86 pcs) 10 img

DATA 606 LAB1

02.02.2020

The goal of this lab is to introduce you to R and RStudio, which you’ll be using throughout the course both to learn the statistical concepts discussed in the texbook and also to analyze real data and come to informed conclusions. To straighten out which is which: R is the name of the programming language itself and RStudio is a convenient inte...

13445 sym R (17190 sym/60 pcs) 14 img

DATA 607 Final Project Team Banu Boopalan and Sadia Perveen PART2

11.12.2019

PROJECT TEAM: SADIA AND BANU DATA 607 FINAL PROJECT (PART2 DOCUMENT) ANALYSIS: Our second part of the analyis was to understand how to scrape Wikipedia for metoo webpage and then store that in Neo4j and retrieve data. The goal was to see if we can explain graph database and understand and expand on modeling topics through neo4j in the future. Fo...

1780 sym R (8409 sym/39 pcs) 5 img 2 tbl

DATA 607 Banu Boopalan Tidyverse Assignment Part2

08.12.2019

Banu Boopalan : Extend Salma’s code (please see last code chunk added) Forcats package forcats provides a suite of useful tools that solve common problems with factors. R uses factors to handle categorical variables, variables that have a fixed and known set of possible values. In the following dataset, we have some categorical variables like ...

1649 sym R (4856 sym/12 pcs) 2 img

DAT607 Assignment 12 migrating mySQL DB to MongoDB

25.11.2019

Migrating to MongoDB Process: I created a MongoDB cluster for free on Atlas account, then tried to migrate my movierating database to the cluster. Because my IP is whitelisted, only my IP will be allowed to access my Mongodb cluster. So, from a reproducibility standpoint, I may have to list another IP address if a connection should be allowed int...

3659 sym R (30611 sym/65 pcs)

Data in Context Presentation DATA 607

02.12.2019

R Markdown Data Science Recording using dlply and brew package. Read UN dataset, create data for reporting #read file MyData <- read.csv(file="C:/Users/Banu/Documents/RScriptfiles/Datascienceincontext/SYB62_T07_Education_BrewPracticedataset.csv", header=TRUE, sep=",",stringsAsFactors = FALSE) str(MyData) ## 'data.frame': 8629 obs. of 7 vari...

403 sym R (7488 sym/14 pcs) 3 img

DATA 607 Banu Boopalan Tidyverse Assignment Part1

08.12.2019

Show Tidyverse, text sentiment analysis by taking in dataset, clean up tokens, perform igraph and bing sentiment analysis. For this exercise I have used the FiveThirtyEightDataset. These are the GOP phrases that Candidates repeated the most https://github.com/fivethirtyeight/data/tree/master/repeated-phrases-gop I downloaded the dataset from the ...

470 sym R (6169 sym/22 pcs) 2 img

DATA 607 Final Project Team Banu Boopalan and Sadia Perveen PART1

11.12.2019

PROJECT TEAM INFORMATION: README : In this RMD, please see PART1 for the project. Another RMD will be submitted for PART2. PART1 : Will contain ELA/MATH scores analysis and NYtimes API data analysis PART2 : Will contain Web scrape of Wikipedia page and neo4j data model and implementation. Team Members: Banu Boopalan & Sadia Perveen. For our final...

11159 sym R (74548 sym/145 pcs) 17 img 4 tbl

DATA606 Presentation

05.03.2020

Underage drinking, Part I. (4.17) Data collected by the Substance Abuse and Mental Health Services Administration (SAMSHA) suggests that 69.7% of 18-20 year olds consumed alcoholic beverages in any given year. (a) Suppose a random sample of ten 18-20 year olds is taken. Is the use of the binomial distribution appropriate for calculating the proba...

2849 sym R (1631 sym/34 pcs) 3 img

DATA606 Lab5b

08.03.2020

Sampling from Ames, Iowa If you have access to data on an entire population, say the size of every house in Ames, Iowa, it’s straight forward to answer questions like, “How big is the typical house in Ames?” and “How much variation is there in sizes of houses?”. If you have access to only a sample of the population, as is often the case...

6918 sym R (2138 sym/27 pcs) 4 img