Publications by Group_Project: Venkata Naga Vamsidhar reddy karasani(vkara4), Anila Cheekati(vchee3), Venkata sai ram tirunagari(Vtiru5) , Pradeep kumar Naidu(Pnaid2), Simhadri Ramanjaneyulu(rsimh3), Subhalaxmi Rout(srout2)

DATA 607 Assignment3

16.02.2020

1. Using the 173 majors listed in fivethirtyeight.com’s College Majors dataset [https://fivethirtyeight.com/features/the-economic-guide-to-picking-a-college-major/], provide code that identifies the majors that contain either “DATA” or “STATISTICS” Answer data <- read.csv("https://raw.githubusercontent.com/fivethirtyeight/data/master/c...

1122 sym R (2183 sym/15 pcs)

Data 606 Homework 2 (Summarizing Data)

09.02.2020

Stats scores. (2.33, p. 78) Below are the final exam scores of twenty introductory statistics students. 57, 66, 69, 71, 72, 73, 74, 77, 78, 78, 79, 79, 81, 81, 82, 83, 83, 88, 89, 94 Create a box plot of the distribution of these scores. The five number summary provided below may be useful. df<- data.frame( "scores" = c(57, 66, 69, 71, 72, 73, ...

6670 sym R (422 sym/7 pcs) 5 img

DATA 607 Assignment1

03.02.2020

Overview Basically from the worst driver dataset, we need to find out where is America’s worst driver. The given data set shows different states of America and accident percentage based on speed, alcohol, distraction. This data also shows the driver’s car insurance and losses incurred by insurance companies for collisions per insured driver. ...

3094 sym R (9359 sym/12 pcs) 4 img

Publish Document

08.02.2020

This is by design a very open-ended assignment. In general, there’s no need here to ask “Can I…?” questions about your proposed approach. A variety of reasonable approaches are acceptable. You could for example access the SQL data directly from R, or you could create an intermediate .CSV file. I should be able to generate the SQL table(s)...

905 sym R (2263 sym/12 pcs) 1 img 1 tbl

Data 607 Project1

21.02.2020

Project Instruction In this project, you’re given a text file with chess tournament results where the information has some structure. Your job is to create an R Markdown file that generates a .CSV file (that could for example be imported into a SQL database) with the following information for all of the players: Player’s Name, Player’s Stat...

1326 sym R (48413 sym/54 pcs) 1 tbl

Data 607 Project 3

22.03.2020

1. Introduction a. What are the most valuable skills? As future data scientists, the goal of this project is to determine which skills are most valued by employers. In order to appropriately answer this question, we decided to look at current job postings and to look for skills that were most frequently requested by employers. As a data set with...

15925 sym R (68270 sym/57 pcs) 29 img

DATA 606 Lab 7

22.03.2020

North Carolina births In 2004, the state of North Carolina released a large data set containing information on births recorded in this state. This data set is useful to researchers studying the relation between habits and practices of expectant mothers and the birth of their children. We will work with a random sample of observations from this da...

6545 sym R (6564 sym/38 pcs) 10 img 1 tbl

Data 607 project 3 - Data Science skills set

23.03.2020

1. Introduction a. What are the most valuable skills? As future data scientists, the goal of this project is to determine which skills are most valued by employers. In order to appropriately answer this question, we decided to look at current job postings and to look for skills that were most frequently requested by employers. As a data set with...

18035 sym R (68273 sym/57 pcs) 29 img

Tidyverse Create assignment

29.03.2020

1. Introduction This is the dataset of titanic, I have chosen from Kaggle. This data set has below columns. 2. Load library #install.packages("tidyverse") #install.packages("ggplot2") library(ggplot2) library(tidyverse) ## ── Attaching packages ────────────────────────────────�...

511 sym R (4321 sym/11 pcs) 3 img

DATA 605 Assignment 1

31.08.2020

Problem set 1 You can think of vectors representing many dimensions of related information. For instance, Netflix might store all the ratings a user gives to movies in a vector. This is clearly a vector of very large dimensions (in the millions) and very sparse as the user might have rated only a few movies. Similarly, Amazon might store the item...

2337 sym R (1684 sym/13 pcs)