Publications by Adrianne Kristianto

Sacramento January 2006 Crime Records (Leaflet)

08.05.2020

Introduction The Sacramento crime January 2006 file contains 7,584 crime records, as made available by the Sacramento Police Department. For this project, data downloaded from all forces for January 2006 for the whole city of Sacramento is used. The used data is from: https://support.spatialkey.com/spatialkey-sample-csv-data/ Preparing your data...

1507 sym R (827 sym/7 pcs)

Data Algorithms I - Midterm

09.10.2020

Data Sets: You need to download dataset birthweight.csv for Exercise 1-4. The birthweight data record live, singleton births to mothers between the ages of 18 and 45 in the United States who were classified as black or white. There are total of 400 observations in birthweight, and variables are: Weight: Infant birth weight (gram) Black: Categori...

9219 sym R (3510 sym/30 pcs) 4 img

Data Algorithms I - Homework 2

04.10.2020

Setting Working Directory setwd("/Users/Kristianto_97/Downloads/MSDA/Fall 2020/DA Algorithms/HW 2") Packages Used library(DescTools) library(MASS) library(car) ## Loading required package: carData ## ## Attaching package: 'car' ## The following object is masked from 'package:DescTools': ## ## Recode Exercise 1: Analysis of Variance The he...

15428 sym R (11784 sym/79 pcs) 8 img

Data Algorithms I - Final

15.12.2020

EXERCISE 1.1 Multiple Linear Regression lm.birth <- lm(Weight ~ Black + Married + Boy + MomSmoke + Ed + MomAge + MomWtGain + Visit, data = birthweight) summary(lm.birth) ## ## Call: ## lm(formula = Weight ~ Black + Married + Boy + MomSmoke + Ed + ## MomAge + MomWtGain + Visit, data = birthweight) ## ## Residuals: ## Min 1Q Me...

5828 sym R (9193 sym/31 pcs) 1 img

DA Visualization and Communication - Homework 1

15.12.2020

In this homework, you will use diamonds data set from ggplot2 package. It is automatically loaded when you execute library(ggplot2) so you don’t have to separately load it. Please create a duplicate of diamonds and use that for homework. This will avoid corrupting the original data set. Make sure that you understand the variables in the data by...

1446 sym R (2811 sym/11 pcs) 9 img

Data Algorithms I – Homework 4 (Logistic Regression)

15.12.2020

Exercise 1A The liver data set is a subset of the ILPD (Indian Liver Patient Dataset) data set. It contains the first 10 variables described on the UCI Machine Learning Repository and a LiverPatient variable (indicating whether or not the individual is a liver patient. People with active liver disease are coded as LiverPatient=1 and people withou...

13099 sym R (16696 sym/57 pcs) 8 img

Data Algorithms I – Homework 3 (Linear Regression)

15.12.2020

Exercise 1A We would like to investigate the relationships between Cholesterol, Weight and/or Blood Pressure. The data set contains Weight, Diastolic blood pressure, Systolic blood pressure and Cholesterol for alive subjects in the heart.csv. The medical director at your company wants to know if Weight alone can predict Cholesterol outcome. Consi...

12097 sym R (5789 sym/24 pcs) 6 img

Data Algorithms II - Assignment 1 (Statistical Analysis)

27.01.2021

Question 8 This exercise relates to the College data set, which can be found in the file College.csv. It contains a number of variables for 777 different universities and colleges in the US. 8A: Use the read.csv() function to read the data into R. Call the loaded data college. Make sure that you have the directory set to the correct location for...

7549 sym R (12886 sym/58 pcs) 12 img

Data Algorithms II - Assignment 2

19.02.2021

Question 2 Carefully explain the differences between the KNN classifier and KNN regression methods Although these two are quite similar, KNN classifier is used to solve classification problems (qualitative response) using the most common group among the K nearest neighbor. KNN regression is used to make a quantitative estimate by averaging the ...

5705 sym R (10593 sym/37 pcs) 4 img

Time Series Analysis

17.04.2021

Exercise 1: ACF and PACF from simulated data under ARMA(p,q) model For simulated stationary data (n=200) under different ARMA models listed below, (i) plot observations, (ii) plot its ACF and PACF plots, and (iii) describe what you observe from ACF and PACF. a: Two sets of simulated stationary data under AR(3) model by specifying coefficients of...

10537 sym R (9300 sym/37 pcs) 29 img