Publications by Mustafa Arslan

Principal Component Analysis - Air Pollution Data set

26.08.2021

Objective: Perform a principal component analysis on the Air Pollution data set and make an inference. Data: Air Pollution Data Set: This date set can be downloaded through R HSAUR2 package and it has 41 observation of 7 variables. These variables are: SO2: SO2 content of air in micrograms per cubic metre Temp: average annual temperature in F Ma...

5878 sym R (6941 sym/58 pcs) 9 img

Principal Component Analysis - Mali Farm Data set

22.08.2021

Objective: Perform a principal component analysis on the Mali Farm data set and make an inference. Data: Mali Farm Data Set: Data from survey of 76 farmers in the Sikasso region of Mali (West Africa) on (from columns 1 through9) on family size (Family), distance in km to nearest passable road (DistRd), hectares of cotton, maize, sorghum, millet p...

4183 sym R (4468 sym/52 pcs) 10 img

Principal Component Analysis - Flea Beetle Data set

14.08.2021

Objective: Carry out a principal component analysis on the Flea Beetle data set and make an inference. Data: There are two Flea Beatle data set. First data set is called Haltica-Oleracea and the second one is called Haltica-Carduourum. Flea Beetle Data Set: A data frame with 39 observations; 19 from Haltica oleracea and 20 from H. carduourum (den...

5047 sym R (3013 sym/39 pcs) 6 img

Exploratory Factor Analysis - Work-Force Data set

31.08.2021

Introduction There are some areas in which we cannot measure directly the concepts of primary interest such as intelligence or social class which are called “latent variables”. In such cases, we collect information that can be measured and observed, and assume that these variables might be the best indicator of the primary interest. These var...

6068 sym R (7038 sym/51 pcs) 3 img

Cluster Analysis Part I

02.09.2021

Introduction In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis which seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two types: Agglomerative: This is a “bottom-up” approach: each observation starts in it...

4182 sym R (4397 sym/30 pcs) 10 img

Cluster Analysis Part III

03.09.2021

Introduction This is Part III of the hierarchical clustering analysis. In this part, I will be adding an outlier to the data set and observe if it makes any difference. Note that,you can find the other parts from the links below: Part I covers question 1 Part II covers question 2 Part III covers question 3. Part IV covers question4 Data: The d...

3068 sym R (4742 sym/34 pcs) 10 img

Cluster Analysis Part IV

03.09.2021

Introduction This is Part IV of the hierarchical clustering analysis. In this part, I am going to compare the each result with K-means. Note that,you can find the other parts from the links below: Part I covers question 1 Part II covers question 2 Part III covers question 3. Part IV covers question4 Data: The data file Ramusbone length records...

3616 sym R (6377 sym/64 pcs) 6 img

Analysis of Repeated Data

08.09.2021

Introduction Linear mixed models allow us to handle repeated measurements such as clinical trial in which individuals are assigned to different groups and have a response variable of interest that is recorded on different days. Repeated data generally have random effects and mixed effects. Fixed effects are the factors of interest that we manipu...

2256 sym R (4633 sym/33 pcs) 2 img

Logistic Regression 1

12.09.2021

Introduction Logistic regression, also called a logit model, is used to model binary outcome variable. In the logit model, the log odds of the outcome is modeled as linear combination of the predictor variables. Logistic regression is a useful analysis method for classification problems. Logistic Regression models provide us the probabilities, i...

3288 sym R (8262 sym/53 pcs) 1 img

Logistic Regression 2

12.09.2021

Introduction Data: SKP_FashionBig data set contains 1000 observations of 4 variables. The predictor variables are Age, Income and Months_subbed, and the response variable is Upgrade. Objective: The company wants to understand the effect of age, income and the length of subscription on upgrading to premium fashion service and make prediction for ...

1706 sym R (12123 sym/66 pcs) 7 img