Publications by Mustafa Arslan
Principal Component Analysis - Air Pollution Data set
Objective: Perform a principal component analysis on the Air Pollution data set and make an inference. Data: Air Pollution Data Set: This date set can be downloaded through R HSAUR2 package and it has 41 observation of 7 variables. These variables are: SO2: SO2 content of air in micrograms per cubic metre Temp: average annual temperature in F Ma...
5878 sym R (6941 sym/58 pcs) 9 img
Principal Component Analysis - Mali Farm Data set
Objective: Perform a principal component analysis on the Mali Farm data set and make an inference. Data: Mali Farm Data Set: Data from survey of 76 farmers in the Sikasso region of Mali (West Africa) on (from columns 1 through9) on family size (Family), distance in km to nearest passable road (DistRd), hectares of cotton, maize, sorghum, millet p...
4183 sym R (4468 sym/52 pcs) 10 img
Principal Component Analysis - Flea Beetle Data set
Objective: Carry out a principal component analysis on the Flea Beetle data set and make an inference. Data: There are two Flea Beatle data set. First data set is called Haltica-Oleracea and the second one is called Haltica-Carduourum. Flea Beetle Data Set: A data frame with 39 observations; 19 from Haltica oleracea and 20 from H. carduourum (den...
5047 sym R (3013 sym/39 pcs) 6 img
Exploratory Factor Analysis - Work-Force Data set
Introduction There are some areas in which we cannot measure directly the concepts of primary interest such as intelligence or social class which are called “latent variables”. In such cases, we collect information that can be measured and observed, and assume that these variables might be the best indicator of the primary interest. These var...
6068 sym R (7038 sym/51 pcs) 3 img
Cluster Analysis Part I
Introduction In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis which seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two types: Agglomerative: This is a “bottom-up” approach: each observation starts in it...
4182 sym R (4397 sym/30 pcs) 10 img
Cluster Analysis Part III
Introduction This is Part III of the hierarchical clustering analysis. In this part, I will be adding an outlier to the data set and observe if it makes any difference. Note that,you can find the other parts from the links below: Part I covers question 1 Part II covers question 2 Part III covers question 3. Part IV covers question4 Data: The d...
3068 sym R (4742 sym/34 pcs) 10 img
Cluster Analysis Part IV
Introduction This is Part IV of the hierarchical clustering analysis. In this part, I am going to compare the each result with K-means. Note that,you can find the other parts from the links below: Part I covers question 1 Part II covers question 2 Part III covers question 3. Part IV covers question4 Data: The data file Ramusbone length records...
3616 sym R (6377 sym/64 pcs) 6 img
Analysis of Repeated Data
Introduction Linear mixed models allow us to handle repeated measurements such as clinical trial in which individuals are assigned to different groups and have a response variable of interest that is recorded on different days. Repeated data generally have random effects and mixed effects. Fixed effects are the factors of interest that we manipu...
2256 sym R (4633 sym/33 pcs) 2 img
Logistic Regression 1
Introduction Logistic regression, also called a logit model, is used to model binary outcome variable. In the logit model, the log odds of the outcome is modeled as linear combination of the predictor variables. Logistic regression is a useful analysis method for classification problems. Logistic Regression models provide us the probabilities, i...
3288 sym R (8262 sym/53 pcs) 1 img
Logistic Regression 2
Introduction Data: SKP_FashionBig data set contains 1000 observations of 4 variables. The predictor variables are Age, Income and Months_subbed, and the response variable is Upgrade. Objective: The company wants to understand the effect of age, income and the length of subscription on upgrading to premium fashion service and make prediction for ...
1706 sym R (12123 sym/66 pcs) 7 img