Publications by finnstats

Social Network Analysis in R

22.04.2021

Social Network Analysis in R, Social Network Analysis (SNA) is the process of exploring the social structure by using graph theory. It is mainly used for measuring and analyzing the structural properties of the network. It helps to measure social network relationships (Facebook, Twitter likes comments following etc..), Email connectivity, flows b...

2660 sym R (2585 sym/13 pcs) 16 img

Handling missing values in R

23.04.2021

Handling missing values in R, one of the common tasks in data analysis is handling missing values. In R, missing values are often represented by the symbol NA (not available) or some other value that represents missing values (i.e. 99). Impossible values (e.g., dividing by zero) are represented by the symbol NaN (not a number) Handling missing va...

4831 sym R (4763 sym/22 pcs) 6 img 1 tbl

Regression analysis in R-Model Comparison

25.04.2021

Regression analysis in R, just look at the Boston housing data and we can see a total of 506 observations and 14 variables. In this dataset, medv is the response variable, and the remaining are the predictors. We want to make a regression prediction model for medv based on other predictor variables. Most of the variables are numeric variables ex...

4037 sym R (8588 sym/23 pcs) 24 img

Timeseries analysis in R

26.04.2021

Timeseries analysis in R, in statistics time series, is one of the vast subjects, here we are going to analyze some basic functionalities with the help of R software. The idea here is to how to start time series analysis in R. In this tutorial will go through different areas like decomposition, forecasting, clustering, and classification. Cluster...

2611 sym R (4293 sym/20 pcs) 22 img

Self Organizing Maps in R- Supervised Vs Unsupervised

27.04.2021

Self-organizing maps are very useful for clustering and data visualization. Self-organizing maps (SOMs) are a form of neural network and a beautiful way to partition complex data. In this tutorial, we are using college admission data for clustering and visualization and we are covering unsupervised and supervised maps also. Self Organizing Maps T...

3461 sym R (3802 sym/16 pcs) 8 img

Logistic Regression R- Tutorial

28.04.2021

Logistic Regression R, In this tutorial we used the student application dataset for logistic regression analysis. Logistic regression is a statistical model that in its basic form uses a logistic function to model a binary dependent variable. In this tutorial, the target variable or dependent variable is Admit (0-No, 1-Yes) and the remaining vari...

3299 sym R (3212 sym/14 pcs)

KNN Algorithm Machine Learning

29.04.2021

knn algorithm machine learning, in this tutorial we are going to explain classification and regression problems. Machine learning is a subset of artificial intelligence which provides machines the ability to learn automatically and improve from previous experience without being explicitly programmed. The major part of machine learning is data. Fe...

3807 sym R (5437 sym/24 pcs) 6 img

Customer Segmentation K Means Cluster

01.05.2021

Customer segmentation is the process of separation of customers into groups based on common characteristics or patterns so companies can market their products to each group effectively and significantly. In business-to-consumer marketing, most of the companies often segment their customers into Age, Gender, Marital status, location (urban, suburb...

2999 sym R (3249 sym/12 pcs) 4 img

Linear Discriminant Analysis in R

02.05.2021

linear discriminant analysis, originally developed by R A Fisher in 1936 to classify subjects into one of the two clearly defined groups. It was later expanded to classify subjects into more than two groups. Linear Discriminant Analysis (LDA) is a dimensionality reduction technique. LDA used for dimensionality reduction to reduce the number of di...

3896 sym R (2469 sym/17 pcs) 12 img

Random Forest Feature Selection

03.05.2021

Random Forest feature selection, why we need feature selection? When we have too many features in the datasets and we want to develop a prediction model like a neural network will take a lot of time and reduces the accuracy of the prediction model. We need to make use of the Boruta algorithm and is based on random forest. How Boruta works? Suppos...

2990 sym R (5510 sym/20 pcs) 4 img