Publications by Juan Osorio
First R markdown project
1. Code for reading in the dataset and/or processing the data The data is stored in the activity.csv file inside the data folder. This file was imported to R using the read.CSV function and stored in the dat variable. After that, the dat data frame was processed when required into another data frame which were to be used to answer an specific que...
2588 sym R (1994 sym/10 pcs) 4 img
Pressentation Most Populous Cities App
24/6/2020 Overview This presentation contains a plot of the most populous cities in the world according to the data from Simple Maps. The population is rescaled to millions and it is labaled accordingly. The plot is made with the leaflet package. The whole code (including the markdown file) is available on my github. Data Processing From the ful...
1495 sym R (198 sym/1 pcs) 1 tbl
Activity Predictions Using the Weight Lifting Exercises Dataset
Summary Based on a dataset provide by HAR we will try to predict the activity that was performed using original data from 159 covariates and 1 predictor. The steps taken to address the problem are: 1. Process the data to make it easy to analyze 2. Perform an exploratory data analysis on the data to find relationships between covariates 3. Selecti...
5385 sym R (7813 sym/19 pcs) 3 img 3 tbl
Predictive Text App
July 15, 2020 Predictive Text APP This is my predictive text app. It trains the model with the data from the HC Corpora english corpus. The model was chosen so as to provide a good balance between accuracy and speed. The three main parts are: Model Algorithm Shiny Web App This project will continue to be developed by the author. So be patient ...
1996 sym
Presentation 500 Most Populous Cities
25/6/2020 Overview This presentation contains a plot of the 500 most populous cities in the world according to the data from Simple Maps. The population is rescaled to millions and it is colored accordingly. The plot is made with the plotly package. The whole code (including the markdown file) is available on my github. Data Processing Code used...
418 sym R (403 sym/1 pcs)
Regression Models Project
Summary In this document we are analyzing the data from various cars, the special itnerest is to explore the relationship between the MPG (miles per gallon) and the type of transmission (Manual or Automatic). Two questions are expected to be addressed: 1. Is an automatic or manual transmission better for MPG 2. Quantify the MPG difference between...
4041 sym R (6047 sym/19 pcs) 3 img
HC Corpora - Exploratory Analysis
Synopsis This document is the result of a exploratory analysis over the HC Corpora english data. Said data will be later used to create a predictive text model. The main objective is to understand the corpus and get some basic statistical information and features. The HC Corpora data consists of four corpora each of a different language (english,...
4743 sym R (112 sym/2 pcs) 1 img