Publications by Juan Osorio

First R markdown project

05.06.2020

1. Code for reading in the dataset and/or processing the data The data is stored in the activity.csv file inside the data folder. This file was imported to R using the read.CSV function and stored in the dat variable. After that, the dat data frame was processed when required into another data frame which were to be used to answer an specific que...

2588 sym R (1994 sym/10 pcs) 4 img

Pressentation Most Populous Cities App

24.06.2020

24/6/2020 Overview This presentation contains a plot of the most populous cities in the world according to the data from Simple Maps. The population is rescaled to millions and it is labaled accordingly. The plot is made with the leaflet package. The whole code (including the markdown file) is available on my github. Data Processing From the ful...

1495 sym R (198 sym/1 pcs) 1 tbl

Activity Predictions Using the Weight Lifting Exercises Dataset

19.06.2020

Summary Based on a dataset provide by HAR we will try to predict the activity that was performed using original data from 159 covariates and 1 predictor. The steps taken to address the problem are: 1. Process the data to make it easy to analyze 2. Perform an exploratory data analysis on the data to find relationships between covariates 3. Selecti...

5385 sym R (7813 sym/19 pcs) 3 img 3 tbl

Predictive Text App

15.07.2020

July 15, 2020 Predictive Text APP This is my predictive text app. It trains the model with the data from the HC Corpora english corpus. The model was chosen so as to provide a good balance between accuracy and speed. The three main parts are: Model Algorithm Shiny Web App This project will continue to be developed by the author. So be patient ...

1996 sym

Presentation 500 Most Populous Cities

25.06.2020

25/6/2020 Overview This presentation contains a plot of the 500 most populous cities in the world according to the data from Simple Maps. The population is rescaled to millions and it is colored accordingly. The plot is made with the plotly package. The whole code (including the markdown file) is available on my github. Data Processing Code used...

418 sym R (403 sym/1 pcs)

Regression Models Project

15.06.2020

Summary In this document we are analyzing the data from various cars, the special itnerest is to explore the relationship between the MPG (miles per gallon) and the type of transmission (Manual or Automatic). Two questions are expected to be addressed: 1. Is an automatic or manual transmission better for MPG 2. Quantify the MPG difference between...

4041 sym R (6047 sym/19 pcs) 3 img

HC Corpora - Exploratory Analysis

07.07.2020

Synopsis This document is the result of a exploratory analysis over the HC Corpora english data. Said data will be later used to create a predictive text model. The main objective is to understand the corpus and get some basic statistical information and features. The HC Corpora data consists of four corpora each of a different language (english,...

4743 sym R (112 sym/2 pcs) 1 img