Publications by Ivo Pinheiro

Capstone Slide Deck

09.11.2024

Next Word Prediction Capstone Project Ivo Pinheiro 2024-11-09 Background My goal was creating a simple Shiny web app that allows the user to input a sequence of words and will output a prediction of the next most likely 3 words. The foundation of my text prediction app is an n-gram model that relies on: the frequencies of n-grams, that is, se...

1305 sym 1 img

Milestone Report (Capstone Project - Module 2)

28.09.2024

This is my milestone report about the Capstone Project (Module 2). The final goal of this project is creating a text prediction algorithm and Shiny app. This report, however, only covers some initial steps: Loading and counting how many lines each of the 3 data sets have, Sampling 1000 lines of each data set and turning text files into individual ...

1935 sym Python (3923 sym/25 pcs) 3 img

Shiny app Pitch (Data Science Specialisation)

17.09.2024

App Pitch for Coursera Project Reproducible Pitch Presentation My Shiny app analyses text and outputs a summary analysis. Warning: if you are a fan of the The Beatles, there’s an Easter egg. How It Works The app is very simple to use: The user will input some text. The app will return the number of characters in the text, the number of words, ...

1341 sym

My Limits

14.06.2024

About a year and a half ago, I started learning about data science in my free time. But what have I actually learned about data? Am I even good at it? How I Started I started learning online, where there’s no shortage of great and free resources. I read Wickham’s R for Data Science (https://r4ds.hadley.nz). I downloaded R Studio, completed the...

5199 sym R (619 sym/2 pcs)

Capstone Project

19.05.2024

Introduction This case study was produced as part of the final project for the Google Data Analytics Professional Certificate on Coursera. It follows the data analysis phases taught during the course (ASK, PREPARE, PROCESS, ANALYZE, SHARE, ACT) and should you want to reproduce my results, you can, as this report includes all the steps taken to rea...

5334 sym R (19371 sym/27 pcs) 3 img 2 tbl

My Storm Data Analysis for Coursera

26.03.2024

Synopsis This analysis focuses on the most harmful events with respect to population health and economic consequences. The variables that measure population health are FATALITIES and INJURIES, whereas economic consequence are accounted by PROPDMG. My analysis, therefore, was very straightforward. I set out to find out what were the top 10 most dama...

1203 sym R (40919 sym/23 pcs) 1 img