Publications by Siyang Ni
NGram Predictive Modelling
NGram Predictive Modelling Siyang Ni Overview This Project is the capstone project for the Data Science Specialization from John Hopkins University. In this project, I demonstrate how to build a simple model that predict the next words according to the previous one or few words. You can access the app on: https://siyangni.shinyapps.io/ngram_...
3290 sym 1 img
Milestone Report
Predicting Next Words Using N-Gram Models Author Siyang Ni Data Info First let’s get basic info of each test files we are working with: library(pacman) p_load(wordpredictor) p_load(tidyverse) tweets_path <- ("C:/Users/siyan/Downloads/Coursera-SwiftKey/final/en_US/en_US.twitter.txt") blogs_path <- ("C:/Users/siyan/Downloads/Coursera-Swif...
682 sym Python (16950 sym/31 pcs) 1 img
Predicting Child's Height
May 9, 2024 Developing Data Products Final Project This is the final project for the JHU’s Developing Data Products course This project contains two parts: An interactive web-hosted app where you can use mother and father’s height and the child’s sex to predict child’s height, based on the R’s built-in Galton Families dataset. This pres...
832 sym 1 img
R Markdown Presentation and Plotly
May 8, 2024 Demonstration of Visualizing PCA In this example, we extract 3 principle components from 4 variables in the IRIS dataset. Figure 1: 3D PCA of the IRIS Dataset Demonstration Thank you...
205 sym
SODA501: Tutorial on Panel Data
Introduction Panel data can be difficult to deal with in social science research, because many assumptions we impose on cross-sectional data modelling does not apply to panel data modelling. However, panel data analysis is a powerful tool in social scientists’ toolbox because it allows us to draw causal conclusions on a research question, where t...
13201 sym R (25856 sym/29 pcs) 8 img
Frequency_table_temp
Value Frequency Percentage cimp5 NA NA 0 1390 12.10 1 1656 14.41 2 1851 16.11 3 1908 16.61 4 1712 14.90 5 1436 12.50 6 1537 13.38 NA 7753 67.48 crisk5 NA NA 0 5291 48.32 1 3475 31.74 2 1468 13.41 3 441 4.03 4 164 1.50 5 63 0.58 6 34 0.31 7 11 0.10 8 3 0.03 NA 8293 75.74 cimp6 NA NA 0 1348 13.71 1 1439 14.63 2 1589 16.16 3 1...
4 sym 1 tbl
EDPSY 557: Homework 3 Part II
library(tidyverse) # for lots of things setwd("D:\\r\\hlm") # Setup working directory df <- read.csv("D:\\r\\hlm\\hw3.csv") # Load the data # Check nuts and bolts dim(df) ## [1] 80 4 head(df,3) ## classroomid time ontaskbehavior group ## 1 1 0 5 0 ## 2 1 1 12 0 ## 3 1 ...
10418 sym R (7207 sym/19 pcs) 2 img
EDPSY558: Assignment 3
Setting up the Environment rm(list=ls()) #Remove all existing objects setwd("D:/OneDrive/class/EDPSY558/hw3") #Set the working directory library(tidyverse) #Data Pre-processing ## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ── ## ✔ dplyr 1.1.3 ✔...
5765 sym R (12694 sym/13 pcs)