Publications by Robert Vidigal, PhD
Data Structures in Python
Data structures are fundamental constructs in Python that are used to store and organize data. Some of the most common basic data structures in Python include lists, dictionaries, and sets. These data structures can be used to solve a wide variety of problems, and they are essential for efficient programming. 3.1: Lists List: a mutable data struct...
1303 sym Python (4670 sym/74 pcs)
Classes & Methods in Python
Correctly declare a class object. Understand the use of init () method and the self parameter. Recognize the difference between functions and methods, and make method calls. Understand how the level of inheritance affect method calls and variables. # LibraryBook is the name of the class class LibraryBook: pass # pass indicates that the body/suit...
3589 sym Python (4787 sym/48 pcs)
YouTube Takeout Data
The code first unzips the YouTube takeout data into a directory. Then, it uses the YouTube API to extract the metadata for each video in the directory. The metadata includes the video title, description, length, views, and likes. The code then saves the metadata to a CSV file. """ @author: Robert Vidigal, PhD (CSMaP-NYU) """ # Unzipping YouTube fi...
282 sym Python (4954 sym/1 pcs)
Facebook Posts and Page Likes Pull
This python code pulls the user posts and page likes from facebook raw data extracted from the Facebook Graph API. The code then saves the data to a CSV file. """ @author: Robert Vidigal, PhD (CSMaP-NYU) """ import pandas as pd import glob from tqdm import tqdm # Reading JSON files and converting to CSV def readFiles(lang): outdir='/Users/rb5...
200 sym Python (2109 sym/2 pcs)
Twitter Posts, Likes, Friends, and Timeline Pull
This python code gets the user timeline, likes, posts, and friends (i.e., following) from Twitter raw data obtained from the Twitter API v2. The code then saves the data to a CSV file. This code works with Twitter data in multiple languages. """ @author: rb5286 """ import json import pandas as pd import glob from tqdm import tqdm def readFiles(lan...
282 sym Python (3322 sym/2 pcs)
Data Types, Subsets, and Merging Data
In this tutorial, we will only use base R functions to manipulate our data. setwd("~/Dropbox/PIBIC2021/data") # Setting Working Directory list.files("~/Dropbox/PIBIC2021/data") # Listing files in data folder list.files("~/Dropbox/PIBIC2021/codebook") # Listing files in codebook folder lapopBR19<-read.csv("LAPOP2019_work.csv", header=T, sep=",") # C...
462 sym Python (10715 sym/9 pcs)
Objects & Data Classes in R
In R, data can be of different types. The main ones are: integer = e.g., 2L (L is a command for R to save this as an integer), as.integer() numeric = e.g., 2, 15394.2, etc. (real numbers or decimal numbers), as.numeric() logical = e.g., True or False (binary), 1 == 1 character/string = e.g, “a”, “Sofia”, etc, as.character() factor = e.g. ...
728 sym Python (7652 sym/7 pcs)
Descriptive Statistics in R Robert Vidigal, PhD
Types of research questions There are two very different types of statistics, each designed to use this data to answer different types of research questions. Descriptive statistics – techniques used to describe and condense data Inferential statistics – techniques used to draw conclusions from data Descriptive statistics Goal is to summariz...
1787 sym Python (6059 sym/4 pcs) 2 img
AWS S3 Tutorial
Amazon S3 - Cloud Object Storage Tutorial rm(list=ls()) require(aws.s3) require(aws.signature) A detailed description of how credentials can be specified is provided at the link below: browseURL("https://github.com/cloudyr/aws.signature") The easiest way is to simply set environment variables on the command line prior to starting R or via a .Renvir...
533 sym Python (1072 sym/6 pcs)
Variable Labels in R
Labeling Variables install.packages("labelled") # If you need to INSTALL the package. require(labelled) Full vignette of the ‘labelled’ package here: browseURL("https://cran.r-project.org/web/packages/labelled/vignettes/intro_labelled.html") # A variable label could be specified for any vector using var_label(). var_label(df$caseid_w1) <- "SM...
191 sym