Publications by Janis Corona

UFC fighter analytics, ML on hits landed from regex text extraction of descriptive actions

26.03.2020

This script was edited months afterwards to correct the regex error with the knee to body getting selected as a knee strike, so now the counts are accurate. Updated 2/11/2020. Put in the first 50 samples of Wolfey (Mazvidal) as the testing set to see the prediction accuracy of hits landed with VulfenSarah (Nunez) hits landed comparison library(ca...

11725 sym R (73372 sym/93 pcs) 17 img

Coronavirus liver tumor and blood capillary samples analyzed for CNVs and such

25.03.2020

These samples are the headers added from three Gene Expression Omnibus studies at ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE89166 ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE89160 ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE100509 The first two studies are part of the same study that used human liver tumor samples in vitro to compare the effects of...

10570 sym R (31730 sym/79 pcs) 2 img

grabbing the stocks available from yahoo to Analyze and calculating counts decreasing and increasing

25.03.2020

This script shows how to grab thousands of available stocks, then select one stock from the set available, and count the number of days in a set time period, if available, that the stock increases and separately decreases based on the stock value a set number of days (lag value) earlier. It then uses that lag value to predict whether the stock wi...

12299 sym R (131144 sym/190 pcs)

Uterine Leiomyoma Beadchip Gene Expressions MySQL and DT package

25.03.2020

This script uses MySQL via RMariaDb package and MySQL. This script to use MySQL within RStudio from: https://programminghistorian.org/en/lessons/getting-started-with-mysql-using-r library(RMariaDB) library(tidyverse) library(DT) library(stringr) UL_data <- read.csv('UL_nonUL_beadchip_stats.csv', sep=',',header=TRUE, na.strings=c('',' ','NA')) ...

1033 sym R (13312 sym/15 pcs)

Uterine Leiomyoma Beadchip CNVs and FCs with ML to Predict Gene Targets

24.03.2020

This is to re-examine the UL and non-UL samples from the Gene Expression Omnibus online data repository (GEO) for genotypes in the ULs compared to those samples without tumor tissue in them. The accession IDs for the Series is GSE95101 and for the platform is GPL13376 Lets look at some of these copy number variants of one gene with seven copy num...

13094 sym R (97693 sym/145 pcs) 5 img

Subset of Stock with ML for Days Increasing or Decreasing as Target

24.03.2020

This script takes the ALL_65_stocks_countGroups_date1_date2_lagN.csv file made in the matrix_automateDateLagStocks2.Rmd script where date1 is the requested information and date 2 is the date script ran and lagN is the number N days the lag counts made. This script runs known algorithms on that count/lag table for machine learning results using th...

9338 sym R (3077225 sym/283 pcs) 8 img

part2 of Lyme Disease document with GSE145974 data

29.08.2020

This is the fold change by median values instead of by mean values to see if the accuracy in classification improves. Some samples might just need to be removed as there are about 4-5 in different classes where some genes are greatly over expresses in relation to their neighbors in each class. This is the median fold change version of LymeDisease...

19423 sym R (60718 sym/233 pcs) 1 img

COVID-19 severity graded cases and gene expression analysis from GSE152418 of NCBI's GEO

13.08.2020

The access number for this data is GSE151161 and it can be (downloaded)[https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE152418] with the text metadata information shown immediately below and the RAW data values of gene expression. No need to add the platform information, because the RAW data has the gene name attached to the Raw RNA gene ex...

10271 sym R (100569 sym/245 pcs)

Genecards.org gene grabs for proteins interested in researching

26.07.2020

This is a script that can work in combination with another function I made in a separate post, that returns the fold change on gene expression samples, by returning a list of 25 top genes for a protein such as ‘androgen’ from genecards.org. That function is named find25genes() and has one character argument of the protein you want the top 25 ...

5516 sym R (8823 sym/28 pcs)

LMT state by state comparison with number of jobs, pay, comparative jobs and demographics and home value per state

14.07.2020

This script pulls together other tables created from related scripts for extracting zillow two bedroom home values for May 2020, current July 2020 number of jobs available and hourly and annual average pay per job compared to licensed massage therapists (LMTs) in each of 50 of the US states not including the new District of Columbia state added t...

24774 sym R (474017 sym/264 pcs) 20 img