Publications by Alex Lewis, Diana Murray, Jeffrey Chang, and Philip Moos
DREAM-High Breast Cancer Patient Data_test
In this activity, we will learn new skills in R with a large real-life dataset! We will load and examine a data frame that contains clinical information from over 1,000 breast cancer patients from The Cancer Genome Atlas (TCGA). TCGA characterized over 20,000 cancer samples spanning 33 cancer types with genomics. Genomics is an interdisciplinary fi...
12430 sym 5 img
Heatmaps
Heatmaps Heatmaps are a way to colorize, visualize, and organize a data set with the goal of intuiting relationships among observations and features. We will use heatmaps in this course to find patterns in the gene expression data for the 1K breast cancer patients from The Cancer Genome Atlas. Here, we focus on what heatmaps are and how to create t...
5788 sym R (3088 sym/21 pcs) 10 img
DREAM-High Breast Cancer Patient Data
In this activity, we will learn new skills in R with a large real-life dataset! We will load and examine a data frame that contains clinical information from over 1,000 breast cancer patients from The Cancer Genome Atlas (TCGA). TCGA characterized over 20,000 cancer samples spanning 33 cancer types with genomics. Genomics is an interdisciplinary fi...
12430 sym 5 img
Breast Cancer Cell Lines
About this activity In this module, we will switch gears and work with data from experiments with human cancer cell lines from the Physical Sciences in Oncology (PS-ON) Cell Line Characterization Study. Cancer cell lines are cancer cells that keep dividing and growing over time, under certain conditions in a laboratory. Cancer cell lines are used i...
10426 sym Python (12253 sym/58 pcs) 4 img
Breast Cancer Expression Data
About this activity We will load and examine R dataframe objects that contain data from over 1,000 breast cancer (BRCA) patients from The Cancer Genome Atlas (TCGA). The objects include: clinical measurements on the patients and the patients’ tumors, such as gender, age, estrogen, progesterone, and her2 receptor status. We examined this data in ...
9196 sym 9 img
TCGA Heatmaps and Clustering
About this activity In the earlier module Breast_Cancer_Expression_Data, we examined the mRNA levels for 18,351 genes across 1,082 breast cancer patients. We saw that the average expression of the genes (across patient samples) varied greatly. By taking the log of the expression values, we compressed the data (or reduced the range of variation) so ...
11134 sym R (8834 sym/55 pcs) 12 img
Predictive Modeling for Cancer Prognosis
About this activity This module will introduce you to loading and manipulating the data from the breast cancer METABRIC dataset, visualizing and working with gene expression measurements, and building predictive models based on the expression of many different genes (and clinical data too). The METABRIC dataset is from another large breast cancer p...
9123 sym Python (9217 sym/61 pcs) 4 img
DREAM-High Breast Cancer Patient Data
In this activity, we will learn new skills in R with a large real-life dataset! We will load and examine a data frame that contains clinical information from over 1,000 breast cancer patients from The Cancer Genome Atlas (TCGA). TCGA characterized over 20,000 cancer samples spanning 33 cancer types with genomics. Genomics is an interdisciplinary fi...
12409 sym 5 img
Breast Cancer Patient Data
In this activity, we will learn new skills in R with a large real-life dataset! We will load and examine a data frame that contains clinical information from over 1,000 breast cancer patients from The Cancer Genome Atlas (TCGA). TCGA characterized over 20,000 cancer samples spanning 33 cancer types with genomics. Genomics is an interdisciplinary fi...
11608 sym 5 img
Module 4: Patterns in Breast Cancer Gene Expression
About this activity Last time, we examined the mRNA levels for 18,351 genes across 1,082 breast cancer patients. We saw that the average expression of the genes (across patient samples) varied greatly. By taking the log of the expression values, we compressed the data (or reduced the range of variation) so that the genes were more comparable. Thi...
8910 sym R (7592 sym/35 pcs) 8 img