Publications by Alex Lewis, Diana Murray, Jeffrey Chang, and Philip Moos

DREAM-High Breast Cancer Patient Data_test

13.08.2024

In this activity, we will learn new skills in R with a large real-life dataset! We will load and examine a data frame that contains clinical information from over 1,000 breast cancer patients from The Cancer Genome Atlas (TCGA). TCGA characterized over 20,000 cancer samples spanning 33 cancer types with genomics. Genomics is an interdisciplinary fi...

12430 sym 5 img

Heatmaps

09.08.2024

Heatmaps Heatmaps are a way to colorize, visualize, and organize a data set with the goal of intuiting relationships among observations and features. We will use heatmaps in this course to find patterns in the gene expression data for the 1K breast cancer patients from The Cancer Genome Atlas. Here, we focus on what heatmaps are and how to create t...

5788 sym R (3088 sym/21 pcs) 10 img

DREAM-High Breast Cancer Patient Data

09.08.2024

In this activity, we will learn new skills in R with a large real-life dataset! We will load and examine a data frame that contains clinical information from over 1,000 breast cancer patients from The Cancer Genome Atlas (TCGA). TCGA characterized over 20,000 cancer samples spanning 33 cancer types with genomics. Genomics is an interdisciplinary fi...

12430 sym 5 img

Breast Cancer Cell Lines

09.08.2024

About this activity In this module, we will switch gears and work with data from experiments with human cancer cell lines from the Physical Sciences in Oncology (PS-ON) Cell Line Characterization Study. Cancer cell lines are cancer cells that keep dividing and growing over time, under certain conditions in a laboratory. Cancer cell lines are used i...

10426 sym Python (12253 sym/58 pcs) 4 img

Breast Cancer Expression Data

09.08.2024

About this activity We will load and examine R dataframe objects that contain data from over 1,000 breast cancer (BRCA) patients from The Cancer Genome Atlas (TCGA). The objects include: clinical measurements on the patients and the patients’ tumors, such as gender, age, estrogen, progesterone, and her2 receptor status. We examined this data in ...

9196 sym 9 img

TCGA Heatmaps and Clustering

09.08.2024

About this activity In the earlier module Breast_Cancer_Expression_Data, we examined the mRNA levels for 18,351 genes across 1,082 breast cancer patients. We saw that the average expression of the genes (across patient samples) varied greatly. By taking the log of the expression values, we compressed the data (or reduced the range of variation) so ...

11134 sym R (8834 sym/55 pcs) 12 img

Predictive Modeling for Cancer Prognosis

09.08.2024

About this activity This module will introduce you to loading and manipulating the data from the breast cancer METABRIC dataset, visualizing and working with gene expression measurements, and building predictive models based on the expression of many different genes (and clinical data too). The METABRIC dataset is from another large breast cancer p...

9123 sym Python (9217 sym/61 pcs) 4 img

DREAM-High Breast Cancer Patient Data

11.07.2024

In this activity, we will learn new skills in R with a large real-life dataset! We will load and examine a data frame that contains clinical information from over 1,000 breast cancer patients from The Cancer Genome Atlas (TCGA). TCGA characterized over 20,000 cancer samples spanning 33 cancer types with genomics. Genomics is an interdisciplinary fi...

12409 sym 5 img

Breast Cancer Patient Data

28.03.2024

In this activity, we will learn new skills in R with a large real-life dataset! We will load and examine a data frame that contains clinical information from over 1,000 breast cancer patients from The Cancer Genome Atlas (TCGA). TCGA characterized over 20,000 cancer samples spanning 33 cancer types with genomics. Genomics is an interdisciplinary fi...

11608 sym 5 img

Module 4: Patterns in Breast Cancer Gene Expression

20.10.2021

About this activity Last time, we examined the mRNA levels for 18,351 genes across 1,082 breast cancer patients. We saw that the average expression of the genes (across patient samples) varied greatly. By taking the log of the expression values, we compressed the data (or reduced the range of variation) so that the genes were more comparable. Thi...

8910 sym R (7592 sym/35 pcs) 8 img