Publications by Xuejun Gu
HW9
Q1. Consider the fraction variable a.grad.rate (percentage of freshmen who graduated within a six-year period). Compare the fraction variable across tiers and also compare the froots of the variable across tiers and also compare the flogs of the variable across tiers. Is it necessary to reexpress the data by froots or flogs in this example? Expla...
3358 sym R (3619 sym/24 pcs) 6 img
Document
Find some (x, y) data where you think x and y are strongly related. Make a scatterplot and find the least-squares and resistant fits. Plot the two sets of residuals. Interpret the fit and the residuals. Contrast the two methods of fitting – is it better to fit a resistant line? college.ratings <- read.delim("~/data/college.ratings.txt") data<...
4879 sym R (5310 sym/39 pcs) 17 img
Document
The dataset lake in the LearnEDA package (taken from the Minitab dataset collection) contains measurements of lakes in the Vilas andOneidacounties of northernWisconsin. The variables are AREA = area of lake in acres DEPTH = maximum depth of lake in feet PH = pH (acidity) measurement WSHED = watershed area in square miles HIONS = concentration of...
3220 sym R (6571 sym/39 pcs) 21 img
Document
For each of the two datasets below Find 5-number summaries, fences, and outside values for each group (CITY). Construct parallel boxplots. Using a spread-vs-level plot, determine the power of a transformation that you believe will stabilize spread. Using the transformation, reanalyze the data by computing new 5-number summaries (and fence and ou...
5985 sym R (14103 sym/91 pcs) 10 img 22 tbl
HW6
Part A For both data sets: Using R, perform a resistant smooth (3RSSH, twice) to your data. Save the smooth (the fit) and the rough (the residuals). Plot the smooth (using a smooth curve) and describe the general patterns that you see. Don’t assume that anything is obvious – pretend that you are explaining this to someone who doesn’t have ...
7193 sym R (8084 sym/33 pcs) 12 img
HW7
Find a two-way table that you are interested in with at least 4 rows and 4 columns. Analyze the table using both additive AND multiplicative fits. Plot the additive fit. Explain your additive and multiplicative fits (what do the common, row effects, and column effects mean). Is an additive or multiplicative fit more suitable for your data? Explai...
4234 sym R (12014 sym/47 pcs) 7 img
HW8
1. Exploring Football Scores The dataset football in the LearnEDA package gives the number of points scored by the winning team (team1) and the losing team (team2) for a large number of American football games. Using the bin boundaries -0.5, 6.5, 13.5, 20.5, 27.5, 34.5, 41.5, 48.5, 55.5, 62.5, 69.5, 76.5 , have R construct a histogram of the sco...
4374 sym R (8520 sym/30 pcs) 14 img