Publications by Larry D'Agostino
IEOR Tools Tutorial: Learning XML with R
I have been using a lot of R lately in my work. R (main site) is an open source statistical computing platform. Saying R is only used for statistics does not do it justice. I am finding it to be a really powerful statistical and optimization computing platform. There seems to be no task that can not be accomplished. Lately I...
7991 sym
Computer languages and Applied Math
There is no question that computer languages have helped pushed the envelope for applied mathematics. It is hard to imagine where we would be without airline scheduling, supply chain management, or inventory control if it were not for all of the great advances in optimization and statistical computing. I have thought a lot about t...
2584 sym
Data mining competition with R
There is a new data mining competition aimed at predicting preferred data mining tools in R via dataists.com. The concept of the competition is to try to determine which R packages are preferred in the R community via their CRAN package libraries. The developers of this new competition are also in the R community with the NY R Us...
1934 sym
R Links for the Beginner on World Statistics Day
In honor of the first World Statistics Day I thought I would share some of my favorite R links. R is a free software statistical computing environment for performing all sorts of data and mathematical manipulation.Introduction and TutorialsR Tutorial Series and Introduction Burns Statistics TutorialsIntroductory R TutorialsLearning ...
898 sym
R references for handling Big data
The Dallas R User Group had a meeting over the weekend. One of the discussions is the memory limitations with R. This is a common subject among the R community and R User Groups. There has been a lot of strides recently in allowing R to stretch its memory limitations. I thought I would compile and share some of the best resour...
1515 sym
INFORMS Data Mining Competition leaders used Open Source software
The results of 2010 INFORMS data mining competition just recently finished. The leaders were presented at the 2010 Annual INFORMS Conference. The 2010 INFORMS data mining competition goal was to determine short term movements in stock prices. You may recall that IEOR Tools competed in this competition with not too glamorous results at the e...
2336 sym 4 img
Big Data Logistic Regression with R and ODBC
Recently I’ve been doing a lot of work with predictive models using logistic regression. Logistic regression is great for determing probable outcomes of a independent binary target variable. R is a great tool for accomplishing this task. Often times I will use the base function glm to develop a model. Yet there are times, du...
8863 sym 10 img
Where to find good data sets
O’Reilly Media has been a big advocate of Open Data and believes that is where a lot of computing is going to be headed in the future. I think they are definitely on to something. Yet the future could be now. There is a lot of opportunities to find good data sources immediately. One of my favorite blogs, OReilly Radar, has a...
2441 sym
Video of Joy of Stats by Hans Rosling
The Joy of Stats narrated by Hans Rosling was just produced by BBC and shown to their audience. Hans Rosling via gapminder.org was kind enough to post the full hour video of the documentary about joys of statistics. The video is posted on YouTube and is available to anyone.http://www.gapminder.org/videos/the-joy-of-stats/Hans Rosling’s pass...
1761 sym 2 img
IBM has a Natural Language Purpose
I wanted to write a blog post about the advancements of Natural Language Processing in light of the performance of IBM’s Watson on the Jeopardy challenge last week. Natural Language Processing is the science of transforming and interpreting human spoken and written language by artificial means. Generally this type of study has b...
4239 sym