Publications by G-Tch
R et Twitter
On va dans ce post, illustrer une utilisation simple des packages twitteR, StreamR, tm qui permettent faire du textmining. En réalité, les deux premiers permettent de récuperer les tweets et de faire des comptages simples et complexes et le dernier permet de faire du textmining; On reviendra plus en détail sur ce dernier dans un prochain bill...
3785 sym R (9740 sym/21 pcs) 26 img 1 tbl
Qu’on dit les médias cette semaine ?: Timeline de TwitteR
Related To leave a comment for the author, please follow the link and comment on their blog: Learning Data Science . R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click ...
404 sym
SeminR – au Museum d’histoire naturelle
JOURNEE R LE 24/05/2013 A PARIS – MUSEUM NATIONAL D’HISTOIRE NATURELLE VENEZ PARTAGER VOTRE (ME)CONNAISSANCE DE R ! Au programme : chimie, rapports automatisés, mélanges gaussiens, analyse spatiale, analyse de réseaux, interface R, atlas botanique, bases de données, analyse textuelle et biologie de l’évolutionInscription ...
857 sym
Monitoring des médias 2
Petit monitoring de notre observatoire des médias sur Twitter.Chez Mediapart : Le Monde Le Figaro Le parisien Vue globaleLe code pour réaliser ce post : Related To leave a comment for the author, please follow the link and comment on their blog: Learning Data Science . R-bloggers.com offers daily e-mail updates about R news and...
571 sym 10 img
A new package : Quandl
Quandl is a new database management tool which seeks to become the place to find datasets. That is, each unique indicator is considered an independent data set. This helps them to seem to have a ginormous quantity of data sets. Source : Blog Econometric Simulation.To load or find the datasets, we have to authentify using the API like with Tw...
1104 sym R (512 sym/2 pcs) 6 img
Mining the last French presidential debate
After reading this post (thanks to him), I think it could be interesting to replicate this with some specific up of french language and to see and we can perform rapid view of the debate between Sarkozy and Hollande of the last 2nd round of presidential election.Key words : TextMining, Elections, France, Debate, 2nd RoundWe use the pa...
1167 sym 6 img 2 tbl
How logistic regression work ?
Discussing with a non statistician colleague, it seems that the logistic regression is not intuitive; Some basics questions like : – Why don’t use the linear model? – What’s logistic function? – How can we compute by hand, step by step to listen what is dealing by the glm function?This post aims to answer that questions a...
2308 sym R (1096 sym/4 pcs) 4 img
How to read quickly large dataset in R?
Here, or there, I read many techniques to import a large dataset in R.The option read.table or read.csv doesn’t work anyway because, as discusshere, R load in memory. And sometimes, when we try to load a big dataset, we got this message :Warning messages: 1: Reached total allocation of 8056Mb: see help(memory.size)2: Reached total allocation o...
2200 sym R (550 sym/3 pcs) 1 tbl
ggmap : Interesting toolbox for spatial analysis
ggmap is a new tool which enables such visualization by combining the spatial information of static maps from Google Maps, OpenStreetMap, Stamen Maps or CloudMade Maps with the layered grammar of graphics implementation of ggplot2The library is developped by David Kahle and Hadley Wickham and in the latest R/Journal (Volume 5/1, June 2013), there...
3070 sym 8 img
Analyse discriminante linéaire ou Regression logistique
Supposons que l’on dispose d’iris de Paris (en population >100khabts) et qu’on veuille pouvoir les classer selon leurs caractéristiques sociodémos :Populationtaux de chômageEtudiantsCSPetc…Une fois, les iris classés, on se demande si l’on peut transporter cette typologie à une autre grande ville (Lyon) par exemple : Il ...
2606 sym R (2063 sym/1 pcs) 4 img