R Basic querry

While learning R (from scratch) I fall on kaggle. It propose some competition between user to find the best model of prediction.

It's my first Big Data Experience, then I am a little bit lost. Then First  I have to admit that I follow "How To"

I will post here all my generic basic R data manipulation.



Create a list of value

unusual_title<-c('Dona', 'Lady', 'the Countess','Capt', 'Col', 'Don', 
                 'Dr', 'Major', 'Rev', 'Sir', 'Jonkheer')

Update the value Title by "unusual Title" if the title is in the liste previously created

titanic$title[titanic$title %in% unusual_title]<-'Unusual Title'

Sapply is a function that apply on each item of a dataSet.
on each name of the DataSet we apply a Split on the first ,or , and keep the first part
this part go in surnamecolum,

titanic$surname<-sapply(titanic$Name, function(x) strsplit(x,split='[,.]')[[1]][1])

Give the value "single" to fsizeD if the column famsize == 1

titanic$fsizeD[titanic$famsize == 1] <- 'single'

titanic$fsizeD[titanic$famsize < 5 & titanic$famsize> 1] <- 'small'

Commentaires

Posts les plus consultés de ce blog

CRM dynamics V9 wait Async web API to contiue code

How to connect to Crm Dynamics On Premis 2016 with Consol App

promise example