Consider the following data-set showing the customer usage of the treadmill. The requirement is to do Descriptive Analytics on this data to create a customer profile for each Treadmill.
The following code assumes the data-set to be available in .csv file to be read into the R environment
#Open the File dialog to select the .csv file containing the data-set
myFile <- file.choose()
#Read the .csv content into R and store in myData variable
myData <- read.csv(myFile, header = TRUE)
class(myData)
#Create variables with .csv file column header values
attach(myData)
#Show summary for the entire file
summary(myData)
#Histogram of Age
hist(Age, col = heat.colors(14), main = "Histogram of Age", xlab = "Age")
#BoxPlot of Age
boxplot(Age, horizontal = TRUE, col = "RED", main = "BoxPlot of Age")
#BoxPlot of Age as a factor of Gender
boxplot(Age~Gender, horizontal = TRUE, col = c("RED", "LIGHTBLUE"), main = "BoxPlot of Age by Gender")
#BoxPlot of Age as a factor of Product
boxplot(Age~Product, horizontal = TRUE, col = c("RED", "LIGHTBLUE", "GREEN"), main = "BoxPlot of Age by Product")
#Show summary grouped by Product
by(myData, INDICES = Product, FUN = summary)
Some inference of this summary is as follows:
REFERENCES
https://www.greatlearning.in/great-lakes-pgpba/
Product | Age | Gender | Education | Marital Status | Usage | Fitness | Income | Miles |
TM195 | 18 | Male | 14 | Single | 3 | 4 | 29562 | 112 |
TM195 | 19 | Male | 15 | Single | 2 | 3 | 31836 | 75 |
TM195 | 19 | Female | 14 | Partnered | 4 | 3 | 30699 | 66 |
TM195 | 19 | Male | 12 | Single | 3 | 3 | 32973 | 85 |
TM195 | 20 | Male | 13 | Partnered | 4 | 2 | 35247 | 47 |
TM195 | 20 | Female | 14 | Partnered | 3 | 3 | 32973 | 66 |
TM195 | 21 | Female | 14 | Partnered | 3 | 3 | 35247 | 75 |
TM195 | 21 | Male | 13 | Single | 3 | 3 | 32973 | 85 |
TM195 | 21 | Male | 15 | Single | 5 | 4 | 35247 | 141 |
TM195 | 21 | Female | 15 | Partnered | 2 | 3 | 37521 | 85 |
TM498 | 19 | Male | 14 | Single | 3 | 3 | 31836 | 64 |
TM498 | 20 | Male | 14 | Single | 2 | 3 | 32973 | 53 |
TM498 | 20 | Female | 14 | Partnered | 3 | 3 | 34110 | 106 |
TM498 | 20 | Male | 14 | Single | 3 | 3 | 38658 | 95 |
TM498 | 21 | Female | 14 | Partnered | 5 | 4 | 34110 | 212 |
TM498 | 21 | Male | 16 | Partnered | 2 | 2 | 34110 | 42 |
TM498 | 21 | Male | 12 | Partnered | 2 | 2 | 32973 | 53 |
TM498 | 23 | Male | 14 | Partnered | 3 | 3 | 36384 | 95 |
TM498 | 23 | Male | 14 | Partnered | 3 | 3 | 38658 | 85 |
TM498 | 23 | Female | 16 | Single | 3 | 3 | 45480 | 95 |
TM798 | 31 | Male | 16 | Partnered | 6 | 5 | 89641 | 260 |
TM798 | 33 | Female | 18 | Partnered | 4 | 5 | 95866 | 200 |
TM798 | 34 | Male | 16 | Single | 5 | 5 | 92131 | 150 |
TM798 | 35 | Male | 16 | Partnered | 4 | 5 | 92131 | 360 |
TM798 | 38 | Male | 18 | Partnered | 5 | 5 | 104581 | 150 |
TM798 | 40 | Male | 21 | Single | 6 | 5 | 83416 | 200 |
TM798 | 42 | Male | 18 | Single | 5 | 4 | 89641 | 200 |
TM798 | 45 | Male | 16 | Single | 5 | 5 | 90886 | 160 |
TM798 | 47 | Male | 18 | Partnered | 4 | 5 | 104581 | 120 |
TM798 | 48 | Male | 18 | Partnered | 4 | 5 | 95508 | 180 |
The following code assumes the data-set to be available in .csv file to be read into the R environment
#Open the File dialog to select the .csv file containing the data-set
myFile <- file.choose()
#Read the .csv content into R and store in myData variable
myData <- read.csv(myFile, header = TRUE)
class(myData)
#Create variables with .csv file column header values
attach(myData)
#Show summary for the entire file
summary(myData)
#Histogram of Age
hist(Age, col = heat.colors(14), main = "Histogram of Age", xlab = "Age")
#BoxPlot of Age
boxplot(Age, horizontal = TRUE, col = "RED", main = "BoxPlot of Age")
#BoxPlot of Age as a factor of Gender
boxplot(Age~Gender, horizontal = TRUE, col = c("RED", "LIGHTBLUE"), main = "BoxPlot of Age by Gender")
#BoxPlot of Age as a factor of Product
boxplot(Age~Product, horizontal = TRUE, col = c("RED", "LIGHTBLUE", "GREEN"), main = "BoxPlot of Age by Product")
#Show summary grouped by Product
by(myData, INDICES = Product, FUN = summary)
Some inference of this summary is as follows:
- Product TM798 is mostly used by Males
- Frequency of usage of TM798 is more when compared with other Products
- TM798 is mostly used by customers with higher income and education compared to other products
- Customers who bought TM798 are satisfied with the fitness achieved with use of this product
REFERENCES
https://www.greatlearning.in/great-lakes-pgpba/
Comments
Post a Comment