Descriptive measures of central tendency
In a video game store, several data were collected to determine if the video game business is profitable enough to keep operating. Several days were selected as a sample to conduct the study. The study involved the determination of the number of visits that the video game store had in each of the selected days and the amount of time each of the visits lasted. The time for each of the visits was then multiplied by the number of visits each day to determine the total time people spend at the video game store. Apart from that, data on which game was mostly played at the video game store was also collected. Also, the means through which the people visiting the store had about the store were also collected. This information was then used to determine whether the video game business is a good investment might lead to losses.
In the analysis of the data that was obtained in the video game store, there were measures of centrality, which included the mean and the median and the measures of the variability of the distribution, which included the variance and the standard deviation. The mean is the best-known measure of centrality since it uses all the data in the distribution to calculate the measure of centrality. The only disadvantage of using the mean is when there are a lot of outliers, the mean might give an inaccurate value of the central tendency. The median, on the other hand, is a good measure of centrality when the data distribution is even. However, when the data is highly skewed, the median given might be inaccurate such as in the data above (Dean & Illowsky 2018).
Results and Description
To find the mean, we take the number total time, which has been given as the time of visit multiplied by the number of visits and divides it by the total number of days. Hence we use the formula where ∑fx is the summation of the total time while n is the total number of days.
To find the median of the variable, we first arrange the variables from the lowest to the largest. We then find the number in the middle. Since the total distribution is an even number, we take the two numbers in the middle, add them and divide by two. The two numbers are 0 and 0 hence = 0. Hence the median is 0.
To find the variance, we subtract the mean from the variables (x-). We take the square the values and find the sum of the values (x-)2. We then take the sum of the squares and divide it by the total number of variables n. Hence the formula which gives the variance as
To find the standard deviation of the distribution, we take the value of the variance and find its square root. Hence , which gives the standard deviation.
For the data that has been given above, the mean is the best value to be used in calculating the central tendency of the distribution. This is because the number in the middle of the distribution is zero, hence a median of zero. The median is not an accurate measure of the centrality of the distribution. Also, the mode of distribution is zero. This is because there was no turn out to play the videogame for most of the days; hence the frequency that is most repetitive is zero. This, however, is not an accurate measure of centrality. There are several outliers that make the mean slightly inaccurate, but as compared to the other measures of centrality, it is the measure that gives the most accurate description of the centrality of the distribution.
The variance and standard deviation are very high. This means that there is a distribution that is highly volatile. High volatility shows significant variations between the data that is presented. This means that there is a significant fluctuation in the number of hours the video game is played.
In order to draw the box plot, the value of the minimum, the first quartile, the median, the third quartile, and the maximum. The value of the minimum is 0, the value of the first quartile is 0, the median is 0, the third quartile is 1.98, and the maximum is 28.45.
This is a box plot showing a representation of the distribution of the data. From the box plot, we can determine the value of the minimum, first quartile, median, third quartile, and the maximum. The values that are plotted above the upper bound of the box plot are outliers that affect the mean as a measure of centrality.
From the data plotted, the skewness of that distribution is positive. This is because most of the data is located on the lower end of the boxplot. Apart from that, the outliers that are past the upper bound make the tail longer hence the positive skewness. The Kurtosis is a measure of the length of the tail of the distribution (Komsta, & Novomestky 2015). In our data set, we used the formula =KURT to determine the Kurtosis, which was at 7.55, which is a very high value. This indicates that the distribution has a long tail hence the presence of many outliers.
From the analysis of the distribution of data, it is clear that in most of the days, the game store had no visits, which explains why the median of the distribution is 0. The business is running losses since out of 44 days that were sampled for the study, 27 of them lacked visits. The data is positively skewed, which means that in the majority of the day, there were little to no visits.
Dean, S., & Illowsky, B. (2018). Descriptive statistics: skewness and the mean, median, and mode. Connexions website.
Komsta, L., & Novomestky, F. (2015). Moments, cumulants, skewness, kurtosis, and related tests. R package version, 14.