Markdown

In [5]:
options(scipen=999)
library(ggplot2)
data("midwest", package = "ggplot2")
theme_set(theme_bw())
In [6]:
gg <- ggplot(midwest, aes(x = area, y = poptotal)) +
	geom_point(aes(col = state, size = popdensity)) +
	geom_smooth(method = "loess", se = F) + xlim(c(0,0.1)) + ylim(c(0, 500000))+
	labs(title = "Area vs Population", y = "Population", x = "Area",
		caption = "Source: midwest")
In [7]:
plot(gg)
Warning message:
"Removed 15 rows containing non-finite values (stat_smooth)."Warning message:
"Removed 15 rows containing missing values (geom_point)."
In [8]:
setwd("C:/Users/Larry/Documents/Data Science/Projects/Baseball/Data")
In [9]:
mil <- read.csv("milwaukee_2017_batting.csv")
In [10]:
head(mil)
RkPosNameAgeGPAABRHX2B...OBPSLGOPSOPS.TBGDPHBPSHSFIBB
1 C Manny Pina\pinama01 30 107 359 330 45 92 21 ... 0.327 0.424 0.751 95 140 8 5 1 3 0
2 1B Eric Thames*\thameer01 30 138 551 469 83 116 26 ... 0.359 0.518 0.877 126 243 6 7 0 0 5
3 2B Jonathan Villar#\villajo0126 122 436 403 49 97 18 ... 0.293 0.372 0.665 73 150 4 0 2 1 1
4 SS Orlando Arcia\arciaor01 22 153 548 506 56 140 17 ... 0.324 0.407 0.731 90 206 10 1 2 3 9
5 3B Travis Shaw*\shawtr01 27 144 606 538 84 147 34 ... 0.349 0.513 0.862 122 276 20 4 1 3 6
6 LF Ryan Braun\braunry02 33 104 425 380 58 102 28 ... 0.336 0.487 0.823 112 185 15 3 0 4 2
In [11]:
colnames(mil)
  1. 'Rk'
  2. 'Pos'
  3. 'Name'
  4. 'Age'
  5. 'G'
  6. 'PA'
  7. 'AB'
  8. 'R'
  9. 'H'
  10. 'X2B'
  11. 'X3B'
  12. 'HR'
  13. 'RBI'
  14. 'SB'
  15. 'CS'
  16. 'BB'
  17. 'SO'
  18. 'BA'
  19. 'OBP'
  20. 'SLG'
  21. 'OPS'
  22. 'OPS.'
  23. 'TB'
  24. 'GDP'
  25. 'HBP'
  26. 'SH'
  27. 'SF'
  28. 'IBB'
In [13]:
df <- mil[c("R","H","HR","RBI","BB","OBP","SLG","OPS")]
In [ ]:

In [ ]:

/

Latex

When \(a \ne 0\), there are two solutions to \(ax^2 + bx + c = 0\) and they are $$x = {-b \pm \sqrt{b^2-4ac} \over 2a}.$$ $ H_0: \mu = 2.5\quad vs\quad H_a: \mu > 2.5$ $$ ax^2 + bx + c = 0$$ One may readily verify that if $\sum_{i=0}^n i^2 = \frac{(n^2+n)(2n+1)}{6}$ and $$g$$ are continuous functions on $D$ then the functions $f+g$, $f-g$ and $f.g$ are continuous. If in addition $g$ is everywhere non-zero then $f/g$ is continuous.
/

Mission Statement

What is the mission statement of this blog?

I am writing this blog for a couple of reasons.  I have heard that teaching is one of the best ways to learn because you have to learn things twice.  You learn the material for the first time for yourself, then you learn it a second time when you explain it to someone else.

This summer I have spent a lot of time working on data science, python, machine learning, and statistics.  My process has been haphazard at best and lacked direction.  I had tons of drive and motivation, but no north star to guide my effort.  As I work to find that direction, I hope to provide a bit of guidance as to a better way to learn data science.

I think we always want to do cool, interesting things at the expense of the fundamentals.  I want to provide a place where people can see and learn the fundamentals of data science.  I also plan on tackling more advanced topics but first I need to build a strong foundation of knowledge.  

What is the goal of this blog?
1) To help one person in their data science journey.
2) To help me learn and improve my data science knowledge.

What is the mission statement of this blog?
To learn and grow as a data scientist, and help others do the same.
 

/