Think Stats: Exploratory Data AnalysisIf you know how to program, you have the skills to turn data into knowledge, using tools of probability and statistics. This concise introduction shows you how to perform statistical analysis computationally, rather than mathematically, with programs written in Python. By working with a single case study throughout this thoroughly revised book, you’ll learn the entire process of exploratory data analysis—from collecting data and generating statistics to identifying patterns and testing hypotheses. You’ll explore distributions, rules of probability, visualization, and many other tools and concepts. New chapters on regression, time series analysis, survival analysis, and analytic methods will enrich your discoveries.
|
Contents
Section 21 | |
Section 22 | |
Section 23 | |
Section 24 | |
Section 25 | |
Section 26 | |
Section 27 | |
Section 28 | |
Section 9 | |
Section 10 | |
Section 11 | |
Section 12 | |
Section 13 | |
Section 14 | |
Section 15 | |
Section 16 | |
Section 17 | |
Section 18 | |
Section 19 | |
Section 20 | |
Section 29 | |
Section 30 | |
Section 31 | |
Section 32 | |
Section 33 | |
Section 34 | |
Section 35 | |
Section 36 | |
Section 37 | |
Section 38 | |
Section 39 | |
Other editions - View all
Common terms and phrases
agepreg analytic distribution autocorrelation babies birth weight BRFSS Central Limit Theorem chapter ChiSquared Test Classical Hypothesis Cohort Effects confidence interval Correlation Test DataFrame dataset Difference in Means Distribution Framework estimated parameters Estimation Game example Exercises Expected Remaining Lifetime explanatory variables exponential distribution Figure Glossary groups hazard function HazardFunction Here’s the code Hist histogram hypothesis testing Implementation integer Least Squares Fit Linear Model linear regression logistic regression lognormal Distribution median method mother’s age Moving Averages normal distribution normal probability plot NSFG Variables null hypothesis NumPy ofthe pandas Pareto Distribution PDFs Pearson’s correlation percentile rank population predictions pregnancy length provides pvalue Python Random Numbers residuals respondents RMSE sampling distribution sampling error Scatter Plots sequence serial correlation shows the result simulated skewness slope Spearman’s Rank Correlation standard deviation standard error statistically significant StatsModels Survey of Family Survival Curves survival function SurvivalFunction test statistic test_stat thinkstats2 variance Weighted Resampling