Public

统计

Master this deck with 21 terms through effective study methods.

Generated from text input

Created by @miao

What is the purpose of a census in data collection?

A census aims to collect data from every member of a population to ensure comprehensive and accurate information, eliminating sampling error.

How does a simple random sample differ from a stratified random sample?

A simple random sample selects individuals randomly from the entire population, while a stratified random sample divides the population into subgroups (strata) and then randomly samples from each stratum to ensure representation.

What are the potential biases in voluntary response samples?

Voluntary response samples can lead to bias because they rely on individuals who choose to participate, often resulting in overrepresentation of strong opinions and underrepresentation of the general population.

Explain the concept of undercoverage in sampling.

Undercoverage occurs when certain groups in the population are inadequately represented in the sample, leading to biased results and an inaccurate reflection of the population.

What is the difference between an observational study and an experiment?

An observational study observes subjects without manipulation, while an experiment involves the researcher actively manipulating variables to determine cause-and-effect relationships.

Define confounding variable and provide an example.

A confounding variable is an external factor that influences both the explanatory and response variables, potentially skewing results. For example, in a study on exercise and weight loss, diet could be a confounding variable.

What is the significance of random assignment in experiments?

Random assignment helps ensure that each participant has an equal chance of being assigned to any treatment group, which minimizes bias and allows for causal inferences.

How does replication enhance the reliability of an experiment?

Replication involves repeating an experiment multiple times to confirm results, which increases the reliability and generalizability of the findings.

What is the placebo effect and why is it important in experiments?

The placebo effect occurs when participants experience changes due to their expectations rather than the treatment itself. It is important to control for this effect to accurately assess the treatment's efficacy.

What are the characteristics of a double-blind study?

In a double-blind study, neither the participants nor the researchers know who is receiving the treatment or the placebo, which helps eliminate bias in both administration and reporting of results.

What is the empirical rule in statistics?

The empirical rule states that for a normal distribution, approximately 68% of data falls within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations.

How do you determine if a distribution is approximately normal?

A distribution is approximately normal if it is symmetric, bell-shaped, and follows the empirical rule, with most data points clustering around the mean.

What is the role of a control group in an experiment?

A control group serves as a baseline to compare the effects of the treatment, helping to isolate the treatment's impact from other variables.

Explain the concept of response bias and its implications.

Response bias occurs when participants provide inaccurate or untruthful responses, often due to social desirability or misunderstanding questions, leading to skewed data.

What is a two-way table and how can it be used to determine association?

A two-way table displays the relationship between two categorical variables, allowing for the assessment of association by comparing the distribution of one variable across the levels of the other.

How does standardizing data affect its mean and standard deviation?

Standardizing data transforms it to have a mean of 0 and a standard deviation of 1, allowing for comparison across different datasets by removing units of measurement.

What is the significance of the area under a density curve?

The area under a density curve represents the total probability of all outcomes, which is always equal to 1, indicating that the curve encompasses all possible values.

How can you identify outliers in a dataset?

Outliers can be identified using the interquartile range (IQR) method, where any data point that lies more than 1.5 times the IQR above the third quartile or below the first quartile is considered an outlier.

What is a cumulative relative frequency graph and what does it show?

A cumulative relative frequency graph displays the cumulative totals of relative frequencies, showing the proportion of data points that fall below a certain value, which helps in understanding the distribution of data.

What are the key components of the SOCS method for describing distributions?

The SOCS method includes Shape (the overall shape of the distribution), Outliers (any unusual data points), Center (the mean or median), and Spread (the range or standard deviation).

How can you assess normality in a dataset?

Normality can be assessed using graphical methods like histograms or Q-Q plots, as well as statistical tests such as the Shapiro-Wilk test, which evaluates how closely the data follows a normal distribution.