Labour Economics: An Introduction
- Study of labour markets: wages and wage structure, employment, unemployment, policy
and labour markets, employer labour practices.
- A mix of theory and empirical work:
- theory: a way of organizing thoughts – logical, precise but a simplification.
- theory in labour economics:
applied microeconomics
models of consumer behavior (labour supply)
theory of the firm (labour demand)
supply-demand, monopsony (labour market models)
investment theory (skill creation)
incentive theory (contracts)
bargaining models (unionized markets)
information models (matching, search, incentives)
macroeconomics and labour economics?
business cycles:
unemployment: theories with labour market frictions or
imperfections.
wages, employment and recessions.
long-run macroeconomics:
productivity and its determinants.
- empirical: testing and measurement
policy effects and evaluation
special statistical techniques
special data issues (microdata, panel data)
- Practicial importance:
- about65-70% of income is generated through labour markets (Canada, US).
- labour policy is an active area
- research: a very active field (abundance of data!)
- This course?an introduction and sampler.
- learn some theory, learn some econometrics, see some applications and apply methods to
data.
Background and History of Labour Economics
- See: Introduction to Cahuc and Zylberberg (2004) Labor Economics.
- Rise of markets, industrial revolution:
- modern labour markets and working for pay become prominent.
- explaining these a key taks for labour economics.
- Adam Smith (1776)Wealth of Nations:
- wage determinants: compensating differentials
- labour productivity, specialization and the division of labour.
- labour theory of value.
- Classical economists: 19th century Britain
- “iron law of wages”: explaining poverty.
- Malthus: population and labour supply in the long-run.
- labour productivity and diminishing returns.
- Marxist Economics:
- Classical value theory with ethics: a theory of distribution and exploitation.
- unemployment, technology and wages.
- Neoclassical revolution: late 19th century – start of modern microeconomics
- marginalism, concern with resource allocation.
- income distribution: marginal productivity theory and labour demand.
- marginal utility: roots of the model of labour supply.
- supply-demand model (Alfred Marshall)
- Keynesian economics and labour markets: explaining the Great Depression
- unemployment: rigid wages? Causes?
- J. Hicks: Theory of Wages (1932)
- introduced modern microeconomics to labour economics.
- early model of wage determination with worker and employer coalition bargaining.
- Institutionalists:
- dominate American labour economics 1920s-50s/60s.
- descriptive, atheoretical, empirical.
- explaining internal markets, employer interviews.
- sociology and labour studies: similar to institutionalists.
- remnants: data-driven empirical research; dual labour market theories.
- Age of Becker! 1950s-1960s:
- Human capital theory: Gary Becker, T. Schultz
- Earnings determination: Jacob Mincer
- Discrimination theory: Gary Becker
- Labour supply as a time allocation problem, economics of the family: Becker.
- 1970s-1990s:
- Microeconomic theories of unemployment.
- Contract theory and incentives.
- Search and matching approaches to labour markets (Mortenson, Pissarides, Diamond)
- Influence of the computer:
- empirical analysis of "microdata": possible for the first time.
- rise of empirical labour economics.
- microdata problems and methods.
- Ongoing evolution:
- Methodological debates: data vs. theory
Structuralists (theory-based empirical methods)
- coherent interpretation of results needs a specific theoretical framework.
- problems: structural approaches often only tractable under strong technical
assumptions.
Mainstream: data-driven approaches, specifications appeal to theory but are not derived from specific models.
- Identification:
Natural experiments and matching treatment with control groups.
Instrumental variables methods.
- Current research issues:
- Wage inequality and labour’s share: trends and explanation.
( technology, globalization, institutions (policy, unions) ).
- Recessions and labour markets:
- cyclical vs. structural unemployment
- long-term unemployment and its effects.
- Aging populations and labour markets.
- Technology, trade and the future of labour markets.
- Health, education outcomes.
Data, Data Sourcesand Data Handling: Preliminaries
Importance of Data in Labour Economics:
- Description: identifying what facts need to be explained.
- Observation may motivate theory.
e.g. views on unemployment: static, turnover views and data availability.
Wage differences and discrimination; wage inequality and “the 1%”.
- Testing and measuring microeconomic relationships
- Labour supply and wage determinants, labour costs and employment:
presence and size of the effects?
- Policy development and evaluation:
- Measurement of policy effects.
- Examples:
Welfare reform (mid/late-1990s), EI(1990s) and labour supply.
Baker and Fortin (2000) Ontario pay equity.
Minimum wages, mandatory retirement etc.
Effects of taxes on incomes and employment.
- Findings may spur policy:
e.g. immigrant outcomes literature and selection of immigrants;
wage inequality in the English speaking world --- tax/transfer, education policy.
Types of labour market data.
- Survey Data and Administrative data
- Administrative data: based on government records.
e.g., unemployment insurance files
payroll or income tax data.
- Survey data: most readily available
- questionnaire responses from a sample of the population of interest.
- Increasingly datasets may combine survey and administrative data.
e.g. 2006 Census – answer income questions or allow access to tax records.
- Establishment and Household Survey Data
- Establishment surveys: survey employers or businesses
e.g., Survey of Employment, Payrolls and Hours
- Household surveys e.g. Labour Force Survey, Census, SLID
- survey individuals or households.
- Matched data: links household and establishment data.
e.g. survey workers and their employers; or
link survey data for one party with administrative data for the other.
- Aggregate and Microdata:
- Microdata: unit of observation is an individual
person (household) or business.
e.g., is a particular person employed, unemployed this week?
what is the person’s wage rate?
- a microdata sets can have thousands of observations.
- Aggregate: microdata which has been "aggregated"
(summed, averaged, etc.)
e.g., how many people are employed or unemployed this week in Canada?
what is the average wage of men in Canada?
- typically a limited number of observations. Is a summary of microdata
outcomes.
- Time series, cross-section, pooled and longitudinal data:
- Time series data: variables vary over time only
e.g., unemployment rate and minimum wage in Ontario 1976-2014.
(special issues: dynamics, collinearity and autocorrelation, stationarity)
- Cross-section: variables differ between units of observation at the same point in
time.
- Could be aggregate
e.g., provincial unemployment rates and minimum wage rates
in August 2014
- Could be microdata: labour market outcomes of individuals in Thunder
Bay in August 2014.
- Pooled time-series/cross-section data: variables vary between cross sections and
across time.
- Could be aggregate data
e.g., provincial unemployment rates and minimum wages 1976-2014.
- Could be microdata
- several cross-sections of labour market outcomes of individuals.
e.g. labour market outcomes of individuals in Thunder Bay in August
2012, August 2013 and August 2014.
- Longitudinal or Panel Data:
- follows the same individuals (households) or firms across time.
(a special kind of pooled time series/cross-section data)
- Growth in data availability: microdata especially.
- Growth in computing power: ability to handle microdata has increased massively.
Sampling Methods and Microdata:
- Dataset could be the population of interest.
e.g. some administrative datasets such as income tax records; short-form Census.
- Most datasets are a sample from the population of interest.
- Why sample?cheaper!
- Problem: sampling error.
- can only estimate values of interest for the population e.g. means, medians,
variances.
- tradeoff: sample size and accuracy of estimates.
- How is the sample selected?
- Random sample: each observation in the population has an equal probability of being
selected for the sample.
- in constructing estimates from the sample each observation receives equal weight.
e.g. 10% probability of being selected to the sample then each observation in the
sample represents 10 people in the population.
- Non-random sample: probability of being selected differs across observations.
- some types of observations are oversampled and some are undersampled
(compared to the random sample).
- weights now differ.
Observations with 10% probability of selection represent 10 observations in
the population.
Observations with 5% probability of selection represent 20 observations in
the population.
(must take this into account when constructing estimates for the
population from the sample)
- why non-random sampling?
(a) Could be part of the sampling strategy:
- a given sized random sample may give too few observation for reliable
estimates of some subgroups of interest.
- options? Increase the size of the entire random sample (expensive) or just
increase samples of the undersampled subgroups.
How to construct the non-random sample? “stratification” and “clustering”
- stratification: divide the population into subgroups of interest.
- identify clusters of observations in each subgroup.
- sampling: draw a sample clusters from each subgroup;
Sample individuals from the selected clusters.
(b) Could reflect sampling problems:
- non-response, difficulties contacting certain groups.
- results in a non-random sample if non-response/ sampling problems differ
across types of observations.
- Either (a), (b) or both gives a non-random sample.
- Datafile often includes a weighting variable that takes into account the way in
which the sample is selected from the population.
- Weight is often:
Weight = 1/(probability the observation is selected for the sample)
(number of people in the population represented by the observation)
- Sometimes the reported weights have been normalized to sum to 1.
- To generate estimates representative of the population data from individual
observations must be weighted (see Assignment 1).
- Good source on microdata, sampling and weighting:
A. Deaton (1997) The Analysis of Household Surveys Ch. 1 (see especially Sections
1.1 and 1.4)
Major Canadian Sources of Labour Market Data:
Labour Force Survey (LFS):
- Monthly, household survey.
- Since 1945, major redesigns: 1966, 1976. Additional variables since 1997.
- Big: 53,000 households, data on more than 100,000 individuals each month.
- Stratified sample not random (see above): must weight microdata.
- What?
- Employment, unemployment, hours, type of work.
- Since 1997: wage, union status. Some other job characteristics.
- By personal characteristic (age, sex, education etc.), region.
- Availability?
- The Daily: Labour Force usually the first Friday of each month.
- Labour Force Historical Review : summary statistics by characteristic back to 1976.
(a handy tool: we may use it occasionally)
- Source data files: available back to 1976.
i.e., microdata (cross-sectional) -- we will use some of these on the assignments.
- See:Guide to the Labour Force Survey (link on website) for details including survey form.
Survey of Labour and Income Dynamics (SLID)
- Longitudinal data (released as cross-sections), household survey.
- Started 1993 (earlier: Survey of Consumer Finances).
- Sub-sample of the Labour Force Survey sample.
- follow each person for 6 years.- new sub-sample started every 3 years.
- 15,000 households in each wave (since 1998: always 2 overlapping waves).
- Originally intended to study:
- income dynamics - labour market dynamics- family dynamics.
- Has become the main source of annual income data, see
Statistics CanadaIncome in Canada Cat. 75-202 XIE.
Census
- Every 5 years since 1961 (every 10 years pre-1961).
- Subset of households receive a detailed questionnaire (long-form)
- source of income and labour market data: since 2006 often linked to tax data.
- observations in subsets typically have equal weights (as if a random sample).
- Good data on personal characteristics: notably location, education, education, ethnic background
immigration status and language.
- Very limited information on jobs: occupation, industry.
- Microdata: Public use microdata files
- large samples
- 2%-3% of the population 15 yrs and over.
- National Household Survey replaced the Census starting in 2011 (microdata late 2014?).
Survey of Employment, Payroll and Hours
- Monthly establishment survey.
- increasinglyrelies on administrative (tax) data.
- Average wages, hours, number of employees by detailed industry and geographical location.
- Publication: Employment, Hours and Earnings Cat. 72-002
- Microdata unavailable.
United States Data
- Current Population Survey (CPS):
- US version of the Labour Force Survey.
- much of the data is available free through:
- Bureau of Labor Statistics website (aggregate data: free!)
- US Census Bureau: microdata (free!)
- Panel Survey on Income Dynamics (PSID):
- "Michigan Panel"
- a major US longitudinal data set (since 1970s): similar to SLID.
- National Longitudinal Survey (NLS):
- other long-standing US longitudinal data set.
Europe
- Common for countries to have a household survey (like our Labour Force Survey).
- Some countries rely on administrative datasets: sample is the population! e.g. Denmark
Using Data : Tips
- Hamermesh (2000): some useful suggestions and cautions.
- Which data set should be used?
- Variables available?
- How are key variables coded/defined in the dataset?
- Sample size: is it adequate?
- What data sets have other researchers used? Why?
- Know the Data:
- Sample coverage: are there any restrictions? Is it representative or restricted to some subset
of the population?
- Check questionnaire:
- Is the question flawed? Is the question asked only of a subset of the sample? Could
question order affect results?
- Be familiar with variable codings and definitions.
- detail reported?
- top-coding or bottom-coding?
- missing value codes: don’t confuse them with real data!
- What sampling methods were used?
- random sample? stratified sample? Are sample weights provided?
- Data may need to be weighted for some uses.
- Is the data concerned with current status or is it recall (retrospective) data?
- This could be important for reliability of the data.
-Preliminary investigation of the data:
- Generate descriptive statistics for key variables:
means, minimum and maximum values, frequencies.
- Look for oddities, unexpected results.
- Have you misunderstood something?
- Have you made programming errors?
- Are key figures in line with other sources? If not why? (review literature)
- Cross-tabulations, means by characteristic, comparisons of descriptive statistics between
datasets.
- Watch out for surprises/oddities!
- Is there any preliminary evidence on the issues you are examining?
- Appropriate methods:
- draw on economic theory: what variables should determine outcomes?
- draw on econometric knowledge.
- draw on examination of the literature (what techniques have others used? What
explanatory variables have they used? Why?)
- Sensitivity:
- How do results change with changes to specification, estimation technique, sample used?
Methods in Empirical Labour Economics
- Non-standard issues:
- Qualitative data: dummy or other categorical variables are common.
e.g., have a university degree or not; male or female; construction worker or not, etc.
- Dummy dependent variables: probit, logit, multinomial logit methods.
e.g., employed or not
- Censored data: dependent variables with limits
e.g. maximum or minimum earnings
- Sample selection problems and Classical linear regression assumptions (assumes random
sampling).
e.g. migrants are unlikely to be a random sample of the source population.
- Natural experiment approaches.
- Empirical Methods and Causality: a big issue in practice!
- See Angrist and Krueger (1999) for a more detailed overview of several techniques we will
encounter later.
- Angist and Krueger (2008) Mostly Harmless Econometrics. An interesting discussion of
current practice – advocate looking at the problem from an experimental perspective.
(1) Control for confounding variables.
e.g., adding control variables in a regression to isolate the effect of interest.
Estimating: Wi = a + b Ui + ei (ei – error term, a, b coefficients)
- What if Ui depends on an observable variable, Xi, that affect both Wi and Ui?
- estimate of “b” mixes effect of Ui on Wi and the effect of Xi on Wi.
- Estimate instead:
Wi = a + b Ui + cXi + ei
- This method is used in most empirical studies in labour economics.
(2) Fixed effect methods:
- Controlling for unobservable factors using multiple observations on the same
unit of observation (e.g. person).
- Say that Wi depends on unobservable factors specific to individual:
Wi = ai + b Ui + cXi + ei(intercept differs by person i)
- If unobservable is correlated with Ui then again b is biased.
- If have multiple observations on person i over time then a can be estimated as a
“fixed effect” or it could be differenced out.
- Common approach with panel / longitudinal data.
(3) Difference-in-differences estimates
- Appeals to an experimental approach.
- Exogenous event occurs.
- Observe change (difference before and after event) in the variable of interest for
group affected by the event.
- Observe how variable of interest changed in some “control group”.
- Calculate the effect of the event as the difference-in-differences, i.e., difference
between the change for the group affected and the control group.
e.g., Alberta bans unions: observe how wages change after this event and compare to
how wages change in say Saskatchewan.
- See minimum wage debate, US welfare reform effects, overtime effects, etc.
(4) Instrumental variable techniques
- Say Ui is partly determined by unobservables that either affect Wi or are correlated
with ei .
- Then: cov(Ui,ei)0 and estimate of b is biased.
- Possible solution: find an instrumental variable (Zi)
cov(Zi ,Ui) 0 and cov(Zi ,ei) =0 and then use Zi as an instrument for Ui in
the wage equation regression.
- Very popular in recent literature estimating the effect of education on wages.
1