Graduate Labour Notes

Labour Economics: An Introduction

- Study of labour markets: wages and wage structure, employment, unemployment, policy

and labour markets, employer labour practices.

- A mix of theory and empirical work:

- theory: a way of organizing thoughts – logical, precise but a simplification.

- theory in labour economics:

applied microeconomics

models of consumer behavior (labour supply)

theory of the firm (labour demand)

supply-demand, monopsony (labour market models)

investment theory (skill creation)

incentive theory (contracts)

bargaining models (unionized markets)

information models (matching, search, incentives)

macroeconomics and labour economics?

business cycles:

unemployment: theories with labour market frictions or

imperfections.

wages, employment and recessions.

long-run macroeconomics:

productivity and its determinants.

- empirical: testing and measurement

policy effects and evaluation

special statistical techniques

special data issues (microdata, panel data)

- Practicial importance:

- about65-70% of income is generated through labour markets (Canada, US).

- labour policy is an active area

- research: a very active field (abundance of data!)

- This course?an introduction and sampler.

- learn some theory, learn some econometrics, see some applications and apply methods to

data.

Background and History of Labour Economics

- See: Introduction to Cahuc and Zylberberg (2004) Labor Economics.

- Rise of markets, industrial revolution:

- modern labour markets and working for pay become prominent.

- explaining these a key taks for labour economics.

- Adam Smith (1776)Wealth of Nations:

- wage determinants: compensating differentials

- labour productivity, specialization and the division of labour.

- labour theory of value.

- Classical economists: 19th century Britain

- “iron law of wages”: explaining poverty.

- Malthus: population and labour supply in the long-run.

- labour productivity and diminishing returns.

- Marxist Economics:

- Classical value theory with ethics: a theory of distribution and exploitation.

- unemployment, technology and wages.

- Neoclassical revolution: late 19th century – start of modern microeconomics

- marginalism, concern with resource allocation.

- income distribution: marginal productivity theory and labour demand.

- marginal utility: roots of the model of labour supply.

- supply-demand model (Alfred Marshall)

- Keynesian economics and labour markets: explaining the Great Depression

- unemployment: rigid wages? Causes?

- J. Hicks: Theory of Wages (1932)

- introduced modern microeconomics to labour economics.

- early model of wage determination with worker and employer coalition bargaining.

- Institutionalists:

- dominate American labour economics 1920s-50s/60s.

- descriptive, atheoretical, empirical.

- explaining internal markets, employer interviews.

- sociology and labour studies: similar to institutionalists.

- remnants: data-driven empirical research; dual labour market theories.

- Age of Becker! 1950s-1960s:

- Human capital theory: Gary Becker, T. Schultz

- Earnings determination: Jacob Mincer

- Discrimination theory: Gary Becker

- Labour supply as a time allocation problem, economics of the family: Becker.

- 1970s-1990s:

- Microeconomic theories of unemployment.

- Contract theory and incentives.

- Search and matching approaches to labour markets (Mortenson, Pissarides, Diamond)

- Influence of the computer:

- empirical analysis of "microdata": possible for the first time.

- rise of empirical labour economics.

- microdata problems and methods.

- Ongoing evolution:

- Methodological debates: data vs. theory

Structuralists (theory-based empirical methods)

- coherent interpretation of results needs a specific theoretical framework.

- problems: structural approaches often only tractable under strong technical

assumptions.

Mainstream: data-driven approaches, specifications appeal to theory but are not derived from specific models.

- Identification:

Natural experiments and matching treatment with control groups.

Instrumental variables methods.

- Current research issues:

- Wage inequality and labour’s share: trends and explanation.

( technology, globalization, institutions (policy, unions) ).

- Recessions and labour markets:

- cyclical vs. structural unemployment

- long-term unemployment and its effects.

- Aging populations and labour markets.

- Technology, trade and the future of labour markets.

- Health, education outcomes.

Data, Data Sourcesand Data Handling: Preliminaries

Importance of Data in Labour Economics:

- Description: identifying what facts need to be explained.

- Observation may motivate theory.

e.g. views on unemployment: static, turnover views and data availability.

Wage differences and discrimination; wage inequality and “the 1%”.

- Testing and measuring microeconomic relationships

- Labour supply and wage determinants, labour costs and employment:

presence and size of the effects?

- Policy development and evaluation:

- Measurement of policy effects.

- Examples:

Welfare reform (mid/late-1990s), EI(1990s) and labour supply.

Baker and Fortin (2000) Ontario pay equity.

Minimum wages, mandatory retirement etc.

Effects of taxes on incomes and employment.

- Findings may spur policy:

e.g. immigrant outcomes literature and selection of immigrants;

wage inequality in the English speaking world --- tax/transfer, education policy.

Types of labour market data.

- Survey Data and Administrative data

- Administrative data: based on government records.

e.g., unemployment insurance files

payroll or income tax data.

- Survey data: most readily available

- questionnaire responses from a sample of the population of interest.

- Increasingly datasets may combine survey and administrative data.

e.g. 2006 Census – answer income questions or allow access to tax records.

- Establishment and Household Survey Data

- Establishment surveys: survey employers or businesses

e.g., Survey of Employment, Payrolls and Hours

- Household surveys e.g. Labour Force Survey, Census, SLID

- survey individuals or households.

- Matched data: links household and establishment data.

e.g. survey workers and their employers; or

link survey data for one party with administrative data for the other.

- Aggregate and Microdata:

- Microdata: unit of observation is an individual

person (household) or business.

e.g., is a particular person employed, unemployed this week?

what is the person’s wage rate?

- a microdata sets can have thousands of observations.

- Aggregate: microdata which has been "aggregated"

(summed, averaged, etc.)

e.g., how many people are employed or unemployed this week in Canada?

what is the average wage of men in Canada?

- typically a limited number of observations. Is a summary of microdata

outcomes.

- Time series, cross-section, pooled and longitudinal data:

- Time series data: variables vary over time only

e.g., unemployment rate and minimum wage in Ontario 1976-2014.

(special issues: dynamics, collinearity and autocorrelation, stationarity)

- Cross-section: variables differ between units of observation at the same point in

time.

- Could be aggregate

e.g., provincial unemployment rates and minimum wage rates

in August 2014

- Could be microdata: labour market outcomes of individuals in Thunder

Bay in August 2014.

- Pooled time-series/cross-section data: variables vary between cross sections and

across time.

- Could be aggregate data

e.g., provincial unemployment rates and minimum wages 1976-2014.

- Could be microdata

- several cross-sections of labour market outcomes of individuals.

e.g. labour market outcomes of individuals in Thunder Bay in August

2012, August 2013 and August 2014.

- Longitudinal or Panel Data:

- follows the same individuals (households) or firms across time.

(a special kind of pooled time series/cross-section data)

- Growth in data availability: microdata especially.

- Growth in computing power: ability to handle microdata has increased massively.

Sampling Methods and Microdata:

- Dataset could be the population of interest.

e.g. some administrative datasets such as income tax records; short-form Census.

- Most datasets are a sample from the population of interest.

- Why sample?cheaper!

- Problem: sampling error.

- can only estimate values of interest for the population e.g. means, medians,

variances.

- tradeoff: sample size and accuracy of estimates.

- How is the sample selected?

- Random sample: each observation in the population has an equal probability of being

selected for the sample.

- in constructing estimates from the sample each observation receives equal weight.

e.g. 10% probability of being selected to the sample then each observation in the

sample represents 10 people in the population.

- Non-random sample: probability of being selected differs across observations.

- some types of observations are oversampled and some are undersampled

(compared to the random sample).

- weights now differ.

Observations with 10% probability of selection represent 10 observations in

the population.

Observations with 5% probability of selection represent 20 observations in

the population.

(must take this into account when constructing estimates for the

population from the sample)

- why non-random sampling?

(a) Could be part of the sampling strategy:

- a given sized random sample may give too few observation for reliable

estimates of some subgroups of interest.

- options? Increase the size of the entire random sample (expensive) or just

increase samples of the undersampled subgroups.

How to construct the non-random sample? “stratification” and “clustering”

- stratification: divide the population into subgroups of interest.

- identify clusters of observations in each subgroup.

- sampling: draw a sample clusters from each subgroup;

Sample individuals from the selected clusters.

(b) Could reflect sampling problems:

- non-response, difficulties contacting certain groups.

- results in a non-random sample if non-response/ sampling problems differ

across types of observations.

- Either (a), (b) or both gives a non-random sample.

- Datafile often includes a weighting variable that takes into account the way in

which the sample is selected from the population.

- Weight is often:

Weight = 1/(probability the observation is selected for the sample)

(number of people in the population represented by the observation)

- Sometimes the reported weights have been normalized to sum to 1.

- To generate estimates representative of the population data from individual

observations must be weighted (see Assignment 1).

- Good source on microdata, sampling and weighting:

A. Deaton (1997) The Analysis of Household Surveys Ch. 1 (see especially Sections

1.1 and 1.4)

Major Canadian Sources of Labour Market Data:

Labour Force Survey (LFS):

- Monthly, household survey.

- Since 1945, major redesigns: 1966, 1976. Additional variables since 1997.

- Big: 53,000 households, data on more than 100,000 individuals each month.

- Stratified sample not random (see above): must weight microdata.

- What?

- Employment, unemployment, hours, type of work.

- Since 1997: wage, union status. Some other job characteristics.

- By personal characteristic (age, sex, education etc.), region.

- Availability?

- The Daily: Labour Force usually the first Friday of each month.

- Labour Force Historical Review : summary statistics by characteristic back to 1976.

(a handy tool: we may use it occasionally)

- Source data files: available back to 1976.

i.e., microdata (cross-sectional) -- we will use some of these on the assignments.

- See:Guide to the Labour Force Survey (link on website) for details including survey form.

Survey of Labour and Income Dynamics (SLID)

- Longitudinal data (released as cross-sections), household survey.

- Started 1993 (earlier: Survey of Consumer Finances).

- Sub-sample of the Labour Force Survey sample.

- follow each person for 6 years.- new sub-sample started every 3 years.

- 15,000 households in each wave (since 1998: always 2 overlapping waves).

- Originally intended to study:

- income dynamics - labour market dynamics- family dynamics.

- Has become the main source of annual income data, see

Statistics CanadaIncome in Canada Cat. 75-202 XIE.

Census

- Every 5 years since 1961 (every 10 years pre-1961).

- Subset of households receive a detailed questionnaire (long-form)

- source of income and labour market data: since 2006 often linked to tax data.

- observations in subsets typically have equal weights (as if a random sample).

- Good data on personal characteristics: notably location, education, education, ethnic background

immigration status and language.

- Very limited information on jobs: occupation, industry.

- Microdata: Public use microdata files

- large samples

- 2%-3% of the population 15 yrs and over.

- National Household Survey replaced the Census starting in 2011 (microdata late 2014?).

Survey of Employment, Payroll and Hours

- Monthly establishment survey.

- increasinglyrelies on administrative (tax) data.

- Average wages, hours, number of employees by detailed industry and geographical location.

- Publication: Employment, Hours and Earnings Cat. 72-002

- Microdata unavailable.

United States Data

- Current Population Survey (CPS):

- US version of the Labour Force Survey.

- much of the data is available free through:

- Bureau of Labor Statistics website (aggregate data: free!)

- US Census Bureau: microdata (free!)

- Panel Survey on Income Dynamics (PSID):

- "Michigan Panel"

- a major US longitudinal data set (since 1970s): similar to SLID.

- National Longitudinal Survey (NLS):

- other long-standing US longitudinal data set.

Europe

- Common for countries to have a household survey (like our Labour Force Survey).

- Some countries rely on administrative datasets: sample is the population! e.g. Denmark

Using Data : Tips

- Hamermesh (2000): some useful suggestions and cautions.

- Which data set should be used?

- Variables available?

- How are key variables coded/defined in the dataset?

- Sample size: is it adequate?

- What data sets have other researchers used? Why?

- Know the Data:

- Sample coverage: are there any restrictions? Is it representative or restricted to some subset

of the population?

- Check questionnaire:

- Is the question flawed? Is the question asked only of a subset of the sample? Could

question order affect results?

- Be familiar with variable codings and definitions.

- detail reported?

- top-coding or bottom-coding?

- missing value codes: don’t confuse them with real data!

- What sampling methods were used?

- random sample? stratified sample? Are sample weights provided?

- Data may need to be weighted for some uses.

- Is the data concerned with current status or is it recall (retrospective) data?

- This could be important for reliability of the data.

-Preliminary investigation of the data:

- Generate descriptive statistics for key variables:

means, minimum and maximum values, frequencies.

- Look for oddities, unexpected results.

- Have you misunderstood something?

- Have you made programming errors?

- Are key figures in line with other sources? If not why? (review literature)

- Cross-tabulations, means by characteristic, comparisons of descriptive statistics between

datasets.

- Watch out for surprises/oddities!

- Is there any preliminary evidence on the issues you are examining?

- Appropriate methods:

- draw on economic theory: what variables should determine outcomes?

- draw on econometric knowledge.

- draw on examination of the literature (what techniques have others used? What

explanatory variables have they used? Why?)

- Sensitivity:

- How do results change with changes to specification, estimation technique, sample used?

Methods in Empirical Labour Economics

- Non-standard issues:

- Qualitative data: dummy or other categorical variables are common.

e.g., have a university degree or not; male or female; construction worker or not, etc.

- Dummy dependent variables: probit, logit, multinomial logit methods.

e.g., employed or not

- Censored data: dependent variables with limits

e.g. maximum or minimum earnings

- Sample selection problems and Classical linear regression assumptions (assumes random

sampling).

e.g. migrants are unlikely to be a random sample of the source population.

- Natural experiment approaches.

- Empirical Methods and Causality: a big issue in practice!

- See Angrist and Krueger (1999) for a more detailed overview of several techniques we will

encounter later.

- Angist and Krueger (2008) Mostly Harmless Econometrics. An interesting discussion of

current practice – advocate looking at the problem from an experimental perspective.

(1) Control for confounding variables.

e.g., adding control variables in a regression to isolate the effect of interest.

Estimating: Wi = a + b Ui + ei (ei – error term, a, b coefficients)

- What if Ui depends on an observable variable, Xi, that affect both Wi and Ui?

- estimate of “b” mixes effect of Ui on Wi and the effect of Xi on Wi.

- Estimate instead:

Wi = a + b Ui + cXi + ei

- This method is used in most empirical studies in labour economics.

(2) Fixed effect methods:

- Controlling for unobservable factors using multiple observations on the same

unit of observation (e.g. person).

- Say that Wi depends on unobservable factors specific to individual:

Wi = ai + b Ui + cXi + ei(intercept differs by person i)

- If unobservable is correlated with Ui then again b is biased.

- If have multiple observations on person i over time then a can be estimated as a

“fixed effect” or it could be differenced out.

- Common approach with panel / longitudinal data.

(3) Difference-in-differences estimates

- Appeals to an experimental approach.

- Exogenous event occurs.

- Observe change (difference before and after event) in the variable of interest for

group affected by the event.

- Observe how variable of interest changed in some “control group”.

- Calculate the effect of the event as the difference-in-differences, i.e., difference

between the change for the group affected and the control group.

e.g., Alberta bans unions: observe how wages change after this event and compare to

how wages change in say Saskatchewan.

- See minimum wage debate, US welfare reform effects, overtime effects, etc.

(4) Instrumental variable techniques

- Say Ui is partly determined by unobservables that either affect Wi or are correlated

with ei .

- Then: cov(Ui,ei)0 and estimate of b is biased.

- Possible solution: find an instrumental variable (Zi)

cov(Zi ,Ui) 0 and cov(Zi ,ei) =0 and then use Zi as an instrument for Ui in

the wage equation regression.

- Very popular in recent literature estimating the effect of education on wages.