Abstract
Analysis of the relationship between taxes and self-employment should account for the interplay between responses in self-employment and wage employment. To this end, we estimate a two-state multispell duration model which accounts for both observed and unobserved heterogeneity using a large longitudinal administrative data set for Norway for 1993 to 2011. Our findings confirm theoretical predictions and are robust to various changes to definitions and sample selections. A policy experiment simulating a flatter tax schedule in the year 2000 is found to encourage self-employment, delivering a net increase of predicted inflow into self-employment from 2.8 to 5.3.
I. Introduction
MODELS of choices facing wage earners typically neglect the fact that taxpayers may exit or enter self-employment because of differences in tax schedules. Since the interplay between the occupational choices is typically not considered in models of labor supply, these models are silent on how tax differences across occupational choice affect decisions.1,2 However, in contrast, models of choice of the self-employed are dominated by perspectives where decisions are based on implicit or explicit comparisons to the wage sectors. One obvious reason for this asymmetry is the relative sizes of the sectors. For example, the self-employment rate (as a percentage of total employment) in Norway is , whereas the European Union average is approximately (OECD, 2018).
The relationship to the wage sector is not the only factor that complicates the assessment of the effects of taxation on self-employment. From a theoretical perspective, the tax effects are ambiguous. On the one hand, an increase in the tax rate may diminish the self-employment rate as it reduces expected returns. On the other hand, high taxes may encourage self-employment if loss offsetting is allowed because the government provides an implicit insurance by sharing the risk associated with self-employment (Domar & Musgrave, 1944).3
A large majority of empirical studies on the effect of taxes on the level of self-employment activity focuses on the United States. These studies examine the extensive margin in occupational choice models (see Bruce, 2000, 2002; Gentry & Hubbard, 2000, 2004; Schuetze, 2000; Schuetze & Bruce, 2004; Cullen & Gordon, 2007; and Moore, 2004).4 Studies for other countries include Hansson (2012) for Sweden; Fossen (2007, 2009) and Fossen and Steiner (2009) for Germany, and Wen and Gordon (2014) for Canada. Results from these studies are mixed. Results for the United States, for example, do not provide an unambiguous answer about the relationship between tax progressivity and self-employment. However, in other countries, tax progressivity is generally found to discourage self-employment.5
The representation of the tax schedule is important in any analysis of tax effects on self-employment. Some studies include measures of marginal and/or average taxes in a quasi-experimental or reduced-form analysis to investigate the effect of nonlinearities in taxes on entrepreneurship.6 In other studies, authors have used measures of expected net-income differences and/or tax progressivity to capture the tax effects. For example, Gentry and Hubbard (2000, 2004) use the spread in the marginal (or average) tax rates faced by a self-employed individual at various levels of “success,” where success is defined as the observed distribution of the three-year real wage growth for entrants into self-employment.
In two recent studies (Fossen, 2009; Wen & Gordon, 2014), authors derive the tax variables within a structural framework where the decision making is based on the difference in expected utilities. Yet the two papers differ in many aspects and draw different conclusions. The use of different utility functions and assumptions regarding the pretax income distribution of the individual result in different variables that capture the effects of nonlinearities in the tax schedule. They also use different statistical models (logit versus probit).
Fossen (2009) models the transitions between wage and self-employment using data from the German Socio-Economic Panel (GSOEP) over the period 2002 to 2006 and a logit model in which agents are assumed to trade-off risks and returns. He uses a constant relative risk-aversion utility and assumes normally distributed pretax income. The two relevant model-generated variables are (i) the difference in net-of-tax incomes in the two occupations and (ii) the variances of the individual's posttax income distributions in the transition equation.
In contrast, Wen and Gordon (2014) use a pooled cross-sectional sample from the Canadian Survey of Labour and Income Dynamics over the years 1999 to 2005 to estimate the probability of self-employment in a probit model.7 They assume risk neutrality and a log-normal distribution for the pretax income. The relevant “tax variables” are (i) the difference in log net-of-tax incomes in the occupations (netincdiff) and (ii) a variable that they call convexity. The variable convexity has an intuitive interpretation as the “increase in tax-liability taken on by the self-employed due to the volatility of their business income, expressed as a proportion of their disposable income.”
Both studies use selectivity-corrected income equations to predict individual pretax incomes and then use a tax-transfer microsimulation model to generate the relevant expected value and variance of after-tax incomes in wage employment and self-employment. The estimated models are subsequently used to simulate the effects of hypothetical tax policy scenarios that reduced progressivity. Fossen finds the “flatter-tax” reforms considered discourage individuals from choosing self-employment;8 Wen and Gordon find a “small” positive effect on the probability of finding someone in self-employment.9
Here we use the two variables netincdiff and convexity used by Wen and Gordon (2014). Although some of the tax effects in both studies are captured via net-income differences, the additional variable convexity in Wen and Gordon (2014) is an individual-specific measure that intuitively captures the interaction between the progressivity of the tax schedule and the volatility of self-employment income relative to wage income.
Our work complements the existing empirical literature in various ways. First, our definitions of wage employment and self-employment are based on reported incomes from tax records and not on survey responses. We use data drawn from various Norwegian population registers over the period 1993 to 2011. The data include rich sociodemographic information together with highly accurate income measures from the annual tax returns. Second, we model the evolution of employment spells using a two-state multispell duration model that controls for observed and unobserved heterogeneity correlated across spells and accounts for left and right censoring in the observed spells. This contrasts with several previous contributions, which mainly focus on self-employment entries or exits using survey data with self-reported employment status and short panels of individuals.
We generally find significant effects of both netincdiff and convexity on the probability of exit from both types of employment spells, conforming to theoretical predictions as discussed in section VA. The increase in convexity is found to increase the probability of exiting self-employment and to decrease the probability of entry into self-employment; that is, convexity has a discouraging effect on self-employment, ceteris paribus. On the other hand, an opposite effect is found for netincdiff: negative (positive) in the self-employment (wage-employment) equation. Additionally, in our base model, we find a larger effect of convexity relative to that of netincdiff, implying that small increases in convexity will require large increases in netincdiff to discourage the self-employed from quitting and to encourage wage earners to enter self-employment.
Given the way the tax variables are constructed, a change in the progressivity of the tax schedule will have an impact on the convexity and on the netincdiff by changing the expected net income difference in self-employment and wage employment. From this, the total effect on the rate of self-employment of a decrease in the progressivity of the tax schedule is hard to predict. Hence, to better understand the net effect, we simulate a tax experiment that replaces the personal income tax structure in the year 2000 with a less progressive, revenue-neutral tax schedule, as explained in section VB. The overall estimated effect of this policy change is positive on the share of self-employment. The average exit rate from self-employment is estimated to go down by (s.e. ) percentage points, and the estimated exit rate from wage-employment is estimated to increase by (s.e. ) percentage points. This change results in a net increase of predicted inflow into self-employment changing from about to .
The rest of the paper is organized as follows. Section II describes the taxation of self-employment income and wages during our sample period. Section III sets out our econometric model. In section IV we provide details of the data and the sample selected for our analyses. We also present the procedure used for estimating the tax variables. The estimation results are discussed in section V along with the results from our policy simulation and some sensitivity checks. Finally, section VI concludes the paper.
II. Taxation in Norway
Tax reforms undertaken in 1992 introduced a dual-income tax system in Norway. Under this regime, all types of capital income are taxed at a flat rate, but a progressive schedule applies to labor and pension income. Individuals pay income tax on two different tax bases: (i) ordinary income and (ii) personal income.
Income from wages, self-employment, capital, transfers, and pensions are first grouped as ordinary income. After deductions, individuals pay tax at a flat rate (28 during most of the sample period) on ordinary income.10 The other tax base—personal income—includes wage income, transfers, and pension income and self-employment income due to active efforts, but not capital income. Individuals pay a surtax and social security contributions levied on the personal income.
Taxation is more complicated for the self-employed because income represents the reward to the labor of the individual as well as the returns to the capital invested in the firm. Given the lower tax rate on capital income, the decision about how to declare the income was not left to the discretion of the self-employed; rules were established to split the profits into labor and capital income.12 The dashed line in figure 1 represents the marginal tax rates that apply to self-employment income in the case where no capital is invested in the firm. The main differences to the wage income case are the lack of basic allowance and the higher social security contribution (10.7 in 2005).
III. Econometric Model
Drawing heavily on the framework of Ham et al. (2016), we model employment transitions using a two-state multispell discrete duration model accounting for unobserved individual heterogeneity.14 The two employment states are self-employment and wage employment. The duration variable is measured in terms of the Norwegian financial year, which is the calendar year (January–December). Approximately 70 of individuals in our sample have a first spell that is left censored. Without dropping these individuals from the analysis sample, we include them and specify a different model of exit rates for them (Ham et al., 2016). We check for sensitivity of our estimates to excluding the left-censored spells, which is equivalent to using an inflow sample.
With regard to the unobserved heterogeneity, we follow the literature and assume this to be distributed independently across individuals and of the covariates included but fixed over the same type of spell, but correlated across the two employment states and the type of spell (fresh versus left-censored). A discrete distribution is assumed for the unobserved heterogeneity.
where is the duration dependence function, contains time-fixed and time-varying observed individual characteristics, taxation contains the tax variable(s), and is the unobserved heterogeneity. is specified as the complementary log-log distribution function.15 To achieve convergence with stable parameter estimates, we restrict the duration dependence function to a log linear form and model the unobserved heterogeneity to be discrete with two points of support.16 We keep the hazard-specific intercepts, set as a normalization, and estimate the associated probability, .
IV. Data, Sample, and Variable Definitions
A. Data and Sample Selection
The present study benefits from rich longitudinal Norwegian administrative data for the period 1993 to 2011. The main data source is the Income and Wealth Statistics for Persons and Families (Statistics Norway, 2005). The data are drawn from the annual tax returns and the education registers (years of education and fields of studies). The data also contain individual and family sociodemographic characteristics. Our focus is on wage earners and the self-employed who have strong labor market attachment, and so we restrict our analysis to Norwegian citizens aged 25 to 61 and exclude those who have reported any income from agricultural, forestry, or fishing activities.17
We use an income-based definition to identify periods or spells of self-employment and wage employment. In our main analysis, we classify an individual observation as “self-employed” if the major source of income is self-employment income, i.e., if the reported self-employment income (net of expenses) is larger in absolute value than the wage income and is also larger than government transfers (which include disability insurance, unemployment benefits, and other types of pensions).18 Additionally, we restrict our sample to those who have been classified as either being in wage employment or self-employment during the observation period 1993 to 2011.19
B. Defining and Estimating the Tax Variables
Our analysis is based on the theoretical exposition of an expected utility maximization approach discussed by Wen and Gordon (2014), who in turn base their model on the one developed by Rees and Shah (1986). Assuming risk neutrality, a convex tax schedule, and log-normally distributed pretax income, they show how the probability of self-employment can be written as a function of the tax schedule using two representations of the effects of taxation.20 These are (1) netincdiff, which is the difference in log of expected net incomes in self-employment and wage employment, and (2) convexity, which is a measure of how the expected tax liability changes due to the volatility of their self-employment income relative to the net income in wage employment (see online appendix A.1 for further details).
The construction of the two tax variables requires net-income distributions for each individual. We use a tax simulator to generate these (see online appendix A.2). The simulator considers the yearly rules for taxing self-employment income net of expenses, wages, and other sources of income. Other sources of income are taken to be exogenous; these are added to the predicted self-employment or wage income. The simulator also accounts for the main deductions and allowances, as well as for the system for taxation of the labor and capital parts of net self-employment income; see section II.
where is a tax parameter from the tax function (see footnote 21). For each individual, we first estimate the selectivity-corrected expected pretax income for each occupation in each period.21 We then use the tax simulator to generate the individual specific net incomes in both occupations: and
Next, we define the second individual specific tax variable representation: convexity. This variable is defined as the difference between the expected tax liability and the tax liability at the expected income , relative to the expected net income .22 Wage employment is generally less riskier than self-employment. Hence, following Wen and Gordon (2014), we derive our convexity variable by setting the coefficient of variation for wage income equal to 0, so that convexity is associated with uncertainties in self-employment income only.
C. Summary Statistics
Summary statistics for the main estimation sample are provided in table 1. On average, in the weighted sample, the proportion of individuals exiting out of a period of work and into a period of self-employment is less than , whereas the average share of exits out of a period of self-employment is . We next turn to our tax variables.
. | All . | WE Sample . | SE Sample . |
---|---|---|---|
Individual-specific variables | |||
Females | 0.47 (0.50) | 0.48 (0.50) | 0.27 (0.44) |
Lower secondary school and less | 0.39 (0.49) | 0.35 (0.49) | 0.53 (0.50) |
Upper secondary school | 0.30 (0.46) | 0.31 (0.46) | 0.27 (0.45) |
University | 0.32 (0.47) | 0.34 (0.47) | 0.20 (0.40) |
Time-varying variables | |||
Age at the start of the spell | 35.06 (9.24) | 34.84 (9.20) | 39.80 (8.80) |
Years 1993–1998 | 0.30 (0.49) | 0.30 (0.46) | 0.34 (0.47) |
Years 1999–2002 | 0.22 (0.41) | 0.22 (0.41) | 0.21 (0.41) |
Years 2003–2007 | 0.27 (0.44) | 0.27 (0.44) | 0.27 (0.44) |
Years 2008–2011 | 0.21 (0.41) | 0.21 (0.41) | 0.18 (0.39) |
Eastern Norway | 0.50 (0.50) | 0.49 (0.50) | 0.55 (0.50) |
Southern Norway | 0.05 (0.22) | 0.05 (0.22) | 0.06 (0.24) |
Western Norway | 0.26 (0.44) | 0.26 (0.44) | 0.24 (0.42) |
Central Norway | 0.09 (0.28) | 0.09 (0.29) | 0.07 (0.26) |
Northern Norway | 0.10 (0.30) | 0.10 (0.30) | 0.08 (0.27) |
Local unemployment rate | 2.73 (0.83) | 2.73 (0.83) | 2.78 (0.83) |
convexity | 0.007 (0.008) | 0.007 (0.008) | 0.012 (0.008) |
netincdiff | −0.448 (0.19) | −0.429 (0.17) | −0.825 (0.25) |
Proportion of exits | 0.006 | 0.106 |
. | All . | WE Sample . | SE Sample . |
---|---|---|---|
Individual-specific variables | |||
Females | 0.47 (0.50) | 0.48 (0.50) | 0.27 (0.44) |
Lower secondary school and less | 0.39 (0.49) | 0.35 (0.49) | 0.53 (0.50) |
Upper secondary school | 0.30 (0.46) | 0.31 (0.46) | 0.27 (0.45) |
University | 0.32 (0.47) | 0.34 (0.47) | 0.20 (0.40) |
Time-varying variables | |||
Age at the start of the spell | 35.06 (9.24) | 34.84 (9.20) | 39.80 (8.80) |
Years 1993–1998 | 0.30 (0.49) | 0.30 (0.46) | 0.34 (0.47) |
Years 1999–2002 | 0.22 (0.41) | 0.22 (0.41) | 0.21 (0.41) |
Years 2003–2007 | 0.27 (0.44) | 0.27 (0.44) | 0.27 (0.44) |
Years 2008–2011 | 0.21 (0.41) | 0.21 (0.41) | 0.18 (0.39) |
Eastern Norway | 0.50 (0.50) | 0.49 (0.50) | 0.55 (0.50) |
Southern Norway | 0.05 (0.22) | 0.05 (0.22) | 0.06 (0.24) |
Western Norway | 0.26 (0.44) | 0.26 (0.44) | 0.24 (0.42) |
Central Norway | 0.09 (0.28) | 0.09 (0.29) | 0.07 (0.26) |
Northern Norway | 0.10 (0.30) | 0.10 (0.30) | 0.08 (0.27) |
Local unemployment rate | 2.73 (0.83) | 2.73 (0.83) | 2.78 (0.83) |
convexity | 0.007 (0.008) | 0.007 (0.008) | 0.012 (0.008) |
netincdiff | −0.448 (0.19) | −0.429 (0.17) | −0.825 (0.25) |
Proportion of exits | 0.006 | 0.106 |
(i) Years covered in the analysis are 1993–2011. (ii) Definitions of wage employment and self-employment and the sample selection criteria used are provided in section IV. (iii) All averages and proportions are based on the weighted sample (see section IV for further details). (iv) The number of unweighted observations is 476,275, of which 362,217 are classified as wage employment and 114,058 as self-employment. (v) The number of unweighted individuals is 34,746.
In addition to the two tax variables, the models also include time-varying and time-invariant control variables. The time-invariant variables are sex, age at the start of the spell, indicator variables for highest education level achieved, and regional dummies to account for local labor market conditions. Calendar time dummies control for macroeffects. The data are an unbalanced panel; see descriptive information in table 1. Self-employed individuals are on average older and less educated than individuals who are paid wages, and a lower proportion of females is found among the self-employed. Self-employment is also highly concentrated in the more densely populated areas of eastern Norway (the Oslo region) and western Norway (the Bergen region).
V. Results
A. Main Results
Our base model estimates are presented in table 2.28 All four hazard functions are estimated simultaneously. Except for the left-censored SE hazard, the other three hazards show negative duration dependence, ceteris paribus. Insignificant duration dependence estimated for the left-censored SE spells is consistent with the observation that the probability of exiting is almost zero for high duration spells, and the sample of left-censored spells has a higher probability of containing large-duration spells.
. | Fresh Spells . | Left-Censored Spells . | ||
---|---|---|---|---|
. | SE . | WE . | SE . | WE . |
. | [1] . | [2] . | [3] . | [4] . |
netincdiff | −0.429 | 1.685 | −0.725 | 1.753 |
(0.053) | (0.082) | (0.109) | (0.087) | |
convexity × 100 | 0.049 | −0.246 | −0.017 | −0.163 |
(0.015) | (0.021) | (0.030) | (0.023) | |
Male | −0.024 | 0.602 | 0.191 | 0.776 |
(0.027) | (0.030) | (0.058) | (0.037) | |
Age at the start of the spell | −0.012 | 0.030 | −0.034 | −0.046 |
(0.001) | (0.002) | (0.002) | (0.002) | |
High school | −0.006 | 0.115 | −0.131 | −0.008 |
(0.029) | (0.035) | (0.048) | (0.038) | |
University | 0.220 | 0.131 | 0.051 | 0.100 |
(0.028) | (0.037) | (0.051) | (0.038) | |
ln(duration) | −0.520 | −0.490 | −0.016 | −0.234 |
(0.016) | (0.018) | (0.037) | (0.032) | |
Constant | −1.135 | −3.11 | −0.892 | −1.930 |
(0.092) | (0.103) | (0.192) | (0.118) | |
Support points | −0.531 | −3.042 | −1.337 | −1.839 |
(0.049) | (0.200) | (0.072) | (0.094) | |
Probability masses | ||||
p1 (constants + support points) | 0.805 | |||
(0.019) | ||||
p2 (constants only) | 0.195 | |||
(0.019) | ||||
observations (unweighted) | 476,275 | |||
individuals (unweighted) | 34,746 | |||
Maximized log likelihood value | −105,687.67 |
. | Fresh Spells . | Left-Censored Spells . | ||
---|---|---|---|---|
. | SE . | WE . | SE . | WE . |
. | [1] . | [2] . | [3] . | [4] . |
netincdiff | −0.429 | 1.685 | −0.725 | 1.753 |
(0.053) | (0.082) | (0.109) | (0.087) | |
convexity × 100 | 0.049 | −0.246 | −0.017 | −0.163 |
(0.015) | (0.021) | (0.030) | (0.023) | |
Male | −0.024 | 0.602 | 0.191 | 0.776 |
(0.027) | (0.030) | (0.058) | (0.037) | |
Age at the start of the spell | −0.012 | 0.030 | −0.034 | −0.046 |
(0.001) | (0.002) | (0.002) | (0.002) | |
High school | −0.006 | 0.115 | −0.131 | −0.008 |
(0.029) | (0.035) | (0.048) | (0.038) | |
University | 0.220 | 0.131 | 0.051 | 0.100 |
(0.028) | (0.037) | (0.051) | (0.038) | |
ln(duration) | −0.520 | −0.490 | −0.016 | −0.234 |
(0.016) | (0.018) | (0.037) | (0.032) | |
Constant | −1.135 | −3.11 | −0.892 | −1.930 |
(0.092) | (0.103) | (0.192) | (0.118) | |
Support points | −0.531 | −3.042 | −1.337 | −1.839 |
(0.049) | (0.200) | (0.072) | (0.094) | |
Probability masses | ||||
p1 (constants + support points) | 0.805 | |||
(0.019) | ||||
p2 (constants only) | 0.195 | |||
(0.019) | ||||
observations (unweighted) | 476,275 | |||
individuals (unweighted) | 34,746 | |||
Maximized log likelihood value | −105,687.67 |
(i) MLE standard errors in parentheses. (ii) The models are estimated using a random sample of individuals as detailed in section IV. (iii) Omitted education category is no-education/high-school drop-out. (iv) The model additionally includes region and time indicators; see table 1. Complete sets of results are available in the online appendix A.4.
We focus our discussions on the interpretation of the estimated effects of the tax variables. The theory predicts a positive (negative) effect of the netincdiff variable on the probability of exit from WE (SE). For example, the higher the proportionate increase in the net-income differential with respect to the net income from WE, the higher the exit rate from WE (Wen & Gordon, 2014; Taylor, 1996; Fossen, 2009). On the other hand, the theoretical prediction of the effect of convexity is negative on exit rate from WE since higher “convexity” would be expected to discourage SE. The estimated effects of the two tax variables conform to these theoretical predictions.
These estimated coefficients are also found to be higher in absolute value for WE exit probabilities (columns 2 and 4). These results suggest that, compared to exits from SE, the probability of an exit from WE is more sensitive to changes in both expected net-income differences and tax progressivity. This is consistent with the fact that the SE tend to continue their business activities even if they experience lower earnings growth (Hamilton, 2000).
These estimates also indicate that a one percentage point increase in convexity requires an increase of approximately nine to fourteen percentage points in netincdiff to keep these hazards unchanged. Note that increases in convexity in this calculation are assumed to take place via changes to the volatility of SE income (online appendix A.1 equation [A.4]) because we assume no uncertainty in WE income in the calculation of this variable. Similarly, the increase in netincdiff is assumed to work either via a reduction in the pretax income in WE or via an increase in the expected pretax SE income (not altering the variance of the SE income distribution). To further explore these effects accounting for the relationship between the two tax variables, we simulated a policy experiment. The results are presented below.
B. Results from a Policy Experiment
Our policy experiment is to replace two of the surtaxes applied to personal income with one surtax, to create a flatter tax schedule (solid line in figure 10). The surtax value of on gross income above NOK is chosen to ensure revenue neutrality, given a “no behavioral reaction” assumption. Other features of the taxation are held constant. New values of netincdiff and convexity were generated under the hypothetical scenario using our tax simulator and the transition rates predicted from the estimated models.
The average values of the netincdiff and convexity variables in our weighted sample are and under the new policy regime, compared to the original figures for the year 2000 of and , respectively. As expected, the less progressive tax schedule leads to a decrease of percentage points in convexity. The hypothetical policy also leads to a small increase in the mean netincdiff, so that average ratio of net income in SE to net income in WE changes from to .
The predicted transition probabilities and the corresponding standard errors, under the old and the new tax regimes, are reported in table 3.30 In the benchmark year 2000, the model predicts that around of self-employed individuals will transit out of SE to WE (case A).31 However, the reform reduces this figure to (case B). Under the new regime, the predicted transitions from WE to SE are higher at compared to in the base model. Since a very large proportion of individuals are in WE compared to SE, even with this small increase in the exit rates out of WE can generate a substantial net inflow into SE. The change in the exit rates induced by the policy reform is not significant for the self-employed.
. | . | Probability of Exit . | |
---|---|---|---|
Case . | Tax Scenario . | from SE, . | from WE, . |
A | Base model: year 2000, two surtaxes | 9.334 | 0.562 |
(s.e.) | (0.227) | (0.011) | |
B | Reform scenario: year 2000, one surtax | 9.316 | 0.682 |
(s.e.) | (0.289) | (0.016) | |
Change A − B | 0.018 | −0.119 | |
(s.e.) | (0.184) | (0.010) | |
Sample size in year 2000 | 6,043 | 130,019 | |
C | convexity: unchanged from baseline netincdiff: reform | 9.622 | 0.571 |
(s.e.) | (0.234) | (0.011) | |
D | netincdiff: unchanged from baseline convexity: reform | 9.034 | 0.673 |
(s.e.) | (0.276) | (0.015) |
. | . | Probability of Exit . | |
---|---|---|---|
Case . | Tax Scenario . | from SE, . | from WE, . |
A | Base model: year 2000, two surtaxes | 9.334 | 0.562 |
(s.e.) | (0.227) | (0.011) | |
B | Reform scenario: year 2000, one surtax | 9.316 | 0.682 |
(s.e.) | (0.289) | (0.016) | |
Change A − B | 0.018 | −0.119 | |
(s.e.) | (0.184) | (0.010) | |
Sample size in year 2000 | 6,043 | 130,019 | |
C | convexity: unchanged from baseline netincdiff: reform | 9.622 | 0.571 |
(s.e.) | (0.234) | (0.011) | |
D | netincdiff: unchanged from baseline convexity: reform | 9.034 | 0.673 |
(s.e.) | (0.276) | (0.015) |
(i) Actual exit rates in 2000 were 9.813 and 0.595. (ii) Predicted exits are based on the estimated model from table 2. (iii) The percentage exits are calculated with respect to the stocks in each of the occupational categories. (iv) Case A refers to the actual situation as it was in year 2000 with two surtaxes; Calculated convexity and netincdiff in this scenario were used in the estimation of the main model. (v) Case B refers to a hypothetical reform scenario that replaces two surtaxes with just one surtax. New values of convexity and netincdiff are recalculated given the new tax rules. (vi) Case C considers values of convexity from the baseline scenario and values of netincdiff from the reform scenario. (vii) Case D considers values of netincdiff from the baseline scenario and values of convexity from the reform scenario. (viii) The above predictions and the associated standard errors were calculated using the delta method in STATA's command margins. Average exit rates as well as the differenced average exit rates were all calculated using all four hazards. (ix) All calculations are based on the weighted sample.
To further explore how the model predicts responses to separate changes in the two tax variables, we look at these effects separately. In case C we hold the convexity variable fixed at a value that is the same as in the base case scenario and let the netincdiff variable change. Conversely, in case D we see a change in the convexity variable only. Table 3 shows that the partial effect of a change in netincdiff is an increase in transitions out of both SE and WE. This result is consistent with the fact that mean netincdiff experiences a decrease in the reform scenario for the self-employed, whereas it increases for wage earners. A possible explanation for this effect is that the reduced progressivity of the tax system would encourage a larger share of wage earners who expect to be successful in self-employment to transit into SE. On the other hand, because a majority of self-employed individuals have been predicted to have a higher posttax income in regular employment, a flatter tax scenario would increase the proportion of them leaving SE for WE. In contrast, the decrease in convexity, common to both WE and SE observations, reduces the transitions from SE and increases the exit from WE. In summary, the hypothetical tax scenario is found to encourage the net inflow into SE. Translating these estimates to numbers, we find that such a policy would have resulted in an increase from 2.76 to 5.34 in the net inflow into SE.32
Finally, we briefly compare our results to the findings of Wen and Gordon (2014), given that the same variables are used to capture the effects of taxes and uncertainty. Wen and Gordon (2014) also simulated the effect of a flatter tax schedule in the year 2000 using Canadian data. Their policy reform implied decreases in the average values of (i) netincdiff and convexity from to (a decrease of ) and (ii) from to (a reduction of ). The policy reform we considered increased the average values of netincdiff by around , and reduced the average values of convexity by . From the simulated policy reform, Wen and Gordon (2014) estimate an increase in the number of self-employed individuals of ( to ), which is substantially below our estimate of (our experiment implies an increase of the self-employment share in 2001 from to ). One should however note that Wen and Gordon (2014) do not model transitions.
C. Sensitivity Checks
In this subsection we present results of some of our investigations into key assumptions of our empirical approach. We consider the following: (i) redefinition of a self-employment spell; (ii) estimation based only on the inflow sample; (iii) trimming the netincdiff with respect to extreme values; (iv) controlling for local unemployment rates; (v) including a dummy variable for individuals receiving some unemployment insurance during the year; and (vi) allowing for the share of capital in SE income to be nonzero. Table 4 reports the results of these investigations. The estimated effects of the tax variables are qualitatively unchanged. The full set of results is available in the online appendix A.3.
. | Fresh Spells . | Left-Censored Spells . | ||
---|---|---|---|---|
. | SE . | WE . | SE . | WE . |
Variables . | [1] . | [2] . | [3] . | [4] . |
A. Base case | ||||
netincdiff | −0.429 | 1.685 | −0.725 | 1.753 |
(0.053) | (0.082) | (0.109) | (0.087) | |
convexity 100 | 0.049 | −0.246 | −0.017 | −0.163 |
(0.015) | (0.021) | (0.030) | (0.023) | |
B. Changes to sample definition | ||||
netincdiff | −0.493 | 1.734 | −0.615 | 1.768 |
(0.016) | (0.026) | (0.034) | (0.028) | |
convexity 100 | 0.011 | −0.277 | 0.016 | −0.187 |
(0.005) | (0.007) | (0.009) | (0.007) | |
C. Excluding left-censored spells | ||||
netincdiff | −0.405 | 1.920 | ||
(0.053) | (0.083) | |||
convexity 100 | 0.055 | −0.292 | ||
(0.015) | (0.022) | |||
D. Using trimmed netincdiff | ||||
netincdiff | −0.333 | 2.281 | −0.871 | 2.998 |
(0.068) | (0.108) | (0.138) | (0.134) | |
convexity 100 | 0.065 | −0.222 | −0.061 | −0.237 |
(0.017) | (0.025) | (0.032) | (0.026) | |
E. Including regional unemployment rate 1996–2011 | ||||
netincdiff | −0.531 | 1.709 | −0.718 | 1.568 |
(0.057) | (0.096) | (0.110) | (0.083) | |
convexity 100 | 0.045 | −0.292 | 0.047 | −0.115 |
(0.018) | (0.026) | (0.029) | (0.025) | |
F. Including regional dummies 1996–2011 | ||||
netincdiff | −0.519 | 1.762 | −0.754 | 1.607 |
(0.057) | (0.095) | (0.110) | (0.081) | |
convexity 100 | 0.036 | −0.314 | 0.038 | −0.140 |
(0.018) | (0.026) | (0.030) | (0.024) | |
G. Including unemployment benefits dummy | ||||
netincdiff | −0.415 | 1.698 | −0.694 | 1.763 |
(0.053) | (0.082) | (0.109) | (0.087) | |
convexity 100 | 0.049 | −0.252 | −0.014 | −0.165 |
(0.015) | (0.022) | (0.030) | (0.023) | |
H. Using 3.7 capital income invested in SE | ||||
netincdiff | −0.434 | 1.712 | −0.719 | 1.761 |
(0.052) | (0.083) | (0.108) | (0.086) | |
convexity 100 | 0.058 | −0.264 | −0.008 | −0.166 |
(0.016) | (0.024) | (0.031) | (0.025) |
. | Fresh Spells . | Left-Censored Spells . | ||
---|---|---|---|---|
. | SE . | WE . | SE . | WE . |
Variables . | [1] . | [2] . | [3] . | [4] . |
A. Base case | ||||
netincdiff | −0.429 | 1.685 | −0.725 | 1.753 |
(0.053) | (0.082) | (0.109) | (0.087) | |
convexity 100 | 0.049 | −0.246 | −0.017 | −0.163 |
(0.015) | (0.021) | (0.030) | (0.023) | |
B. Changes to sample definition | ||||
netincdiff | −0.493 | 1.734 | −0.615 | 1.768 |
(0.016) | (0.026) | (0.034) | (0.028) | |
convexity 100 | 0.011 | −0.277 | 0.016 | −0.187 |
(0.005) | (0.007) | (0.009) | (0.007) | |
C. Excluding left-censored spells | ||||
netincdiff | −0.405 | 1.920 | ||
(0.053) | (0.083) | |||
convexity 100 | 0.055 | −0.292 | ||
(0.015) | (0.022) | |||
D. Using trimmed netincdiff | ||||
netincdiff | −0.333 | 2.281 | −0.871 | 2.998 |
(0.068) | (0.108) | (0.138) | (0.134) | |
convexity 100 | 0.065 | −0.222 | −0.061 | −0.237 |
(0.017) | (0.025) | (0.032) | (0.026) | |
E. Including regional unemployment rate 1996–2011 | ||||
netincdiff | −0.531 | 1.709 | −0.718 | 1.568 |
(0.057) | (0.096) | (0.110) | (0.083) | |
convexity 100 | 0.045 | −0.292 | 0.047 | −0.115 |
(0.018) | (0.026) | (0.029) | (0.025) | |
F. Including regional dummies 1996–2011 | ||||
netincdiff | −0.519 | 1.762 | −0.754 | 1.607 |
(0.057) | (0.095) | (0.110) | (0.081) | |
convexity 100 | 0.036 | −0.314 | 0.038 | −0.140 |
(0.018) | (0.026) | (0.030) | (0.024) | |
G. Including unemployment benefits dummy | ||||
netincdiff | −0.415 | 1.698 | −0.694 | 1.763 |
(0.053) | (0.082) | (0.109) | (0.087) | |
convexity 100 | 0.049 | −0.252 | −0.014 | −0.165 |
(0.015) | (0.022) | (0.030) | (0.023) | |
H. Using 3.7 capital income invested in SE | ||||
netincdiff | −0.434 | 1.712 | −0.719 | 1.761 |
(0.052) | (0.083) | (0.108) | (0.086) | |
convexity 100 | 0.058 | −0.264 | −0.008 | −0.166 |
(0.016) | (0.024) | (0.031) | (0.025) |
Our first investigation examines the influence of the definition of an SE spell. In our base model we included individuals in the sample if they had at least three years of labor market attachment, that is, if the net SE income or WE is larger in absolute value than the basic amount for at least three years over the years the individual is observed in data. We now redefine the sample requiring only one year of labor market attachment. The results using this new definition are presented in panel B of table 4. Individuals with less attachment to the labor market would be expected to be more sensitive to changes in the tax variables, and this is what we find when we include these individuals in the estimation sample. The results are qualitatively similar to the results from our base case (panel A). However, the coefficient for convexity in the SE fresh spells hazard decreased substantially. Individuals with less attachment to the labor market with low predicted SE income might be expected to be less sensitive to the progressivity of the tax system.
The base model was estimated using both the left-censored and fresh spells. We reestimate our model using only the inflow sample. This reduces the total number of unweighted observations to . The definition of an SE spell is the same as the one used in our base model. The results are presented in panel C of table 4. The results are broadly similar to our base model results. As expected, dropping those spells for which we have no information about the length of time they had spent in a particular state prior to the sample start slightly increases the estimates.
The third investigation involves omitting observations with extreme predicted values for the variable netincdiff. As shown in figure 5, the distribution of netincdiff exhibits some lumpiness in the tails. To assess the effect of extreme values of netincdiff, we drop those individuals who have at least one occupation-specific netincdiff above the top 1 or below the 1 cut-off values.33 Since individuals with very high or low netincdiff would be expected to be less sensitive than the others, we would expect the estimated effects of netincdiff to be higher in absolute values. This is what we see with the results reported in panel D. In the base model (panel A), we found the WE exits to be more sensitive than the SE exits, and now we see that the effect of netincdiff goes up for the WE exits without much change for in SE exits.
The next investigation examines the influence of local labor market conditions. In the main specification we use regional dummies to partially control for labor market conditions. Perhaps a better control for local labor market conditions would be the use of local unemployment rates. Unfortunately such information is available only from 1996, so we report two sets of results. In panel E we substitute the regional dummies with regional unemployment rates. In panel F we reestimate our base model using the restricted sample of 1996 to 2011. The results are very similar to each other and qualitatively similar to the baseline results.34
As described in section IV, in our base model we drop individuals who received more in social security benefits than their self-employment income or wages in any year. However, it can be the case that individuals are unemployed for a short period and the unemployment insurance is small enough so that the individual is still defined as a self-employed or a wage earner. Individuals with an interruption to their work might behave differently from individuals transiting directly from WE to SE. We therefore include a dummy variable for those individuals who received unemployment insurance during the year. As panel G shows, the results are similar to those from the base model.
In Norway self-employed individuals have the option of having a share of the self-employed income declared as capital income, which is taxed at a lower rate than labor income, as explained in section 2. Tax variables used in our main model are generated under the assumption that the share of capital income in total income is zero (see online appendix A.2). We believe our assumption is reasonable for the following reasons. First, it is not clear what is an appropriate assumption regarding the proportion of capital income used in the generation of counterfactual SE income distributions for the wage earners, which are also exogenous. Second, during our sample period, the share declared as capital income is either 0 or very small (median value is 0.037). However, we check for sensitivity by regenerating our tax variables allowing for of the predicted SE income to be reported as capital income instead of 0. The results are given in panel H. The effect of convexity is slightly stronger on the SE exit rates, and the rest of the estimated effects remain similar to the base model estimates.
VI. Conclusion
We look at the effect of taxation on self-employment and wage employment durations. Our work complements the existing literature on many dimensions. First, in contrast to many existing studies, our definitions of self-employment and wage employment are based on income reported in Norwegian tax returns. The rest of the variables used come from various other registry data. Norwegian registry data are considered to be exceptional in terms of coverage and reliability (Blundell, Graber & Mogstad, 2015). Second, we look at the evolution of self-employment and wage employment spells over a very long period, from 1993 to 2011. We model these transitions using a two-state multispell duration model allowing for correlated unobserved heterogeneity and controlling for a rich set of sociodemographic characteristics.
We focus on the effects of two tax variables: netincdiff and convexity, obtained from Wen and Gordon (2014). netincdiff is defined as the difference in log net-of-tax income in the two occupations, and convexity is an individual-specific measure that captures the interaction between the progressivity of the tax schedule and the volatility of self-employment income relative to wage income. We use the model to predict the transitions under a simulated tax regime that reduced the progressivity of the tax schedule in the year 2000. We also provide some sensitivity checks with respect to the definition of self-employment, the selection of the estimation sample, and other factors. The estimated effects of our two tax variables of interest are qualitatively unchanged, and the quantitative differences are as expected.
The main finding is that, as predicted by theory, higher expected net earnings in self-employment relative to wage employment reduces the probability of exiting out of a self-employment spell. The entry into self-employment—or equivalently the exit out of wage employment—is found to be more sensitive to changes in the two variables than exit from self-employment. In our base model, the estimated effect of changes to netincdiff that are required when convexity changes by a percentage point, to encourage self-employment, is about nine to fourteen times larger in percentage point terms. To shed further light on this, we carried out a policy experiment by implementing a flatter tax schedule in the year 2000 that resulted in reduced tax progressivity. The hypothetical scenario was found to encourage entry into self-employment but not significantly the exit from self-employment, with the estimated inflow into self-employment increasing to from the base model prediction of .
Notes
“Occupational choice” here means a choice between wage employment and self-employment.
The role of loss offsetting is less clear in the presence of a progressive tax schedule. If the tax rate is an increasing function of taxable income, the savings made because of the loss offset are usually lower in magnitude than the taxes paid on profits (Gentry & Hubbard, 2000).
A positive correlation between taxes and self-employment may also partly be attributed to the higher tax evasion or avoidance possibilities in self-employment relative to wage employment (see, for instance, Schuetze & Bruce, 2004). Our data do not allow us to address this issue. Recent tax evasion estimates for Norway show that around 14 of the business income is not reported (Nygård et al., 2019). This estimate is lower than typical estimates for the United States but close to what is found among the self-employed in Finland (Johansson, 2005) and Denmark (Kleven et al., 2011). Slemrod (2007) estimates that around 57 of U.S. nonfarm business income was not reported. The time and individual unobservable effects included in our model will partially mitigate this problem if the differential evasion possibilities are relatively constant over the time period under consideration. Another issue is the possibility of a tax-induced organizational shift. See Papini (2018) for a recent analysis of this issue. We treat a self-employed individual who decides to incorporate and, thus, decides to earn wages from the company, as a wage earner. We also include region fixed effects to partly control for this issue, as this organizational shift was more common in some regions and time periods (Papini, 2018).
Thus, the focus is on being in self-employment at the time of interview and not on entering self-employment.
The interpretation given in Fossen (2009) is that a flatter tax schedule increases expected returns in self-employment, but at the same time it also increases the risk because the variance of the net income distribution also increases. The second effect is found to dominate the first one, and hence, a flatter tax schedule discourages self-employment.
The “flatter-tax” reform considered is found to increase the probability of finding someone in self-employment by 0.04 percentage points from the base model prediction of 5.76.
The deductions include a standard personal allowance, a deduction for expenses including interest payments, and a basic allowance, which is a percentage (up to a maximum) of labor or pension income.
The exchange rate in 2005 was 1 USD 6.45 NOK; 1 EUR 8.01 NOK.
Capital income is calculated by multiplying the capital invested in the firm with a rate of return annually established by the government. The labor income is then estimated by subtracting the imputed capital income from the reported self-employment income net of expenses.
Note that the thresholds account for wage growth.
Following the early pioneering work by Lancaster (1979), and Nickell (1979), the literature on modeling durations using survival analysis has developed very fast. Lancaster (1990) and Van den Berg (2001) provide a comprehensive discussion of theoretical issues as well as empirical examples that helped to develop this literature. See Carrasco and García-Pérez (2015) for another recent application of a two-state multispell duration model with discretely distributed unobserved heterogeneity.
The distribution function is given by Some other popular distributions used are the standard normal and the logistic cdfs, which are symmetric distributions. The distribution we employ is not a symmetric distribution. A discrete time hazard model derived from an underlying continuous time proportional hazard model can be written in this form. See Narendranathan and Stewart (1993) for an application.
Theoretical results exist for lack of nonparametric identification in hazard models when one or more of the following are present: duration dependence, time-varying variables, time-varying effects, and unobserved heterogeneity. For example, Baker and Melino (2000), using simulations, look at the behavior of the nonparametric maximum likelihood estimator for a discrete duration model with unobserved heterogeneity and unknown duration effect, and find the estimator to be biased when both are nonparametrically specified. Unsurprisingly, empirical researchers have also found the model estimations to be unstable when most of the time effects are modeled in an unrestricted manner and have thus imposed some functional form restrictions to identify the parameters. See Ham and Rea (1987) for a discussion of these issues in the context of an empirical application.
Since immigrants are a group of “selected” individuals, we exclude them.
We also exclude individuals who do not report any wage income or business income that is larger than the “Basic amount” during the observation period for at least three years. The “Basic amount” is the base for calculating many of the Norwegian social insurance scheme's payments and was 78,024 NOK in 2011 (the approximate exchange rate in that year was: 1 USD 5.67 NOK; 1 EUR 7.79 NOK).
Around 18% of the individuals in the sample experienced at least one “third-state” spell (periods of time that cannot be defined either as wage employment or as self-employment) and are omitted from the analysis.
Wen and Gordon (2014) represent the convex tax function specifying the after-tax income as , where the tax parameters and are such that , and represents the income at which the tax liability is zero. is the elasticity of posttax income with respect to pretax income (also see Musgrave & Thin [1948] and Benabou [2000]).
Online appendix A.3 contains the full set of estimates from the equations that we used to generate the income variables.
The paradox of self-employment being characterized by higher uncertainty and lower earnings than wage employment is a common finding in previous studies (see, for example, Hamilton [2000] and Hurst & Pugsley [2011], or Berglann et al. [2011] for the case of Norway). There are several possible explanations for this puzzle. Among them, (i) the relevance of unobserved nonpecuniary benefits; (ii) unobserved underreporting of income by the self-employed; and (iii) overestimation by the self-employed of their probability of success.
Negative convexity values are possible if the tax function is not convex. Estimated convexity is 0 for about 1.5 of the observations and negative for about 5.5% of the observations.
Another possible explanation for this is the increased uncertainty due to the recession in the early 2000s.
We carried out an analysis of covariance to assess the contribution of various factors to the variation of the two tax variables. We included all the variables (sex, marital status, education, region, children, family head, year dummies, two selection correction terms, and estimated variances) that were used in the predictions of these two tax variables along with the “other” tax variable (convexity or netincdiff). The model -squared values were and respectively in the netincdiff and convexity equations. The top four largest contributors explained of the model sum of squares (SS) in the netincdiff equation. These were education, selection into SE, and the regional and year dummies. With regard to the convexity variable, the top four largest contributors were the year effects, education, and estimated heteroskedastic functions, which together explained of the model SS. The convexity (netincdiff) variable in the netincdiff (convexity) equation explained less than () of the model variations. The largest contributions to the model SS came from the year effects.
This is the number of individuals exiting during the year divided by the number of individuals in that state at the beginning of the year.
The bootstrapped standard errors to account for the tax variables being “generated regressors” did not change the significance of our variables compared to the usual maximum likelihood standard errors for our base model reported in table 2. Hence, we report only the usual MLE standard errors in this table and subsequent tables.
According to exchange rates for 2000: 1 EUR 8.11 Norwegian kroner (NOK), and 1 USD 8.81 NOK.
All predictions including the differences in predicted exit rate, and the associated standard errors, use all four hazards. These are calculated using STATA's margins command.
The observed exit rates in 2000 were 9.813 and 0.595.
The predicted probability of exit from SE in the reform scenario is not statistically significantly different from the base model, and so we use the base model predicted probability. With the reform scenario prediction, the predicted net inflow would rise to 5.36.
To preserve a continuous series of observations, all observations belonging to an individual are dropped if we find at least one neticdiff that is either less than the first percentile or above the 99th percentile value for that individual resulting in a loss of more than 2% of the sample. We lose about 9 of the observations, resulting in 432,409 observations in our unweighted sample. The definition of a SE spell is the same as the one used in our base model.
We made multiple attempts but were unable to find significant unobserved heterogeneity in these models with the reduced number of years. We therefore report results from the model where we set the unobserved heterogeneity component to
REFERENCES
Author notes
This paper is part of the research of Oslo Fiscal Studies supported by the Research Council of Norway. We are grateful to Statistics Norway (SSB) for providing us with access to the confidential administrative data used in the paper. The paper has benefited from comments from many individuals. We are very grateful to Thor O. Thoresen for his detailed comments on multiple drafts of the paper. The paper has benefited from comments received from Frank Fossen, Åsa Hansson, Ben Lockwood, Jean-François Wen, the participants at the Workshop on Self-Employment/Entrepreneurship and Public Policy held at the University of Oslo, September 2016, and the participants at the Skatteforum held in Hadeland, June 2017, the referees, and the editor of this journal. The views expressed are purely those of the authors and may not in any circumstances be regarded as stating an official position of the European Commission.
A supplemental appendix is available online at https://doi.org/10.1162/rest_a_01046.