SILC_ESQRS_A_FI_2011_0000 - Version 3

National Reference Metadata in ESS Standard for Quality Reports Structure (ESQRS)

Compiling agency: Statistics Finland.

Time Dimension: 2011-A0

Data Provider: FI1

Data Flow: SILC_ESQRS_A


For any question on data and metadata, please contact: EUROPEAN STATISTICAL DATA SUPPORT


1. ContactTop
1.1. Contact organisationStatistics Finland.
1.2. Contact organisation unitPopulation and Social Statistics. (The unit has been valid from 1.1.2013 onwards, earlier Social Statistics)
1.5. Contact mail addressFI-00022 Statistics Finland Finland


2. IntroductionTop
 

The production of quality reports is part of the implementation of the EU-SILC instrument. In order to assess the quality of data at national level and to make a comparison among countries, the National Statistics Institutes are asked to report detailed information mainly on: the entire statistical process, sampling and non-sampling errors, and potential deviations from standard definition and concepts.

This document follows the ESS standard for quality reports structure (ESQRS), which is the main report structure for reference metadata related to data quality in the European Statistical System. It is a metadata template, based on 13 main concepts, which can be used across several statistical domains with the purpose of a better harmonisation of the quality reporting requirements in the ESS.

For that reason the template of this document differs from that one stated in the Commission Reg. 28/2004.

Finally it is the combination of the previous intermediate and final quality reports therefore it is worth mentioning that it refers to both the cross sectional and the longitudinal data from 2013 onwards.

This document updated for the first version in the end of the year 2012 refers to Intermediate Quality Report of the EU-SILC 2011 operation only.

 


3. Quality management - assessmentTop

-


4. RelevanceTop
4.1. Relevance - User Needs

-

4.2. Relevance - User Satisfaction

-

4.3. Completeness

-

4.3.1. Data completeness - rate

-


5. Accuracy and reliabilityTop
5.1. Accuracy - overall
 

In terms of precision requirements, the EU-SILC framework regulation as well the Commission Regulation on sampling and tracing rules refers respectively, to the effective sample size to be achieved and to representativeness of the sample. The effective sample size combines sample size and sampling design effect which depends on sampling design, population structure and non-response rate.

 

Considering the Finnish sample design described in the section 12 and data representativeness in general, it can be summarised that:

  • The Finnish SILC 2011 is based on a nationally representative probability sample of the population residing in private households (non-institutionalised persons, two-phase stratified sampling design),
  • All private households and all persons aged 16 and over within the household are eligible for the operation (selection of persons, creation of household-dwelling units around persons and definition of households, i.e. housekeeping units, during the interviews),
  • Representative probability samples are achieved both for households, which are the basic units of sampling, data collection and data analysis, and for individual persons in the target population (selection of persons aged 16 and over from the register, creation of household-dwelling units around persons and definition of households, i.e. housekeeping units, during the interviews), and
  • The sampling frame and methods of sample selection ensure that every individual and household in the target population is assigned a known and non-zero probability of selection (for every non-institutionalised person the probability of selection is identified and greater than zero).

 

The precision reguirements have been met by effective sampling design (section 12.1.1), sampling rates and sizes (section 12.1.3) and weighing methods of non-response corrections (section 12.5.1).

 
5.2. Sampling error
 

EU-SILC is a complex survey involving different sampling design in different countries. In order to harmonize and make sampling errors comparable among countries, Eurostat (with the substantial methodological support of Net-SILC2) has chosen to apply the "linearization" technique coupled with the “ultimate cluster” approach for variance estimation. Linearization is a technique based on the use of linear approximation to reduce non-linear statistics to a linear form, justified by asymptotic properties of the estimator. This technique can encompass a wide variety of indicators, including EU-SILC indicators. The "ultimate cluster" approach is a simplification consisting in calculating the variance taking into account only variation among Primary Sampling Unit (PSU) totals. This method requires first stage sampling fractions to be small which is nearly always the case. This method allows a great flexibility and simplifies the calculations of variances. It can also be generalized to calculate variance of the differences of one year to another .

The main hypothesis on which the calculations are based is that the "at risk of poverty" threshold is fixed. According to the characteristics and availability of data for different countries we have used different variables to specify strata and cluster information. In particular, countries have been split into four groups:

1)BE, BG, CZ, IE, EL, ES, FR, IT, LV, HU, NL, PL, PT, RO, SI, UK and HR whose sampling design could be assimilated to a two stage stratified type we used DB050 (primary strata) for strata specification and DB060 (Primary Sampling Unit) for cluster specification;

2) DE, EE, CY, LT, LU, AT, SK, FI, CH whose sampling design could be assimilated to a one stage stratified type we used DB050 for strata specification and DB030 (household ID) for cluster specification;

3) DK, MT, SE, IS, NO, whose sampling design could be assimilated to a simple random sampling, we used DB030 for cluster specification and no strata;

 

In case Eurostat methodology is not accepted by your country, please describe the methodology used at national level for computing the estimates.

 

The Finnish EU-SILC implies two-phase stratified sample design (see section 12.1.1) using simple random sampling without replacement in the 1st phase, and stratified simple random sampling with unequal allocation emphasising some groups in the 2nd phase. The standard error calculations are conducted by bootstrap method (3,000 replications in 2011). The idea is to estimate the standard error of the second phase separately carrying out simple random sampling with replacement in every stratum with the original sample size of the stratum. The calibration is conducted in every replication, the weights are an outcome of this process.  In order to adjust the effect of separate panel, the calibrated weights are multiplied by the proportions of accepted rotational group samples from the whole sample. The variance used is simply the variance of the bootstrap estimator. In addition, the non-negligible sampling fraction has been taken into account by multiplying the variance by the finite population correction at the whole sample level, i.e. approximately 0.79. The standard error is the square root of the variance.

 

The variance estimation method includes some aspects of uncertainty. The real non-response effect has been omitted. The with-replacement nature of selection differs from the original selection, and the use of the finite population correction at the general level does not take into account the non-proportional allocation. For this reason, the standard error estimates are supposed to be slightly conservative.

 
5.2.1. Sampling error - indicators
 
  AROPE At risk of poverty
(60%)
Severe
Material Deprivation
Very low
work intensity

Ind.

value

Stand. errors

Half

CI (95%)

Ind.

value

Stand. errors

Half

CI (95%)

Ind.

value

Stand. errors

Half

CI (95%)

Ind.

value

Stand. errors

Half

CI (95%)

Total

 17.6 0.672  16.1,19.1  13.6 0.710 12.1,15.1   3.1 0.360  2.3,3.9   10.4  0.610  9.1,11,7

Male

 17.0 0.775 15.3,18.7  13.0 0.775 11.3,14.8  3.1 0.431 2.2,4.1      

Female

 18.3 0.850 16.4,20.2   14.1 0.876  12.3,16.1  3.1 0.419 2.2,4.1       

Age0-17

 15.7 1.453  12.6,19.0   11.6 1.321  8.9,14.6  3.0 0.796 1.6,5.0      

Age18-64

 17.7 0.686  16.2,19.2  12.7  0.694 11.2,14.2  3.4 0.390 2.6,4.3       

Age 65+

 19.8 1.701  16.2,23.7  19.0  1.730  15.4,22.9  1.9 0.580 0.9,3.4       
 
5.3. Non-sampling error
 

Non-sampling errors are basically of 4 types:

  • Coverage errors: errors due to divergences existing between the target population and the sampling frame.
  • Measurement errors: errors that occur at the time of data collection. There are a number of sources for these errors such as the survey instrument, the information system, the interviewer and the mode of collection
  • Processing errors: errors in post-data-collection processes such as data entry, keying, editing and weighting
  • Non-response errors: errors due to an unsuccessful attempt to obtain the desired information from an eligible unit. Two main types of non-response errors are considered:
  1. – Unit non-response: refers to absence of information of the whole units (households and/or persons) selected into the sample. Unit non-response of the Finnish sample refers to absence of information of the selected sample persons.
  1. – Item non-response: refers to the situation where a sample unit has been successfully enumerated, but not all required information has been obtained
 
5.3.1. Coverage error
 

Coverage errors include over-coverage, under-coverage and misclassification:

  • Over-coverage: relates either to wrongly classified units that are in fact out of scope, or to units that do not exist in practice
  • Under-coverage: refers to units not included in the sampling frame
  • Misclassification: refers to incorrect classification of units that belong to the target population

Coverage errors of the Finnish SILC data are minor due to the exhaustive and up-to-date data source used for the sampling frame (section 12.1).

 
5.3.1.1. Over-coverage - rate
 
 

Main problems

Size of error

Cross sectional

data

·Over-coverage

·Under-coverage

·Misclassification

-  

 
5.3.2. Measurement error
 

Cross sectional data

Source of measurement errors

Building process of questionnaire

Interview training

Quality control

Problems emerging in connection of integrating different data sources

 

The Finnish microdata is collected from administrative data and by computer-assisted telephone interviews (CATI). Administrative data is integrated to the sample persons and co-residents by personal identification codes (ID codes). All person registers used for data compilation include ID codes. In the sampling phase, the codes are retrieved from the sampling frame. They are then used in all phases of integrating different data sources. ID codes can be missing only from recently immigrated persons, members recently moved to the household or newly born co-residents. 

 

To help the interviewer in defining the household members of the selected sample person, the first survey year's or wave's questionnaires are pre-filled with the persons’ ID codes registered in the same dwelling with the sample person. The second and later waves questionnaires are pre-filled using the information collected in the first wave. The interviewer checks the co-resident’s position in the household deleting those who do not belong to it and adding the names, birthdays and ID-codes of those who have moved into it. The ID codes to be added are either given by the respondent or searched later from the population register, using dates of birth, names and other information. If no ID code is found, the person is deleted from the data base. These cases are quite rare. In the 2011 gross sample operation, the ID code was missing for only 11 co-residents.

 

More importantly, reference periods in the administrative data sources may become a problem. In the Finnish SILC microdata, the variables on education attendance (PE010, PE020) are retrieved from registers referring to attendance in the educational institutes in September 26th pre-dating the survey year. Since most of the institutes start their semesters in August or early September, and usually continue to May of the survey year, the coverage and reference period of this data source is reasonable.

 

 

Problems due to the need to keep up the national time series

 

According to the principles commonly agreed in the planning phase of the SILC survey, it was important to ensure the coherence between the new instrument and established national statistics. In Finland, the SILC survey was integrated into the national Income Distribution Survey (IDS), compiled yearly since 1976. A major problem in maintaining the national series was and still is in reconciling the variables on labour activity. 

 

In the IDS, the reference period for the labour information is the income reference year. In the SILC, labour information refers mainly to the current situation. Different reference periods in IDS and SILC concern the variables PL031, PL040, PL050, PL111, PL130, PL140 and PL150. The SILC variables PL073 - PL090 are also in contradiction with similar IDS monthly activity variables: overlapping activities are permitted in IDS, but in SILC, one should define one's main activity for each month. Both reference periods cannot be collected; different reference periods would make it very hard for the interviewee and the interviewer to give accurate information, especially in cases where changes occurred during the income reference period (IRP) or currently. To make the fieldwork easier, from the very start of the EU-SILC, the reference periods were integrated. "Current" is operationalized as the December of the IRP.

 

Examples of labour information with different requirements in the IDS and EU-SILC

Concepts / Variables

Requirements

Solution

 

IDS

EU-SILC

Integrated

 

 

 

Current = December of the IRP

Main job

The longest period of employment during the year or the job with the highest income

Current

If main job is different from current job, information about both jobs are collected

Second job

The second longest period of employment during the year or the job with the second highest income

Current

If second job is different from current second job, information about both jobs are collected

PL020

---

Current - 4 weeks

December

PL025

---

Current + 2 weeks

December

PL031

---

Current

December

PL040

Status in main job

Current

If main job is different from current job, information about both jobs are collected

PL050

Occupation in main job

Current

If main job is different from current job, information about both jobs are collected

PL073, PL074, PL075, PL076, PL080, PL085, PL086, PL087, PL088, PL089, PL090

Number of months for each activity - 12 categories - overlaps allowed

Number of months for each main activity - no overlaps allowed

Number of months and calendar of activities collected for all members 16+

PL111

NACE in main job

Current

If main job is different from current job, information about both jobs are collected

PL140

Contract in main job

Current

If main job is different from current job, information about both jobs are collected

 

Source of measurement error: selection of the informant

 

Finland’s EU-SILC uses the selected respondent -model. Typically, only one member of the household is interviewed. As a rule, this interviewee should be the selected sample person.

 

In the EU-SILC, it is important to interview subjective questions from selected respondents. He/she gives all the information to the household questionnaire and all personal questionnaires.

 

Often a choice has to be made between the selected respondent and other member of his/her household. A significant part of the data collected from other members is about the household’s affairs. The selected respondent (especially the youngest selected respondents who still live with their parents, or very old respondents) may not be aware of the household economy, household debts, child care, housing items, the other household members' activities, or many other items.

 

A proxy respondent is defined as the respondent who is not the selected respondent. A proxy represented 16 per cent of the 9,351 selected respondents in 2011. In 84 per cent of the households, the selected respondent was interviewed. The proxy respondent rate has been slowly decreasing through years: from 25 per cent in 2004 to 16 in 2011.

 

Use of proxy respondents, the 2011 survey 

Informant

Information on:

 

 

 

Selected person

Co-resident

Total

 
 

N

%

N

%

N

%

The person him/herself

7 849

83.9

1 501

16.4

9 350

50.4

Proxy

1 502

16.1

7 650

83.6

9 152

49.6

Total

9 351

100.0

9 151

100.0

18 502

100.0

 

 

Source of measurement error: General fieldwork problems

 

The main data collection mode is computed-assisted telephone interview (CATI). According to interviewers' estimate, about half of the interviews are made through mobile phones. In about 6 per cent of those cases, interviews take place outside the respondent’s home. Telephone interviews are afflicted by a sense of rush. In large households, the interview is too long for telephone. The interviewers are allowed to change the mode into CAPI, in the cases the respondent has no phone or has an exceptionally large household. CAPI mode was used in 265 households, that is 3 per cent of all households in the 2011 survey.

 

Interview duration. According to the Interviewers' Feedback Survey 2011, 31 per cent of the interviewers felt that the duration of the interview was too long. 38 per cent of them assessed it had an effect on the refusal rate and 17 per cent thought that it weakened the quality of responses.

 

Distribution of total duration of interview by rotational group (DB075), the 2011 survey

 

1-25

26-35

36-60

61-

Missing

Total

Mean

Cross-section, total,  n

5 486

2 377

1 323

160

5

9 351

25.5

%

58.7

25.4

14.2

1.7

0.1

100.0

 

 

 

 

 

 

 

 

 

2  (Wave 1) n

1 544

1 158

747

94

2

3 545

29.4

%

43.6

32.7

21.1

2.7

0.1

100.0

 

1,4,3 (Wave 2,3,4) n             

3 942

1 219

576

66

3

5 806

23.2

%

67.9

21.0

9.9

1.1

0.1

100.0

 

 

 

 

 

 

 

 

 

2 (Wave 1) n

1 544

1 158

747

94

2

3 545

29.4

%

43.6

32.7

21.1

2.7

0.1

100.0

 

1 (Wave 2) n

1 942

683

340

37

1

3 003

24.0

%

64.7

22.7

11.3

1.2

0.0

100.0

 

4 (Wave 3) n

980

294

130

18

1

1 423

21.6

%

68.9

20.7

9.1

1.3

0.1

100.0

 

3    (Wave 4) n

1 020

242

106

11

1

1 380

22.9

%

73.9

17.5

7.7

0.8

0.1

100.0

 

 

Source of measurement error: Variable-specific problems

 

HS130 Lowest monthly income to make ends meet. The difficulty of this question for the respondent is well illustrated by the large number of item non-responses in the cross-section data. Very low and very high figures were also answered.

 

PL060, PL100 Number of hours usually worked per week in main job, Total number of hours in the second, third…jobs. Item non-response rate is rather high, obviously, due to proxy respondents’ inability to report the hours accurately. 

 

PE010–PE040, variables on education are collected from different registers. Variables PE010 and PE020 refer to the autumn semester (September 15th) preceding the operation year. Since the school year and academic year usually last from August-September to May-June, the current situation is sufficiently covered. PE030 (Year when the highest level of education was attained) includes a large number of missing values due to register imperfection, while the coverage of PE040 (highest ISCED level attained) is satisfactory. PE040 is known to be imperfect in the case of migrants.

 

RB031 (Year of immigration) is retrieved from the population register. The data coverage is deficient in cases of immigration that took place before 1990.

 

 

 

 

 

 

General description of the fieldwork tools

 

List of field work tools of EU-SILC 2011 (income reference period 2010)


1 Questionnaires for CATI/CAPI interviews

1 2011 for all waves, Finnish/Swedish

2 Interviewer's instructions

2A Instructions book for all panels, Finnish/Swedish

3 Contact letter

3A Contact letters to the selected persons,  first panel, 3 different letters, Finnish/Swedish
3B Contact letters to the selected persons,  second panel, 3 different letters,  Finnish/Swedish
3C Contact letters to the selected persons,  third panel, 3 different letters, Finnish/Swedish

4 Brochures to present the why and how the survey is executed, Finnish/Swedish

5 Pocket Statistics: a small collection of results from previous waves of the SILC survey, especially prepared for the respondents who wanted to know more on how the information is used, Finnish/Swedish

6 List of questions on housing costs; the interviewer may send the list to the respondent to let him find out about the costs in advance before the interview takes place, Finnish/Swedish.

7 List of questions on child day care payments in the income reference period (which are collected for national purposes)

The BLAISE-programmed questionnaire is divided into blocks of questions: a specific block for each household member aged 16+, child care block, health questions block, housing block, a block on household economics, a block on household composition. The order of the blocks is optional: the interviewer can choose the order. Only the household composition has to be fixed first, after that the interviewer is free to choose the order of the blocks. In case he does not choose blocks himself, the order is automatic.

 

Questionnaire build-up has its starting point in the previous year's questionnaire, feedback from the field interviewers and feedback from the data editing process and users. The leading principle in the questionnaire build-up has been a gradual integration process of the SILC to the IDS, and avoidance of too many changes in the national IDS. Of course, another starting point of the questionnaire update is to check the changes in the SILC doc65. The changes were minimal in the 2011 survey.

 

During the process of BLAISE programming (fall 2010), the questionnaire was table-tested several times by the team responsible for the IDS and EU-SILC. Six persons were involved. The focus was in the parts of the questionnaire undergoing some change. In the end, a group of professional interviewers checked the questionnaire against their experience. Finally, the technical functioning of the questionnaire was tested in the interviewer organisation before they were sent to the field.

 

The testing procedure makes use of the BLAISE-programmed questionnaire. The real field situation is simulated by a test sample, actual households from the preceding year's data base. Thus the test questionnaire is pre-filled with the information about the household composition and dates of birth. As in real field situation, the second and consequent panels have more information from the previous interview entered into the questionnaires. The testers fill in the questionnaire over again, trying all combinations of imagined situations, and likely errors (to disclose signalling), too. Testers are asked to pay attention to

 

- spelling, language, formulations and conceptual correctness of the questions,

- proper functioning of the routings and

- adequacy of logical checks, signals and interviewing instructions on the screen.

 

Changes in the questionnaire

 

An additional remark concerns the change of reference period for child care reported in the 2010 quality report: for children starting school the year preceding the operation, i.e. children aged 7, the frequencies of formal care declined rather clearly. This is a result of changing the reference period from autumn to spring.  When starting school, children often go to formal care before or after school hours. After a few months, many children stop using formal care services.

 

 

Occupational classification was double-coded using ISCO88 and ISCO08. The data was collected using automatic coding and ISCO88 classification, and later changed to ISCO08 using a conversion tool.

 

Building process of questionnaire: selection of the informant 

The contact letter is also sent to his/her parents or guardians, if the selected person is aged less than 18 years. In 2011, 89 per cent of the selected respondents under the age of 18 have been represented by a proxy respondent.

 

Distribution of proxy interviews by their relationship to the selected person in age groups, the 2011 survey

 

Informant

 

Selected respondent

Proxy

Age of the selected respondent

 

Spouse

Child

Parent

Sibling

Other

Proxies,
total

All interviews

16-17

25

0

0

205

0

0

205

230

18-24

570

30

0

255

0

0

285

855

25-44

2384

259

1

40

1

0

301

2685

45-64

3301

420

5

10

2

0

437

3738

65+

1569

228

31

0

8

6

273

1842

Total

7849

937

37

510

11

6

1501

9350

Interviewing more than one household member – both the selected person and a household respondent – is supported, but it rarely happens. Other members are allowed to be consulted during the interview if they are available. This option is often used. In 1372 interviews in 2011 (1597 in 2010), 20 % of all, other persons were consulted during the interview.

 

The questionnaire is built to enable a change of respondent during the interview. The questionnaire is programmed to accommodate the mode of addressing the respondent depending on whether the selected person him/herself or another member of the household is responding (interviewing the selected respondent about himself: Did you…; interviewing through a proxy respondent: Did N.N. …). This helps the interviewer and respondent to keep control of the member-specific data collection.

 

Use of proxy is denied only in the self-reported health questions (PH010-PH030). A special procedure and a reminder function has been built in the questionnaire to help interviewers contact the selected respondent separately for health questions in those cases where a proxy has answered the rest of the questionnaire. Some ad hoc modules also require a separate contact with the selected respondent.

 

Building process of questionnaire: General fieldwork problems

 

To help communication by telephone and shorten the interview duration, a number of pre-fills using the data from the previous wave are served on the questionnaire. Each year new ideas on how the relieve the respondent burden are applied to the questionnaire. According to the feedback from the interviewers, it was easier to manage the questionnaire in 2011 than in the previous year. In the longer run, the questionnaire has improved. Percentage of interviewers who felt that the questionnaire functioned technically badly fell from 13 per cent in 2005 to 8 in 2011. Percentage of those who felt that the questionnaire functioned badly as to the substance fell from 18 per cent in 2005 to 6 in 2011.

 

Letters describing the purpose and contents of the survey are sent to the selected respondents. A special brochure describes the data construction, how the results are published and data protection issues. Respondents get also a small booklet with some results from the earlier survey years. Respondents who have refused to take part in the survey receive a special letter trying to persuade them to participation.

 

It is also possible for the interviewer to change the mode of collection from CATI to CAPI. CAPI is allowed as an exception in cases where the respondent cannot or does not want to be interviewed by phone. Especially households with no phones or large households use this opportunity. Since the CAPI questionnaire is identical with the CATI questionnaire, the possible differences in outcomes must be due to interaction differences, or, the fact that the CAPI-interviewed households differ from the average in many respects, e.g. income and family composition. 

 

Type of interview (n, %), the 2011 survey

 

1. wave

2. wave

3. wave

4. wave

Total

 

n

%

n

%

n

%

n

%

n

%

CATI

3372

95.1

1 942

98.0

1 402

98.5

1 370

99.3

9 086

97.2

CAPI

173

4.9

 61

2.0

21

1.5

10

0.7

265

2.8

Total

3 545

100.0

3 003

100.0

1 423

100.0

1 380

100.0

9 351

100.0

 

 

Building process of questionnaire:  Variable-specific problems

 

HS130 The wording of the question is essential. The wording was reformulated in the Survey laboratory of Statistics Finland for the 2006 operation but the high level of non-response prevails.

 

PL060, PL100 If the respondent could not give an exact answer, he/she was asked whether the hours exceed 30 hours per week.

 

The missing values were imputed (hot deck) using modified 2-digit level of ISCO classification, separately by gender and age, information on whether the job was the main or the second job and whether it was a part-time job. Imputation was conducted first time in the 2008 operation. In 2009, there were 568 imputed records in PL060 and 100 in PL100.

 

PE010–PE040, Not asked in the interview

 

RB031, Not asked in the interview

 

General description of the interviewer training routines

 

Statistics Finland's interviewer organisation employs about 160 field interviewers on a permanent work contract. They work mostly part-time. They are given basic training on interviewing and questionnaire standards and codes of practices when they start working. They collect most of Statistics Finland's survey data, for the Labour Force Survey, Household Budget Survey, Time Use Survey and Adult Literacy Survey, for example. In other words, they are experienced.

 

The questionnaire changes were introduced to the interviewers in a separate written report and, of course, in the instructions book. The instructions book is rewritten every year and it is also under constant development.

 

Newly recruited interviewers were trained separately. They had two day's training about the SILC. The training programme included a lecture on the planning of the survey, including a description of Eurostat's process, legislation and future uses of the data, and Eurostat guidelines on data protection. Concern over international comparability was underlined. Instructions on the fundamental rules of central data collection were given and discussed, such as the definition of target population, household definition and its implementation in practice, different concepts and classifications of activity, especially labour market activities, child care questions, housing costs and mortgages. A major part of the training time was used on going through the videoed BLAISE questionnaire with the aid of three lecturers. The panel design and the future modules were described. The last part of training consisted of data transferring, data protection and other practicalities.

 

During the whole fieldwork period, interviewers' information desk is open for them. They can ask for support from the IDS-SILC team. The interviewers, who are distributed all over the country, also have organised district meetings with each other to discuss professional matters.

 

Fieldwork management and data reception

 

The interviewers collect the data and transmit them to the central unit. At Statistics Finland, there is a separate organisation, the interviewers' central unit, to control, monitor and supervise the fieldwork. The central unit transmits the fieldwork tools to the field and organises interviewer training at the beginning of the project, follows the fieldwork progress, and receives the output from the field, checks that all the sampled units are adequately processed and transmits the data to the IDS-SILC team. It also collects feedback from the interviewers with a standardised questionnaire. All data contents are processed by the IDS-SILC team either using the BLAISE system or SAS. Mainly the IDS and SILC data processing is integrated.

 

 

Quality control: selection of the informant

 

Problems arising from the use of proxy respondents concentrate on the subjective questions: control in terms of which household member answers the questions involving subjective assessments, depends on the interviewer. The denial to use proxy respondents for the health questions produces item non-response. In 2011, item non-response percentage was 3.0 (variable PH010_F = -1 weighted by PB060). This percentage reflects the share of selected respondents who could not be reached at all in the survey due to denials or other reasons, but whose household respondents were contacted.

 

The high percentage of proxy interviews guarantees a higher quality of the household information. Most of the proxy respondents are parents or spouses. Proxies are mostly (97 % in 2011) 1st or 2nd persons responsible for the accommodation, which also indicates their competence regarding knowledge of the household affairs.

 

Quality control: General fieldwork problems

 

HB100, PB120 - Household and personal interview duration - In Finland's selected respondent model, the duration of the interview is measured as the duration for both household- and personal interview in variable HB100. Variable PB120 is empty.

 

In less than 2 per cent of the households, the duration of the interview has exceeded the 60 minutes maximum stipulated in the framework regulation.

 

Quality control: Variable-specific problems

 

HS130 None

PL060, PL100 Distributions check

PE010–PE040, missing values are checked and imputed

RB031 No action

 

 

 

 

 
5.3.3. Non response error
 

Non-response errors are errors due to an unsuccessful attempt to obtain the desired information from an eligible unit. Two main types of non-response errors are considered:

1) Unit non-response which refers to the absence of information of the whole units (households and/or persons) selected into the sample. According the Commission Regulation 28/2004:

  • Household non-response rates (NRh) is computed as follows:

NRh=(1-(Ra * Rh)) * 100

Where Ra is the address contact rate defined as:

Ra= Number of address successfully contacted/Number of valid addresses selected

and Rh is the proportion of complete household interviews accepted for the database

Rh=Number of household interviews completed and accepted for database/Number of eligible households at contacted addresses

  • Individual non-response rates (NRp) will be computed as follows:

NRp=(1-(Rp)) * 100

Where Rp is the proportion of complete personal interviews within the households accepted for the database

Rp= Number of personal interview completed/Number of eligible individuals in the households whose interviews were completed and accepted for the database

  • Overall individual non-response rates (*NRp) will be computed as follows:

*NRp=(1-(Ra * Rh * Rp)) * 100

For those Members States where a sample of persons rather than a sample of households (addresses) was selected, the individual non-response rates will be calculated for ‘the selected respondent’, for all individuals aged 16 years or older and for the non-selected respondent.

2) Item non-response which refers to the situation where a sample unit has been successfully enumerated, but not all the required information has been obtained.

 
5.3.3.1. Unit non-response - rate
 

Cross sectional data

Address contact rate
(Ra)*

Complete household interviews
(Rh)*

Complete personal interviews
(Rp)*

Household Non-response rate
(NRh)*
Individual non-response rate
(NRp)*
Overall individual non-response rate
(NRp)*

A*

B*

A*

B*

A*

B*

A*

B*

A*

B*

A*

B*

 100.0  100.0  81.9 72.0   .. ..   18.1 28.0  .. .. 18.1  28.0

* All the formulas are defined in the Commission Regulation 28/2004, Annex II

A* = Total sample; B = * New sub-sample

..   = Not sample person.

 
5.3.3.2. Item non-response - rate
 

The computation of item non-response is essential to fulfil the precision requirements concerning publication as stated in the Commission Regulation No 1982/2003. Item non-response rate is provided for the main income variables both at household and personal level.

 
5.3.3.2.1. Item non-response rate by indicator
 

 

 Cross-sectional data

 

  Total hh gross income Total disposable hh income Total disposable hh income before social transfers other than old-age and survivors benefits Total disposable hh income before all social transfers
(HY010) (HY020) (HY022) (HY023)
% of household having received an amount 100.0 99.9 96.9 90.7
% of household with missing values (before imputation) 7.3 6.8 96.9 90.7
% of household with partial information (before imputation) 7.3 6.8 96.9 90.7

 

  Imputed rent Income from rental of property or land Family/ Children related allowances Social exclusion payments not elsewhere classified Housing allowances Regular inter-hh cash transfers received Interest, dividends, profit from capital investments in incorporated businesses Interest repayments on mortgage Income received by people aged under 16
(HY030) (HY040) (HY050) (HY060) (HY070) (HY080) (HY090) (HY100) (HY110)
% of household having received an amount 83.0 11.1 31.2 6.9 14.7 7.6 79.9 39.2 2.5
% of household with missing values (before imputation) 83.0 0.0 0.0 0.0 0.0 0.0 21.2 0.0 0.0
% of household with partial information (before imputation) 83.0 0.0 0.0 0.0 0.0 0.0 21.2 0.0 0.0

 

  Regular taxes on wealth Regular inter-hh cash transfers paid Tax on income and social contributions
(HY120) (HY130) (HY140)
% of household having received an amount 58.3 17.2 98.2
% of household with missing values (before imputation)  0.0  0.0  0.0
% of household with partial information (before imputation)  0.0  0.0  0.0

 

  Cash or near-cash employee income Other non-cash employee income Income from private use of company car Employers social insurance contributions Cash profits or losses from self-employment Value of goods produced for own consumption Pensions from individual private plans Unemployment benefits Old-age benefits Survivors benefits Sickness benefits Disability benefits Education-related allowances
(PY010) (PY020) (PY021) (PY030) (PY050) (PY070) (PY080) (PY090) (PY100) (PY110) (PY120) (PY130) (PY140
% of household having received an amount 63.1 15.7 2.1 62.6 18.1 0.0 5.4 13.7 20.8 3.0 5.6 7.6 9.0
% of household with missing values (before imputation)  0.0  0.0  0.0 62.6  0.0 0.0 0.0  0.0  0.0  0.0  0.0  0.0  0.0
% of household with partial information (before imputation)  0.0  0.0  0.0 62.6  0.0  0.0 0.0  0.0  0.0  0.0  0.0  0.0  0.0

 

 

 
5.3.4. Processing error
 
Data entry and coding Editing controls
  • Fieldwork management and data reception
  • Missing identification information of sample persons
  • Incomplete interviews
  • Missing information, outliers, illogical and inconsistent information
  • Inconsistency between self-reported months of activity and registered income sources in particular
  • Database construction
  • Processing register data
  • Comparison of aggregates

Fieldwork management and data reception:

 

The Field Unit in the Social Survey Unit of Statistics Finland (unit was valid until 2013) manages, supervises and monitors interviews’ fieldwork, provides interview tools to the field and organises interviewer training. This unit employs 160 trained interviewers operating in all parts of the country. Four field supervisors administer this network. The interviewers’ average period of service is 11 years. Communication between the Field Unit and the interviewers is arranged through a job monitoring system. The system enables continual follow-up of the fieldwork by each interviewer. The accruing paradata includes contacts, completed interviews, non-response, workload and costs. Interviewers collect the data and transmit them to the Field Unit, which checks the interviewed data quality (e.g. sample units are adequately processed) and transfers them to the SILC team. Occasional controlling is made by re-interviewing. Feed-back from the central unit to the interviewers is regular. A standardised questionnaire on interviewers’ evaluations of the operation is in regular use.

 

Missing identication information of sample persons: 

 

Missing identification information of sample persons has been checked and edited by using register information from the Population Information System. The process takes usually from mid-February to the end of the June, when the interviewed data have been received.

 

Incomplete interviews:

 

After the fieldwork period, the SILC team looks through incomplete interviews and makes a decision on the acceptance. Some of the incomplete interviews are rejected. Since the register income data are nearly perfectly available, the acceptance decision is based on the sufficient completeness of the main activity and housing information. In the 2011 operation, 18 incomplete interviews were excluded from the received sample and treated as non-responses.

 

Missing information, outliers, illogical and inconsistent information: 

 

The BLAISE programming system of the CATI and CAPI questionnaires controls primarily data quality. In addition, interviewed information and interviewers’ remarks on the questionnaires have been checked and edited against other interviewed and register data for the unit. Errors of objective type of information have been corrected, subjective and opinion questions are left to be without editing. Errors contain e.g. missing answers, denials and don't knows, clear mistakes, outliers, illogical and inconsistent answers. Special emphasis has been put on the questions of activity months, occupation, NACE, housing costs and childcare. They are checked against other information (including register information). Occupation and NACE are processed through automatic coding. Some of the cases remain open which are processed manually.

 

Inconsistency between self-reported months of main activity and registered income sources in particular:

 

The months of main activity have traditionally been heavily edited to comply with register data, especially with income data. As a result of comparisons between interviewed and registered data, some activities of the respondents' answers are rejected and replaced with the corrected ones to be in coherence with the information about salaries, entrepreneur income, pensions, unemployment and other benefits. Editing was started from the 2009 SILC operation. In the 2004-2008 operations, months of main activities (PL070, PL072, PL080, PL085, PL087, PL090) were collected and dealt as subjective responses given by respondents as defined in the EU-SILC document 065.  

 

Database construction:

 

Simultaneously with the checking process, a database is opened and variable formation begins by SAS-programming. Variables construction based on interviews and registers is started. Interviewed variables are transferred from the questionnaires to the database. Variables that need for constructing – i.e. combined interview- and register information and complex questionnaire items – are added into the database after all the checks and edits have been carried out. Imputations are done.

 

Processing register data:

 

Register data - that have been subscribed from the register authorities with a special procedure - arrive in electronic form to the IT and Statistical Methods units. In 2011, use was made of eleven registers. The incoming data are checked technically and substantially. Possible defects are notified to the administrative authority in charge of the data source. The registers cover all units - population, dwelling units, income receivers, etc. The data are linked to the sample persons and transmitted into the database. The data are compared with available external data, i.e. those of the tax authority, pensions authority and other statistics.

 

Comparison of aggregates:

 

Routines have been developed to compare the results on variable level with external sources such as the Labour Force Survey, National Accounts, wage statistics and statistics on different social transfers and taxation produced by the National Pensions Institute, National Board of Taxes and National Research and Development Centre for Welfare and Health. Standard comparisons are routinely made each year. These comparisons also have an effect on error detection.

 
5.3.4.1. Imputation - rate

-

5.3.4.2. Common units - proportion

-

5.3.5. Model assumption error

-

5.3.6. Data revision

-

5.3.6.1. Data revision - policy

-

5.3.6.2. Data revision - practice

-

5.3.6.3. Data revision - average size

-

5.3.7. Seasonal adjustment

-


6. Timeliness and punctualityTop
6.1. Timeliness

-

6.1.1. Time lag - first result

-

6.1.2. Time lag - final result

-

6.2. Punctuality

-

6.2.1. Punctuality - delivery and publication

-


7. Accessibility and clarityTop
7.1. Dissemination format - News release

-

7.2. Dissemination format - Publications

-

7.3. Dissemination format - online database

-

7.3.1. Data tables - consultations

-

7.4. Dissemination format - microdata access

-

7.5. Documentation on methodology

-

7.5.1. Metadata completeness - rate

-

7.5.2. Metadata - consultations

-

7.6. Quality management - documentation

-

7.7. Dissemination format - other

-


8. ComparabilityTop
8.1. Comparability - geographical

-

8.1.1. Asymmetry for mirror flow statistics - coefficient

-

8.1.2. Reference population
 

Reference population

Private household definition Household membership
Members of the private households permanently resident (the census definition) in Finland on 31 December 2010. For migrants in particular, permanent residency means that the persons have resided or intend to reside for at least 12 months and they have not a permanent recidency abroad. Persons living in institutions, in collective households or in residential homes have been excluded.

The private household include a person residing alone, or all the persons, related or not, who reside and have their meals together or otherwise use their income together. The definition equals with the obligatory EU-SILC definition on "shares in household expenses", but uses other words "use income together" in the interview.

See the private household definition.

For persons who were temporarily absent from the household’s main dwelling and from home no specific time duration (6 months) was set for the absence in the interview provided that the criteria of household formation and membership (shares in household expenses) were fulfilled. Such persons have close family ties to the household and they do not form a household of their own. Therefore, the following persons are also counted in household members:

- Persons conducting military service or conscript service
- Persons residing and working in another locality or abroad if they are involved in the acquisition and use of household income
- Persons residing and studying in another locality if they use income received mostly from their parents
- Persons temporarily in institutions, on holiday or travelling.

 

The following persons form a household of their own:
- Subtenants
- Domestic staff
- Students living on their own if they live mostly on their own income or on a student loan
- Students residing in dormitories, unless they are married or officially cohabiting.

 
8.1.3. Reference Period
 
Period for taxes on income and social insurance contributions Income reference periods used Reference period for taxes on wealth Lag between the income ref period and current variables
2010  2010  2010  4 months
 
8.1.4. Statistical concepts and definitions
 
Total hh gross income
(HY010)
Total disposable hh income
(HY020)
Total disposable hh income before social transfers other than old-age and survivors' benefits
(HY022)
Total disposable hh income before all social transfers
(HY023)
F F F F

 

Imputed rent
(HY030)
Income from rental of property or land
(HY040)
Family/ Children related allowances
(HY050)
Social exclusion payments not elsewhere classified
(HY060)
Housing allowances
(HY070)
Regular inter-hh cash transfers received
(HY080)
Interest, dividends, profit from capital investments in incorporated businesses
(HY090)
Interest paid on mortgage
(HY100)
Income received by people aged under 16
(HY110)
Regular taxes on wealth (HY120) Regular inter-hh transfers paid
(HY130)

F

F

F

F

F

F

F

F

F

F

 

Cash or near-cash employee income
(PY010)
Other non-cash employee income
(PY020)
Income from private use of company car
(PY021)
Employers social insurance contributions
(PY030)
Cash profits or losses from self-employment
(PY050)
Value of goods produced for own consumption
(PY070)
Unemployment benefits
(PY090)
Old-age benefits
(PY100)
Survivors benefits
(PY110)
Sickness benefits
(PY120)
Disability benefits
(PY130)
Education-related allowances
(PY140)
Gross monthly earnings for employees
(PY200)

F

F

F

L

F

NC

F

F

F

F

F

F

NC

 

The source or procedure used for the collection of income variables The form in which income variables at component level have been obtained The method used for obtaining target variables in the required form
 Gross  Gross Register (99.0 % of all gross income and 97.8 % of all paid transfers, weighed income) / Interview (1.0 % of all gross income and 2.2 % of all paid transfers, weighted income) 
 
8.2. Comparability - over time

The income data are comparable over the survey years 2004—2011.

8.2.1. Length of comparable time series

-

8.3. Comparability - domain

-


9. CoherenceTop
9.1. Coherence - cross domain

Cross-sectional data 

Total gross income of private household in the income reference year 2010. Difference of the income amounts (%) from the cross-sectional data to Total Statistics of Income Distribution (TSID)
TSID    
Income components Difference  
  %  
2.1. Gross employee income  (py010g,py021g) -0.6 TSID: Employee income received by persons aged under 16 is included. All items of  gross non-cash employee income are included.
2.2. Self-employment income -0.5 TSID: Employee income received by persons aged under 16 is included.
2.3.Property income (hy040g, hy090g, py080g)  -23.4 TSID: See above. Profits from sales which are included, interests income taxed at a source is not included.
2.4. Current transfers received 3.3 TSID: All inter-household transfers received are not included
PY090G. Unemployment benefits 1.2  
PY100G. Old-age benefits 4.2 TSID does not include income received from PY130G for the persons who are on old-age pensions after the standard retirement age. 
     
PY110G. Survivors' benefits -11.4  
PY120G. Sickness benefits 10.1  
PY130G. Disability benefits -6.2 See PY100G
PY140G.Education-related allowances 37.7 TSID does not include all SILC items.
HY050G. Family/children -related allowances 10.8 TSID does not include all SILC items.
HY060G. Social exclusion payments not elsewhere classified -14.6  
HY070G. Housing allowances -5.7  
2.5. Other income received 100 Income (HY110G) is included in other TSID income components.
2.6. Current transfers paid 0.9 TSID: Inter-household transfers paid are not included. Tax paid on profits from sales is included.
Total gross household income including PY080g (excluding imputed rent and mortgage interests, negative values have been changed for 0-values).  -1.1 In addition to the EU-SILC estimation, the difference is mostly due to other non-cash employee income than 
    a company car, profits from sales included in TSID, and household transfers received not included in TSID.
Total disposable household income including PY080g (excluding imputed rent and mortgage interests, negative values have been changed for 0-values).  -1.7 In addition to the EU-SILC estimation, the difference is mostly due to other non-cash employee income than a company car, 
    profits from sales included in TSID, and inter-household transfers not included in TSID.
Components not in the EU-SILC definition, they are included in the    
more complete TSID  total disposable household income definition:    
- Gross employee income  -0.1 TSID: Employee income received by persons aged under 16 is included.
      (py010g, py020g)    
.. Information is not available; . Information is not logical    
     
Total gross income of private household in the income reference year 2010. Difference of the income amounts (%) from the cross-sectional data to  European System of Integrated Social Protection Statistics (ESSPROS)
ESSPROS    
Income components Difference  
  %  
PY090G. Unemployment benefits 2.1  
PY100G. Old-age benefits 1.3 ESSPROS does not include income received from PY130G for the persons after the standard retirement age.
PY110G. Survivors' benefits -22.6  
PY120G. Sickness benefits -73.9 ESSPROS includes sick pay which has been counted in PY010G employee income.
PY130G. Disability benefits -18.8 See PY100G.
PY140G. Education-related allowances .  
HY050G.Family/children -related allowances 0 ESSPROS includes the income maintenance benefits paid in the event of child birth and the parental leave benefits
    paid as salary which are in PY010G employee income.
HY060G. Social exclusion payments not elsewhere classified -20.9 ESSPROS includes wage quarantee, which is in PY010G employee income. 
HY070G. Housing allowances 15.3 ESSPROS does not include students' housing supplements. As of 2008, ESSPROS contains pensioners’ housing
     allowances, when earlier they were items of PY100G and PY130G. 
Total, excl. education-related allowances -7.7  
Same definitions in accordance with ESSPROS:    
HY070G. Housing allowances -13.3  
PY100G,PY130G -2.8  
Population coverage is wider in ESSPROS than in EU-SILC. Information is not available; . Information is not logical    
9.1.1. Coherence - sub annual and annual statistics

-

9.1.2. Coherence - National Accounts

Cross-sectional data

Total gross income of private household in the income reference year 2010. Difference of the income amounts (%) from the cross-sectional data to National Accounts  (NA)

NA    
Income components Difference  
  %  
2.1. Gross employee income   (py010g, py021g) -3.0 NA includes non-taxable income items, e.g. estimate on non-taxable and non-monetary income provided to
     an employee by an employer.
2.2. Self-employment income -31.7 NA’s concept s on operating surplus and mixed income (B4N-B43N) differs from the one of the EU-SILC entrepreneur  income.
    For example, NA’s mixed income includes rental income from rental activity (other than land) in
    unincorporated enterprises and value added from self construction. Operating surplus from 
    owner-occupied dwellings as imputed rent has been excluded from the figure beside.
2.3.Property income (hy040g, hy090g, py080g) -31.1 NA (D4R)  includes following items, e.g. estimated value of premiums and claims from life- and pension
     insurances to insurants, property income of mutual funds (interests and dividends), which have  been
    invested forward on shareholders’ behalf.
2.4. Current transfers received -10.5 NA includes more compensation items from individual personal insurance schemes , NA does not include transfers received from other private households.
     
2.5. Other income received . NA: Income (HY110G) is included in other income components.
2.6. Current transfers paid -17 NA includes optional contributions, e.g. contributions to indemnity insurance, church tax, membership fees of trade unions, other membership fees and employees’ optional contributions to social insurance. It does not include transfers received from
    other private households. In NA income tax refers to time point the taxes have been actually paid, whereas in SILC the tax reference time period equals to the income reference period (i.e. when the income have been received).  
     
     
     
Total gross household income (excluding imputed rent and -8.3  
mortgage interests, negative values have been changed for    
0-values).     
Total disposable household (excluding imputed rent and mortgage interests, negative values have been changed for 0-values).  -5.1  
Components not in the EU-SILC definition. They are included in -2.5  
the more complete NA total disposable household income definition:    
- Gross employee income  (py010g,py020g)    
- Imputed rent 85.9 NA: Net operating surplus from owner-occupied dwellings. NA counts FISIM and depreciation of
    owner-occupied dwellings as expenses for the net value
- Interest payments of housing loans for owner occupiers 6.4  
     
9.2. Coherence - internal

-


10. Cost and BurdenTop

-


11. ConfidentialityTop
11.1. Confidentiality - policy

-

11.2. Confidentiality - data treatment

-


12. Statistical processingTop
12.1. Source data

Description of the sampling frame 

 

The Population Information System maintained by the Population Register Centre of Finland is the source data used for the sampling frame (see below). The Population Information System is a compilation of local registers kept up by population register districts. It covers basic data on all Finnish citizens and aliens permanently resident in Finland. Persons living in private households, institutions, persons living temporarily abroad and homeless persons are also included in the Population Information System. Because the persons do not belong to the target population, they are excluded from the sampling frame.

 

Finland uses unified identification code system across register sources, which means that every person residing in Finland has a unique identification code and each dwelling has a domicile code. Person has been registered in the municipality where he/she has a permanent place of residence. The domicile code is the link between a person and his/her permanent dwelling. Persons without an address are registered in municipal registers as homeless persons. To a person with a permanent address may also have a registered temporary address.

 

A register data copy of the population information system is used as the sampling frame for the selection of master sample, i.e. selected sample persons. Data refers to the end of the year preceding the survey year. Persons under aged 16 and persons placed in institutions and homeless are excluded from the frame. The order of the frame is based on the domicile code indicating the location of person's dwelling.

 

The frame is also used for the construction of the household-dwelling units for the master sample, and for the sample stratification. After various checks and combinations (e.g. excluding collective households) household-dwelling units are compiled by adding all the persons sharing the same domicile code (occupying same dwelling) with the selected person (target person) to the master sample. Before the interview fieldwork begins, the information for the second, third and fourth panels of the EU-SILC and changes occurred for the first panel after the sample selection are updated on the basis of the register data.

 

The master sample of household-dwelling units is used for different sampling purposes, one of them is the Finnish EU-SILC survey and Income Distribution Statistics (IDS).

 

Information about the frame: reference period, updating actions, quality review actions

 

In general, the Population Information System of the Population Register Centre of Finland is exhaustive and up-to-date as regards persons.  Information on population changes: births, deaths, migration, immigration and emigration, marriages, divorces, adoptions and changes of names are updated regularly. The Population Register Centre updates the official population figures the 5th – 8th day of every month in all municipalities in Finland.

 

The system is maintained by notifications of changes made by population districts authorities. The inhabitants themselves are responsible only of notification of changes of residence. Those who move or immigrate are expected to report the new residence address to local register office within one week of the move, specifying all the family or household members involved in the move. Those emigrating should supply a notice of the change of address in the country of entry. According to an agreement between the Nordic countries the population register authorities of the country of entry inform the population register authorities about the country of exit. In the years when municipal elections are arranged (every 4th year), the population is corrected by around 1,000 persons, when emigrants whose emigration has been left unnoticed return notifications of voting.

 

A reliability survey on the Population Information System is conducted yearly by means of a sample interview (CATI) survey of approximately 10,000 persons.  From the EU-SILC point of view, reliability of its address information has special relevance. In the quality surveys, the final proportion of the correct addresses in the total sample has always been high, 98-99 per cent.

 

The EU-SILC collects the variables PB130, PB140, PB150, PB190, PB210, PB220A and PB220B directly from the Population Information System. None of these information, however, have been checked in the reliability survey.

 

The Population Information System has no under-coverage in any population groups. Asylum seekers and refugees are not included in the resident population until their permit of residence has been processed. The small over-coverage exists as a consequence of the necessity to draw the sample in time before the reference time point of the sample households (31. Dec.). This master sample data has further been updated by final register information (including tax information to be connected to the master sample in order to create the strata, for example) available after it’s selection. Various checks are conducted. At this point those who have died, moved permanently abroad or placed into an institution between the sample selection time point and the end of the year are excluded from the master sample. With this processing the frame imperfection (not describing the reference time point) is corrected in the sample.

 

A household dwelling unit may consist of several households, or all the dwelling occupants are not sharing in household expenses - housekeeping unit is not a register concept – that’s why the household composition is checked in the interview (referring to 31. Dec). The household members presence vs. absence either temporarily or permanently are checked in the interview as well. Persons who recently changed a place of residence and/or household, new-borns, recently moved to institutions or died are the usual sources of non-correct register-based pre-entries in the EU-SILC questionnaire.

12.1.1. Sampling design and procedure
 

Type of sampling design

Two-phase stratified sampling design.

Stratification and sub stratification criteria

Strata (DB050) contain socio-economic groups grouped by income class (13 strata in the new sample). The register information used for strata construction is about all members of the selected sample person's household-dwelling unit available at the time of sample selection (updated by the final population register data later). Strata are created only for those who are not in the over-coverage. The stratification takes normally the highest earning person as a categorising person for a stratum, but as an exception of the rule is an entrepreneur who hasn't to be the highest earning one in the category of entrepreneurs. The income class division allocates the sample more to high-earners.

 

The strata of the new sample/rotational group (DB075=2)

       
Socio-economic categorisation of the household-dwelling unit Income Class

Stratum

DB050

 
Wage-earners Lowest  1  
  2nd lowest  2  
  3rd lowest 3  
  Highest  4  
Entrepreneurs Lower  5  
  Higher  6  
Farmers Lower  7  
  Higher  8  
Pensioners Lower  9  
  Higher  10  
Others Lower  11  
  Higher  12  
No tax information - 13  
       

 

Strata of the older rotational groups in the data refer to equivalent groups of the new sample as follows: 2. replication (DB075=1), DB050-12; 3. replication (DB075=4), DB050-24; 4. replication (DB075=3), DB050-36.  

Sample selection schemes

The new sample was selected with a two-phase stratified sampling design. In the first phase, a master sample of persons (50 000) was selected with systematic sampling from the population register data. In the second phase, a sample of persons (5 000) was selected from the stratified master sample with simple random sampling without replacement within every stratum and using non-proportional allocation. The new sample selection scheme equals to first year's sample selection of old rotational groups.

 

The old rotational groups (3 groups) were included in the set of responded households (DB135=1, including initial sample person) from the previous survey year to be interviewed (gross sample). Those households in that the initial sample person was no longer in-scope (moved to collective household or institution in the country, moved outside country or dead) were omitted for the final net sample of the survey as over-coverage. The final household composition was defined during the interview. In the survey year, all old rotational groups were supplemented by an extra sample of persons aged 16 (selected from the stratified master sample with simple random sampling without replacement within every stratum and using non-proportional allocation, section 12.1.3).

 

The sample does not contain substitutions.

Sample distribution over time

The reference population is defined as the population registered as permanently resident in Finland on 31 December 2010. Household composition is also dated to 31 December 2010. The income reference period is constant for all households and persons: the calendar year 2010.

 

In SILC 2011 operation, the fieldwork period streched over five months; it started in the beginning of January 2011 and ended in May 2011. The cross-sectional sample of the EU-SILC consists of four rotational groups. The fieldworks of old rotational groups were started in the beginning of January and were completed mostly till the end of March. Only a few households were interviewed after March. The new sample households were interviewed in February – May. 

 

New sample

1st wave,    DB075=2

 

2nd wave, DB075=1

 

3rd wave,  DB075=4

 

4th wave,   DB075=3

 

Total

Month of interview

N

%

N

%

N

%

N

%

N

%

December, 2010

 

 

 

 

 

 

 

 

 

 

January, 2011

 

 

1 861

62.0

287

20.2

 

16.1

2 370

25.3

February

602

17.0

1 068

35.6

766

53.8

683

49.5

3 119

33.4

March

1 813

51.1

58

1.9

358

25.2

466

33.8

2 695

28.8

April

1 072

30.2

13

0.4

12

0.8

8

0.6

1 105

11.8

May, 2011

58

1.6

3

0.1

.

.

1

0.1

62

0.7

Total

3 545

100.0

3 003

100.0

1 423

100.0

1 380

100.0

9 351

100.0

 

 

 
12.1.2. Sampling unit

The sampling unit is a person. Sample persons refer to persons selected in the first phase of the two-phase stratified sampling (selected sample persons). 

12.1.3. Sampling rate and sampling size
 

Concerning the SILC instrument, three different sample size definitions can be applied:

- the actual sample size which is the number of sampling units selected in the sample

- the achieved sample size which is the number of observed sampling units (household or individual) with an accepted interview

- the effective sample size which is defined as the achieved sample size divided by the design effect with regards to the at-risk-of poverty rate indicator

 

Cross-sectional data

 

The Finnish data provides  a greater number of sample than the minimum sample required to be achieved in EU-SILC. Because Finland uses a sample of persons, the minimum effective sample size is 5 063, and when taking account of the design effect term, deff2= 1.25, the minimum sample to be achieved is 6 239 for the cross-sectioanl data. The minimum number of sample to be selected is 8 328, The figure includes non-response (overall response rate R is approximately 0.76 in the Finnish survey).

 

In the survey year 2011, the number of actual sample was 11 418 in the cross-sectional component. When excluding the non-response (25 % for the first panel, and 8.5 % for the following panels), the number of respondents were expected to be 9 783. The realised number of accepted respondents was 9 351. The minimum sample to be selected was reached (6 239). 

 

Achieved sample size according to rotational groups (DB075) in the 2011 survey year

                 
    Accepted households Number of persons aged 16 or older,  members of the accepted households (DB135=1) and for whom interview was completed (RB250=11 to 13). Number of selected respondents, members of the accepted households (DB135=1) and who completed a personal interview (RB250=11 to 13).          
Cross-sectional, total:   9 351   18 502   9 351          
New sample (DB075= 2)   3 545   7 035   3 545          
Old rotational groups (DB075= 1,4,3)   5 906   11 467   5 906          
DB075 Wave                      
1 2   3 003   5 904   3 003          
4 3   1 423   2 828   1 423          
3 4   1 380   2 735   1 380          
                         
                         
Information about the new sample (DB075=2) by stratum in the 2011 survey year                
Socio-economic categorisation of the household-dwelling unit Income Class Stratum Master sample        (1st phase) 2nd phase sample 2nd phase sample excluding over-coverage 2nd phase sample, accepted respondents             
Wage-earners Lowest  1 10 213 820 813 547            
  2nd lowest  2 8 909 650 646 470            
  3rd lowest 3 7 749 567 562 411            
  Highest  4 3 626 500 497 380            
Entrepreneurs Lower  5 1 895 400 394 271            
  Higher  6 951 300 297 205            
Farmers Lower  7 859 199 198 163            
  Higher  8 687 184 183 154            
Pensioners Lower  9 6 504 500 469 353            
  Higher  10 4 991 400 387 304            
Others Lower  11 2 175 299 298 184            
  Higher  12 369 133 132 90            
No tax information - 13 205 48 46 13            
    All 49 133 5 000 4 922 3 545            
                         
                         
Information about the old rotational groups by initital stratum in the 2011 survey year            
Socio-economic categorisation of the household-dwelling unit Income Class Stratum of the initial sampling (1st wave) 2nd phase sample incl. over-coverage 2nd phase sample excl. over-coverage Accepted respondents  
DB075     1 4 3 1 4 3 1 4 3  
Wave     2 3 4 2 3 4 2 3 4  
Wage-earners Lowest  1 538 247 230 535 247 230 448 219 213  
  2nd lowest  2 459 210 214 457 207 214 402 191 199  
  3rd lowest 3 422 194 210 421 192 209 368 174 196  
  Highest  4 357 156 157 357 156 157 319 147 146  
Entrepreneurs Lower  5 276 135 112 276 135 112 235 123 105  
  Higher  6 220 80 88 219 80 87 196 74 81  
Farmers Lower  7 156 73 62 150 72 61 127 71 59  
  Higher  8 158 70 69 158 69 69 141 63 66  
Pensioners Lower  9 330 145 119 321 137 114 268 128 106  
  Higher  10 310 135 122 304 133 119 273 125 109  
Others Lower  11 165 76 66 162 75 66 134 65 60  
  Higher  12 87 44 37 85 44 37 77 39 36  
No tax information - 13 20 4 5 20 4 5 15 4 4  
    All 3 498 1 569 1491 3 465 1 551 1 480 3 003 1 423 1 380  
                         

 

 

 
12.2. Frequency of data collection

Annually.

12.3. Data collection
 

Mode of data collection

A description of the mode of data collection used in your country. Please mention if you use mixed mode of data collection.

1-PAPI
(% of total)
2-CAPI
(% of total)
3-CATI
(% of total)
4-Self administrated
(% of total)
 0.0  2.8  97.2  0.0

The mean interview duration

The mean interview duration per household is calculated as the sum of the duration of all household interviews plus the sum of the duration of all personal interviews, divided by the number of household questionnaires completed. Only households accepted for the database have to be considered. 

Average interview duration = 26.5 minutes.

 
12.4. Data validation

-

12.5. Data compilation

-

12.5.1. Weighting procedure
 

Design factor

Non-response adjustments

Adjustment to external data

Final cross sectional weights

Deft2=1.25

Non-response adjustments

 

The household design weights (see  below) were multiplied by nsample,h / nrespondents,h in every stratum h.

 

Calculation of design weights

 

Separately calculated from the master samples SY 2011 (of size 50,000) and 2010, 2009 and 2008  (each of size 50,000) we got the population figures for the person selection, e.g., where piα, person k is the inclusion probability of the selected person k in the master sample. The inclusion probabilities of the dwelling units created around the selected persons in the master sample were piαk = piα,person k n16+, dwelling of κ. The inclusion probabilities of two-phase sampling (the effect of selecting the master sample and the SILC sample) were calculated, at the second phase based on the stratification (13 strata) of the master sample and the allocation used. For those waves we separately calculated the inclusion probabilities pi*k = piαk pik|sα , where

           piαk = piα,person k n16+,HH of k =  nsα n16+,HH of k  /                        

and pik|sα= nh / Nh,sα  is the conditional inclusion probability at the second phase taking the stratification of the master sample into account. The Finnish SILC D file has the design weight variable DB080 (the inverse of the inclusion probability), in which the original design weights were calculated separately for all four waves and with a multiplication by 0.25 in order to get coherent information about the households. PB070 (personal design weight for selected respondents) is an estimate of the inverse of the inclusion probability of the person (DB080*n16+,HH). This weight was not needed in the weighting procedure of the SILC/IDS. Again in this case these weights were calculated separately for all rotational groups. In addition, the calculation was conducted for all of the sample (excluding over-coverage). However, the weight PB070 is defined only for the households that have been accepted (P file), not all the sample (including non-response). In this case there should be a non-response correction included in the weight in order to get the figures right. We did the simple adjustment nsample / nrespondents in every stratum. In addition, to get the separate wave effect to disappear, we multiplied the weights by 0.25. The sum of the weights is N16+.

The nonresponse-adjusted weights were adjusted for the hosuehold number of the survey year in each rotational group and then used as input weights in calibration (the raking method) conducted with the macro CALMAR  for the accepted households. The calibration process was carried out separately for all rotational groups. The calibration could be interpreted as integrative, i.e. both the household and the person levels were included in the process. The percentual marginal distributions and the statistics used in calibration are the following:

 

1) Households: province; type of municipality; HH size; sums of 15 different income variables. The first three distributions of the households were obtained from the master sample, using weights for which a primary calibration (population register: 16+ persons and persons under 16 by region; gender*age class) was conducted. The income information comes from different registers.

 

2) Persons: gender and age classes (0-4, 5-9, … , 80-84, 85+)

 

 Description of the calibration variables:

  • Region (NUTS 3 level), capital region separated
  • Size of dwelling unit
  • Degree of urbanisation
  • Men 0-4,5-9,10-14,...,80-84
  • Women 0-4,5-9,10-14,...,80-84
  • Income 1: Cash or near cash employee income
  • Income 2: Income1 > 0
  • Income 3: Pensions
  • Income 4: Unemployment benefits 1
  • Income 5: Unemployment benefits 2
  • Income 6: Income 4 > 0
  • Income 7: Income from self-employment
  • Income 8: Capital income 1
  • Income 9: Income from agriculture
  • Income 10: Income from property and forestry
  • Income 11: Other capital income
  • Income 12: Income from forestry 2
  • Income 13: Capital gains
  • Income 14: Pensions > 0
  • Mortgage interests

 

The calibration data was obtained from register-based dwelling data and income registers used for Total income distribution data of Statistics Finland covering the whole household-dwelling population. 2,551,000 was used as the fixed number of households in the 2011 process. The result of this calibration was the weight that produced exactly these margins when used in the summation of these variables in the data set containing accepted observations. DB090 is this calibrated weight multiplied by the proportions of accepted sample households of the rotational group (rg) of all accepted sample households (nrespondents rg / nrespondents ) in order to adjust the effect of separate calculations.

 

When DB090 is connected to the R file (“All persons currently living in households or temporarily absent”), these weights (in this context RB050) give the sum which coincides with the exact number of non-institutionalised persons at the end of 2011, i.e. 5,294,659. Furthermore, when DB090 is linked to the P file (“All eligible persons for whom the information could be completed”), these weights (in this context PB040) give the population of persons aged at least 16 years, i.e. 4,348,213. And linking DB090 to the sample person in R- or P file gives the number of households defined (2,551,000). The variables values DB090 = RB050 = PB040. 

 

Finally, the personal cross-sectional weight for the selected respondent, i.e. PB060 is DB090 multiplied by n16+,HH. The number of 16+ is fixed in this phase as well.

 

An additional weight for children aged 0 to 12, i.e. RL070 (Children cross-sectional weight for child care) is calculated by multiplying RB050 with the term number of non-institutionalised children in age class X from the register” / “number of children in age class X estimated with RB050”, where X = 0 to 12.

 
12.5.2. Estimation and imputation
 
Imputation procedure used Imputed rent Company car
Deductive imputation, statistical imputation. The stratification method. Register source, no imputation.
 
12.6. Adjustment

-


13. CommentTop

National questionnaire is available in Circa BC at: https://circabc.europa.eu/

Please select EU-SILC section and then select the folder called "06 National Questionnaire" in the library list.

 

Annex. Technical report on 2011 module on intergenerational transmission of disadvantages.


AnnexesTop
Technical report on 2011 module on intergenerational transmission of disadvantages (Annex.rtf)