Upload
talia
View
52
Download
0
Embed Size (px)
DESCRIPTION
24th October, EFGS 2013 Conference, Sofia. Disaggregation methods for georeferencing inhabitants with unknown place of residence : the case study of population census 2011 in the Czech republic. Ing. Jaroslav Kraus, Ph.D. Mgr. Štěpán Moravec. Starting Situation. - PowerPoint PPT Presentation
Citation preview
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 1/X
Ing. Jaroslav Kraus, Ph.D.
Mgr. Štěpán Moravec
DISAGGREGATION METHODS FOR GEOREFERENCING INHABITANTS WITH
UNKNOWN PLACE OF RESIDENCE :
THE CASE STUDY OF POPULATION CENSUS 2011 IN THE CZECH REPUBLIC
24th October, EFGS 2013 Conference, Sofia
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 2/X
STARTING SITUATION
Total number of usually resident population: 10 436 560
Georeferenced inhabited building points stored in the Register
of Census Districts and Buildings managed by CZSO: 1 790 122
Georeferenced population with exact place of usual residence
(x,y coordinates): 10 343 479 High coverage of georeferenced data (above 99 %): 93 thousands inhabitants not linked to the exact place of their usual residence (0,9 % of the total census population)
10 436 560 – 10 343 479 = 93 081
But, the census data of these inhabitants are linked to the level
of statistical districts
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 3/X
Cause: missing, incomplete or incorrect address data
Structure of the people with unknown place of residence: homeless people people living in emergency buildings or shelters people living in buildings without final approval
Possible solution for distribution of these people into
buildings with x,y coordinates or into grids:
Application of some disaggregation method
Testing of 3 disaggregation methods via ArcGIS software
DESCRIPTION OF THE PROBLEM
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 4/X
Case study: small town Abertamy in the northern part of the CR
Total number of census population: 1 213 Number of not georeferenced inhabitants: 46 Total number of statistical districts: 6 Number of affected statistical districts: 6 Number of inhabited buildings: 214
CASE STUDY
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 5/X
1. Layer of statistical districts with number of not georeferenced inhabitants
2. Layer of population grids with number of georeferenced inhabitants
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 6/X
METHOD 1: CREATING NEW RANDOM BUILDING POINTS
Creates a specified number of random point features. Random points can be generated in an extent window, inside polygon features, on point features, or along line features
Parameters:– Number of Points– Minimum Allowed Distance – Others
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 7/X
METHOD 1: CREATING NEW RANDOM BUILDING POINTS
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 8/X
METHOD 1: RECALCULATION OF POPULATION BY RANDOM BUILDING POINTS (1)
Recalculation of usually living by rIDOBs Abertamy community Number of rIDOBs: 47 Number of persons by rIDOBs: <1;4> dim max, min max = 4 min = 1 x = (Int((max-min+1)*Rnd+min)) __esri_field_calculator_splitter__ Počet osob = X Random number <0;1> by IDOB (e.g. ordering): dim max, min max = 1 min = 0 x = ((max-min+1)*Rnd+min) __esri_field_calculator_splitter__ Nahodne poradi (RandomOrdering) = x
Source: Using field calculator: Create Random Values, Iowa State University
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 9/X
1. Creating new random building points
2. Defining population (from random interval) for new random building points
3. Recalculation of limit number of inhabitants (e.g. defined by information from statistical district )
Source: ArcGIS10 Help
METHOD 1: METHODOLOGY
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 10/X
ArcGIS 10: method Median Center (or Mean Center, Central Feature)
METHOD 2: CREATING OF POPULATION CENTERS OF GRAVITY (1)
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 11/X
MEAN CENTER (SPATIAL STATISTICS)
Identifies the location that minimizes overall Euclidean distance to the features in a dataset
Mean Center (and Median Center) are measures of central tendency
For line and polygon features, feature centroids are used in distance computations
The Case Field is used to group features for separate median center computations (e.g. by statistical districts)
Source: ArcGIS10 Help
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 12/X
METHOD 2: METHODOLOGY (1)1. Calculating Central Value (Mean Center, Median Center)
→ Layer of spatially weighted population centers of gravity
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 13/X
2. Spatial join for linking persons with unknown place of residence into weighted center of gravity
METHOD 2: METHODOLOGY (2)
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 14/X
Aim: To distribute not georeferenced population just into grids, not
into particular buildings (x,y coordinates) To respect known spatial distribution of population (based on
georeferenced population only)
Methodology:
1. To calculate a population weight of each inhabited grid
segment within affected statistical district
Population weight of grid segment i =
METHOD 3: CALCULATION OF POPULATION WEIGHTS OF GRIDS
Population number of grid segment i
Total population of statistical district j
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 15/X
1. Layer of population grids with number of georeferenced inhabitants
2. Layer of population grids with relative population weight
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 16/X
2. To calculate a population number distributed to each
inhabited grid segment within affected statistical district
Population weight of grid segment i * Total number of not georeferenced
persons within statistical district j
3. Rounding of the population number distributed to each
inhabited grid segment to an integer value
4. Add the number of distributed not georeferenced persons to
the initial number of georeferenced inhabitants for each grid
segment
METHOD 3: METHODOLOGY
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 17/X
2. Layer of population grids with relative population weight
3. Layer of population grids with number of additionally distributed persons
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 18/X
Different types of irregularities and deviations: Problem with rounding (increase or decrease of
the distributed population number)
Problem with statistical districts without inhabited buildings
Problem with grids with the same population weight
Definition of additional assumptions
and consequent manual corrections required
METHOD 3: METHODOLOGICAL ISSUES
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 19/X
CONCLUSION
Pluses and minuses of method 1 and 2:
inhabitants distributed to the level of buildings
distribution according to spatial distribution of inhabited buildings
Pluses and minuses of method 3:
distribution according to spatial distribution of population
inhabitants distributed to the level of grids
All mentioned methods are used for recalculation of people with unknown exact place of residence
There is relatively enough „handworks“ to do it → some automatizations of processes are important
Finally, recalculation on single (personal) records are aim of the whole process
CZECH STATISTICAL OFFICE | Na padesátém 81, 100 82 Prague 10 | czso.cz 20/X
Thank you for your attention.