Geog 258: Maps and GIS


February 24, 2006 (Fri)

Geographic Information System (GIS): Representation


Learning objectives

 

1.    Understand how geographic phenomenon is measured

2.    Understand how attribute is represented in GIS

3.    Understand how space is  represented in GIS

4.    Understand how attribute & space are represented in GIS

 


Measurement of geographic phenomenon

 

Compare information system with no spatial component to information system with spatial component

How would the databases look different? In other words, how differently would the information be stored?

GIS deals with georeferenced data and attributes attached to spatial entities

 

Geographic information has three main components

Space, time, and attribute

Unlike IS, database in GIS does not only consider attribute but also space and time

Spatial component is central to GIS

Many questions on GIS data model revolve around how space is combined with attribute

 

Can you measure all of the three?

Example: population density per census tract in King County

In the map, what is measured among {space, time, attribute}?

Time is fixed (e.g. source data: 2000 census)

Space is controlled (e.g. census tract)

The map does not measure the location of population, but rather population count within census tract divided by areal size of census tract is measured. Census tract (space) serves as the control in which the attribute can be measured 

Attribute is measured

 

Can you measure all of the three?

Example: the location of cities in the U.S.

In the map, what is measured among {space, time, attribute}?

Time is fixed

Space is measured

The location of cities is measured

Attribute is controlled

While the attribute (i.e. city) is controlled

 

Can you measure all of the three?

elevation

land parcel

changes in city limits over the last 50 years

sea level at the station

 

Measurement of geographic phenomenon

 

  Three main components of geographic information is space, time, and attribute

  All of them are not thoroughly measured; but rather some of them should be sacrificed to allow for the full measurement of one component

  One is measured, one is controlled, and the other is fixed

  Control of other components of a phenomenon permits the measurement of one component

 

The case in which attribute plays role of control

1) The precise location of geographic phenomenon (thing, event, entity, and so on) can be measured e.g. land parcel, city location

2) The line of same value (isoline) is derived in a way that attribute is controlled and space is measured e.g. contour line

3) The exhaustive boundary of geographic phenomenon is measured e.g. categorical map

 

The case in which space plays role of control

1) The phenomenon occurs continuously, thus it is infeasible to measure its values at all locations, it is necessary to control space while attribute is measured in the spatial unit e.g. DEM

2) The attribute is measured in a predefined spatial unit (e.g. census) e.g. choropleth map

 


Representation of attribute

 

As a main component of geographic information, (non-spatial) “attribute” has been stored and managed in different ways, evolving from file system to database system

 

1. File system

 

Also called flat file (e.g. ASCII file)

Still common file format for distribution

Can be converted to tabular format (use ms-access, or statistics package)

Comes with data dictionary

 

2. Database system

 

Unlike file system, a collection of “related” data

Unified storage increases efficiency in data management

 

2-1. Relational database

 

·       Entities can be viewed as tables

·       Row and column of table represents entity and attribute, respectively

·       Building relationship between entities are performed by common attributes

 

 

You can link attribute data to spatial data through a common identifier

 

 

·       Primary key: unique identifier

·       Foreign key: key that allows for linking to other table

·       SQL: structured query language for relational database, reflects standardization efforts

·       Normalization: rules for reducing redundancy in table

·       Many commercial GIS systems are built upon relational database

 

2-2. Object-oriented database

 

·       The world can be seen as a collection of autonomous objects

·       Data and procedure are not separated

·       Embodies object-oriented concepts such as inheritance, polymorphism, and encapsulation

·       Handles abstract data type (ADT)

·       Its idea has been the engine for innovation in software design, database management and data modeling and so on (e.g. UML: notation language for communicating system design with object-oriented concepts)

·       Future GIS system?

 


Representation of spatial entities

 

There are two common data models that represent geographic phenomenon

 

1. Raster model

 

Attributes are stored in controlled spatial unit such as grid cell

Good for representing continuous fields (e.g. elevation, temperature, soil type)

 

 

2. Vector model

 

Precise location is stored while attribute is controlled

Good for representing discrete objects (e.g. building, land parcel, lake

Vector model can be classified depending on (1) whether topology exists (2) what kind of spatial primitives are used (cartographic analogy: point, line, area)

 

 

What is topology?

Simply put, spatial relationships

 

2-1. Vector model by topology

·       Spaghetti vector model: data without topology à mainly for display

·       Topological vector model: data in which topology is built  à allows for complex operations (e.g. network analysis, accurate spatial measurement)

 

 

2-2. Vector model by spatial primitives

·       Point:  0-dimensional

·       Line: 1-dimensional

·       Area: 2-dimensional

How are they stored in files (such as ASCII file)?

They are scale-dependent

 

Q. Identify data model

 

Topographic surface of New York State

 

Fatal motor vehicle accidents & Road network & Lake Erie in Buffalo, NY

 

Orthophoto image and road network in SUNY-Buffalo Campus

 

 

 

 

Comparison between vector and raster

 

 

Vector data

Raster data

Base of reps.

Coordinate

Cell

Good to reps.

Discrete entities

e.g. school, lake, event-location, road

Continuous entities

e.g. temperature, elevation. toxic level

Example data or product

TIGER/Line, USGS DLG

Satellite imagery, aerial photo, USGS DRG, DOQ, DEM

Spatial data format

shapefiles, Arc/Info coverage, AutoCAD DXF file

TIFF, JPEG, MrSID, BMP, BIL

Attributes

Multiple attributes are stored in a linked tabular data

Usually a single attribute (z-value)

Note

Topology

Resolution

 

Translating data between vector and raster model

Scanned maps are vectorized when you have to create vector data from paper maps

Vector data are rasterized when you export vector maps into image files

Elevation (continuous surface) is represented by vector model (e.g. contour line)

 

Advantage/disadvantage of vector/raster model

 

·       Raster: many data sources are already stored in this format (e.g. satellite image, orthophoto), equivalent to multidimensional array, requires large space à compression technique (e.g. MrSID), precise location is not measured, thus some measurement is not quite accurate

·       Vector: precise location is measured (even though some approximation exists depending on tolerance given to vertices: GBF/DIME vs. TIGER), storage space is saved, some geographic phenomenon is not well represented in this data model (e.g. surface)

 


Representation of space and attribute: GIS Architecture

 

1. Hybrid system

 

Spatial data is stored as files separated from attribute stored in table

e.g. Arc/Info coverage: separation between Arc and Info (it’s called georelational model)

 

 

2. Integrated system

 

Spatial database management system approach

Vector/raster data model is stored as relational tables

 

 

3. Object-oriented system

 

GIS data is modeled using OO concepts for natural representation of spatial entities

e.g. Geodatabase supports (unlike previous data model) rules and relationships

Lack of consensus on how it can be implemented

Not good for representing fields