Geog 258: Maps and GIS

March 7, 2006

Final exam review

 


1. Grid Coordinate Systems

 

Why coordinate systems? With coordinate systems, any location can be identified precisely. It can be thought of a location identifier. By reporting the location of your house using coordinate systems (such as latitude and longitude) in the case of emergency, a pilot can find you to rescue you without confusion.

 

Why grid coordinate systems? How would you manage sheets of maps if you were in charge of national maps or state maps? It would be convenient to have grid-like index so that different areas can be retrieved or archived in a systematic way (e.g. USGS quadrangles). It also makes it easy to make measurements (e.g. distance between two points). Two grid coordinate systems are covered: UTM and SPC. They are used for a large-scale mapping.

 

The name tells you a lot. UTM is Universal Transversal Mercator. (1) Universal: covers the world. (2) Mercator: uses Mercator map projection (3) Transverse: uses transverse aspect. (Do you know Mercator and transverse aspect?). SPC is State Plane Coordinate system. (1) State: covers the U.S. States by States (2) Plane: planar (2-dimensional) coordinate system. 

 

How zones are partitioned? UTM is divided into 6-degree longitudinal zones. SPC is divided into zones whose boundaries follow administrative units. Unlike map projections used for small-scale mapping where the globe is projected onto one (big) developable surface (e.g. Mercator map), these grid coordinate systems use many (small) developable surfaces defined for each zone. In such a way, overall distortion will be minimized.

 

Why do SPC zones follow administrative boundaries? Maps have been commonly used for municipal purposes (such as land parcel records) where land units often follow administrative boundaries. There must have been demands for creating maps in administrative units. So were demands for defining coordinate systems for a group of related administrative units.

 

Which map projection should be adopted for each zone in SPC? It depends on how the zone looks. If it has longer east-west extents, it will be better to use conic projection because lines of tangency will lie along line of parallels, thereby reducing the overall amount of distortion. If it has longer north-west extents, transverse mercator will be used.

 

Why do grid coordinate systems use false easting and false northing? In UTM and SPC, it is common to use false easting and false northing. You can imagine coordinate systems for each zone where its center point would become (0, 0). It means that coordinate values in the bottom left from this origin will be negative for example. (0,0) is shifted approximately so that it avoids having negative coordinate values.

 

Comparison between UTM and SPC

 

 

UTM

SPC

Boundary

Latitude & longitude

Administrative boundary

Projection type

Transverse Mercator

Lambert conformal conic; Transverse Mercator

Geographic scope

International

U.S. only

Measurement unit

Meter

Feet (for SPC27); Meter (for SPC83)

 


2. Thematic Maps

 

What is a thematic map? It shows spatial distributions of particular themes (e.g. world climate map). It is different from a reference map mainly designed to show the locations of features (e.g. Atlas). A thematic map is called a single-purpose map while a reference map is called a multiple-purpose map. A thematic map is like a topical essay whereas a reference maps is like an encyclopedia.  

 

Symbolization of thematic maps: It is mainly determined by measurement levels of themes mapped. If the theme is measured in a nominal scale (e.g. land use type), distinguishing types of symbols should be used (e.g. shape, hue, pattern arrangement, pattern orientation). On the other hand, if the theme is measured in numeric scales (e.g. land value), ordering types of symbols should be used (e.g. size, value, pattern texture). The thematic map showing themes in a nominal scale is called qualitative thematic map. The thematic maps showing themes in numeric scale is called quantitative thematic map.  

 

Generalization of thematic maps: All thematic maps adopt generalization schemes. Maps can’t show all details, but rather attempt to portray relevant themes in a concise manner. For instance, the geometric dimensionality of features doesn’t necessarily correspond to its actual dimensionality (e.g. city as a point in a small scale map even though a city is better seen as a polygon). The location of symbols may be approximate rather than exact (e.g. tourist map). In qualitative thematic maps, homogeneity within category is assumed (e.g. land use type may be mixed).

 

Kinds of quaLitative thematic maps: The simplest form of thematic maps would be a point symbol map that shows the (approximate) location of features (e.g. tourist map). Depending on scale, the point can be mapped as polygon (e.g. campus map). Categorical map refers to an area-class map where the nominal value within spatial unit is assumed to be homogeneous, and its boundary is exhaustively drawn (e.g. vegetation type map, soil type map, world biome map).

 

Kinds of quaNtitative thematic maps: Most commonly used quantitative thematic maps include proportional symbol map, dot map, choropleth map, and isoline map. Proportional (or graduated) symbol map uses the size of symbol to represent varying magnitude at the fixed location of features (e.g. total revenue at the location of shopping mall). Dot map uses the number of dots to represent varying magnitude of values which are collected at a given unit. A certain number of dots are sprinkled in the unit, where the location of dots can be either adjusted to spatial arrangement of related features (e.g. no population in the lake). Choropleth map is one popular map mainly because many data are compiled in the pre-defined unit such as census. Examples include median home value per census tract, breast cancer rate per county, and world population density per county. Choropleth map uses color value (brightness) to represent varying magnitude of values. Isoline map is good for portraying continuous geographic phenomenon whose values exhibit smooth variation (e.g. temperature map used for weather forecast on T.V.)

 

Other (used much less, but worthwhile mentioning) quantitative thematic maps include cartogram and dasymetric map. Most maps do not attempt to distort the geometry of map features (not like generalization, such as seeing road as line, or placing city location as a point), but cartogram dares to distort it. The size of area is modified in proportion to the values mapped – see U.K. population map where the southern part becomes fatter in the lecture note. Cartogram can relieve us from our image fixated on land size.

 

Most area-feature maps are made directly from a predefined collection unit (especially socioeconomic phenomenon rather than physical phenomenon). The spatial unit is not necessarily homogeneous for the phenomenon mapped. It can be better said that appropriate geographic units will vary by the phenomenon in hand – are you mapping soil, population, precipitation, or debt? Dasymetric map attempts to map the phenomenon in its most natural boundary. Dasymetric map can relieve us from tyranny of boundaries that could be irrelevant to the phenomenon of our interest. 

 

Measurement behind thematic maps: Even though maps apparently look the same (e.g. same dimensionality and symbolization), the value mapped is not necessarily based on the same kind of measurement (i.e. how values are collected and compiled). For instance, consider choropleth map, dasymetric map, and isoline map. The same phenomenon such as population density can be mapped differently. If a map shows population divided by the area per census tract, it will be a choropleth map. If a map shows population density adjusted to spatial arrangement of features affecting population density (e.g. lake, road, mountain), it will be a dasymetric map. If a map shows population density interpolated from centroid of census tract (or hopefully smaller unit), it will be an isoline map. Which map do you think is more accurate representation of the phenomenon? Which map is hardest to make? 

 

Quantitative thematic maps and phenomenon space: Four quantitative thematic maps (that is, proportional symbol map, dot map, choropleth map, and isoline map) can be viewed in two dimensions. One dimension is concerned with whether the phenomenon mapped is continuous or discrete. The other dimension is concerned with how values mapped change – abruptly or smoothly. If the value of phenomenon can be measured in a continuous scale (however fine the resolution is), it can be considered continuous (e.g. temperature, air pressure, median age per census tract, population density per county). If the value of phenomenon is measured in a discrete manner (i.e. there is a point in space the value doesn’t exist), it is considered discrete (e.g. sales revenue per each Starbucks coffee store in Seattle, the number of jobs in the U.S.). Even with the same continuous phenomenon, the value can change smoothly or abruptly. For instance, tax per county is considered continuous, but its value changes abruptly on county boundaries. See the phenomenon space matrix in the lecture note.

 

How would you map the number of jobs in the U.S.?

If the value is compiled directly from the location of jobs or relatively small units of compilation, use dot map

If the value is compiled from some pre-defined collection unit, use choropleth map

If the value is compiled from metropolitan areas, use proportional symbol map

 


3. Global Positioning System (GPS)

 

What do you need for making your GPS receiver work?

  • Space segment (Satellites): Satellites send signals that allow us to determine the location of your GPS receivers
  • Control segment (Control station): Receivers signals from satellites and process/correct information encoded in signals
  • User segment (GPS receiver): GPS receivers can decode signals from satellites and control stations, and give you the location.

 

How is the location of a GPS receiver determined?

1)      Know where the satellite is: it’s encoded in signals from satellites and control stations

2)      Know the distance to the satellite: it’s speed multiplied by travel time of signal, where speed is given since it’s the speed of a radio wave, and travel time of signal is calculated from the delay between departure time measured at atomic clock in the satellite and arrival time measured at the clock built in a GPS receiver; travel time should be corrected somewhat because of difference in accuracy of clocks)

3)      Triangulate between three or more satellites

 

How does Differential GPS (DGPS) improve positional accuracy?

The location determined in the way described above is compared to the actual location at receiving station. The actual location is already known from the survey. The difference in roving location (from the GPS receiver) and actual location (it’s called a range error) can be used to account for effect of errors which helps correct for the roving location.

 

Where is GPS used?

Navigation (e.g. collision avoidance between airplanes), Surveying (determining location), car navigation system, enhanced 911, location-based service (knowing where my friends are, where the nearest pizza stores are with GPS-enabled PDA or cellular phone), vehicle tracking for just-in-time delivery (e.g. UPS), mapping (it revolutionizes map-making practices which were traditionally based on surveying equipments or remote-sensing more recently), and much more

 


4. Geographic Information System (GIS)

 

GIS is an information system designed to work with geographically referenced data. Like other information systems, it is broad enough to encompass hardware, software, database, and people. GIS provides a wide array of functionalities that regard acquisition, manipulation, management, analysis, and display of geospatial data. The following table can show the partial picture of GIS components in two dimensions – components in the column and functionalities in the row. 

 

 

Hardware

Software 

Data

Acquisition

GPS receivers

Remote sensing instrument

Scanner

Digitizer

Locating

Image processing

 

Scanning

Digitizing

Location file

Image

 

Raster data

Vector data

Manipulation

Workstation

Georeferencing

Transformation of coordinate system

File conversion

Raster data

Vector data

 

Management

Client/Server

Spatial database management system (SDBMS)

Spatial database

Table data

Analysis

Desktop computer

Spatial overlay

Routing

Terrain analysis

Layer

Topological vector

DEM

Display

Mobile computing device

Car Navigation System

Web-mapping

Map

 

For example, suppose you start your own business, testing a new business model – providing real estate service in the greater Seattle metropolitan area using GIS. Imagine what you will need for this service. Acquisition: You would have to acquire relevant spatial data with GPS receivers or remote sensing. Manipulation: You will have to transform data into one of grid coordinate systems (such as State Plane). You would need to link spatial information to non-spatial attribute (such as transaction data). Management: Data needs to be regularly updated with proper user controls. Analysis: You can test your hypothesis about the relationship between land value and other variables (view?) to predict future trends with spatial overlay. Display: you can advertise the listings of apartments in selected areas with web-mapping.

 


5. GIS: Representations

 

Measurement of geographic phenomenon, raster, and vector

All of three components of geographic information {space, time, attribute} are not measured. But rather two components are sacrificed to allow one component to be measured. For example, to measure the geographic phenomenon like elevation, it is necessary to control space (because it is infeasible to measure elevation at all locations or it is hard to conceive spatial form like geometry from elevation) while elevation within some spatial unit (like in a grid or tessellation). Such continuous field is usually measured in this manner – space controlled, attribute measured, and time fixed. Raster data model can be seen as the data model that allows for measurement of continuous field in general. On the other hand, discrete object is usually measured in a way that space (e.g. park boundary) is measured while attribute (e.g. park) is controlled, and time (at some point in time) is fixed. Vector data model can be thought of as the way to encode this kind of spatial information in the computer. Even though the distinction between vector and raster data model is intrinsic to the nature of phenomenon (continuous, discrete), it is also possible to relax this linkage depending on the intended use of data (e.g. Tornado as an air mass (raster) and path (vector). In general, modeling the geographic reality into computer-compatible format requires the understanding of the nature of phenomenon, geographic scale observed, and potential use of data, but it is largely guided by existing human concepts (e.g. geometry – point, line, polygon) or available tools (e.g. remote sensing).

 

 

How is attribute encoded in the computer?

It begins with storing attributes in a flat file, and evolves to database system. For example, if you start your own business in real estate service, transaction data (mortgage, information about customer) can be stored in a file (like text file). Later, you may have to retrieve the information from the files, and you will find it necessary to adopt a better way to manage increasingly large data set. The bigger data is, the more users are, it is more likely to benefit from database system due to the consistent way in which attribute is stored in database system. Can you imagine amazon.com function properly without database system?

 

Two database models are covered in this class. The world is composed of entity and relationships, viewed from relational database model. The world is composed of autonomous object, viewed from object-oriented database model. Relational database (storing attribute in a table) is very common in GIS while object-oriented database (storing attribute as an object with some rules and relationships embedded in) is relatively new in commercial GIS.

 

City database in relational database: city is encoded in a table where its attributes are stored in the column, and its entity instance is encoded in the record. The table is linked to other related table through common attributes.

City database in object-oriented database: city is encoded as an object whose generic properties inherit from its superclass (e.g. spatial object). Object does not only have attributes but also methods (functions) defined for its object. The relationship with other related objects is built upon given hierarchy or embedded rules in the object.

 

How are space and attribute combined?

Traditionally, spatial entity is stored in a file (e.g. the list of point locations in ASCII files) because it is not easy to store this locational information in a predictable manner like number or character as a field type). Hybrid system refers to GIS architecture where spatial entities are stored in files, attributes attached to spatial entities are stored in the table, and they are linked through common identifiers. Integrated system refers to GIS architecture where all information (whether space or attribute) is stored in one table instead of separating space from attribute. This is spatial DBMS approach to GIS. Object-oriented system represents geographic phenomenon using object-oriented concepts as explained above.

 


6. GIS: Analytics

 

We here discuss kinds of operations used in GIS. It is convenient to think of operations as a function that transforms input values into output values. For example, attribute operations addition (let’s say 1 + 1) will add two input values to yield output value 2. Spatial operations are different from attribute operations as they are performed on spatial entities as operands, and the operands have complex dimensionality (e.g. the number 1 has no dimensionality, but road segments say 45th street has location, length and direction). It is not surprising spatial operations cannot be easily implemented in commercial DBMS.

 

Operations performed on spatial entities are classified into field operation and object operations. (Discrete) object has its geometric dimensionality, but (continuous) field does not. Therefore, field operations are based on proximity only – local, focal, and zonal operators transform input values into output values depending on its values at the location of input pixel, values in the neighbors of input pixel, and values in the predefined zone where input pixel belongs to. Object operations can measure the geometric properties of one spatial entities (e.g. length of road, area of land parcel), or can measure spatial relationships between spatial entities (e.g. distance between airport and my house, shopping mall is adjacent to my house).

 

Spatial overlay is a powerful tool of GIS. Easy way to conceive how spatial overlay works is to think that you have plastic planes and superimpose them based on their common spatial framework (e.g. coordinate system). Based on their location, it is possible to determine their relationships (e.g. breast cancer rate and toxic environment). Many geographic questions or inquiries can be answered through these procedures that enable you to manipulate geographic themes unlike reality where you can only guess. It is widely used for site selection scenario, being illustrated below which can be thought of cookie-cutter cases for each criteria.

 

 


7. Spatial Data Quality

 

One important question you should ask yourself with map uses and GIS is the reliability of data. Spatially-enabled decisions critical to our future – sustainability, homeland security – are made on the basis of analysis. The analysis is performed on spatial data. What if such important decisions have been made on inaccurate data? Who should take responsibility for this? Turn on a GPS receiver with your hand. The first message you would probably see will be “data provided in this receiver is for reference only, and should not be used for navigational purposes”. So if you end up drowning in the lake where doesn’t exist in the GPS receiver, who would you talk about this problem to? There must be some way to assess spatial data quality to avoid (or warn) such things to happen.

 

Data is said to be accurate if it is sufficiently close to the true value. Your multiple choice answer can be said to be accurate if you pick the right answer as long as the answer is proven to be true. Positional accuracy is said to be higher if the value is near true positional value (e.g. little difference between roving location of GPS receivers and true location determined by better surveying equipment like in bench mark).

 

Data is said to be consistent if it does not contradict internal rules. In the world, there are so many rules you can’t possibly list completely. For example, we know Julie as a spatial object can exist only at one location and at the same time, not two locations at the same time. It wouldn’t be considered consistent if county information is stored in State column in your table. Lake is supposed to be closed in geometry, instead of being unclosed strings of points in the computer. 

 

Data is said to be complete if there’s nothing missing. If my grading excel file miss one of students, the data is incomplete. If UW campus map doesn’t contain Drumheller fountain, you will say that the map is incomplete.

 

Accuracy, (logical) consistency, and completeness are seen (agreed) as components of spatial data quality. It means that when you have to determine how good data is, you may want to look at spatial data in these three respects. It would help you make systematic assessment of data. 

 


8. Maps and GIS go online

 

Internet GIS or (broadly defined distributed GIS) can be categorized depending on where three elements of applications are occurring. Applications have presentation, logic, and data. Presentation elements can be thought of the window into the program (user interface – where you click), and logics are functionalities (displaying maps, performing spatial overlay – what you do). Standalone GIS contains three elements in one computer. Distributed GIS distributes three elements into separate computers through networking. For example, American fact finder showing census maps accesses the data in the server (data), process the data in the server (logic), and web browser displays the results of processing in your computer (presentation).

 

Categories of web mapping: If your computer (as a sort of terminal) does a little work (only displaying the map through web browser), it is said to have a thin client. On the other hand, if your computer performs heavy load of work (not only displaying maps, but also perform operations), it is said to have a thick client. More evolving form of distributed GIS uses distributed component technology designed to work with any flatform and data (i.e. interoperability).

 

Distributing data online: Data can be distributed on the internet in different levels ranging from object-level to collection-level. Object-level: you can either download data by knowing where they are (e.g. National Atlas of U.S.). Collection-level: you can be directed to a point of contact or data itself (if you’re lucky) through a query on a sort of portal website (e.g. geospatial one-stop). A critical element of distributing data online is metadata – data about data. Metadata describes the content of data. Most of web data services are built upon metadata linked to data. Internet does not read the information from the data directly, but rather read the information through metadata. Getting the right data is critical to the success of your project. Visit USGS website where you examine available public data (private data are often derived from these).

 


Final exam will take place at MGH 241 on Tuesday (3/14) 8:30-9:20 AM