Lecture 7: Understanding
raster data and cell-based modeling
(Modified
from Chapters 4, 5, & 7 in “Using ArcGIS Spatial
Analyst”)
1. Understanding a
raster dataset
2. Coordinate space
and the raster dataset
3. Discrete and
continuous data
4. The resolution of
a raster dataset
5. Raster encoding
6. Representing
features in a raster dataset
7. Assigning
attributes to a raster dataset
8. Using feature
data directly in Spatial Analyst
9. Deriving raster
datasets from existing maps
Lecture
Objectives:
·
About
the structure of raster datasets
·
The
importance of coordinate space and raster datasets
·
The
difference between discrete and continuous types of raster datasets
·
About
the resolution or cell size when creating a raster dataset
·
How
raster datasets are encoded and how points, lines, and polygons are represented
as cells
·
Other
issues you need to be aware of, such as when adding other attributes to raster
datasets and creating raster datasets from existing maps
1. Understanding
a raster dataset
·
Raster
data is generally divided into two categories: thematic data and image data.
·
The
values in thematic raster data represent some measured quantity or
classification of a particular phenomena such as
elevation, pollution concentration, or population.
·
For example,
in a landcover map the value 5 may represent forest,
and the value 7 may represent water.
·
The
values of cells in an image represent reflected or emitted light or energy such
as that of a satellite image or a scanned photograph.
·
The
analysis tools of Spatial Analyst are primarily intended for use on thematic
raster data.
·
ERDAS
Imagine will be used in the next section of the course to analyze satellite
imagery
·
All
Spatial Analyst functions process the first band of any raster dataset. This
section will provide an overview of raster data and
how it is created.

The composition of a
raster dataset
·
Raster
datasets describe the location and characteristics of an area.
·
Raster
datasets describe a single theme (landsuse, soils,
roads, elevation, etc.) and provide complete coverage of an area through cells.

The cell
· Cells, or
pixels, make up a raster dataset
· All cells in a
raster must be the same size
· Cells can be any
size you desire

Rows and
columns
· Cells are arranged
in rows and columns where rows represent the x-axis of a Cartesian plane and
the columns the y-axis
·
Each
cell has a unique row and column address

Values
·
Each
cell has a specific values assigned to it
·
Values
can represent magnitude, distance, or relationship of the cell on a continuous
surface (e.g., elevation, soil, aspect)
·
Values
can also represent categorical data such as soil type, soil texture, or landuse class
·
Both integer
and floating-point values are supported in Spatial Analyst.
o
Integer
values are best used to represent categorical data, and floating-point
values to represent continuous surfaces (elevation, slope, flow
accumulation).
o
Floating-point
data takes up a lot more disk space and is more difficult to handle
o
8-bit
integer can store 256 values, so often it makes sense to recalculate floating
point values into integer values.

Zones
· Any two or more cells with the same value
belong to the same zone.
· A zone can consist of cells that are
connected, disconnected, or both.
· Zones, therefore, are NOT analogous to
polygons, which are only contiguous closed areas.
· Zones whose cells are connected usually
represent single features of an area, such as a building, a lake, a road, or a
power line.
· Assemblages of entities, such as forest
stands in a state, soil types in a county, or the single-family houses in a
town, are features of an area that will most likely be represented by zones
made up of many disconnected groups of connected cells.
· Every cell in a raster belongs to a zone.
· Some raster datasets contain only a few
zones, while others contain many.

Regions
·
Each
group of connected cells in a zone is considered a region.
·
A zone
that consists of a single group of connected cells has only one region.
·
Zones
can be composed of as many regions as are necessary to represent a feature; the
number of cells that make up a region has no practical limits.
·
Spatial
Analyst provides the tools needed to turn regions into individual zones.
·
In
the raster dataset above, Zone 2 consists of two regions, Zone 4 of three
regions, and Zone 5 of only one region.
NoData
·
If a
cell is assigned the NoData value, then either no
information or insufficient information about the particular characteristics of
the location the cell represents is available.
·
The NoData value, sometimes also referred to as the null value,
is treated differently from any other value by all operators and functions.

·
Cells
with NoData values are processed in one of two ways:
1.
Assigning
NoData to the output cell location if the NoData value exists:
o
for
the location on any of the inputs in an operator or local function,
o
in
its neighborhood in a focal function,
o
or in its
zone in a zonal function.
2.
Ignoring
the NoData cell and completing the calculations with
all valid values.
·
The
second option, to ignore the NoData cell, is not
possible when using operators between two datasets or with local functions.
·
When
a NoData cell is within the neighborhood of a cell in
a focal function or a zone of a zonal function, by default, the sum, median,
variety, majority, or minority of all cells with known values can be calculated
and assigned to the output raster dataset (this default can be
overridden).
The
associated table
·
Integer
(categorical) raster datasets usually have an attribute table associated with them (VAT = value
attribute table).
·
The
first item in the table is Value, which stores the value assigned to
each zone of a raster.
·
A second
item, Count, stores the total number of cells in the dataset that belong to
each zone. Both Value and Count are mandatory items.
· An essentially limitless number of
optional items can be incorporated into the table to represent the other
attributes of the zone.

Name
·
Converting
a raster dataset from a nonreal-world coordinate
system (image space) to a real-world coordinate system is called georeferencing.
·
For a
raster dataset, the orientation of the cells is determined by the x- and y-axes
of the coordinate system.
·
Cell boundaries
are parallel to the x- and y-axis, and the cells are square in map coordinates.
·
Cells
are always referenced by an (x,y)
location in map coordinate space and never by specifying a row and column
location.
·
The x,y Cartesian coordinate system
associated with a raster dataset that is in a real-world coordinate space is
defined with respect to a map projection.
·
Map
projections transform the three-dimensional surface of the earth to allow the
raster to be displayed and stored as a two-dimensional map.
·
The
process of rectifying a raster dataset to map coordinates or converting a
raster dataset from one projection to another is referred to as geometric
transformation.
2. Coordinate
space and the raster dataset

Georeferencing a raster dataset
·
To georeference a raster dataset from image space to a
real-world coordinate system, you need to know the location of recognizable
features in both coordinate spaces.
Polynomial
transformation
·
A
polynomial transformation computed using the specified control points is
applied so the input locations approximate the specified output locations using
a least-square fit.
Projecting
raster datasets
·
The
cells of a raster dataset will always be square and of equal area with respect
to the Cartesian coordinate system (map coordinate space) associated with the
raster dataset.
·
The
shape and area a cell represents on the surface of the earth will never be
constant across a raster dataset.
·
Since
the area represented (on the face of the earth) by the cells will vary across
the raster dataset, the output cell size and the number of rows and columns may
change when projected. Converting from one projection to another can also
change the shape and area a cell represents on the surface of the earth.
·
Each
projection treats the relationship between a three-dimensional world and a
two-dimensional one differently.
·
You
should be aware of the properties and assumptions for each projection before
selecting one. When displaying and performing analysis with raster datasets,
they should be in the same coordinate space and in the same projection.
·
If
two raster datasets are in different coordinate systems, the values of the
coordinates are on different scales.
·
Errors
will occur when comparing such datasets because they will represent different
locations.
Geometric
transformation
·
When
you rectify a raster dataset, project it, convert the raster dataset from one
projection to another, or change the cell size, you are performing a geometric
transformation.
·
Geometric
transformation is the process of changing the geometry of a raster dataset from
one coordinate space to another.
·
Because
the cell centers of the output cells do not match the cell centers of the input
raster, a resampling technique must be used to derive
a value for the center on the output raster
Resampling is the process of determining new values for cells in an output
raster that result from applying a geometric transformation to an input raster
dataset.
The three
techniques for determining output values are nearest neighbor assignment,
bilinear interpolation, and cubic convolution.
Nearest
neighbor assignment
·
Nearest
neighbor assignment determines the location of the closest cell center on the
input raster and assigns the value of that cell to the cell on the output
raster.
·
Always
used for categorical data (nominal or ordinal) – why?
Bilinear
interpolation
·
Bilinear
interpolation uses the value of the four nearest input cell centers to determine
the value on the output raster.
·
The
new value for the output cell is a weighted average of these four values,
adjusted to account for their distance from the center of the output cell in
the input raster.
·
This
interpolation method results in a smoother-looking surface than can be obtained
using nearest neighbor.
·
Preferred
for continuous data (elevation, slope, salinity, etc.)
Cubic
convolution
· The weighted average is calculated from
the 16 nearest input cell centers and their values.
· Cubic convolution will have a tendency to
sharpen the data more than bilinear interpolation since more cells are involved
in the calculation of the output value.
3. Discrete and
continuous data
·
It is
important to understand the type of data you are modeling, whether it is
continuous or discrete, when making decisions based on the resulting values.
·
Discrete
data, which is sometimes called categorical or discontinuous data, mainly
represents objects in both the feature and raster data storage systems.

·
A
discrete object has known and definable boundaries (ie.,
a lake, a building)
·
A
continuous surface represents phenomena where each location on the surface is a
measure of the concentration level or its relationship from a fixed point in
space or from an emitting source.
·
Continuous
data is also referred to as field, nondiscrete, or
surface data.
·
One
type of continuous surface is derived from those characteristics that define a
surface, where each location is measured from a fixed registration point.
o
elevation
o
aspect

·
Another
type of continuous surface includes phenomena that progressively vary as they
move across a surface from a source.
o
fluid and
air movement.
o
characterized by the type or manner in which the phenomenon moves.
o
diffusion
or any other locomotion where the phenomenon moves from areas with high
concentration to those with less concentration, until the concentration level
evens out.
o
salt
concentration moving through either the ground or water, contamination level
moving away from a hazardous spill or a nuclear reactor, and heat from a forest
fire. In this type of continuous surface, there has to be a source.
o
concentration is always greater near the source, and diminishes as a function
of distance and the medium the substance is moving through.

·
The
means of locomotion affect the surface concentration.
·
Other
locomotion surfaces include dispersal of animal populations and the spreading
of a disease.
·
When
representing and modeling many features, the boundaries are not clearly continuous
or discrete.
·
A
continuum is created in representing geographic features, with the extremes
being pure discrete and pure continuous features.
·
Most
features fall somewhere between the extremes. Illustrations of features that
fall along the continuum are soil types, edges of forests, boundaries of
wetlands, and geographic markets influenced from a television advertising
campaign.
·
The
determining factor for where a feature falls on the continuous-to-discrete
spectrum is the ease in defining the feature’s boundaries.
·
No
matter where on the continuum the feature falls, the grid-cell storage can
represent it to a greater or lesser accuracy.
·
The
validity and accuracy of boundaries of the input data must be understood.
4. The
resolution of a raster dataset
·
The
size chosen for a raster cell of a study area depends on the data resolution
required for the most detailed analysis.
·
The
cell must be small enough to capture the required detail, but large enough so that
computer storage and analysis can be performed efficiently.
·
The
more homogeneous an area is for critical variables such as topography and landuse, the larger the cell size can be without affecting
accuracy.
·
Before
specifying the cell size, the following factors should be considered:
·
The
resolution of the input data
·
The
size of the resultant database and disk capacity
·
The
desired response time
·
The
application and analysis that is to be performed
·
A
cell size finer than the input resolution will not produce more accurate data
than the input data.
·
It is
generally accepted that the resultant raster dataset should be the same or
coarser than the input data.
·
Spatial
Analyst allows for raster datasets of different resolutions to be stored and
analyzed together in the same database


5. Raster
encoding
·
The
process of creating a raster dataset is like draping a fishnet containing
square cells over the study area.
·
A
code is assigned to each cell according to the feature that is at the center of
the cell.
·
The
code or value of a cell is a numeric value that corresponds to an attribute
type.
·
Numeric
values speed processing and allow for data compression.
·
Each
cell represents a specified portion of the world.
·
The main
consideration is that the size be appropriate for the analysis
6. Representing
features in a raster dataset
When converting
points, polylines, and polygons to a raster, you
should be aware of how the raster dataset will represent the features.
Linear data

·
Linear
data is all of those features that, at a certain resolution, appear only as a polyline such as a road, a stream, or a power line.
·
A
line by definition does not have area.
·
In Spatial
Analyst, a polyline can be represented only by a
series of connected cells.
·
As
with a point, the accuracy of the representation will vary according to the
scale of the data and the resolution of the raster dataset.
Polygon data

·
Polygonal
or areal data is best represented by a series of connected cells that best
portrays its shape.
·
Trying
to represent the smooth boundaries of a polygon with a series of square cells
does present some problems, the most infamous of which is called the jaggies, an effect that resembles stair steps.
Point data

·
A
point feature is any object at a given resolution that can be identified
as being without area. Although a well, a telephone pole, or the location of an
endangered plant are all features that can be rendered as points at some
resolutions, at other resolutions they do in fact have area.
·
Point
features are represented by the smallest unit of a raster, a cell. It is
important to remember that a cell has area as a property.
·
The smaller
the cell size, the smaller the area and thus the closer the representation
of the point feature.
·
Points
with area have an accuracy of plus or minus half the cell size.
·
This
is the trade-off that must be made when working with a cell-based system. Having
all data types: points, polylines, and polygons in
the same format and being able to use them interchangeably in the same language
are more important to many users than a loss of accuracy.
·
The
accuracy of the above representation is dependent on:
o
the
scale of the data
o
the size
of the cell.
7. Assigning attributes to a raster dataset

·
The
value associated with a cell is an identifier that defines to which class,
group, category, or member the cell belongs.
·
There
is usually a one-to-many relationship between the cell values (or codes) and
the number of cells that are assigned the code.
·
The
field you use in the conversion process will affect the analysis that you can
perform on the dataset.
·
If
you have a polygon feature dataset that contains landuse
type and owner for each parcel in a town, you can use either attribute.
·
If
you use landuse type you will be able to ask
questions, such as Where are all the agricultural
areas that are available for building?
·
However,
you cannot add (join) the owner attribute to the raster dataset because this is
a many-to-one relationship.
·
You
can also relate the landuse type to the relational
table because each parcel will have one landuse type.
·
Where
this logic collapses is when one person owns multiple parcels with different landuse types.
· In this case, you may have to use
parcel-ID or some other unique feature when converting.
· Usually with continuous data, each cell
has a unique value and will have no attribute table, so there will be no
attributes to relate.
· In this case, the many-to-one issue will
not be applicable.
·
When
in doubt, the most detailed breakdown should be used.
· Information can be grouped much easier
than split.
8. Using feature
data directly in Spatial Analyst
·
Several
of the Spatial Analyst dialog boxes allow you to enter a point, polyline, or polygon feature directly into the function.
·
There
are two ways that features are handled in Spatial Analyst.
·
It
either processes the feature data directly, or it converts it to a raster and
then processes it.
·
The
output resolution can be a specific cell size or the maximum or minimum cell
size of the other input raster datasets into the function.
1.
The
default is set to the coarsest input raster dataset into the function.
·
You
will know if the input can be either a feature or a raster dataset because when
you open the browser to enter the input, the browser will say Raster datasets and feature classes in the Show of
type input field, and both feature and raster data will be displayed in the
browser.
·
If
only rasters are allowed as input, then the browser
will say Raster Datasets in the Show of
type input field, and only raster datasets will be displayed.
· Some browsers will allow the input of both
feature and raster data

9. Deriving
raster datasets from existing maps
When creating
raster datasets from existing maps (data entry), several factors must be
considered to allow for full utilization of the input data.
Selecting
maps
When selecting maps
for the creation of the database, you must be aware of:
·
The
age and date of the map
·
The
cartographic accuracy
·
The
resolution and detail
·
The
compatibility of the map with other input maps
Potential
errors
Even with
current maps that are accurate, at the same resolution, with the desired amount
of detail, and compatible, errors can still occur. Some of the most common
errors include:
·
Drafting
errors
·
Different
cartographic projections used to draft the original data
·
Different
photographic projections used to draft the original data
·
Physical
changes in the materials used for the maps (shrinking or swelling)
Understanding
cell-based modeling (Chapter 5)
1.
Understanding
analysis in Spatial Analyst
2.
The
operators and functions of Spatial Analyst
3.
NoData and how it affects analysis
4.
Values
and what they represent
5.
The
analysis environment
6.
The
cell size and analysis
7.
Handling
projections during analysis
1.
Understanding analysis in Spatial Analyst
·
The
easiest way to understand cell-based modeling is from the perspective of an
individual cell (the worm’s-eye approach) as opposed to from the entire raster
(the bird’s-eye approach).
·
To do
so, think of yourself as a cell in a raster dataset.
·
You
represent a location, and you have a value.
·
All
Spatial Analyst operators and functions will ask you to manipulate your value
(or remain the same) based on a set series of rules.
·
For
you to calculate an output value for your location using any Spatial Analyst
operation or function, there are three things you need to know:
1.
your
value.
2.
the
manipulation of the operator or function.
3.
which
other cell locations and their values need to be included in your calculations.
2. The operators
and functions of Spatial Analyst
The functions
associated with raster-cell cartographic modeling can be divided into five
types:
A.
Those
that work on single cells (local functions)
B.
Those
that work on cells within a neighborhood (focal functions)
C.
Those
that work on cells within zones (zonal functions)
D.
Those
that work on all cells within the raster (global functions)
E.
Those
that, when combined in a series, perform a specific application (application
functions)
A.
Local functions
·
Local,
or per-cell, functions compute an output raster dataset where the output value
at each location is a function of the value associated with that location on
one or more raster datasets.
·
A
per-cell (local) function can be applied to a single raster dataset or to
multiple raster datasets.
·
For a
single dataset, examples of per-cell functions are the trigonometric functions
(for example, sin), or the exponential and logarithmic functions (for example,
exponential or log).
·
Examples
of local functions that work on multiple raster datasets are functions that
return the minimum, maximum, majority, or minority value for all the values of
the input raster datasets at each cell location.

B. Focal
functions
·
Focal,
or neighborhood, functions produce an output raster dataset in which the output
value at each location is a function of the input value at a location and the
values of the cells in a specified neighborhood around that location.
·
A
neighborhood configuration determines which cells surrounding the processing
cell should be used in the calculation of each output value.
·
Neighborhood
functions can return the mean, standard deviation, sum, or range of values
within the immediate or extended neighborhood.
·
Generally
used for ratio (true zero) data, but can be used with categorical data as well
(majority).

C. Zonal
functions
·
Zonal
functions compute an output raster dataset where the output value for each
location depends on the value of the cell at the location and the association
that location has within a cartographic zone.
·
Zonal
functions are similar to focal functions except that the definition of which
cells to include in the processing (the neighborhood) in a zonal function is
defined by the configuration of the zones or features in the input zone
dataset, not by a specified neighborhood shape.
·
Each
zone can be unique.
·
Operations
that can be completed on these cells return the mean, sum, minimum, maximum, or
range of values from the first dataset that fall within a specified zone of the
second.

D. Global
functions
·
Global,
or per-raster, functions compute an output raster dataset in which the output
value at each cell location is potentially a function of all the cells in the
input raster datasets.
·
There
are two groups of global functions: Euclidean distance and weighted distance.
o
Euclidean
distance global functions assign to each cell in the output raster dataset its
distance from the closest source cell (a source may be the location from which
to start a new road).
o
The
direction of the closest source cell can also be assigned as the value of each
cell location in an additional output raster dataset.
o
By
applying a global function to a weighted (cost) surface, you can determine the
cost of moving from a destination cell (the location where you wish to end the
road) to the nearest source cell.
o
To
take this one step further, the shortest path over a cost surface can be
calculated over a non-networked surface from a source cell to a destination
cell. In all the global calculations, knowledge of the entire surface is
necessary to return the solution.

·
There
is a wide series of cell-based modeling functions that are developed to solve
specific applications.
·
Some
of the application functions are more general in scope, such as surface
analysis, while other application functions are more narrowly defined, such as
the hydrologic analysis functions.
·
The
categorization of the application functions is an aid to group and understand
the wide variety of Spatial Analyst operators and functions.
Density
The Density
function distributes a measured quantity of an input point layer throughout a
landscape to produce a continuous surface.
Surface
generation (Spatial statistics)
·
The
surface functions use the surface representation of a raster dataset to represent
height, concentration, or magnitude (for example, elevation, pollution, or
noise).
·
Surface
generation functions, called surface interpolators, create a continuous surface
from sampled point values.
·
Surface
generation functions make predictions for all locations in a raster dataset
whether a measurement has been taken at the location or not.
o
Inverse
Distance Weighted (IDW) -
as the distance increases, you will inversely weight the values.
o
Polynomial
trend surface is
conceptually similar to taking a piece of paper and trying to pass it through
measured points that are raised to the height of their values. That paper is
fitted so that overall it fits best to all the points.
o
Spline is
conceptually like taking a rubber membrane and, once the measured points are
raised to the height of their values, trying to fit it through the points the
best you can. The criterion imposed on fitting this membrane is that it must
pass through the measured points.
o
Kriging is a
statistical method that quantifies the correlation of the measured points
through variography. When making a prediction for an
unknown location, kriging weights the nearby measured
points by their configuration around the prediction location and uses the
fitted model from variography to determine a value.
o
Geostatistical Analyst provides additional tools for more advanced surface
generation.
Surface
analysis - the premise behind the
surface analysis functions is that additional information can be derived by
producing new data and identifying patterns in existing surfaces.
·
Slope
identifies the slope, or
maximum rate of change, from each cell to its neighbors. An output slope raster
dataset can be calculated as either a percentage of slope
(for example, 10 percent slope) or a degree of slope (for example, 45-degree
slope).
·
Aspect
identifies the steepest downslope direction from each cell to its neighbors. The
value of the output raster dataset represents the compass direction of the
aspect: 0 is true north, a 90-degree aspect is to the
east, and so forth.
·
Hillshade is used to
determine the hypothetical illumination of a surface for either analysis or
graphical display. For analysis, hillshade can be
used to determine the length of time and intensity of the sun in a given
location. For graphical display, hillshade can
greatly enhance the relief of a surface.
·
Viewshed identifies
either how many of the observation points specified on the input observation
raster dataset can be seen from each cell or which cell locations can be seen
from each observation point.
·
Curvature
measures the slope of the
surface at each cell. It calculates the second derivative of the input-surface
raster dataset - the slope of the slope.
o
The
result of the curvature function can be used to describe the physical
characteristics of a surface, such as the erosion and runoff processes within a
landscape.
o
The
slope identifies the overall rate of downward movement, and aspect defines the
direction of flow.
o
The
profile curvature is the shape of the surface in the direction of the slope.
o
The planform curvature defines the shape of the surface
perpendicular to the direction of the slope.
·
Contour
produces an output polyline dataset.
o
The
value of each line represents all contiguous locations with the same height,
magnitude, or concentration of whatever the values on the input dataset
represent.
o
The
function does not connect cell centers; it interpolates a line that represents
locations with the same magnitude.
Hydrologic
analysis
·
The
shape of a surface determines how water will flow across it.
·
The
hydrologic modeling functions provide methods for describing the hydrologic
characteristics of a surface.
·
Using
an elevation raster dataset as input, it is possible to model where water will
flow, create watersheds and stream networks, and derive other hydrologic
characteristics.
·
The
hydrologic modeling functions are available through the RasterHydrologyOp or through Map Algebra
via the Raster Calculator.

Watersheds for each section of a stream network
Geometric
transformation
·
The
geometric transformation functions either change the location of each cell in
the raster dataset or alter the geometric distribution of the cells within a
dataset to correct a distortion.
·
The mosaicking functions (another geometric transformation)
combine multiple raster datasets representing adjacent areas into a single
raster dataset.
Generalization
·
Sometimes
a raster dataset contains data that is erroneous or irrelevant to the analysis
at hand or is more detailed than you need.
·
For
instance, if a raster dataset was derived from the classification of a
satellite image, it may contain many small and isolated areas that are
misclassified.
·
The
generalization functions assist with identifying such areas and automating the
assignment of more reliable values to the cells that make up the areas.
·
The
generalization functions are available through the RasterGeneralizeOp or through Map Algebra
via the Raster Calculator.
·
These
tools provide capabilities for aggregation, edge smoothing, intelligent noise
removal, and more.
·
Nibble - removes single, misclassified cells in
the classified image.
·
BoundaryClean and MajorityFilter - smooth the boundaries between different zones
·
Expand - expands specified zones
·
Shrink - which shrinks specified zones
· Thin - thins linear features in
a raster

The base classification from a satellite image

Effect of Nibble applied to the base classification

Effects of MajorityFilter applied
to the output from Nibble
Resolution
altering
·
The
resolution altering functions change the resolution of an existing raster
dataset.
·
If
you have one raster dataset at a finer resolution than the rest of the raster
datasets, you may wish to resample the finer resolution dataset to the same
resolution of the coarser ones to make all the raster datasets the same
resolution.
·
This
speeds up processing and reduces the data size.
·
The
two principal ways to determine values when changing the resolution of a raster
dataset are interpolation and aggregation.
·
One
group of resampling interpolation functions use
either the nearest-neighbor, bilinear, or cubic methods on the values of the
input raster dataset.
·
A
second group of resampling interpolation functions
uses a specified statistical aggregation method within a neighborhood to derive
values.
·
Unlike
the cell size setting in the analysis environment, the
resolution altering functions are applied only to the resultant dataset.
·
The
aggregation functions group a series of cells to the same value.
·
To
perform an aggregation, the block functions are implemented.
·
With
a block function, Spatial Analyst calculates a specified statistic within nonoverlapping neighborhoods.

The effect on the raster of resampling
to a coarser resolution

3. NoData and how it affects analysis
·
Every
cell location in a raster has a value assigned to it.
·
When
inadequate information is available for a cell location, the location can be assigned
NoData. NoData and 0 are
not the same
·
The
fact that a location can have NoData instead of a
valid value has ramifications in operators and functions.
·
NoData
means that not enough information is known about a cell location to assign it a
value. There are two ways that a location with NoData
can be treated in the computation of an expression:
·
Return
NoData for the location no matter what.
·
Ignore
the NoData and compute with the available values.
·
If NoData exists in any of the input raster datasets in the
Spatial Analyst expression, the output values will be affected.
·
The
behavior of NoData is addressed for each operator and
function in the online command references.
·
It is
important to understand how NoData is handled in a
particular function before making a decision.
·
You
may need to know if a location with NoData on the
output ever had a value, or if it received NoData
from the operator or function.
·
Sometimes,
when locations receive values, it may be important to know if the output value
really is the actual minimum or maximum value, or if it is the minimum or
maximum value of the existing known values.
4. Values and
what they represent
Measurement
values can be broken into four types: ratio, interval, ordinal, and nominal.
· Ratio - the values from the ratio measurement system are derived
relative to a fixed zero point on a linear scale.
o Mathematical operations can be used on
these values with predictable and meaningful results.
o Examples of ratio measurements are age, distance,
weight, and volume.

·
Interval
-
time of day, years on a
calendar, the Fahrenheit temperature scale, and pH value are all examples of
interval measurements.
o
These
are values on a linear calibrated scale, but they are not relative to a true zero
point in time or space.
o
Because
there is no true zero point, relative comparisons can be made between the
measurements, but ratio and proportion determinations are not as useful.

·
Ordinal
-
values determine position.
o
These
measurements show place, such as first, second, and third, but they do not
establish magnitude or relative proportions.
o
How
much better, worse, prettier, healthier, or stronger something is cannot be
demonstrated from ordinal numbers.

·
Nominal - values associated with this measurement
system are used to identify one instance from another.

o
They
may also establish the group, class, member, or category with which the object
is associated.
o
These
values are qualities, not quantities, with no relation to a fixed point or a
linear scale.
o
Coding
schemes for landuse, soil types, or any other
attribute qualify as a nominal measurement.
o
Other
nominal values are social security numbers, ZIP Codes, and telephone numbers.
o
Spatial
Analyst does not distinguish between the four different types of measurements
when asked to process or manipulate the values.
o
Most
mathematical operations work well on ratio values, but when interval, ordinal,
or nominal values are multiplied, divided, or evaluated for the square root,
the results are typically meaningless.
o
On
the other hand, subtraction, addition, and Boolean determinations can be very
meaningful when used on interval and ordinal values.
o
Attribute
handling within and between raster datasets is most effective and efficient
when using nominal measurements.
5. The analysis
environment
Spatial Analyst
allows you to process a subset of cells and to specify the resolution in which
to process them.
·
analysis extent

·
mask

· cell size
6. The cell size
and analysis
·
Cells
in different raster datasets do not need to be stored in the same resolution.
·
But
when processing between multiple datasets, the cell resolution, as is the case
with the registration, needs to be the same.
·
When
multiple raster datasets are input into any Spatial
·
Analyst
function and their resolutions are different, one or more of the input datasets
will be automatically resampled, using the nearest
neighbor assignment to the coarsest input.
·
The nearest
neighbor assignment resampling technique is used
since it is applicable to both discrete and continuous value types, while
bilinear and cubic are only applicable to continuous data.
·
A resampling technique is necessary because rarely do the
centers of the input cells align with the transformed cell centers of the
desired resolution.
·
The
default resampling to the coarsest resolution of the
input rasters can be changed in the Cell Size tab of
the Options dialog box to a specific cell size or to the minimum of the input
raster datasets.
·
Caution
must be taken when specifying a finer cell size than the coarsest input because
the resolution of the output cannot be more accurate than the coarsest of the
inputs.
·
Specifying
a cell size of 50 meters when the input raster datasets are 100 meters creates
an output raster with a cell size of 50 meters, but the accuracy is still 100.
·
When
performing analysis, make sure you are asking appropriate questions of the cell
size.
·
That
is, you will not study mouse movement when the cell size is five kilometers,
and you will not want to use five kilometer cells when studying the effects of
global warming over the earth.
7.
Handling
projections during analysis
·
Raster
datasets must be registered with one another before completing any analysis or
processing between them.
·
Each
location on the ground must be represented by the same x,y cell address on the different input datasets.
·
This
means that the input raster datasets have to be in the same coordinate space or
coordinate system (in the same projection).
·
The
coordinate space of the output will be dependent on the coordinate space of the
input datasets.
·
If
two or more input raster datasets in an expression are not in the same
coordinate space, Spatial Analyst will automatically put them in the same space
using the following rules.
The default
behavior:
·
If
only one raster dataset is input, the output will be in the same coordinate
space as the input (a very common situation).
·
If
multiple raster and feature datasets are in the same coordinate space, the
output will also be in that same coordinate space.
·
If
more than one raster dataset is input, the output will be in the same
coordinate space as the first input.
·
If
feature and raster data with different coordinate spaces are input to the same
function, the feature dataset will be projected to the coordinate space of the
raster; the output will be in the coordinate space of the raster.
·
If
feature data is input, the output will be in the same coordinate space as the
first input.
Overriding
the default
· On the General tab of
the Options dialog box, you can set the coordinate
space of all output raster datasets to be the same as that specified for the
data frame.
·
Automatically
transforming a raster or feature dataset into the common coordinate system in
the cases identified above is referred to as projecting on the fly.
·
To
maintain the speed of on-the-fly projection, a low-order polynomial
transformation is applied to the dataset.
·
The
on-the-fly projection transformation is less accurate than if you project the
dataset using the geometric transformation.
Performing
spatial analysis
(From
chapter 7: you will be working through
the examples in Lab 4 described in this chapter)
1.
Mapping
distance
2.
Performing
surface analysis
3.
Calculating
cell statistics
4.
Calculating
neighborhood statistics
5.
Calculating
zonal statistics
6.
Reclassifying
your data
7.
Using
the raster calculator
8.
Mapping
density
9.
Converting
your data
1. Mapping
distance
What are
distance mapping functions?
·
The
distance mapping functions are global functions. They compute an output
raster dataset where the output value at each location is potentially a
function of all the cells in the input raster datasets. There are several distance
mapping tools for measuring both straight line (Euclidean) distance and
distance measured in terms of other factors, such as the cost to travel over
the landscape. The outputs from the Straight Line Distance functions are normally used directly, while the outputs from the Cost Weighted Distance functions are most commonly used to
compute shortest (or least-cost) paths.
Straight
Line Distance functions
·
The Straight Line Distance function measures the straight line distance from each cell to the
closest source (the source identifies the objects of interest, such as wells,
roads, or a school). The distance is measured from cell center to cell center.
o
Example
of usage: What is the distance to the closest town?
Cost
Weighted Distance functions
·
The Cost Weighted Distance function modifies the Straight Line Distance by some other factor, which is a cost to travel through any given
cell. For example, it may be shorter to climb over the mountain to the
destination, but it is faster to walk around it.
·
The Cost Weighted Allocation function identifies the nearest source cell based on accumulated
travel cost.
·
The Cost Weighted Direction function provides a road map, identifying the route to take from
any cell, along the least-cost path, back to the nearest source.
·
The Distance and Direction raster datasets are normally created to
serve as inputs to the pathfinding function, the shortest
(or least-cost) path.
Why is it
useful to map distance?
·
By
mapping distance, you can find out information such as the distance to the
nearest hospital from certain areas for an emergency helicopter, or find all
fire hydrants within 500 meters of a burning building.
·
Alternatively,
you could find the shortest (or least-cost) path from one location to another,
based on some cost factor.
Straight line
distance
What are the
Straight Line Distance functions?
The Straight
Line Distance functions describe each cell’s relationship to a source or a set
of sources. There are three potential outputs from this function.
Primary output:
Straight
Line Distance gives the
distance from each cell in the raster to the closest source.
Example of
usage: What is the distance to the closest town?
Optional
outputs:
Straight
Line Allocation identifies
the cells that are to be allocated to which source based on closest proximity.
Example of
usage: Which town am I closest to?
Straight
Line Direction gives the
direction from each cell to the closest source.
Example of
usage: What is the direction to the closest town?
The source
The source
identifies the location of the objects of interest, such as wells, shopping
malls, roads, forest stands, and so on. If the source is a raster, it must
contain only the values of the source cells - all other cells must be NoData. If the source is a feature, it will internally be
transformed into a grid when you run the function.
The straight
line distance raster
The straight
line distance raster contains the measured distance from every cell to the
nearest source. The distances are measured in projection units, such as feet or
meters, and are computed from cell center to cell center.
The Straight
Line Distance function is used frequently as a standalone function for applications
such as finding the nearest hospital for an emergency helicopter.
Alternatively, this function can be used when creating a suitability map, when
you need to include data representing the distance from a certain object.
In the example
below, the distance to each town is found. This sort of information could be
extremely useful for planning a hiking trip. You may want to stay within a
certain distance of a town in case of emergencies, or know how much further you
have to travel to pick up supplies.

The straight line distance to the nearest town from every
location.
·
The Straight Line Allocation function assigns each cell the value of the source to which it is
closest. The nearest source is determined by the Straight Line Distance.
o
Example
of usage: Which town am I closest to?

·
The Straight Line Direction function computes the direction to the nearest source, measured in
degrees.
o
Example
of usage: What is the direction to the closest town?

Allocation
What is the Allocation
function?
The Allocation
function allows you to identify which cells belong to which source based on
closest proximity (in a straight line). An output raster is produced that
records the identity of the closest source cell for each cell. Each cell in an
allocation raster receives the value of the source cell to which it will be
allocated. Note that the Allocation function can also be performed via the
Straight Line Distance function or the Cost Weighted Distance function.
Performing the
Allocation function via the Straight Line Distance function allows you to find
the cells that are to be allocated to which source based on closest proximity,
in a straight line.

Performing the
Allocation function via the Cost Weighted Distance function takes the cost of
traveling over the land into consideration rather than the straight line
distance

Why use the
Allocation function?
Use the
Allocation function to perform analyses such as:
The example to
the below identifies the areas of land supported by a recreation site. You can
easily identify the areas that may be in need of more recreation sites (mainly
areas in the northeast half of the raster).

Cost weighted
distance
What is cost
weighted distance mapping?
Cost weighted
distance mapping finds the least accumulative cost from each cell to the
nearest, cheapest source. Cost can be money, time, or preference.
The functions
that perform cost weighted distance mapping are similar to the Straight Line
Distance functions, but instead of calculating the actual distance from one
point to another, they compute the accumulative cost of traveling from each
cell to the nearest source, based on the cell’s distance from each source and
the cost to travel through it (for example, it is easier to walk through a
meadow than a swamp).
Why use the
Cost Weighted Distance function?
·
Cost
weighted distance modeling is useful whenever movement is based on geographic
factors, such as animal migration studies or consumer travel behavior.
·
Cost
weighted distance may also be used to minimize construction costs for routing
new roads, transmission lines, or pipelines.
·
The
straight line distance between two points is not necessarily the best. In the
graphic to the left, the shortest path over the mountain takes three hours.
·
The
longer path around only takes two hours. If time were a cost, then the route
with the longer distance should be taken.
·
However,
the aim may be to climb over the mountain. Applying cost weighted distance
enables you to specify preferences in your input data.
·
It may,
for example, take longer to travel over the mountain due to steep slopes, so
steep slopes should be given a higher cost when finding a suitable path from A
to B.

Example:
Finding the least-cost route for a road
·
In
the following example, the Cost Weighted Distance functions are used to find
the least-cost path for a new road.
·
The
Cost Weighted Distance function is the prerequisite to the Shortest Path
function, which is discussed in the next section.
·
The
Shortest Path function determines the actual route for the road.
·
To
calculate the least accumulative cost from each cell to the nearest source, the
Cost Weighted Distance function needs a source and a cost raster.
The source
The
source, as you can see in the graphic below (in red), is the starting point for
the proposed road.

The cost
raster
·
The
cost raster identifies the cost of traveling through every cell.
·
To
create this raster, you need to identify the cost of constructing a road
through each cell.
·
Although
the cost raster is a single dataset, it is often used to represent several
criteria.
·
In
the following example, landuse and slope influence
the construction costs.
·
These
datasets are in different measurement systems (landuse
type and percent slope), so they cannot be compared relative to one another and
must be reclassified to a common scale.
Creating a
cost raster:
Reclassifying
your datasets to a common scale
·
In
this example, slope and landuse have been
reclassified on a scale of 1ń10.
·
The attributes
of each dataset should be examined, in turn, to decide on their contribution to
the cost of building a road.
·
For
example, it is more costly to traverse steep slopes, so steeper slopes will be
assigned higher costs when reclassifying this dataset.
·
The
graphics below display the results.

Reclassified Landuse

Reclassified
Slope
Weighting
datasets according to percent influence
·
The
next step in producing the cost raster is to add the reclassified datasets
together.
·
The simplest
approach is to just add them together.
·
However,
you may know that some factors are more important than others.
·
For
instance, avoiding steep slopes may be twice as important as the landuse type, so you might, for example, give this dataset
an influence of 66 percent and the landuse dataset an
influence of 34 percent (to make 100%).
·
The
following diagram shows the conceptual process:
Slope

Slope * 0.66 =

Landuse

Landuse *
0.34 =

Combining
the datasets
The final cost raster
is the result of adding the weighted datasets together.

+

=

Taking this
example, the following diagram shows the final cost raster, the result of
reclassifying the datasets of slope and landuse,
weighting each by 0.66 and 0.34, respectively, then
combining the weighted datasets.
The cells
shaded dark blue are the most suitable cells through which to route the road,
as they are the least costly.

The Cost
Weighted Distance function
·
Using
the cost raster and the source, the Cost Weighted Distance function produces an
output raster in which each cell is assigned a value that is the least
accumulative cost of getting back to the source.
·
Using
our example, the function takes the cost raster and calculates a value for each
cell in the output cost-weighted distance raster that is the accumulated least
cost of getting from that cell to the nearest source.
·
Every
cell in the cost-weighted distance raster is assigned a value that represents
the sum of the minimum travel costs that would be incurred by traveling back
along the least-cost path to its nearest source.
·
In
the example below, the accumulated least costly way of getting from the cell
colored dark red to the school is 10.5.

·
Two additional
outputs: direction and allocation rasters can be
created from the Cost Weighted Distance function.
·
These are
explained on the following pages.Both the
cost-weighted distance and direction rasters are
required if you want to go on to calculate the least-cost (shortest) path
between source locations and destination locations.
Direction
·
The
cost-weighted distance raster tells you the least accumulated cost of getting
from each cell to the nearest source, but it doesn’t tell you which way to go
to get there.
·
The
direction raster provides a road map, identifying the route to take from any
cell, along the least-cost path, back to the nearest source.

Cost Weighted

Direction

Direction
Coding
·
The algorithm
for computing the direction raster assigns a code to each cell that identifies
which one of its neighboring cells is on the least-cost path back to the
nearest source.
·
In
the direction coding diagram above, 0 represents every cell in the cost-weighted
distance raster.
·
Each
cell is assigned a value representing the direction of the nearest, cheapest
cell on the route of the least costly path to the nearest source.
·
For
example, in the graphic above, the cheapest way to get from the cell with a value
of 10.5 is to go diagonally, through the cell with a value of 5.7, to the
source, the school site.
·
The
direction algorithm assigns a value of 4 to the cell with a value of 10.5, and
4 to the cell with a value of 5.7, because this is the direction of the
least-cost path back to the source from each of these cells.
·
This
process is done for all cells in the cost-weighted distance raster to produce
the direction raster, which tells you the direction to travel from every cell
in the cost-weighted distance raster back to the source.

Cost Weighted
Distance

Direction
Both the
cost-weighted distance and direction rasters are
required if you want to go on to calculate the least-cost (shortest) path
between source locations and destination locations.
Allocation
The
cost-allocation raster identifies the nearest source from each cell in the
cost-weighted distance raster. It is conceptually similar to the Straight Line
Distance Allocation function, where each cell is assigned to its nearest source
cell. However, near is expressed in terms of accumulated
travel cost.

All cells are
allocated to the school source.
What is the
Shortest Path function?
·
The
Shortest Path function determines the path from a destination point to a
source.
·
Once
you have performed the Cost Weighted Distance function, creating distance and
direction rasters, you can then compute the
least-cost (or shortest) path from a chosen destination to your source point,
which in our original example was the starting point for the new road.
Why find the
shortest path?
·
The
shortest path travels from the destination to the source and is guaranteed to
be the cheapest route (relative to the cost units defined by the original cost
raster).
·
Use
it to find the best route for a new road in terms of construction costs, or to
identify the path to take from several suburban locations (sources) to the
closest shopping mall (destinations).

·
You
can see two potential paths for the new road in the diagram above (in purple and
red) to illustrate an important point.
·
The
purple line represents the path created using a cost raster where each input
raster (landuse and slope) had the same influence.
·
The
red line represents the path created using a cost raster where the slope input
raster had a weight (or influence) of 66 percent.
·
By
giving the slope input raster a higher weight, more attention was given to
avoiding steeper slopes in the red path.
·
It is
important to spend time considering how to weight the rasters
that make up the cost raster.
·
How
you weight your rasters depends on your application
and the results you wish to achieve.
2. Performing
surface analysis
·
You
can gain additional information by producing a new dataset that identifies a
specific pattern within an original dataset.
·
Patterns
that were not readily apparent in the original surface can be derived, such as
contours, angle of slope, steepest downslope
direction (aspect), shaded relief (hillshade), and viewshed.
·
Contours
can be useful for finding
areas of the same value.
·
You
may be interested in obtaining elevation values for specific locations and
examining the overall gradation of the land.
·
You
may, for instance, want to know the variations in the slope of the
landscape because you want to find the areas most at risk of landslide based on
the angle of slopes in an area (steeper slopes being those most at risk).

Input Elevation Raster Output
Contours

Steeper
angle of slope
Output Slope
You may be a
farmer interested in locating a field on an area with a southerly aspect.
You can create a hillshade for both analytical
and graphical purposes.

Output Aspect
Graphically, a hillshade can provide an attractive and realistic backdrop
showing how other layers are distributed in relation to the terrain relief.

Output hillshade
From an
analytical point of view, you can, for instance, analyze how the landscape is
illuminated at various times of the day by lowering and raising the sun angle
used in the analysis.

Azimuth 45° Azimuth 315°
Calculating
viewshed is useful when you want to
know how visible objects will be. For instance, you might want to find the
location with the most expansive view in an area because you want to know the
best location for a lookout. Display a hillshade
transparently underneath the result from the Viewshed
function.

Input elevation raster Output viewshed Combined
Contour
What are
contours?
·
Contours
are polylines
that connect points of equal value (such as elevation, temperature,
precipitation, pollution, or atmospheric pressure).
·
The
distribution of the polylines shows how values change
across a surface.
·
Where
there is little change in a value, the polylines are
spaced farther apart.
·
Where
the values rise or fall rapidly, the polylines are
closer together.
Why create
contours?
·
By
following the polyline of a particular contour, you
can identify which locations have the same value.
·
Contours
are also a useful surface representation because they allow you to
simultaneously visualize flat and steep areas (distance between contours) and
·
ridges
and valleys (converging and diverging polylines).
·
The
example below shows an input elevation dataset and the output contour dataset.
·
The
areas where the contours are closer together indicate the steeper locations.
·
They
correspond with the areas of higher elevation (in white on the input elevation
dataset).

Input
elevation dataset Output
contour dataset
The
Contour attribute table contains an elevation attribute for each contour polyline.

What is
slope?
The Slope
function calculates the maximum rate of change between each cell and its
neighbors - for example, the steepest downhill descent for the cell (the maximum
change in elevation over distance between the cell and its eight neighbors).
Every cell in the output raster has a slope value. The lower
the slope value, the flatter the terrain; the higher the slope value, the
steeper the terrain. The output slope dataset can be calculated as
percent slope or degree of slope.
Degree of
slope = Θ Percent
of slope = rise / run * 100
tan
Θ = rise / run

Degree of slope = 30 45 76
Percent slope = 58 100 375
·
When
the slope angle equals 45 degrees, the rise is equal to the run.
·
Expressed
as a percentage, the slope of this angle is 100 percent.
·
Note
that as the slope approaches vertical (90°), the percentage slope approaches
infinity.
·
The
Slope function is most frequently run on an elevation dataset, as the following
diagrams show.
·
Steeper
slopes are shaded red on the output slope dataset.
·
It
can also be used with other types of continuous data, such as population, to
identify sharp changes in value.
What is
aspect?
·
Aspect
identifies the steepest downslope direction from each
cell to its neighbors.
·
It
can be thought of as slope direction or the compass direction a hill faces.
·
It is
measured clockwise in degrees from 0 (due north) to 360 (again due north,
coming full circle).
·
The
value of each cell in an aspect dataset indicates the direction the cell’s
slope faces.
·
Flat
slopes have no direction and are given a value of -1.
·
The
diagram below shows an input elevation dataset and the output aspect raster.

![]()
Why use the
Aspect function?
With the Aspect
function, you can:
·
Find
all north-facing slopes on a mountain as part of a search for the best slopes
for ski runs.
·
Calculate
the solar illumination for each location in a region as part of a study to
determine the diversity of life at each site.
·
Find
all southerly slopes in a mountainous region to identify locations where the
snow is likely to melt first as part of a study to identify those residential
locations that are likely to be hit by meltwater
first.
·
Identify
areas of flat land to find an area for a plane to land in an emergency.
Hillshade
· The Hillshade
function obtains the hypothetical illumination of a surface by
determining illumination values for each cell in a raster.
·
It
does this by setting a position for a hypothetical light source and calculating
the illumination values of each cell in relation to neighboring cells.
·
It
can greatly enhance the visualization of a surface for analysis or
graphical display.
· By default, shadow and light are shades of
gray associated with integers from 0 to 255 (increasing from black to white).
Azimuth is the angular direction of the sun,
measured from north in clockwise degrees from 0 to 360. An azimuth of 90 is
east. The default is 315 (NW).
Altitude is the slope or angle of the illumination
source above the horizon. The units are in degrees, from 0 (on the horizon) to
90†degrees (overhead). The default is 45 degrees.
Using hillshading for display
· By placing an elevation raster on top
of a created hillshade, then making the elevation
raster transparent, you can create realistic images of the landscape.
· Add other layers, such as roads or streams, to further increase the