Using Geographic Codes to Import Data into MapPoint 2004
Greg Slayden of Microsoft shows how to use standard geographic codes (FIPS, NUTS, ISO-3166, and Canadian SGC) to import data files into Microsoft MapPoint 2004
Microsoft® MapPoint® 2004
United States FIPS Codes
Canadian SGC Codes
European NUTS Codes
Postal Codes and Other Abbreviations
One of the great strengths of Microsoft® MapPoint® 2004 is its capability to
display your custom data on a map. Any data in which a row represents a piece of
geography lends itself to map display. When custom data is displayed on a map it
can provide users with insight that no table, chart, or graph can match.
The MapPoint Import Data Wizard is a powerful tool that can import your
geographic data and quickly display it for you in a new way. To create a map,
the Import Data Wizard matches the records in your dataset to geographical
areas, such as states, counties, countries\regions, and others, on a MapPoint
map. This matching occurs on the second page of the wizard, where you specify
which columns in your dataset the wizard should use for matching.
The default method the wizard uses for matching is a string match based on
the name of the geographical area. For example, the string "Alabama" corresponds
to the state of Alabama on the MapPoint map. String matches often work and
provide acceptable results, but accuracy is reduced if datasets contain place
names that are misspelled, that vary in spelling based on language, or that are
duplicates. In addition, abbreviations or codes are sometimes used instead of
complete strings to save space in the database.
To eliminate the limitations of string matching, MapPoint recognizes a wide
variety of standard geographic code schemes and abbreviations as valid values
for matching. Many datasets use these codes. Using them to match your data will
result in a more accurate data-import process.
Figure 1 shows the second page of the Import Data Wizard. Note that the data
type of the first column is set to State, because the values in that column are
Federal Information Processing Standard (FIPS) codes (which are described in
detail later) for United States states. MapPoint recognizes the 01 in the first
row as being equal to the state of Alabama and matches the record to the correct
geographic area. Note that the data does not contain state names or
abbreviations, although those values would also work.
Figure 1. Data import with FIPS codes
MapPoint recognizes the following types of coding schemes:
- United States Federal Information Processing Standard (FIPS) codes
- Canadian Standard Geographical Classification (SGC) codes
- European nomenclature des unités territoriales statistiques (NUTS) codes
- International Organization for Standardization (ISO)-3166 codes
- Postal codes and other abbreviations
The following sections describe these code schemes and outline how to use
them with MapPoint 2004.
United States FIPS Codes
With the North America version of MapPoint 2004, you can import and map data
to five levels of geography in the United States: state, county, Metropolitan
Statistical Area (MSA), ZIP Code, and census tract. You can use FIPS codes for
all levels. FIPS codes are extensively used in all United States government data
products, notably those of the Census Bureau. Many non-government companies and
organizations also use these codes.
FIPS codes appear to be numeric, but they actually must be treated as text,
with leading zeros preserved at all times. For example, in Microsoft Excel, you
must format a column containing FIPS codes as text. If you cut and paste cells
from an Excel spreadsheet directly into MapPoint, the text formatting and
leading zeros will be lost. Instead, open the Excel workbook on the first page
of the Import Data Wizard and do not paste your selected cells directly.
The following sections describe the FIPS codes for each level of
Both the United States Census Bureau and MapPoint treat the District of
Columbia as a state equivalent, so for data mapping purposes, there are actually
State FIPS codes are two-digits long, ranging from 01 for Alabama to 56 for
Wyoming. When importing data into MapPoint, you can use the State column header
for a column that contains full state names, standard two-letter postal
abbreviations, or two-digit state FIPS codes.
As of 2003, there were 3,141 counties and county equivalents in the United
States. County equivalents include parishes in Louisiana, independent cities in
Virginia, boroughs and census areas in Alaska, and other similar entities.
MapPoint 2004 has correct geography for all counties and county equivalents,
including changes that occurred since the 2000 census.
FIPS codes for counties are three-digit codes that are unique within each
state. Therefore, to allow for full national coverage with one set of codes,
MapPoint uses five-digit FIPS codes for counties: two-digit state codes suffixed
by three-digit county codes. For example, Autaugua County, Alabama has a FIPS
code of 01001, and Weston County, Wyoming has a FIPS code of 56045.
County FIPS codes look like ZIP Codes, but they are different. Take care to
avoid confusing the two sets of codes.
An MSA is a geographical area, defined by the federal Office of Management
and Budget (OMB), that represents the metropolitan area of a city. In most of
the United States, MSAs consist of a county or a group of counties. However, in
the six New England States (CT, ME, MA, NH, RI, and VT), MSAs are defined by
city and town.
Note Entities called New
England County Metropolitan Areas (NECMAs) are composed of entire counties and
are useful for those who want a set of MSAs that follow county boundaries.
MapPoint does not support NECMAs and uses regular MSAs for the New England
An MSA can be either a "regular" MSA or a Primary Metropolitan a Statistical
Area (PMSA), the difference being that PMSAs can be combined into a Consolidated
Metropolitan Statistical Area (CMSA), which is roughly equivalent to a
megalopolis. For example, Dallas and Fort Worth, TX, are PMSAs that are combined
into the Dallas-Fort Worth CMSA. MapPoint treats PMSAs the same way as regular
MSAs and does not support mapping of CMSAs.
In mid-2003 the OMB issued a new set of MSAs and other similar entities.
These were issued too late to be included in MapPoint 2004. Therefore, the MSAs
included in MapPoint 2004 are those that existed from 1999 to 2003 and were used
for the 2000 census. These MSAs remain the current standard for most
You can import MSA data into MapPoint using standard FIPS MSA codes (which
also include PMSAs). Each MSA has a unique four-digit number, ranging from 0040
for the Abilene, TX MSA to 9360 for the Yuma, AZ MSA.
MapPoint can import data based on five-digit United States ZIP Codes. The
familiar five-digit ZIP Code is also used as the FIPS code for a ZIP code area,
so there is rarely an issue with importing this kind of data. MapPoint does not
store the city name for a ZIP Code and can only make a match based on the ZIP
It is important to note that ZIP Codes were designed for mail delivery, not
for data analysis and mapping. ZIP Codes change frequently and their boundaries
are not well defined in many rural areas. Institutional ZIP codes, P.O. box-only
ZIP codes, and other anomalies can prevent many sets of ZIP Code data from being
imported completely. MapPoint uses the most current set of ZIP Code boundaries
possible, but you will often find gaps or unmatched codes in large datasets.
A census tract is a small geographical area that generally contains between
1,500 and 8,000 demographically similar people, with an optimum size of 4,000.
Unlike ZIP Codes, whose boundaries are not always well defined, the boundaries
of census tracts are precise. The United States Census Bureau redefines census
tracts every ten years for each new census; the 2000 census divided the United
States into 65,320 tracts.
Census tracts have several advantages over ZIP Codes: they are more stable,
have exact boundaries, and have roughly equal populations. The main disadvantage
of census tracts is that very few people know what census tract they live in,
making ZIP Codes much more useful for many customer-data scenarios.
Every census tract has a number, which is unique within a county. The number
consists of a main number ranging from 1 to 9999, and an optional suffix with
two decimal places. Examples of census tract numbers are 1, 1.01, 234.08, and
9999.99. A FIPS code for a census tract contains six digits, with zeros filling
in the digits that are missing from the census tract number. For example, the
FIPS code for census tract 1 is 000100; tract 1.01 is 000101; tract 234.08 is
023408, and so on.
Because census tract numbers are only unique within a county, the full FIPS
code that is unique within the United States is an eleven-digit number composed
of the state FIPS code, the county FIPS code, and the census tract FIPS code.
For example, the full code for tract 228.03 in King County (code 033) in
Washington State (code 53) is 53033022803. This eleven-digit code is the code
MapPoint recognizes when importing census tract data.
If your census tract dataset contains only the six-digit FIPS codes, you also
need to specify which columns hold the state and county values. You can mix and
match the codes in any way when more than one column is required for data
import. For example, when importing census tract data, you can use postal
abbreviations for state, FIPS codes for county, and regular census tract numbers
(for example, 228.03) for census tracts.
Note that older versions of MapPoint do not support the import of census
tract data based on FIPS codes. However, they do support FIPS codes for
importing state, county, ZIP Code, and MSA data.
Canadian SGC Codes
Statistics Canada, the Canadian government agency that conducts the census in
Canada, publishes a code scheme called the Standard Geographical Classification
(SGC). SGC has become the standard encoding for Canadian geographical areas. The
SGC scheme uses three geographical levels: province and territory, Canadian
census division, and Canadian census subdivision.
Province and Territory
MapPoint contains geography for all 13 Canadian provinces and territories.
You can match data based on name, standard alphabetic/postal SGC code (for
example, QC, ON, BC, and so on), or numeric SGC code. The numeric codes range
from 10 for Newfoundland and Labrador to 59 for British Columbia and 62 for
Nunavut Territory. Note that MapPoint does not recognize NL as the new
alphabetic code for the province of Newfoundland and Labrador; instead, MapPoint
recognizes the old NF code used prior to 2002.
Canadian Census Division
MapPoint includes geography for the 288 census divisions used by the 1996
Canadian census. Census divisions correspond to counties in most of eastern
Canada, to regional districts in British Columbia, and to census-created
geographies with no administrative reality in most of central Canada.
Census divisions have a four-digit SGC code whose first two digits are equal
to the province/territory SGC code. MapPoint imports data based on this
four-digit code. For example, the Greater Vancouver Regional District has an SGC
code of 5915, where the 59 matches the SGC province code for British
Canadian Census Subdivision
For the 1996 census, Canada was divided into 5,984 census subdivisions. These
subdivisions correspond to cities, towns, townships, and other local
governmental units and have widely varying populations. For example, the city of
Montreal is a census subdivision with over 1 million people, while many rural
subdivisions have only a few hundred inhabitants.
MapPoint correctly imports census subdivision data using a seven-digit SGC
code. This code consists of the two-digit province/territory code, the two-digit
census division code, and the three-digit census subdivision code. For example,
the city of Montreal has an SGC code of 2466025; 24 represents Quebec, 66
represents the census division for the Montreal Urban Community, and 025
represents the city itself.
European NUTS Codes
NUTS is a French acronym for "nomenclature des unités territoriales
statistiques" (in English, nomenclature of the statistical territorial units).
The European Union uses NUTS codes as the standard for their data products, and
this coding scheme is being used increasingly as a trans-European standard. A
NUTS code is alphanumeric and consists of a two-letter country prefix and a
series of numbers, letters, or both.
NUTS codes are arranged in a hierarchy. NUTS-1 areas are the largest and are
divided into NUTS-2 areas, and so on. NUTS levels for a specific country do not
always correspond to actual administrative units such as states and counties,
and the same NUTS level often corresponds to different political geography in
different countries. For example, the states of Germany correspond to the NUTS-1
level, but the states of Austria correspond to the NUTS-2 level. The NUTS-1
level in Austria consists of artificial groups of states that have no political
reality. MapPoint does not support NUTS levels that do not correspond to actual
administrative units within a particular country. Therefore, MapPoint supports
the NUTS-1 level in Germany but not in Austria.
The Europe version of MapPoint 2004 correctly imports datasets with NUTS
codes as administrative-unit identifiers (as noted in the following table). For
example, if you have a dataset with 96 records for the departments of France and
use the NUTS-3 code as the geographic identifier, you use the Import Data Wizard
drop-down header of Department for that column and the data will be imported
The following table summarizes the countries and administrative levels that
||First-level administrative units
||NUTS-2, 9 states
||NUTS-1, 3 regions
|NUTS-3, 10 provinces
||NUTS-3, 15 counties
||NUTS-2, 22 regions
|NUTS-3, 96 departments
MapPoint also recognizes standard department codes; for example, 75 for
||NUTS-1, 16 states
|NUTS-3, 439 Landkries
||NUTS-2, 20 regions
||NUTS-2, 12 provinces
||NUTS-2, 17 autonomias
The International Organization for Standardization (ISO) publishes code
schemes for countries/regions of the world and for first-level administrative
- The scheme for countries/regions is documented in the standard ISO-3166-1,
"Codes for the representation of names of countries and their subdivisions –
Part 1: Country codes."
- The scheme for first-level administrative divisions is documented in the
standard ISO-3166-2, "Codes for the representation of names of countries and
their subdivisions – Part 2: Country subdivision code."
The ISO-3166-1 standard has two sets of codes for the countries/regions of
the world—a two-letter code, which is more frequently used, and a three-letter
code. With the North America and Europe versions of MapPoint, you can import
data for the countries/regions of the world based on both sets of codes.
However, some small island groups may exist where the ISO definition of a
country/region does not match the one used by MapPoint.
So, if your data has ISO-3166 country codes as record identifiers, such as US
for the United States, CA for Canada, and FR for France, MapPoint will recognize
them and match your data to the correct countries/regions.
Many other code schemes are used for the countries/regions of the world; for
example, the Distinguishing Sign code used on automobile stickers, the
International Olympic Committee country codes, the Internet top-level domain
suffixes, and so on. These schemes vary from the ISO-3166 standard and can cause
confusion. MapPoint supports only the ISO-3166 codes.
First-Level Administrative Unit
The ISO-3166-2 standard lists codes for first-level administrative areas
(states, provinces, prefectures, and so on) in the form of a two-letter
country/region code, a hyphen, and then an administrative-area code. For
example, ISO-3166-2 lists Alabama as US-AL; Brittany, France as FR-E. The North
America and Europe versions of MapPoint both support the ISO-3166-2 codes for
first-level administrative areas in the following countries/regions: Argentina,
Australia, Belarus, Brazil, Canada, China, France, Germany, Italy, Japan, South
Korea, Mexico, Nigeria, Norway, Poland, Romania, Spain, Sweden, Switzerland,
Turkey, United Kingdom, United States, Serbia, and Montenegro.
Postal Codes and Other
Many countries use special codes and abbreviations for their administrative
divisions, and MapPoint supports several other schemes for data import that are
not discussed in this document. The following table lists the other code and
abbreviation schemes that MapPoint recognizes.
||Code or abbreviation scheme
||Two-letter state abbreviations
||RJ for Rio de Janeiro|
||Two-letter province/territory abbreviations
||QC for Quebec|
||Two-digit department codes
||75 for Paris|
||Five-digit numeric commune codes
||47148 for Leyritz-Moncassin|
||Two-letter state abbreviations
||HH for Hamburg|
||Two-letter province abbreviations
||MI for Milano|
||Three-letter state abbreviations
||JAL for Jalisco|
||Two-letter province abbreviations
||SL for Slaskie|
||Two-letter province abbreviations
||SA for Salamanca|
||Two-digit province code
||37 for Salamanca|
||Two-letter canton abbreviations
||VS for Valais|
||Two-letter state postal abbreviations
||CA for California|
Importing data into MapPoint with standard code schemes has many advantages
over using string matching. Matches are precise, leaving little to chance and
reducing the number of unmatched records. Whenever possible, you should use the
supported code schemes your standard way of importing data into MapPoint
Click the following links to learn more about the geographic code schemes
discussed in this article: