spacer1
spacer2 1_1 1_2
2_1
 Subscribe
 The MP2K Update!
 
 
 
 Magazine
Front Cover
What's New
Articles
News
Sample Data
Gallery
Advertise
About
 Features
MapPoint 2013
Press Releases
MapPoint Forums
Companies
Link to MP2Kmag
Wish List
MapPoint Trial
Authors
 Earlier Content
Past News Items
Past What's New Announcements
 Sponsors
 Order

MapPoint 2013

Programming MapPoint in .NET

MapPoint Book

  Spatial Community
SVG Tutorials
MapPoint

Map Visitors

  ARTICLES  


Using Geographic Codes to Import Data into MapPoint 2004

Greg Slayden of Microsoft shows how to use standard geographic codes (FIPS, NUTS, ISO-3166, and Canadian SGC) to import data files into Microsoft MapPoint 2004

Greg Slayden
Microsoft Corporation

Applies to:
   Microsoft® MapPoint® 2004

Contents

Introduction
United States FIPS Codes
Canadian SGC Codes
European NUTS Codes
ISO-3166 Codes
Postal Codes and Other Abbreviations
Conclusion
Additional Resources

Introduction

One of the great strengths of Microsoft® MapPoint® 2004 is its capability to display your custom data on a map. Any data in which a row represents a piece of geography lends itself to map display. When custom data is displayed on a map it can provide users with insight that no table, chart, or graph can match.

The MapPoint Import Data Wizard is a powerful tool that can import your geographic data and quickly display it for you in a new way. To create a map, the Import Data Wizard matches the records in your dataset to geographical areas, such as states, counties, countries\regions, and others, on a MapPoint map. This matching occurs on the second page of the wizard, where you specify which columns in your dataset the wizard should use for matching.

The default method the wizard uses for matching is a string match based on the name of the geographical area. For example, the string "Alabama" corresponds to the state of Alabama on the MapPoint map. String matches often work and provide acceptable results, but accuracy is reduced if datasets contain place names that are misspelled, that vary in spelling based on language, or that are duplicates. In addition, abbreviations or codes are sometimes used instead of complete strings to save space in the database.

To eliminate the limitations of string matching, MapPoint recognizes a wide variety of standard geographic code schemes and abbreviations as valid values for matching. Many datasets use these codes. Using them to match your data will result in a more accurate data-import process.

Figure 1 shows the second page of the Import Data Wizard. Note that the data type of the first column is set to State, because the values in that column are Federal Information Processing Standard (FIPS) codes (which are described in detail later) for United States states. MapPoint recognizes the 01 in the first row as being equal to the state of Alabama and matches the record to the correct geographic area. Note that the data does not contain state names or abbreviations, although those values would also work.

Figure 1. Data import with FIPS codes

MapPoint recognizes the following types of coding schemes:

  • United States Federal Information Processing Standard (FIPS) codes
  • Canadian Standard Geographical Classification (SGC) codes
  • European nomenclature des unités territoriales statistiques (NUTS) codes
  • International Organization for Standardization (ISO)-3166 codes
  • Postal codes and other abbreviations

The following sections describe these code schemes and outline how to use them with MapPoint 2004.

United States FIPS Codes

With the North America version of MapPoint 2004, you can import and map data to five levels of geography in the United States: state, county, Metropolitan Statistical Area (MSA), ZIP Code, and census tract. You can use FIPS codes for all levels. FIPS codes are extensively used in all United States government data products, notably those of the Census Bureau. Many non-government companies and organizations also use these codes.

FIPS codes appear to be numeric, but they actually must be treated as text, with leading zeros preserved at all times. For example, in Microsoft Excel, you must format a column containing FIPS codes as text. If you cut and paste cells from an Excel spreadsheet directly into MapPoint, the text formatting and leading zeros will be lost. Instead, open the Excel workbook on the first page of the Import Data Wizard and do not paste your selected cells directly.

The following sections describe the FIPS codes for each level of geography.

State

Both the United States Census Bureau and MapPoint treat the District of Columbia as a state equivalent, so for data mapping purposes, there are actually 51 states.

State FIPS codes are two-digits long, ranging from 01 for Alabama to 56 for Wyoming. When importing data into MapPoint, you can use the State column header for a column that contains full state names, standard two-letter postal abbreviations, or two-digit state FIPS codes.

County

As of 2003, there were 3,141 counties and county equivalents in the United States. County equivalents include parishes in Louisiana, independent cities in Virginia, boroughs and census areas in Alaska, and other similar entities. MapPoint 2004 has correct geography for all counties and county equivalents, including changes that occurred since the 2000 census.

FIPS codes for counties are three-digit codes that are unique within each state. Therefore, to allow for full national coverage with one set of codes, MapPoint uses five-digit FIPS codes for counties: two-digit state codes suffixed by three-digit county codes. For example, Autaugua County, Alabama has a FIPS code of 01001, and Weston County, Wyoming has a FIPS code of 56045.

County FIPS codes look like ZIP Codes, but they are different. Take care to avoid confusing the two sets of codes.

MSA

An MSA is a geographical area, defined by the federal Office of Management and Budget (OMB), that represents the metropolitan area of a city. In most of the United States, MSAs consist of a county or a group of counties. However, in the six New England States (CT, ME, MA, NH, RI, and VT), MSAs are defined by city and town.

Note  Entities called New England County Metropolitan Areas (NECMAs) are composed of entire counties and are useful for those who want a set of MSAs that follow county boundaries. MapPoint does not support NECMAs and uses regular MSAs for the New England states.

An MSA can be either a "regular" MSA or a Primary Metropolitan a Statistical Area (PMSA), the difference being that PMSAs can be combined into a Consolidated Metropolitan Statistical Area (CMSA), which is roughly equivalent to a megalopolis. For example, Dallas and Fort Worth, TX, are PMSAs that are combined into the Dallas-Fort Worth CMSA. MapPoint treats PMSAs the same way as regular MSAs and does not support mapping of CMSAs.

In mid-2003 the OMB issued a new set of MSAs and other similar entities. These were issued too late to be included in MapPoint 2004. Therefore, the MSAs included in MapPoint 2004 are those that existed from 1999 to 2003 and were used for the 2000 census. These MSAs remain the current standard for most metropolitan-area datasets.

You can import MSA data into MapPoint using standard FIPS MSA codes (which also include PMSAs). Each MSA has a unique four-digit number, ranging from 0040 for the Abilene, TX MSA to 9360 for the Yuma, AZ MSA.

ZIP Code

MapPoint can import data based on five-digit United States ZIP Codes. The familiar five-digit ZIP Code is also used as the FIPS code for a ZIP code area, so there is rarely an issue with importing this kind of data. MapPoint does not store the city name for a ZIP Code and can only make a match based on the ZIP Code itself.

It is important to note that ZIP Codes were designed for mail delivery, not for data analysis and mapping. ZIP Codes change frequently and their boundaries are not well defined in many rural areas. Institutional ZIP codes, P.O. box-only ZIP codes, and other anomalies can prevent many sets of ZIP Code data from being imported completely. MapPoint uses the most current set of ZIP Code boundaries possible, but you will often find gaps or unmatched codes in large datasets.

Census Tract

A census tract is a small geographical area that generally contains between 1,500 and 8,000 demographically similar people, with an optimum size of 4,000. Unlike ZIP Codes, whose boundaries are not always well defined, the boundaries of census tracts are precise. The United States Census Bureau redefines census tracts every ten years for each new census; the 2000 census divided the United States into 65,320 tracts.

Census tracts have several advantages over ZIP Codes: they are more stable, have exact boundaries, and have roughly equal populations. The main disadvantage of census tracts is that very few people know what census tract they live in, making ZIP Codes much more useful for many customer-data scenarios.

Every census tract has a number, which is unique within a county. The number consists of a main number ranging from 1 to 9999, and an optional suffix with two decimal places. Examples of census tract numbers are 1, 1.01, 234.08, and 9999.99. A FIPS code for a census tract contains six digits, with zeros filling in the digits that are missing from the census tract number. For example, the FIPS code for census tract 1 is 000100; tract 1.01 is 000101; tract 234.08 is 023408, and so on.

Because census tract numbers are only unique within a county, the full FIPS code that is unique within the United States is an eleven-digit number composed of the state FIPS code, the county FIPS code, and the census tract FIPS code. For example, the full code for tract 228.03 in King County (code 033) in Washington State (code 53) is 53033022803. This eleven-digit code is the code MapPoint recognizes when importing census tract data.

If your census tract dataset contains only the six-digit FIPS codes, you also need to specify which columns hold the state and county values. You can mix and match the codes in any way when more than one column is required for data import. For example, when importing census tract data, you can use postal abbreviations for state, FIPS codes for county, and regular census tract numbers (for example, 228.03) for census tracts.

Note that older versions of MapPoint do not support the import of census tract data based on FIPS codes. However, they do support FIPS codes for importing state, county, ZIP Code, and MSA data.

Canadian SGC Codes

Statistics Canada, the Canadian government agency that conducts the census in Canada, publishes a code scheme called the Standard Geographical Classification (SGC). SGC has become the standard encoding for Canadian geographical areas. The SGC scheme uses three geographical levels: province and territory, Canadian census division, and Canadian census subdivision.

Province and Territory

MapPoint contains geography for all 13 Canadian provinces and territories. You can match data based on name, standard alphabetic/postal SGC code (for example, QC, ON, BC, and so on), or numeric SGC code. The numeric codes range from 10 for Newfoundland and Labrador to 59 for British Columbia and 62 for Nunavut Territory. Note that MapPoint does not recognize NL as the new alphabetic code for the province of Newfoundland and Labrador; instead, MapPoint recognizes the old NF code used prior to 2002.

Canadian Census Division

MapPoint includes geography for the 288 census divisions used by the 1996 Canadian census. Census divisions correspond to counties in most of eastern Canada, to regional districts in British Columbia, and to census-created geographies with no administrative reality in most of central Canada.

Census divisions have a four-digit SGC code whose first two digits are equal to the province/territory SGC code. MapPoint imports data based on this four-digit code. For example, the Greater Vancouver Regional District has an SGC code of 5915, where the 59 matches the SGC province code for British Columbia.

Canadian Census Subdivision

For the 1996 census, Canada was divided into 5,984 census subdivisions. These subdivisions correspond to cities, towns, townships, and other local governmental units and have widely varying populations. For example, the city of Montreal is a census subdivision with over 1 million people, while many rural subdivisions have only a few hundred inhabitants.

MapPoint correctly imports census subdivision data using a seven-digit SGC code. This code consists of the two-digit province/territory code, the two-digit census division code, and the three-digit census subdivision code. For example, the city of Montreal has an SGC code of 2466025; 24 represents Quebec, 66 represents the census division for the Montreal Urban Community, and 025 represents the city itself.

European NUTS Codes

NUTS is a French acronym for "nomenclature des unités territoriales statistiques" (in English, nomenclature of the statistical territorial units). The European Union uses NUTS codes as the standard for their data products, and this coding scheme is being used increasingly as a trans-European standard. A NUTS code is alphanumeric and consists of a two-letter country prefix and a series of numbers, letters, or both.

NUTS codes are arranged in a hierarchy. NUTS-1 areas are the largest and are divided into NUTS-2 areas, and so on. NUTS levels for a specific country do not always correspond to actual administrative units such as states and counties, and the same NUTS level often corresponds to different political geography in different countries. For example, the states of Germany correspond to the NUTS-1 level, but the states of Austria correspond to the NUTS-2 level. The NUTS-1 level in Austria consists of artificial groups of states that have no political reality. MapPoint does not support NUTS levels that do not correspond to actual administrative units within a particular country. Therefore, MapPoint supports the NUTS-1 level in Germany but not in Austria.

The Europe version of MapPoint 2004 correctly imports datasets with NUTS codes as administrative-unit identifiers (as noted in the following table). For example, if you have a dataset with 96 records for the departments of France and use the NUTS-3 code as the geographic identifier, you use the Import Data Wizard drop-down header of Department for that column and the data will be imported correctly.

The following table summarizes the countries and administrative levels that MapPoint supports.

Country First-level administrative units Second-level administrative units
Austria NUTS-2, 9 states

Example: AT31

Not supported.
Belgium NUTS-1, 3 regions

Example: BE1

NUTS-3, 10 provinces

Example: BE31

Denmark NUTS-3, 15 counties

Example: DK00A

Not supported.
France NUTS-2, 22 regions

Example: FR52

NUTS-3, 96 departments

Example: FR826

MapPoint also recognizes standard department codes; for example, 75 for Paris.

Germany NUTS-1, 16 states

Example: DE6

NUTS-3, 439 Landkries

Example: DEA5C

Italy NUTS-2, 20 regions

Example: IT51

Not supported.
Netherlands NUTS-2, 12 provinces

Example: NL21

Not supported.
Spain NUTS-2, 17 autonomias

Example: ES11

Not supported.

ISO-3166 Codes

The International Organization for Standardization (ISO) publishes code schemes for countries/regions of the world and for first-level administrative divisions:

  • The scheme for countries/regions is documented in the standard ISO-3166-1, "Codes for the representation of names of countries and their subdivisions – Part 1: Country codes."
  • The scheme for first-level administrative divisions is documented in the standard ISO-3166-2, "Codes for the representation of names of countries and their subdivisions – Part 2: Country subdivision code."
Country/Region

The ISO-3166-1 standard has two sets of codes for the countries/regions of the world—a two-letter code, which is more frequently used, and a three-letter code. With the North America and Europe versions of MapPoint, you can import data for the countries/regions of the world based on both sets of codes. However, some small island groups may exist where the ISO definition of a country/region does not match the one used by MapPoint.

So, if your data has ISO-3166 country codes as record identifiers, such as US for the United States, CA for Canada, and FR for France, MapPoint will recognize them and match your data to the correct countries/regions.

Many other code schemes are used for the countries/regions of the world; for example, the Distinguishing Sign code used on automobile stickers, the International Olympic Committee country codes, the Internet top-level domain suffixes, and so on. These schemes vary from the ISO-3166 standard and can cause confusion. MapPoint supports only the ISO-3166 codes.

First-Level Administrative Unit

The ISO-3166-2 standard lists codes for first-level administrative areas (states, provinces, prefectures, and so on) in the form of a two-letter country/region code, a hyphen, and then an administrative-area code. For example, ISO-3166-2 lists Alabama as US-AL; Brittany, France as FR-E. The North America and Europe versions of MapPoint both support the ISO-3166-2 codes for first-level administrative areas in the following countries/regions: Argentina, Australia, Belarus, Brazil, Canada, China, France, Germany, Italy, Japan, South Korea, Mexico, Nigeria, Norway, Poland, Romania, Spain, Sweden, Switzerland, Turkey, United Kingdom, United States, Serbia, and Montenegro.

Postal Codes and Other Abbreviations

Many countries use special codes and abbreviations for their administrative divisions, and MapPoint supports several other schemes for data import that are not discussed in this document. The following table lists the other code and abbreviation schemes that MapPoint recognizes.

Country/region Code or abbreviation scheme Example
Brazil Two-letter state abbreviations RJ for Rio de Janeiro
Canada Two-letter province/territory abbreviations QC for Quebec
France Two-digit department codes 75 for Paris
France Five-digit numeric commune codes 47148 for Leyritz-Moncassin
Germany Two-letter state abbreviations HH for Hamburg
Italy Two-letter province abbreviations MI for Milano
Mexico Three-letter state abbreviations JAL for Jalisco
Poland Two-letter province abbreviations SL for Slaskie
Spain Two-letter province abbreviations SA for Salamanca
Spain Two-digit province code 37 for Salamanca
Switzerland Two-letter canton abbreviations VS for Valais
United States Two-letter state postal abbreviations CA for California

Conclusion

Importing data into MapPoint with standard code schemes has many advantages over using string matching. Matches are precise, leaving little to chance and reducing the number of unmatched records. Whenever possible, you should use the supported code schemes your standard way of importing data into MapPoint 2004.

Additional Resources

Click the following links to learn more about the geographic code schemes discussed in this article:

Discuss this story in the forum.

Author: Greg Slayden
Email: gregsla(AT)microsoft.com
URL: http://www.microsoft.com/mappoint
Greg Slayden is a software design engineer at Microsoft Corporation, where he has worked for the MapPoint Business Unit for over 6 years and built maps for the Expedia Web site, the Encarta World Atlas, and MapPoint. He has over 15 years of experience working with geographic and demographic data.



Google
 
MP2Kmag Internet


 Recent Discussion
 Resources
Browse GIS books and periodicals
Find a MapPoint Partner or Consultant
Real Estate Thornbrook Subdivision


Want Your Site To Appear Here?

   © 1999-2012 MP2K. Questions and comments to: website@mp2kmag.com
  Microsoft and MapPoint 2002/2004/2006/2009/2010/2011/2013 are either trademarks or registered trademarks of Microsoft.