1.         INTRODUCTION

 

Wolters Kluwer Financial Services – PCi division (“WKFS-PCi”) makes and sells Mapping Software and Mapping Data to Financial Institutions to help them to comply with the Community Reinvestment Act (CRA) and the Home Mortgage Disclosure Act (HMDA).  CRA and HMDA are United States Federal Government legislation.

 

As the Geo-Information Specialist at WKFS-PCi, I am responsible for the programs and data that allow our clients to generate maps.  This responsibility includes monitoring the business model used for delivering programs and data to our clients.

 

The Internet, and programs associated with it (email, browsers, file transfer protocol, etc.), have greatly expanded the ability of organizations to both distribute data, and interact with data, over great distances.  If WKFS-PCi is to remain competitive in the business of selling Mapping Software, it must be determined if WKFS-PCi is using the best methods to deliver data.

 

1.1.      Software and Data

 

The Software, originally developed in 1995, is installed onto the user’s computer, and the Data can be loaded onto the user’s computer, or onto any computer to which the user is networked.  Microsoft SQL Server is the database engine, and the Mapping Software is built from various programs available from MapInfo.

 

The Mapping Data is primarily derived from the TIGER line files from the US Census Bureau (Washington, DC), with some additional layers from Tele Atlas (Lebanon, NH).  Appendix 6.1. lists the layers in the US-level 2006 Mapping Data product. The size of the data is 5.2 gigabytes, and its compressed “Install Program” takes up 5 cds.  All customers (at the Nationwide-level) receive the same 5.2 gigabytes of data.

 

Appendix 6.2. describes the Mapping Software and the Mapping Data, and explains the importance of a Bank’s Mapping Activities.

 

1.2.      Problems

 

Problems associated with the size of the Mapping Data (5.2 gigs):

 

When new Mapping Data is created, hundreds of sets of cds must be made and distributed throughout the US (including Alaska, Hawaii, and Puerto Rico).  It would save time and money ($5,000 for each shipment to all customers) if we could avoid production and delivery of the actual cds.

 

The data requires space on a user’s machine, or on another drive owned by the institution.  There are storage concerns at our users’ locations (“Where can we put this?”).

 

 

1.3.      Purpose

 

The goal of this research is to answer three questions:

 

1. Can the Internet be used as a channel to distribute data from WKFS-PCi to the client?

 

2. Can the Internet be used as a network between software on a user’s computer and data residing on WKFS-PCi’s computer?

 

3. Can an Internet-based Mapping Program, using data residing on WKFS-PCi’s computer, be used effectively?

 

This study will answer those three questions.  If any one of them can be accomplished, then money will be saved.  Additional benefits will include:

 

More timely delivery of data.  If the data can be delivered directly from my hard drive to a user’s computer, over the Internet, the user will have access to that data immediately; there will be no wait for shipping companies to deliver the data.

 

Better quality control.  If the data can be delivered directly from my hard drive to a user’s computer, over the Internet, the user will have access to identical data compared to what is on my computer.  The “production process” is eliminated (the “production process” is defined as burning the data onto a cd).  Although not often, “bad” cds appear from time to time – although they visually “look good”, their media is corrupt, and they can not be used.  There are no “bad cds” if the data sets are not burned onto cds.

 

Allow customers easy access to the most-recent data.  Data sets are often released on an annual schedule (once a year), if not longer.  Improvements are made to the data, however, on an on-going basis.  If the client can access the most-recent data from a Web Server, they do not have to wait until the “official release”, which may be many months in the future.

 

1.4.      Project Assistance

 

Assistance was requested for addressing Questions 1 and 2.  This consisted of questioning IT personnel, at both 30 Winter St, Boston, MA and 130 Turner St, Waltham, MA, for the correct procedures to access remote computers, and working with remote personnel (St. Cloud, MN).

 

1.5.      Limitations

 

Particular limitations to the research were as follows:

 

  • Testing was done using the resources available.  These include “average” computers (generally 2-to-4  years old), and “generic” internal network configurations and firewall protocols (no experimental or beta software or hardware).  Additional discussion of the resources is included in the “Definition of Terms” (bandwidth) and the conditions described in answering Question 1.

 

  • It is outside the scope of this study to describe in detail implementation of any Internet-based technology solutions, i.e. the actual programming of an Internet-based Mapping Server solution.

 

2.         LITERATURE REVIEW

 

A literature review was performed consisting of library research, Web research and interviews.  The object of the review was to identify procedures that would allow answering the questions-of-study.

 

The resources listed in the Bibliography were consulted, and used to help establish procedures allowing for the testing of the Hypotheses.  Nothing was specifically found addressing the Hypotheses, probably due to the continuously-changing nature of computer networks and Internet protocols.

 

3.         HYPOTHESES

 

Hypothesis 1: The Mapping Data can be distributed over the Internet.

 

Hypothesis 2: The physical location of the Mapping Data, in relation to the Mapping Program, is not a concern.

 

Hypothesis 3:  Google Maps or Google Earth (Internet-based Mapping Programs that are currently available) can be used as an effective Mapping Engine.

 

 

4.         METHODS OF INVESTIGATION

 

Hypothesis 1: The Mapping Data can be distributed over the Internet.

 

To test this hypothesis, I will create a shared folder on another WKFS-PCi employee’s computer in St. Cloud, MN (1,200 miles away).  Using Windows Explorer, I will then attempt to copy the Northeast Region Mapping Data into the shared folder.

 

Hypothesis 2: The physical location of the Mapping Data, in relation to the Mapping Program, is not a concern.

 

To test this hypothesis, I will make a map, but with three different scenarios.  The first time, I will be using the Mapping Data and Mapping Program installed on my local machine.  The second time, I will be using the Mapping Data in St. Cloud, MN, but using the Mapping Program in our Boston office.  The third time, I will be using the Mapping Data in St. Cloud, MN, but using the Mapping Program in our Waltham office.

 

I will complete this table:

 

Mapping Data Location

Mapping Program Location

Time to make CT Map

Boston, MA

Boston, MA

 

St. Cloud, MN

Boston, MA

 

St. Cloud, MN

Waltham, MA

 

 

Hypothesis 3:  Google Maps or Google Earth (Internet-based Mapping Programs that are currently available) can be used as an effective Mapping Engine.

 

To test this hypothesis, I will 1) acquire licenses for both Google Maps and Google Earth, 2) contract for a Web Server capable of servicing the Google Maps API, and 3) learn enough programming to be able to create the necessary maps.

 

5.         RESULTS

 

Hypothesis 1: The Mapping Data can be distributed over the Internet.

 

In the Boston office, using Boston’s T1 line, it took 30 minutes to copy 280 files (297 megs) [please see CONCLUSIONS for a discussion of “Boston’s T1 line versus Waltham’s T3 line”].  At that rate, it would take 17.9 hours to transfer the 5.2 gigs of US-level 2006 Mapping Data.

 

To confirm my findings, I switched the test to the Waltham office.  There, using Waltham’s T3 line, I downloaded a zipped TIGER file (Middlesex County, Massachusetts; 13.3 megs) from the Census Bureau’s web site (http://www2.census.gov/geo/tiger/tiger2006fe/MA/).  It took 91 seconds, meaning a rate of 149.6 kilobytes per second.

 

At that rate, it would take 101.2 hours to transfer the 5.2 gigs of US-level 2006 Mapping Data.

 

Hypothesis 2: The physical location of the Mapping Data, in relation to the Mapping Program, is not a concern.

 

Once the Mapping Data has been copied to St. Cloud, MN location, open the CRA Wiz program.  There is a setting in the program that allows the user to specify the network path to the Mapping Data.  For test 1 in Boston, I made the path to my own local machine.  For tests 2 and 3, I made the path to the computer out in St. Cloud, MN (using machines in Boston and Waltham, respectively).

 

After setting the Mapping Data path, using the Mapping Program, I made a thematic map of the 815 Census Tracts in Connecticut, shading them by “Census Tract Income Level as a % of MSA Median Income”.

 

 

Map 1: Demographic Data for 815 Census Tracts in Connecticut

 

With the Mapping Data on my local computer, the map took 15 seconds to draw.

With the Mapping Data 1,200 miles away from the Boston office, the map took only about 1 minute to draw; from the Waltham office (T3 line), however, the map took 60 minutes to draw.

 

Mapping Data Location

Mapping Program Location

Time to make CT Map

Boston, MA

Boston, MA

15 seconds

St. Cloud, MN

Boston, MA

60 seconds

St. Cloud, MN

Waltham, MA

60 minutes

 

Hypothesis 3:  Google Maps or Google Earth (Internet-based Mapping Programs that are currently available) can be used as an effective Mapping Engine.

 

I contacted Google and acquired a license for Google Maps.  To acquire a “Google Maps license”, you need to sign up for a Google Maps API key (http://www.google.com/apis/maps/).  To quote the Google Maps API home page:

 

The Google Maps API lets you embed Google Maps in your own web pages with JavaScript. You can add overlays to the map (including markers and polylines) and display shadowed "info windows" just like Google Maps.  The Maps API is a free beta service, available for any web site that is free to consumers.

 

 

One important part of the agreement is that the key must be used on a “web site that is free to consumers”.  That entailed contracting for a Web Server (a computer acting as a node on the Internet, with its own URL).  I accomplished that, and developed a map which displayed client data in the Boston, MA area.

 

I also contacted Google and acquired a license for Google Earth Pro ($400) (http://earth.google.com/earth_pro.html).  Anyone may download a free version of Google Earth (http://earth.google.com/index.html), which can be used to view Google Earth and any layers that other people have created; if you want to import files from a Geographic Information System (MapInfo .tab files, for example), you need to purchase Google Earth Pro.  After acquiring the Google Earth Pro license, I imported points (Census Tract Numbers), lines (Census Tract Boundary files), and polygons (Census Tracts shaded but Income as a % of MSA Median).

 

6.         CONCLUSIONS

 

Hypothesis 1: The Mapping Data can be distributed over the Internet.

 

Conclusion 1: The amount of time that it would take to distribute 5.2 gigabytes of Mapping Data (estimated between 18 and 101 hours) is unacceptable.  Therefore, the Mapping Data cannot be distributed over the Internet.

 

Hypothesis 2: The physical location of the Mapping Data, in relation to the Mapping Program, is not a concern.

 

Conclusion 2:

 

Boston’s T1 line versus Waltham’s T3 line

In the Summer of 2006, the PCi Boston office had two T1 lines (~3.0 Mbits/second transfer rate) available for ~100 users, but all the servers were local.  As of November 1, 2006, the Waltham office also had two T1 lines, but the number of people had increased to ~250, and all our servers had been moved to either Texas or Rhode Island.  This configuration caused such a slow Internet connection that the Mapping Program stopped.  Waltham received an upgrade to a T3 line (45 Mbits/second transfer rate) in late November 2006, but, as of November 22, 2006, the results were very disappointing.

 

Because of IT’s configuration of our network at our Boston office, the St. Cloud mapping was (relatively) very quick.  Unfortunately, out in the Waltham office, the network configuration is radically different.  The results are very disappointing, even though Waltham is using a T3 line (45 Mbits/second transfer rate).  The problem is bandwidth (see Appendix 6.3. Definition of Terms), and 60 minutes to draw a map of Connecticut is unacceptable.

 

Because of the amount of time that it would take to create a map of the Census Tracts in Connecticut (ranging from 1 minute to 60 minutes), the policy should be that the Mapping Data be installed on the user’s computer, or at least on a computer to which the user has a very fast internal (behind the company’s Firewall) connection.

 

Hypothesis 3:  Google Maps or Google Earth (Internet-based Mapping Programs that are currently available) can be used as an effective Mapping Engine.

 

Google Maps

 

Map 2: Client Data displayed in Google Maps

 

Over the last two years, Google has had 100 million people download their Google Earth program (and there are more than 30,000 people who have downloaded the developer’s code), and Google’s Google Maps program is ubiquitous. Their “coolness” factor is very high, and our clients, especially new clients, are more-than-familiar with both interfaces. This directly translates into specific user-preferences for Presentation/Delivery: “when you make a map, put the data [points, boundaries] on Google Earth!” (or Google Maps!) 

 

The map above is an example of such a “client-requested display” (Google Maps).  The programming can be implemented quickly, and the functioning of the program (creating a small XML file of latitude/longitude coordinates, and displaying that file with a Google Maps background) is relatively fast.  It allows the client to switch display modes (Map/Satellite/Hybrid), as well as zoom-in/zoom-out, and pan.  Therefore, Google Maps is a good Internet-based Mapping Engine for the display of point data.

 

The programming for additional functionality (display of boundaries, thematic cartography) is complicated, and will need to be programmed at a later time.  Having said that, it is possible to display Boundaries in both Google Maps and Google Earth.

 

Map 3: blue Census Tract Boundaries (Suffolk County, MA) displayed in Google Maps

 

The map above shows blue Census Tract boundaries in Suffolk County, MA. 

 

Research indicates that it may not be possible to display Census Tract Numbers.

 

 

 

 

 

 

 

 

 

 

 

Google Earth

 

Map 4: Census Tract Boundaries and Numbers (Chicago, IL) displayed in Google Earth

 

Google Maps or Google Earth can be used for a Mapping Engine, but only for (at this time) only certain selected functionalities.  Google Maps can certainly be used to display point data, and can probably be used to view boundaries.  Unfortunately, thematic mapping on Google Maps, and text display of Census Tract Numbers, are not on track for current development.

 

On the other hand, Google Earth has a very nice interface for displaying boundaries and text, but is more complicated in dealing with point data.  Both platforms can display Thematic Mapping, but it is a cumbersome process, and not slated for current development at WKFS-PCi.

 

An additional factor in the comparison between Google Earth and Google Maps: Google Earth is a desktop-program that must be installed on the user’s computer, while Google Maps is simply accessed through a Web Browser (a separate installation is not necessary).  The IT department at a client’s institution may object to “additional software being installed on a user’s machine”, but not object to “accessing a program through a web browser, just as long as nothing gets installed locally”.

 

Conclusion 3: Yes, Google Maps or Google Earth can be used for a Mapping Engine, but only for (at this time) only certain selected functionalities.  If the clients demand a more fully-featured Mapping solution (thematic cartography), then neither Google Maps nor Google Earth can be used for a Mapping Engine at this time.

 

Closing Remark:  Although the questions addressed in this study were not answered with enthusiastic YESs, the door is certainly not closed on the future.  Do we, in 2007, stand at the threshold of unbridled bandwidth?  Or, as discussed in Jonathan Zittrain’s writing, are we at a point where a conservative response to the anarchy of the Internet will “prove decisive in closing today’s open computing environments”?

 

7.       Appendix

 

7.1. Mapping Data Layers

 

Labels:

-          Town Names                                       -  Custom Labels

 

Points:

-          Branch & Deposit Data                      -  Airport

-          US Post Offices                                  -  Hospital

-          Church                                                            -  Custom Coordinates

-          School

 

Lines:

-          Tract Border: Streets                           -  All Major Roads

-          Tract Border: Railroad                        -  Interstate Highways

-          Tract Border: Water                            -  US Highways

-          Streets (Low Income Tracts)              -  State and County Highways

-          Streets (Moderate Income Tracts)      -  Railroads

-          Streets

 

Boundaries:

-          Outline                                                                                    -  Hospital Area

-          FEMA Data                                                                -  Golf Courses

-          Census Designated Place                                            -  Federal Land

-          Census Tracts 1990                                                     -  Indian Reservation

-          Census Tracts 2000                                                     -  Cemetery

-          Census Tracts 2004                                                     -  Military Installation

-          ZIP Codes                                                                   -  National Park

-          Town (Minor Civil Division)                                      -  State Park

-          104th Congressional Districts (1995-1996)                 -  County 2000

-          106th Congressional Districts (1999-2000)                 -  County 2004

-          108th Congressional Districts (2003-2004)                 -  MSA 2000

-          109th Congressional Districts (2005-2006)                 -  MSA 2004

-          School Area                                                                -  State

-          Airport Area                                                                -  Water

 

In addition to the Mapping Data, there are 4 additional layers available in a map: the thematic layer, the thematic overlay, a dot density layer, and a pie chart layer.

 

7.2. Description of Mapping Software and the Mapping Data

 

WKFS-PCi sells computer software that helps a bank with its compliance with the Community Reinvestment Act (CRA) and Home Mortgage Disclosure Act (HMDA).  A Government Examiner visits the Bank every few years to assess their compliance with the CRA.  Upon completion of a CRA examination, an overall CRA rating is assigned using a four-tiered rating system.  These ratings are: Outstanding, Satisfactory, Needs to Improve, and Substantial Noncompliance.

 

Although there are currently no penalties associated with CRA ratings, it is a very important factor in normal bank activities relating to mergers and acquisitions, and branch openings and closings.  Because the Secretary of State for each State has Approval power for those activities, he can deny a bank’s application/petition to buy another bank, to sell itself to another bank, or even to open or close branches.  The Secretary of State uses an institution’s CRA Rating as a prime factor in their judgments.

 

The Mapping Software allows the Bank to view their Loan Application activity against a background of Census Tracts shaded by Demographic Data.

 

Map 5: Sample Bank and Demographic Data for downtown Boston, MA

 

In this map, a Sample Bank’s loan-application activity is represented by Pie Charts in each census tract, and the Census Tracts are shaded by “Income Level as a % of MSA Median” (4 regulatory-defined ranges).

 

Bank Activity:  The Pie Charts show that the Sample Bank had a maximum of 210 loan applications in census tract 108, a similar count in census tract 703, and smaller counts in census tracts 201, 612, 107,101.01, 102.02, 106, 707, 706, 708, and 105.

 

Slices in the Pie Charts:

  • Yellow slice represents the count of applications that were Originated,
  • Orange slice represents the count of applications that were Denied, and
  • Blue slice represents the count of applications that were neither Originated nor Denied (Approved but Not Accepted, Withdrawn, Closed Incomplete, Purchased, Preapproval Denied, or Preapproval Not Accepted)

 

Demographics: The map shows us that although the Back Bay and Copley Square areas (census tracts 105, 106, 107, and 108) are classified as Upper Income, the neighborhoods transition to Middle Income (census tracts 101.01, 102.02, 707, and 708), then to Moderate Income (census tracts 102.01, 104.01, 104.02, 705, and 709), and then to Low Income (census tracts 103, 808, 806, 805, 804, and 711).  If a bank contains all these census tracts in their Assessment Area, an analysis of their HMDA data (annual reporting of loan application locations, by census tract) will reveal the bank’s compliance with the Community Reinvestment Act.  A Government Examiner visits the Bank every few years to assess their compliance with the CRA.  Upon completion of a CRA examination, an overall CRA rating is assigned using a four-tiered rating system.  These ratings are: Outstanding, Satisfactory, Needs to Improve, and Substantial Noncompliance.  In this example, the Sample Bank, because they only have applications from Middle-Income and Upper-Income Census Tracts, would be in danger of receiving a Substantial Noncompliance rating.  This could have grave consequences related to any strategic planning at the Bank (opening and closing branches, purchasing another bank, or selling itself to another bank).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

7.3. Definition of Terms

 

bandwidth

For this paper, I define “bandwidth” as a combination of two factors: “size-of-pipe” and “management-of-pipe”.  The “size-of-pipe” factor is easy to determine: a T1 line can handle an Internet connection speed of 1.544 Mbps (megabits-per-second), and a T3 line handles an Internet connection speed of 44.736 Mbps.  The “management-of-pipe” factor, however, is much more difficult to determine: it falls into the “if it looks like a bird, and flies, it is a bird” school of analysis.  In this case, the “management-of-pipe” factor is estimated by speaking with IT representatives (both in Waltham and Texas), and with various WKFS-PCi internal users.  The consensus is that our Internet connections are slower in Waltham than they were in Boston, ranging from “twice as slow” to “an order of magnitude” slower (10 times as slow).

 

CRA

The Community Investment Act (CRA), enacted by Congress in 1977, is intended to encourage depository institutions to help meet the credit needs of the communities in which they operate.  Depository institutions are inspected by government representatives periodically to ensure compliance with the Act.

 

HMDA

The Home Mortgage Disclosure Act (HMDA), enacted by Congress in 1975 and implemented by the Federal Reserve Board’s Regulation C, requires lending institutions to report public loan data.  The deadline for each annual reporting is March 1 of the following year.

 

Mapper module

The phrase “Mapper module” represents the entire product (Mapping Software plus Mapping Data).

 

WKFS-PCi

“WKFS-PCi” stands for Wolters Kluwer Financial Services – PCi division.  The PCi division makes and sells database and mapping software (and data) to bank compliance officers, and their regulators.  Every bank is chartered to do business in a particular geographic area (the City of Boston, or Suffolk County, for example); this is known as their “Assessment Area”.  Although the bank is not expected to make loans everywhere throughout their Assessment Area, the government expects them to, at a minimum, solicit business throughout their Assessment Area.  Analytically, the best method to accomplish an analysis of their lending throughout an area is to create a map showing their loans throughout that area.  The software allows clients to create reports and maps.

 

 

 

 

 

 

7.4.  Future Work and Development

 

Thematic Mapping is possible on both Google Earth and Google Maps, but will not be put into production at WKFS-PCi at this time.  Should they some day decide to move in this direction, they can review these examples of Thematic Mapping that I have prepared.

 

Map 6: Demographic Data for Census Tracts in Los Angeles, CA displayed in Google Earth

This Thematic Map of the Beverly Hills/West Hollywood section of Los Angeles, CA has Census Tracts shaded by “Income as a % of MSA Income”:  yellow = Low, green = Moderate, blue = Moderate Income, and purple = Upper Income.

 

 

 

 

 

 

Map 7: Demographic Data for Census Tracts in Suffolk County, MA displayed in Google Maps

 

This Thematic Map of Suffolk County, MA has Census Tracts shaded by “Income as a % of MSA Income”.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Map 8: Demographic Data for Census Tracts in Los Angeles, CA displayed in Google Earth

 

An exciting visualization prospect is “extruded boundaries” coupled with Google Earth’s “pan/zoom/tilt abilities”. The yellow Low-Income Census Tracts are easily identifiable:

 

Since the layers associated with Thematic Mapping must be created and either reside on the WKFS-PCi Mapping Server (in the case of Google Maps), or reside on the client’s computer (in the case of Google Earth), development and delivery of such Internet-based solutions are for the future.

 

 

 

 

 

 

 

 

 

8.  Bibliography

 

Building Websites with Joomla!, Graf, Hagen, 2006, Packt Publishing, Birmingham, UK

 

Google Maps Hacks, Gibson, Rich & Erie, Schuyler, 2006, O’Reilly Media, Inc., Sebastopol, CA

 

Hacking Google Maps and Google Earth, Brown, Martin C., 2006, Wiley Publishing, Inc., Indianapolis, IN

 

Beginning MapServer: Open Source GIS Development, Kropla, Bill, 2005, Apress, Berkeley, CA

 

Mapping Hacks, Erie, Schuyler & Gibson, Rich & Walsh, Jo, 2005, O’Reilly Media, Inc., Sebastopol, CA

 

Web Mapping Illustrated, Mitchell, Tyler, 2005, O’Reilly Media, Inc., Sebastopol, CA

 

About Face 2.0: The Essentials of Interactive Design, Cooper, Alan, 2003, Wiley Publishing, Inc., Indianapolis, IN

 

ArcUser Magazine, ESRI, Redlands, CA

 

“The Revenge of Geography”, The Economist, 2003

 

“The View from Google Earth”, Wagner, Mary Jo, Geospatial Solutions, May 1, 2006

 

“Google’s not-so-very-secret weapon”, Markoff, John and Hansell, Saul, The New York Times, June 13, 2006

 

“Using Microsoft MapPoint 2004 with Microsoft Windows SharePoint Services”, Bucahanan, Nancy, www.microsoft.com website

 

www.ffiec.gov website

 

Zittrain, Jonathan, “The Generative Internet”. Harvard Law Review, Vol. 119, p. 1974, May 2006 Available at SSRN: http://ssm.com/abstract=847124