GIS Challenges Abound
Developing applications for Geographical Information Systems is the rage these days. But there are problems that we take a closer look at, along with some possible solutions.
Techies love venturing into application development areas that are seen to be fun, as well as sought after by consumers. This perhaps explains the scramble among the tech community to enter the space of Geographical Information Systems (GIS) applications for GPS-enabled mobile phones. But in truth, the utility of GIS extends far beyond digital mapping to help solve cartography problems. Any environmentalist will tell you that GIS is an essential tool, which enables the modelling, analysis and management of fast degrading and scarce spatially-related natural resources. And while the masses using GIS-based location applications want relatively simple tools, the needs of more serious users calls for a full-scale GIS software environment.
for more info on CCNA Training and CCNA Certification and more Cisco exams log in to Certkingdom.com
But developing a GIS application is not a lark! Here’s a rundown of the difficulties that an application developer is likely to face, as well as the solutions suggested by experts.
Capturing data
Collection of data often influences the cost of implementing a new GIS application, simply because of the sheer volumes of data that must be accumulated and edited. The fact is that a GIS application does not only make use of map geometry generated by vectorisation technology, but also requires the addition of feature information, such as a lake or river, so as to generate meaningful topology and the association of user data. This difficulty has led to the creation of raster databases based on data generated by remote sensors (satellites), rasterised paper maps generated by scanning, and grid data derived from digital terrain models. {quotes}Unlike the difficulty of data capture faced by developers creating vector-based databases, remote image processing systems face computational problems in dealing with the sheer bulk of data available.{/quotes}
It is increasingly believed that an ideal GIS system is a hybrid integrated system based on a raster vector database that facilitates the manipulation, viewing and analysis of both forms of data in a seamless environment. According to Rajesh C. Mathur, president, ESRI India, the voluminous data entailed in such a hybrid system is best handled by an RDBMS (relational database management system) supporting spatial geometry in a seamless manner such as ESRI’s vector and raster ArcSDE. A developer using this application would simply have to identify the mechanisms of the spatial relationships and structures that are common in the voluminous data and analytical techniques that are unique to both spatial (vector and raster database) data. This process is made easier by the wide range of geodata services ArcSDE provides for data extraction, replication, warehousing, mining and synchronisation. ArcSDE also provides a framework and the tools to manage large spatial datasets in an RDBMS, such as IBM DB2, IBM Informix, Oracle, Microsoft Access, Microsoft SQL (structured query language) Server, and PostgreSQL. ArcSDE implements data warehousing and data mining sub-routines through its unique design.
Handling huge volumes of spatially related data
A vector GIS system may require disk space between 50 KB and 5 MB to represent one square kilometre of a city area at a scale of 1:5000, while the same area represented by raster data may need 10 MB. As Jithesh P. Joseph, director, Maptell Geosystems, explains, “The total disk space needed depends on the vector or raster format you use to store the data, as well as the scale of the data. Here again, the size of vector data depends on the density of features, the number of layers available and the data attributes added to each geometric object.”
A conventional DBMS would strain to perform analytical tasks on such voluminous data. {quotes}Most popular commercial RDBMSs are not up to the task of handling geographic as opposed to alphanumeric data.{/quotes} For instance, a query language like SQL may be easy to apply for typical search queries involving equal or comparative operators and a user-specified object, but not so much when searching for, say, objects ‘inside’ a given polygon or ‘near’ another object. A large polygonal area or a part of it is far more difficult to process than any sort of linear shape, especially when the system is required to process only a select few of its millions of points. Over and above the difficulty in using unsuited operators to search data that is difficult to index is the fact that SQL would retrieve information far slower than desired.
This problem may be surmounted by building a special purpose database system to handle the graphics, which would be broken down into map sheets with map management techniques to facilitate their use. Map sheets are considered easy to file and index using the standardised referencing system of SOI (Survey of India) or NRSA (National Remote Sensing Agency) map grid systems (see box), and hence amenable to a swift search, provided sheets representing closely located geographical areas are clustered on a disk so as to have similar addresses. But map sheets effectively put paid to a user’s requirement of a single, continuous, seamless landscape. As in paper maps, map sheets cause difficulties when users need to view objects spanning more than one sheet. Besides, one-size map sheets present an improvised manner to deal with variable density data—implying that a representation of an urban area may require a large scale to be accurately depicted vis-à-vis a rural landscape.
Map sheets may cause an explosion of data or slowdowns when the system performs overlay analysis. An overlay analysis typically answers this type of question—“Show agricultural areas with clayey soil where the ground slope is less than four degrees.” In order to answer this question, the system would have to efficiently combine topology and geometry, and the resulting areas may often be polygonal areas spanning multiple map sheets.
Development of a customised front-end system
The limitations of dealing with map sheets suggest that a GIS application should ideally be based on a single database system engineered for optimum performance. However, this would not satisfy users that may already have invested significantly in popular commercial databases. The ideal solution would be to have a virtual front-end system that integrates well with two or more back-end DBMSs—such as SQL and a graphical DBMS—by being able to construct queries that address both databases, and thus preserve their individual capabilities.
Since every organisation using GIS has specific requirements, the front-end language should also enable easy customisation to the user interface and functionality. As it is, professional GIS users tend to complain of a mismatch between the structure of a typical GIS package and their approach—GIS users focus on data and building data layers or making maps, whereas most applications revolve around data functions.
It is generally believed that a developer needing to customise a GIS system would have to be comfortable with a wide range of operating system commands and languages. Further, the front-end of a GIS system should be developed in a language that provides facilities covering the normal requirements of an operating system, customisation, applications programming and programming on most systems; that has a friendly syntax that allows casual users to write simple programs for quick online customisation; and that ensures that larger programs are in a form that is easily readable and debuggable.
“But none of the languages available can deliver complete functionality,” says Joseph, and goes on to point out that the selection of a language is a relative process depending upon the type of the application and the platforms you want to support. “Other important aspects to look for include industry acceptance, development schedules and code maintainability. When you consider the online support and longevity, along with the above facts, MS Visual Studio .NET (C++, C#, VB.NET) is a good choice. Another good combination is C++ and Qt,” he adds.
Version management
Another typical GIS application development problem pertains to the lengthy transaction time that renders version management a difficult task. When a large database is accessed simultaneously by multiple users, it is easy to ‘lock’ the database to preserve its integrity when one user is editing data—if and only if—the transaction is performed quickly. Locking differentiates two versions of the database—before the user starts the transaction and at the point of committing the changes the user has made.
But managing these two versions in a GIS system is difficult because of the duration of transactions—one transaction can take days or weeks or even months to complete. Obviously, the database cannot be locked for that long. At the same time, the sheer size of the database rules out the option of copying it to preserve its integrity. One approach could be for a certain user to check-out subsets of the database and check-in the changes made when the job is done. However, this gives rise to problems if two concurrent users make incompatible changes in the same subsets of the database. Working round this difficulty becomes tougher because test runs of an application are conducted with a limited number of users, and hence the situation prevailing in a production environment may not manifest during a pilot period.
Joseph observes, “The spatial data that forms the back-end of almost all GIS applications is an independent entity that may be updated without making any changes in the application. {quotes}When we use flat file formats in the back-end, the version information can be recorded in a metadata document associated with it. A better practice is to use any of the standard spatial databases to store the data since most of them support spatial data versioning.{/quotes} Middleware like ‘ArcSDE’ incorporating Version Reconcile Services may also be used to implement version-enabled spatial data tables in RDBMS. The version information can also be stored by including temporal data fields in the spatial data table structure. In this case, a Boolean flag is used to denote the latest data in addition to admit and retire time stamps.”
Mathur adds, “Version management tools are essential for a team working on one aspect of a project to continue its long and complex geo-database transactions without disrupting the work of another team. A version reconcile system must be capable of associating a version to a job for the life of the job, provide tools to ensure data is always referenced from the appropriate version, as well as offer management tools to track edits in detail.”
So, these are the most pressing issues a GIS application developer will face. Developers who work their way around these will find themselves contributing more significantly to the application development process.