Standard data interchange - enabling two computers to talk to each other - has been an ongoing problem for the US Federal Government because of its penchant for customised information technology systems. Now, with the push to put more Government services on the World Wide Web - a place where broad data access should be the rule - the compatibility problem is compounded.
But the process for connecting systems may become much simpler. Extensible Markup Language (XML) is emerging as a universal data communications standard capable of moving information between any two applications. The specification makes it simpler for applications to exchange information by defining a common method of identifying and packaging data.
XML can be applied to just about any application and is expected to become the primary data interchange method for all systems. "XML solves a pervasive problem for IT managers: data exchange," said Jonathan Sacks, vice president of marketing at SoftQuad Software, a Toronto XML developer. However, the technology is new and unpolished, so most government agencies are merely testing its capabilities with an eye toward widespread deployment in a few years.
XML is a natural extension of the Internet, and it focuses on standard interfaces so that users can access information located anywhere on a local- or wide-area network. The Web-based authoring language enables users to create documents that run on networks and can be shared.
Adding XML capabilities to an application is similar to incorporating Open Database Connectivity drivers to any type of database product: a programmer has to make an application aware of the feature and then funnel all data exchanges through it. But the standard isn't limited to databases. In fact, XML works with any product that manipulates or exchanges information.
Consequently, support for it is showing up in an array of products and applications, with the business-to- business (B2B) marketplace a key area of emphasis. "XML has the potential to replace electronic data interchange [EDI] as the primary method different organisations use to exchange information," said Charles Allen, co-founder of US-based Webmethods.
EDI enables agencies to transfer documents, but it has been expensive and difficult to implement and maintain. Various industry groups spent a long time just hammering out standard EDI formats, yet the exchanges entail painstaking mapping of one field to another, a complex skill that takes time to master. And if anything changes, all the mapping, or custom programming, has to be redone.
XML is much easier to deploy, and almost any programmer can become proficient with the authoring language in a matter of hours. It takes a company six months to implement an EDI system at a price tag that could run into six figures, but a comparable XML system can be implemented in weeks for a few thousand dollars.
As a result, XML has become a key feature in many emerging B2B trading exchanges, including Web sites where suppliers and customers swap ordering information. Companies such as Ariba; Bowstreet Software, Commerce One, Extricity Software and Harbinger rely heavily on the technology in their systems that enable companies and consumers to purchase items such as paper clips and medical supplies. But XML's usage expands beyond the B2B space.
Any enterprise with multiple computer systems and applications can use XML to help them exchange information. Because most government agencies have a mix of systems - such as legacy mainframes, client/server applications and Web-based systems - XML could help integrate their applications.
Agencies will have no trouble finding tools for such work. Small companies view XML as a way to help grow their business, so Bluestone Software, eBusiness Technologies, a division of Inso Corp, Information Architects Syndication, OnDisplay, SoftQuad Software and Webmethods have been focusing on this area.
And the standard's potential has not been lost on well-established vendors. "IBM, Microsoft and Oracle are all moving quickly to incorporate XML support in their products," said Uttam Narsu, a senior analyst at Giga Information Group.
Government agencies also see XML's potential. The US Census Bureau views XML as a possible cure to its ongoing data integration woes. Since the mid-1990s, the agency has been searching for a way to simplify sharing data among its applications. In 1998, the agency began to build a corporate metadata repository, a central database that will identify where all of its survey information is located, what specific records are in each database or file, and in what format the information was stored.
"We needed a method of making it simple for different applications to find and modify the information about how and where our data was stored," said Samuel Highsmith, a principal researcher at the government agency. "XML has that potential."
Census relies on Oracle as its primary database management system and used the Oracle XML Development Kit to design the repository. The agency developed a common Web interface so applications could create, edit, browse and exchange metadata information.
The first project to take advantage of the features is the 2002 Economic Census, which consists of 450 surveys. The agency is designing applications to make the collected statistics available to government agencies and citizens in a variety of formats: flat files for use by databases, CD-ROMs and Web interfaces.
"When writing new applications, programmers no longer have to figure out how to identify and classify data; XML has already been done for them," Highsmith said.
XML is also being incorporated in a variety of applications, so users may not even realize it is helping them transfer data. For instance, Lotus Development added XML to its Domino servers, and Microsoft did the same with its Word software.
Frontline Solutions is an Idaho-based systems integrator with a Federal Government focus. In 1997, the company began building an application to automate the process of issuing requests for proposals for the US Air Force Aeronautical Systems Center.
The agency had created its RFPs with Microsoft Word. Because one to two dozen individuals would work with the documents, it became difficult to keep track of the most current version.
In July 1999, the company delivered EZ-RFP, an XML RFP authoring system that produces documents that are formatted in HTML and can be published directly to the Web. EZ-RFP also includes features that enable users to track changes to RFPs, such as author and reviewer contributions and internal comments, and it can cross-reference all of those elements.
Although XML has great potential, it is a little rough around the edges. Government agencies may find XML products buggy, typical of any first-generation software technology. They also may not find all of their products have XML support, which forces them to either build it themselves or wait for their suppliers to deliver it.
An important element within the XML specification is the Document Type Definition. DTDs define how an item such as a purchase order is formatted. In order for two applications to exchange information, they have to support common DTDs. "A lot of different groups have been involved in defining DTDs for various industries," said Bob Gruder, chief executive officer at Information Architects Syndication. "Some companies are waiting for de facto standards to emerge, and that has slowed XML deployment."
Because data interchange is a complex area, complementary standards are under development. "XML will become more functional once [Extensible Style Language] and [XML Linking Language] are fully defined later this year," said Mike Bray, president and chief executive officer of Frontline Solutions.
Because there are so many issues, most organisations are proceeding cautiously. "Many Government organisations see the need for adding XML support to their systems, are including it in their RFPs and running XML in pilot operations, but there aren't a lot using it in production systems yet," said Cameron O'Rourke, a marketing technologist in the public-sector group at Oracle.
Yet there is optimism about the future. "XML will be widely adopted because it offers a simple way for two persons with computers running different operating systems and applications to exchange information," said Chuck Hellings, vice president of sales and marketing at BuildWBT.com, a US-based supplier of online training systems.
XML in a nutshell
In technical terms, Extensible Markup Language (XML) is considered a metalanguage. That means it consists of data that describes other information, such as how it is formatted and how it can be exchanged between servers and clients via a network. By providing those features, XML makes it much simpler for programmers to develop applications that move information from system to system.
To link two systems, agencies currently rely on point-to-point connections in which an application uses an application program interface to access and download data on another computer. Although effective, that technique creates a couple of problems. Because every connection between two systems requires custom programming, an agency with multiple data sources must allocate a lot of resources to develop those links.
Maintenance is also an issue. Whenever the logic or data in one system changes, the accompanying interface often also needs to be altered. Consequently, programmers spend much of their time ensuring that the interfaces between different applications function properly rather than enhancing the systems themselves.
Also, point-to-point interfaces do not deal well with problem exchanges. If bits are lost or data parameters don't match up exactly, an application will often drop parts - or all - of a transmission.
Enter XML. Whenever an application receives a file, it gets descriptions about how the data is structured. That way, a program can more easily determine how to process the data.
The standard has a couple of underlying components that make it work. XML Document Type Definitions (DTDs) define a document or file. For example, an electronic payment might include customer name, address, account number, balance and bill number. Two systems with consistent DTDs can exchange files as easily as two PCs equipped with Microsoft's Word.
XML Style Sheets tell applications how to present information to different devices. One style sheet may outline how to present a report to a user to view with a Web browser, while a second could illustrate how to send the report to a user with a wireless phone.
XML components enable programmers to spend less time worrying about data exchanges and more time on application functionality.
Who's behind XML?
Formed in October 1994, the World Wide Web Consortium (W3C) has become a central force behind the development of Internet-based standards, including Extensible Markup Language (XML). Membership in the international organisation has grown to 400 organisations, including academic institutions, research groups and vendors.
The consortium's goal is to design standards that promote application interoperability. Since inception, the W3C has developed more than 20 technical specifications for the Web's infrastructure, including HTML. Any member can propose that the group target a specific area, and W3C members regularly review proposals for work, which are called Activity proposals. Whenever there is a consensus among the members to pursue an item, the W3C initiates a new activity. Currently, there are more than 30 W3C Working Groups.
Work on the XML 1.0 specification began in September 1996, and the standard was published in February 1998. Because data interchange is a complex area, the group has identified other specifications that are being added to the XML infrastructure. In effect, XML has become a family of related specifications. Although XML defines basic functions such as document attributes, it is a growing set of optional modules that provide guidelines for conducting other programming tasks.
Some of them include:
- Xlink describes a standard way to add hyperlinks to a file.
- XPointer and XFragments are syntaxes for pointing to parts of a document.
- XSL is an advanced language for expressing style sheets.
- XML Namespaces is a specification that describes how to associate a URL with every single tag and attribute in a document.