14
44 ©2002 T HE XML H ANDBOOK XML in Office Introductory Discussion Meet the family! Word, Excel, InfoPath, Access, and FrontPage Information capture and reuse End-user data connection Data-driven application enhancement &)$0/<IP3DJH0RQGD\'HFHPEHU30

XML in Office - pearsoncmg.comptgmedia.pearsoncmg.com/images/013142193X/samplechapter/013142… · XML in Office Introductory ... InfoPath lets you design and use forms that are really

Embed Size (px)

Citation preview

44 © 2 0 0 2 T H E X M L H A N D B O O K ™

XML in OfficeIntroductory Discussion

❚ Meet the family! Word, Excel, InfoPath, Access, and FrontPage

❚ Information capture and reuse

❚ End-user data connection

❚ Data-driven application enhancement

���������� ���� �� ������� �������� �� ���� ��� ��

© 2 0 0 2 T H E X M L H A N D B O O K ™

Chapter

45

Chapter 3

T !"# �!�$%�� "# �� �&��&"�' �� %!� (�� ���%)��# �� *��"�� +�

�"#�)## %!� (�������,�� *��"�� $���)�%# - +���� ./��,� ���

��##� ����%���� ��� %!� ��',� "�%���)��� 0�����%! - "� %!�

���%�/% �� #�&���, "������%"�� #!��"�� #�����"�#

But the products are really just the supporting cast. The true stars are theadvances that XML in Office brings to:

� information capture and reuse;� end-user data connection; and� data-driven application enhancement.

3.1 | Information capture and reuseFor all the valuable abstract data that is managed in database systems, thereis even more that is hidden in rendered word processing documents. Thatfact represents an enormous intellectual property loss for enterprises, ofcourse, but it also represents a nuisance and a time-waster for the informa-tion workers who work with those documents.

���������� ���� �1 ������� �������� �� ���� ��� ��

46 CHAPTER 3 | XML IN OFFICE

© 2 0 0 4 X M L I N O F F I C E 2 0 0 3

Consider the articles written for a company’s websites and newsletters.Every one is likely to contain a title, author, and date within it, but moreoften than not that information has to be retyped, or individually copiedand pasted, to get it into a catalog entry. That’s because there is no reliableway for a computer to recognize those data items in order to extract them.

3.1.1 Word processing

In contrast, look at Figure 3-1, which shows an article being edited inMicrosoft Word.

Figure 3-1 Word document showing optional tag icons and task pane with XML structure

���������� ���� � ������� �������� �� ���� ��� ��

3 . 1 | I N F O R M A T I O N C A P T U R E A N D R E U S E 47

© 2 0 0 4 X M L I N O F F I C E 2 0 0 3

The article is actually an XML document that conforms to a schema ofthe user’s choosing, in this case article. The user has opted to displayicons that represent the start- and end-tags. Note that there are distinct ele-ments for the title, author, and date.

Solution developers can use the XML elements to check and normalizeinformation as it is entered, whether or not the tag icons are displayed. Anapplication, for example, could notify the user if the text entered for a dateelement isn’t really a valid date. Or it could automatically supply the currentyear if none was entered.

The right-hand pane is called the task pane; it can be used for variouspurposes. In the figure, the top of the task pane shows the XML structure ofthe document. At the bottom is a list of the types of element that are validat the current point in the document, according to the article schema.

The document is also a normal Word document, so Word’s formattingfeatures can be used in the usual way.

There are three ways to save this document as XML:

WordMLWordML is Word’s native XML file format. It preserves the Worddocument just as the DOC format would, including formattingand hyperlinks. However, it doesn’t include any of the articlemarkup, so we won’t discuss this option further here. (We cover itin Chapter 5, “Rendering and presenting XML documents”, onpage 86.)

custom XMLThe document can be saved as an XML document conforming toa custom schema; in this case, article. A custom schema wouldnormally be defined by an enterprise, or by a committee set up byan industry to which the enterprise belongs. For that reason, itwould be designed to preserve the abstract data needed for theuser’s applications. For example, the title, author, and date caneasily be identified by software and extracted for use in a catalog ofarticles.

mixed XMLThe saved document could contain both WordML and thearticle markup, since the two are in different namespaces. Thisoption preserves the formatting applied by the user, while still

���������� ���� �2 ������� �������� �� ���� ��� ��

48 CHAPTER 3 | XML IN OFFICE

© 2 0 0 4 X M L I N O F F I C E 2 0 0 3

preserving the abstract data and distinguishing it from therendition information.

In our example, the article is the entire Word document, but that isn’t arequirement. It is possible to intersperse short XML documents within alarger Word document. For example, a travel guide might include multipleXML structures that describe hotels, with subelements for the name,address, number of rooms, rates, etc.

Using XML with Word documents enables companies to capture moreof the intellectual property that is created informally by individuals andwork groups, and that typically remains inaccessible to enterprise informa-tion systems. As XML, that property becomes a portable asset that can bereused as needed.1

3.1.2 Forms

For many purposes, a data entry form is more suitable for information cap-ture than a typically larger and less constrained word processing document.InfoPath lets you design and use forms that are really XML documents thatconform to your own custom schemas.2

Figure 3-2 shows the layout of an order form in InfoPath’s design mode.The structure of the order schema is shown in the task pane on the right,from which element types can be dragged onto the form.

Note that there is only one item line in the form design. Because theorder schema allows item elements to be repeated, a user entering data willbe able to add item lines as needed. Had customer elements been repeat-able, the form would expand to allow insertion of the group of customerinformation fields.

Unlike Word, InfoPath generates an XSLT stylesheet to control the ren-dering of the form. The formatting can even be based on the data enteredin the form. For example, the dialog box in Figure 3-3 specifies that nega-tive prices should be shown in a different color.

� +� ��&�� %!� ��%�",# "� �!�$%�� �� 3����%"�� ��� ��"%"�� (�� ���)���%#4�

�� $��� �

� 0�����%! "# �&�",��,� "� %!� *��"�� �����##"���, .�%��$�"#� .�"%"�� ��� "��"�

&"�)�,,�

���������� ���� �5 ������� �������� �� ���� ��� ��

3 . 1 | I N F O R M A T I O N C A P T U R E A N D R E U S E 49

© 2 0 0 4 X M L I N O F F I C E 2 0 0 3

InfoPath is described in detail in Chapter 9, “Designing and usingforms”, on page 180.

Figure 3-2 InfoPath design interface with data source in task pane

Figure 3-3 InfoPath conditional formatting dialog

���������� ���� �� ������� �������� �� ���� ��� ��

50 CHAPTER 3 | XML IN OFFICE

© 2 0 0 4 X M L I N O F F I C E 2 0 0 3

3.1.3 Relational data

XML elements, whether captured in Word or Excel or InfoPath (or anyother way, for that matter), are as well-defined and predictable as the col-umns and tables of a database. XML documents of all kinds are therefore asource of information as rich as any other operational data store. Compa-nies can aggregate, parse, search, manage, and reuse the data in documentsin the same way they do the transactional data that is typically captured forrelational databases.

They can also import the document data into a database and use it inconjunction with data from other sources. In addition, they can exportDBMS data as XML documents.

Figure 3-4, for example, shows the options Access offers when exportingdata as XML. You can specify which tables and records to export and howto sort and/or transform them.

Figure 3-5 shows the options for exporting a schema as XML. You canchoose whether or not to export the schema, and whether it should be

Figure 3-4 Access dialog for exporting data as XML

���������� ���� 1� ������� �������� �� ���� ��� ��

3 . 2 | E N D - U S E R D A T A C O N N E C T I O N 51

© 2 0 0 4 X M L I N O F F I C E 2 0 0 3

exported within the data document or as an independent schema docu-ment.3

3.2 | End-user data connectionIncluding custom XML elements in Office documents presents companieswith new opportunities for business process integration.

For example, an end-user can connect directly to enterprise systems anddata sources using a Web services interface. The products do the heavy lift-ing: natively, in the case of InfoPath, and via VBA or an external extension,in the case of Word and Excel.

The data from the Web service can be cached by the product, which canthen disconnect from the server. The user still maintains the ability to workwith that data, even while disconnected. For this reason, Microsoft refers to

Figure 3-5 Access dialog for exporting schema as XML

� 6�� �!�$%�� ��� 3����## ��%���#�# ��� (��4� �� $��� � ��� ��%�",# ��

����##

���������� ���� 1� ������� �������� �� ���� ��� ��

52 CHAPTER 3 | XML IN OFFICE

© 2 0 0 4 X M L I N O F F I C E 2 0 0 3

Word, InfoPath, and Excel as smart clients.4 Once reconnected to the cor-porate server, the smart client can update the data sources.

The individual end-user also benefits from this ability to search for spe-cific information and to aggregate information from multiple sources. Iteliminates such time-consuming, error-prone tasks as:

� opening and closing files to find information;� cutting and pasting information between documents; and� searching for labels to combine data in like fields.

3.2.1 Spreadsheets

Consider how a smart client can assist in the creation and processing ofexpense reports.

Prior to leaving on a trip, during which she won’t have access to the cor-porate network, Ellen opens the Excel worksheet shown in Figure 3-6. Itscells are mapped to the element types of the expenseReport schema, asshown in the task pane.

Each mapped element type corresponds to an area of the worksheet,either a single cell or a column of cells. The mapping allows XML data tobe imported into, and exported from, the appropriate areas of the work-sheet.

Ellen enters her employee number and the business purpose of her trip.The other cells that are visible in the example are populated automaticallyfrom the enterprise data store.

� Her name comes from the human resources records, based on the employee number that she entered.

� The lodging and airfare amounts come from the bookings made by the travel department.

� The per diem is based on the location in the hotel booking.� The mileage is the calculated distance between the airport and

Ellen’s home address, also taken from the human resources records.

� +!�% �"���#��% ��,,# � 3#���% �,"��%4 "# �##��%"�,,� '!�% %!� "��)#%�� ��,,#

� 3�"�! �,"��%4 ���!�$# �"���#��% ��,"�&�# %!�% "� ��)7�� #���%� ��) �)�!%

%� �� �"�! *)� ������# - #���% �� ���"�"%"�� - #!�),� �����8

���������� ���� 1� ������� �������� �� ���� ��� ��

3 . 2 | E N D - U S E R D A T A C O N N E C T I O N 53

© 2 0 0 4 X M L I N O F F I C E 2 0 0 3

Ellen adds more items to the worksheet during the trip, even without thenetwork connection. When she returns, she completes the report andexports the mapped cells as an XML document conforming to theexpenseReport schema.5

3.2.2 Web pages

Ellen next wants to submit her report for management approval. It needsfive levels of sign-off, so she decides to post it to an internal website whereall the managers can read it.6

Figure 3-6 Excel worksheet and task pane with XML source map

1 +� ��&�� ./��,7# (�� ���%)��# "� �!�$%�� 2� 39#"�� (�� ��%� "� #$�����

#!��%#4� �� $��� ���

�#� "� ��#% �����":�%"��# �"&� ,�&�,# �� ���������% '�),� ��� ��� �&�� %!�

%�$8

���������� ���� 1� ������� �������� �� ���� ��� ��

54 CHAPTER 3 | XML IN OFFICE

© 2 0 0 4 X M L I N O F F I C E 2 0 0 3

Figure 3-7 shows how FrontPage is used to design a Web page based onan XML document. Elements can be dragged from the expenseReportstructure in the task pane onto the main page, and styles and other format-ting options can be used to present the data.

Any XML document could be used as the data source, as could databasesand Web services. Sorting, grouping, filtering, and conditional formattingof the data are supported.

FrontPage generates an XSLT stylesheet from the WYSIWYG display.7

After Ellen’s expense report is approved, the XML document is sent tothe accounting department’s system, which deposits the reimbursement inher bank account.

Figure 3-7 FrontPage website view with data view details in task pane

2 ;!� ����%���� (�� ���%)��# ��� ��&���� "� �!�$%�� ��� 3�)�,"#!"�� (��

%� %!� +�� '"%! ����%����4� �� $��� ���

���������� ���� 1� ������� �������� �� ���� ��� ��

3 . 3 | D A T A - D R I V E N A P P L I C A T I O N E N H A N C E M E N T 55

© 2 0 0 4 X M L I N O F F I C E 2 0 0 3

3.3 | Data-driven application enhancement

There are ways to enhance Ellen’s experience with her expense report. Rais-ing the per diem would undoubtedly provide the most satisfaction, but wehave to stick to improvements enabled by mapping XML to the spread-sheet.

Solution developers have several ways to take advantage of the documentknowledge that XML gives them: custom renditions, smart tags, and smartdocuments.

3.3.1 Custom renditions

The classic technique for data-driven application enhancement dates backto the dawn of markup languages, when GML first separated abstract datafrom presentation. It is to render the same data in different ways for differ-ent tasks or users.

In our scenario, for example, a report to the accounting departmentmight contain only the summary totals from Ellen’s spreadsheet, while hermanagement gets to see every detail.

3.3.2 Smart tags

Office has a facility called smart tags that allows actions to be associatedwith words and phrases in a document. The “tags” don’t have to be delim-ited, as XML tags are. Instead, they are defined by a program or by a lookuptable that contains character strings and their associated actions. The prod-uct recognizes the matching strings in the document.8

Usually, an icon is displayed when the cursor is over a smart tag. Whenthe user clicks on it, the associated list of actions pops up.

5 0�#%��� �� � #$��"�"� �!����%�� #%�"��� ������"%"�� ��),� �� ��#�� �� � <"��

�� $�%%��� ��� #%�"��# %!�% ���$)%�� #�"��%"#%# ��,, � �������������� 0�

%!"# ��#�� ��� �!����%�� #%�"�� "� %!� ���)���% %!�% ��%�!�� %!� $�%%���

'�),� �� ������":�� ��� ���#"����� � #���% %��

���������� ���� 11 ������� �������� �� ���� ��� ��

56 CHAPTER 3 | XML IN OFFICE

© 2 0 0 4 X M L I N O F F I C E 2 0 0 3

It is also possible to define smart tags so that an action will take placeautomatically when the tag is recognized. For example, recognition of theemployee number in Ellen’s worksheet invoked an action to send the Webservice request for her employee name, which was entered into the appro-priate worksheet cell.

3.3.3 Smart documents

The ultimate data-driven application enhancement is to respond intelli-gently to user input, offering context-sensitive actions and guidance, sug-gesting content, and providing supporting data or links to relatedinformation.

The XML facilities we’ve looked at so far can be used in combination toapproach that goal. Add a customized task pane and even more can bedone. You have what in Office-speak is called a smart document solution.

For example, as a user moves the cursor to different elements in a docu-ment, the task pane could display help details, related data, tools to workwith the document, or related graphics. Ellen could click in a lodging celland see the hotel contact information displayed in the task pane. She couldthen click on the hotel’s email address and send a quick note advising thehotel of her arrival time.9

3.3.4 Using the Office tools

While all these features sound useful, there is some setup required. Fortu-nately, you have this book to guide you through the process. We have iden-tified typical implementation tasks, each of which is explained in a chapter.In the context of these tasks, we present the XML features of each of theOffice products. Finally, in the last part, we cover in detail some of the tech-nologies, such as schemas and stylesheets, you may use to get there.

� 6�� �!�$%�� ��� 3��&�,�$"�� *��"�� (�� �$$,"��%"��#4� �� $��� ��5 ���

���� �� #���% %��# ��� #���% ���)���%#

���������� ���� 1 ������� �������� �� ���� ��� ��

���������� ���� 12 ������� �������� �� ���� ��� ��