A lot of Word documents (Word 2003 xml) are to be converted into Docbook 5.1 (30 documents, approx. 80 pages each). I have created a stylesheet for this purpose and it works so far. However, I am not getting anywhere with the following problem:
There are many lists in the documents. The Word XML marks out list items (<w:listPr>), but as far as I can see, it does not indicate where the list begins and ends. There are only list points.
In XSLT I can now capture the list items (<listitem>), but I don't know how to surround the list items with the global list tag (<itemizedlist>).
One way could be to capture the lists with for-each-group or something and copy the text-content of the nodes in my target document. But there are other formatting/elements in the list items like <InstrText> (Docbook: <indexterm>) which should not be lost.
How can I handle this?
Word 2003 xml Source (Excerpt)
<w:p>
<w:pPr>
<w:pStyle w:val="2Standard"/>
<w:listPr>
<w:ilvl w:val="0"/>
<w:ilfo w:val="14"/>
<wx:t wx:val="·"/>
<wx:font wx:val="Symbol"/>
</w:listPr>
</w:pPr>
<w:r>
<w:t>die Prognose der Wirtschaft</w:t>
</w:r>
<w:r>
<w:fldChar w:fldCharType="begin"/>
</w:r>
<w:r>
<w:instrText> XE "Wirtschaft"</w:instrText>
</w:r>
<w:r>
<w:fldChar w:fldCharType="end"/>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="2Standard"/>
<w:listPr>
<w:ilvl w:val="0"/>
<w:ilfo w:val="14"/>
<wx:t wx:val="·"/>
<wx:font wx:val="Symbol"/>
</w:listPr>
</w:pPr>
<w:r>
<w:t>die Beratung der Politik.</w:t>
</w:r>
</w:p>"
Desired Output
<itemizedlist>
<listitem>
<para>die Prognose der Wirtschaft
<indexterm><primary>Wirtschaft</primary></indexterm>
</para>
</listitem>
<listitem>
<para>die Beratung der Politik.</para>
</listitem>
</itemizedlist>
First Stylesheet approach
<xsl:template match="w:p">
<xsl:choose>
<xsl:when test="w:pPr/w:listPr/w:ilvl/@w:val = '0'">
<listitem>
<para>
<xsl:apply-templates select="w:r"/>
</para>
</listitem>
</xsl:when>
<xsl:otherwise>
<para>
<xsl:apply-templates/>
</para>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="w:r">
<xsl:choose>
<xsl:when test="w:instrText">
<indexterm>
<primary>
<xsl:apply-templates select="*/text()"/>
</primary>
</indexterm>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="w:t"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
I think it should be possible with an approach along the lines of
This transforms
into
Consider to provide namespace well-formed samples/snippets the next time.