How to remove prefix from every attribute in xml using python

Question

How to remove prefix from every attribute in xml using python

103 views Asked by dp808139 At 14 April 2023 at 11:44

I am trying to generate xml file from pandas dataframe using pd.to_xml() method.

I have data in SQL-Server and trying to read from it.

id	first_name	last_name	email	gender	ip_address
1	Erika	Pickless	[email protected]	female	133.146.172.220
2	Victor	Stodd	[email protected]	male	111.125.132.0
3	Becky	Berrisford	[email protected]	female	200.62.247.237

RawData = pd.read_sql_query('''select * from RFP.dbo.MOCK_DATA;''', conn)


RawData.to_xml('output.xml', attr_cols=['id', 'first_name', 'last_name', 'email', 
'gender', 'ip_address'],namespaces={"soap": "http://example1.com", "wsse": 
"http://docs.example2.org/wss/"}, prefix="soap")

I am getteing the following output

<?xml version='1.0' encoding='utf-8'?>
<soap:data xmlns:soap="http://example1.com" xmlns:wsse="http://docs.example2.org/wss/">
  <soap:row soap:index="0" soap:id="1" soap:first_name="Erika" soap:last_name="Pickless" soap:email="[email protected]" soap:gender="Female" soap:ip_address="133.146.172.220"/>
  <soap:row soap:index="1" soap:id="2" soap:first_name="Victoir" soap:last_name="Stodd" soap:email="[email protected]" soap:gender="Male" soap:ip_address="111.125.132.0"/>
  <soap:row soap:index="2" soap:id="3" soap:first_name="Becky" soap:last_name="Berrisford" soap:email="[email protected]" soap:gender="Female" soap:ip_address="200.62.247.237"/>
</soap:data>

The above output is almost correct but, I want little different output: Issue: The prefix "soap" is attached to every attribute in the above output.

But I want following output:

<?xml version='1.0' encoding='utf-8'?>
<soap:data xmlns:soap="http://example1.com" xmlns:wsse="http://docs.example2.org/wss/">
  <soap:row index="0" id="1" first_name="Erika" last_name="Pickless" email="[email protected]" gender="Female" ip_address="133.146.172.220"/>
  <soap:row index="1" id="2" first_name="Victoir" last_name="Stodd" email="[email protected]" gender="Male" ip_address="111.125.132.0"/>
  <soap:row index="2" id="3" first_name="Becky" last_name="Berrisford" email="[email protected]" gender="Female" ip_address="200.62.247.237"/>    
</soap:data>

Original Q&A

There are 1 answers

**Parfait** · Accepted Answer · 2023-04-15T23:05:48+00:00

Since XML is an open-ended design standard and Pandas cannot support all possible output specifications with default arguments, you need a customized solution.

Therefore consider running XSLT, the special-purpose language designed to transform XML files, which is supported in DataFrame.to_xml() using the default lxml parser and stylesheet argument. Below XSLT will run after your raw output to remove namespace prefixes from all attributes:

XSLT (save as .xsl script, a special .xml file)

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <!-- IDENTITY TRANSFORM -->
    <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
    </xsl:template>

    <!-- REMOVE NAMESPACE PREFIXES FROM ALL ATTRIBUTES -->
    <xsl:template match="@*">
      <xsl:attribute name="{local-name()}">
        <xsl:value-of select="."/>
      </xsl:attribute>
    </xsl:template>
</xsl:stylesheet>

Online Demo

Python

RawData.to_xml(
    "output.xml", 
    attr_cols=["id", "first_name", "last_name", "email", "gender", "ip_address"],
    namespaces={"soap": "http://example1.com", "wsse": "http://docs.example2.org/wss/"}, 
    prefix="soap",
    stylesheet="style.xsl"
)

TechQA.

How to remove prefix from every attribute in xml using python

There are 1 answers

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in XML

Related Questions in EXPORT-TO-XML

Popular Questions

Trending Questions