<_x005B_dbo_x0" /> <_x005B_dbo_x0" /> <_x005B_dbo_x0"/>

Retrieve tags from xml using python

47 views Asked by At

I have employee data in employeedata.xml file. My sample data looks like below

 <?xml version="1.0" standalone="yes"?>
    <DocumentElement>
      <_x005B_dbo_x005D_._x005B_employeedata_x005D_>
        <RowID>11148</RowID>
        <ItemID>966109</ItemID>
        <Mappings>[]</Mappings>
        <Groups>93664,68349</Groups>
        <GroupKey>7003142</GroupKey>
        <ParentItemID>351908</ParentItemID>
        <JobID>30318</JobID>
        <Action>Employee</Action>
        <Employee_Name>John Travis</Employee_Name>
        <mail_id>[email protected]</mail_id>
        <...>...</...>  
        <Action>Experience</Action>
        <...>...</...>
        <...>...</...>
        <...>...</...>

I'm using python 3.8.10 to read the file. I'm using following code to read my data

with open('employeedata.xml', 'r') as f:
    data = f.read()

I would like to store my xml in pandas data frame where the value under tag Action will store in Action column & corresponding tag will store in other column, Field. My sample output will look like

|    Action    |    Field    |
|--------------|-------------|
|    Employee  |Employee_Name|
|    Employee  |mail_id      |
|    Employee  |....         |
|    Employee  |....         |
|    Experience|....         |
|    Experience|....         |

Can you please suggest me how should I do this?

2

There are 2 answers

0
pabludo8 On BEST ANSWER

You are probably looking for this python xml parser module

In your case:

import xml.etree.ElementTree as ET
tree = ET.parse('employeedata.xml')
root = tree.getroot()
0
Hermann12 On

Pandas can read the xml file directly:

import pandas as pd

df = pd.read_xml("Employee data.xml", xpath="_x005B_dbo_x005D_._x005B_employeedata_x005D_")
print(df[['Action', 'Employee_Name']].to_string(index=False))

If you like you can also print do at the whole.

Output:

Action Employee_Name
Experience   John Travis