I am trying to achieve a conversion of some JSON data to CSV for a later analysis.
Issue is due to the n-nested levels, I think, and mainly for the last stage of the conversion process.
So far, I have been able to create a proper data rows mapping, but it fails once I try to include the respective data into each single rows for x'000s of times.
I downloaded the data from a website and use it from an offline file in my desktop, but the JSON data comes from: https://ws.cso.ie/public/api.restful/PxStat.Data.Cube_API.ReadDataset/HSA06/JSON-stat/1.0/en
It comes from Irish Data Department: https://data.gov.ie/dataset/hsa06-average-price-of-houses
Last code I have is:
import json
import csv
local_file_path = "/Users/piuno/OneDrive - Asociación Club Taller Cultural/full_data_housing.json"
# Load JSON data from file
with open(local_file_path, 'r') as json_file:
data = json.load(json_file)
# Writing to csv
with open('output.csv', mode='w', newline='') as file:
writer = csv.writer(file)
# Write header
header = ['Statistic', 'Quarter', 'Area', 'Value']
writer.writerow(header)
# Write data rows
statistic_mapping = data['dataset']['dimension']['STATISTIC']['category']['label']
quarter_mapping = data['dataset']['dimension']['TLIST(Q1)']['category']['label']
area_mapping = data['dataset']['dimension']['C02343V02817']['category']['label']
values = data['dataset']['value']
print("Statistical Mapping:", statistic_mapping)
print("Quarter Mapping:", quarter_mapping)
print("Area Mapping:", area_mapping)
for i in range(len(values)):
if values[i] is not None:
statistic_index = i // (len(data['dataset']['dimension']['TLIST(Q1)']['category']['index']) * len(data['dataset']['dimension']['C02343V02817']['category']['index']))
quarter_index = (i % (len(data['dataset']['dimension']['TLIST(Q1)']['category']['index']) * len(data['dataset']['dimension']['C02343V02817']['category']['index']))) // len(data['dataset']['dimension']['C02343V02817']['category']['index'])
area_index = i % len(data['dataset']['dimension']['C02343V02817']['category']['index'])
row = [statistic_mapping[str(statistic_index)], quarter_mapping[str(quarter_index)], area_mapping[str(area_index)], values[i]]
writer.writerow(row)
With that, it returns properly the prints as it shows in the image below:

The expected result would be something like:

I don't know what's failing or missing in the loop to look up for the specific value and then return the specific data as shown in the second image.
You are trying to unfold the JSON file. Try transforming the json to a tree and then transversing the tree in the correct way to obtain each row.