How to import Custom json encoder class into data cataloge

31 views Asked by At

I’ve a df which stores lists in a column. I am saving the df with all columns in json using

config_new:   type: json.JSONDataSet   filepath: data/01_raw/new_config.json   save_args:     indent: 6  

It’s saving all columns ok , except the column with list as string. As in:

“T”:[{ “Col1”: “9” “Col2”: “[“7”,”9”,”0”,”5”]” }]

As you can see above col2 list is coming out as string

I am using json encoder class as below in a python script and saving it under src :

import json   class CustomEncoder(json.JSONEncoder):     def default(self, obj):         if isinstance(obj, list):             return obj.to_json()          return super().default(obj)

Updated my config to:

config_new:   type: json.JSONDataSet   filepath: data/01_raw/new_config.json   save_args:     indent: 6     cls: custom_encoder.CustomEncoder 

However the CustomEncoder is not being identified and shooting error as it can’t call str.

I am not sure on how to import the class to the cataloge

1

There are 1 answers

0
datajoely On

So you're on the right track, but you need to subclass the json.JSONDataSet and make the changes there, then call the classpath of that custom dataset from your yaml catalog.

https://docs.kedro.org/en/stable/data/how_to_create_a_custom_dataset.html