The original title of this question was going to be
Why does the Python jsons library produce serialized output with underscores prepended to the field names?
However while writing the question I realized why this is happening. Now the question has become more about whether or not this can be prevented and if so how to go about doing so.
Here are some examples to demonstrate the problem:
Example 1: Dictionary serialized as expected
with open('tmp.txt', 'w') as ofile:
d = {'key': 'value'}
ofile.write(jsons.dumps(d))
# produces:
# {"key": "value"}
Example 2: Class (object) serialized as expected
class Test:
def __init__(self, value:str) -> None:
self.value = value
with open('tmp.txt', 'w') as ofile:
test = Test('value')
ofile.write(jsons.dumps(test))
# produces:
# {"value": "value"}
Example 3: As soon as the concept of properties are introduced, we begin to see duplicated output
class TestProperty:
def __init__(self, value:str) -> None:
self.value = value
@property
def value(self) -> str:
return self._value
@value.setter
def value(self, the_value) -> None:
self._value = the_value
with open('tmp.txt', 'w') as ofile:
test = TestProperty('value')
ofile.write(jsons.dumps(test))
# produces:
# {"_value": "value", "value": "value"}
Example 4: If we swap out the properties concept for getter/setter functions, the problem goes away
class TestFunction:
def __init__(self, value:str) -> None:
self.value = value
def get_value(self) -> str:
return self.value
def set_value(self, the_value) -> None:
self.value = the_value
with open('tmp.txt', 'w') as ofile:
test = TestFunction('value')
ofile.write(jsons.dumps(test))
# produces:
# {"value": "value"}
What problems are created?
As Example 3 shows, if properties are introduced, the data is effectively duplicated.
- I don't fully understand why this is happening
- I thought
propertiesshould work like functions, as in Example 4, but they don't and cause a fieldvalueto be serialized - I understand why
_valueis being serialized in Example 3, it is an attribute of the class, it is the field name where we are storing the actual data values
There are three potential issues with this:
- Harder to inspect messages when debugging. The data is duplicated making the JSON message harder to read and using up screen space (which might become important if the message to be inspected is large)
- Slower message transfer rates (sending twice as much data over a network)
- Wasted disk space (might become important if storing large quantities of data, because available space to store records is reduced by a factor of 2)
Why is the data duplicated in Example 3 and is there a solution to this?
strip_privates=Trueremoves attributes starting with an underscore.