Pandas Dataframe - enforce data properties

76 views Asked by At

I would like to enforce properties on pandas data tables. Mostly a "uniqueness" of "primary keys" in the table would be interesting.

Is there a way to ensure such properties without having to call a validation function? It would be preferable that pandas throws an error if ANY modification to the table breaches the defined rules.

I already found a the package pandera which only works when a validation function is called. The checks are not enforced on the table at any time.

1

There are 1 answers

1
D.L On

you can use the set_index() function to set a column or set of columns as the index of a DataFrame.

you should note that if the column is not unique (has duplicates) then Pandas will raise a KeyError.

The question has no actual code to debug, so i post a boiler plate example of what i might look like:

import pandas as pd
df = pd.read_csv("data.csv")
df = df.set_index("id")

There are different variants on this.