Using describe() method to exclude a column

4.7k views Asked by At

I am new to using python with data sets and am trying to exclude a column ("id") from being shown in the output. Wondering how to go about this using the describe() and exclude functions.

4

There are 4 answers

0
Ajay A On BEST ANSWER

describe works on the datatypes. You can include or exclude based on the datatype & not based on columns. If your column id is of unique data type, then

df.describe(exclude=[datatype])

or if you just want to remove the column(s) in describe, then try this

cols = set(df.columns) - {'id'}
df1 = df[list(cols)]
df1.describe()

TaDa its done. For more info on describe click here

1
robdev91 On

Use output.describe(exclude=['id'])

0
Joao Gabriel Fekete On

You can do that by slicing your original DF and remove the 'id' column. One way is through .iloc . Let's suppose the column 'id' is the first column from you DF, then, you could do this:

df.iloc[:,1:].describe()

The first colon represents the rows, the second the columns.

0
Zamboni On

Although somebody responded with an example given from the official docs which is more then enough, I'd just want to add this, since It might help a few ppl:

IF your DataFrame is large (let's say 100s columns), removing one or two, might not be a good idea (not enough), instead, create a smaller DataFrame holding what you're interested and go from there.

Example of removing 2+ columns:

    table_of_columns_you_dont_want = set(your_bigger_data_frame.colums) = {'column_1', 'column_2','column3','etc'}
your_new_smaller_data_frame = your_new_smaller_data_frame[list[table_of_columns_you_dont_want]]
your_new_smaller_data_frame.describe()

IF your DataFrame is medium/small size, you already know every column and you only need a few columns, just create a new DataFrame and then apply describe():

I'll give an example from reading a .csv file and then read a smaller portion of that DataFrame which only holds what you need:

df = pd.read_csv('.\docs\project\file.csv')
df = [['column_1','column_2','column_3','etc']]
df.describe()