Extracting X_train and y_train from tabular data where a column is target and another columns are features with its corresponding ID

Question

Extracting X_train and y_train from tabular data where a column is target and another columns are features with its corresponding ID

38 views Asked by Muhammad Ikhwan Perwira At 18 December 2023 at 11:53

I have this DataFrame:

timestep	ID	x1	x2	y
1	1	2	3	0
2	1	4	5	0
...	...	...	...	...
24	1	5	5	0
...	...	...	...	...
1	9	2	3	1
2	9	4	5	1
...	...	...	...	...
24	9	2	2	1

I expect it converted to the numpy ndarray X_train with shape (9, 24, 2). Where, 9 represent how many IDs. 24 represent n-rows aka timestep. 2 represents how many features (x1 and x2)

It same to the y_train with shape (9, 1) where 9 represents how many IDs and 1 represents scalar value since all y will be same for all 24 timesteps

The minimal reproducible example for this, when accessing X_train[0,:,:] it will returns table with ID=1 with 24 rows (timesteps) like this:

x1	x2
2	3
4	5
...	...
5	5

(the last 24-th row)

Original Q&A

There are 1 answers

**mozway** · Answer 1 · 2023-12-18T11:56:55+00:00

First sort the rows by ID/timestep with sort_values, then convert to_numpy and reshape:

out = (df.sort_values(by=['ID', 'timestep'])[['x1', 'x2']]
         .to_numpy().reshape((9, 24, 2))
       )

Or, for an arbitrary number of features/columns:

n = df[['ID', 'timestep']].nunique()

out = (df.sort_values(by=['ID', 'timestep'])
         .filter(like='x')
         .to_numpy().reshape((n['ID'], n['timestamp'], -1))
       )

NB. assuming no combinations of ID/timestamps is missing/duplicated

For y, since the IDs are already sorted:

y_train = df.drop_duplicates(subset='ID')['y'].to_numpy()

TechQA.

Extracting X_train and y_train from tabular data where a column is target and another columns are features with its corresponding ID

There are 1 answers

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in NUMPY

Related Questions in TABULAR

Popular Questions

Trending Questions

timestep	ID	x1	x2	y
1	1	2	3	0
2	1	4	5	0
...	...	...	...	...
24	1	5	5	0
...	...	...	...	...
1	9	2	3	1
2	9	4	5	1
...	...	...	...	...
24	9	2	2	1

timestep	ID	x1	x2	y
1	1	2	3	0
2	1	4	5	0
...	...	...	...	...
24	1	5	5	0
...	...	...	...	...
1	9	2	3	1
2	9	4	5	1
...	...	...	...	...
24	9	2	2	1

timestep	ID	x1	x2	y
1	1	2	3	0
2	1	4	5	0
...	...	...	...	...
24	1	5	5	0
...	...	...	...	...
1	9	2	3	1
2	9	4	5	1
...	...	...	...	...
24	9	2	2	1