I have two tables. The first reference table is below:
| Name | Target | Bonus |
|------|--------:|------:|
| Joe | 40 | 46 |
| Phil | 38 | 42 |
| Dean | 65 | 70 |
The Python code to generate the table is:
# Data for the table
data = {
'Name': ['Joe', 'Phil', 'Dean'],
'Target': [40, 38, 65],
'Bonus': [46, 42, 70]
}
# Creating the DataFrame
ref = pd.DataFrame(data)
My second table is below:
| week | Metrics | Joe | Dean |
|------------|---------|----:|-----:|
| 11/6/2023 | Target | 40 | 65 |
| 11/6/2023 | Bonus | 46 | 70 |
| 11/6/2023 | Score | 33 | 71 |
| 11/13/2023 | Target | 40 | NaN |
| 11/13/2023 | Bonus | 46 | NaN |
| 11/13/2023 | Score | 45 | NaN |
| 11/20/2023 | Target | 40 | 65 |
| 11/20/2023 | Bonus | 46 | 70 |
| 11/20/2023 | Score | 35 | 68 |
| 11/27/2023 | Target | NaN | 65 |
| 11/27/2023 | Bonus | NaN | 70 |
| 11/27/2023 | Score | NaN | 44 |
| 12/4/2023 | Target | 40 | 65 |
| 12/4/2023 | Bonus | 46 | 70 |
| 12/4/2023 | Score | 42 | 66 |
The Python code to generate this table is:
# Data for the new table
data = {
'week': ['11/6/2023', '11/6/2023', '11/6/2023', '11/13/2023', '11/13/2023', '11/13/2023',
'11/20/2023', '11/20/2023', '11/20/2023', '11/27/2023', '11/27/2023', '11/27/2023',
'12/4/2023', '12/4/2023', '12/4/2023'],
'Metrics': ['Target', 'Bonus', 'Score', 'Target', 'Bonus', 'Score',
'Target', 'Bonus', 'Score', 'Target', 'Bonus', 'Score',
'Target', 'Bonus', 'Score'],
'Joe': [40, 46, 33, 40, 46, 45, 40, 46, 35, None, None, None, 40, 46, 42],
'Dean': [65, 70, 71, None, None, None, 65, 70, 68, 65, 70, 44, 65, 70, 66]
}
# Creating the DataFrame
df = pd.DataFrame(data)
As you can see Dean has a week where his Target, Bonus, and Score cells are blank. So does Joe in a later week. In these specific instances where the cell is NaN I want to populate them using the following rules:
- Get Target and Bonus cell values for each person from the first reference table and populate the NaN cell accordingly.
- Set the Score cell equal to the Target cell value for the person.
My desired output table would look like this:
| week | Metrics | Joe | Dean |
|------------|---------|----:|-----:|
| 11/6/2023 | Target | 40 | 65 |
| 11/6/2023 | Bonus | 46 | 70 |
| 11/6/2023 | Score | 33 | 71 |
| 11/13/2023 | Target | 40 | 65 |
| 11/13/2023 | Bonus | 46 | 70 |
| 11/13/2023 | Score | 45 | 65 |
| 11/20/2023 | Target | 40 | 65 |
| 11/20/2023 | Bonus | 46 | 70 |
| 11/20/2023 | Score | 35 | 68 |
| 11/27/2023 | Target | 40 | 65 |
| 11/27/2023 | Bonus | 46 | 70 |
| 11/27/2023 | Score | 40 | 44 |
| 12/4/2023 | Target | 40 | 65 |
| 12/4/2023 | Bonus | 46 | 70 |
| 12/4/2023 | Score | 42 | 66 |
Only one block of NaN per column at most
Another possible solution, which loops through the
dfcolumns corresponding to each person and, for each block ofNaN(identified byloc), assigns the corresponding block of values inref(also identified byloc):General case
In case there is more than a single block of
NaNper person, we need to change the code slightly:Edit
To satisfy the new requirement of the OP: Instead of order Target, Bonus and Score, it is needed the order Bonus, Target and Score. In such a case, we need to readjust the previous code:
Output: