How to add prefix/suffix on a repeatable dictionary key in Python

122 views Asked by At

Could you please suggest is there any way to keep all the repeatable (duplicate) keys by adding prefix or suffix. In the below example, the address key is duplicated 3 times. It may vary (1 to 3 times). I want to get the output as in the expected output with adding a suffix to make the key unique. Currently the update function is overwriting the key value.

list = ['name:John','age:25','Address:Chicago','Address:Phoenix','Address:Washington','email:[email protected]']
dic = {}
for i in list:
    j=i.split(':')
    dic.update({j[0]:j[1]})
print(dic)

Current output: {'name': 'John', 'age': '25', 'Address': 'Washington', 'email': '[email protected]'}

Expected output: {'name': 'John', 'age': '25', 'Address1': 'Chicago', 'Address2': 'Phoenix', 'Address3': 'Washington', 'email': '[email protected]'}

Tried the below:

list = ['name:John','age:25','Address:Chicago','Address:Phoenix','Address:Washington','email:[email protected]']
dic = {}
for i in list:
    j=i.split(':')
    dic.update({j[0]:j[1]})
print(dic)

Expected output: {'name': 'John', 'age': '25', 'Address1': 'Chicago', 'Address2': 'Phoenix', 'Address3': 'Washington', 'email': '[email protected]'}

4

There are 4 answers

2
Suraj Shourie On BEST ANSWER

You can use something like this:

list_ = ['name:John','age:25','Address:Chicago','Address:Phoenix','Address:Washington','email:[email protected]']

dic = {}
for i in list_:
    j = i.split(':')
    key_ = j[0]
    count = 0 # counts the number of duplicates
    while key_ in dic:
        count += 1
        key_ = j[0] + str(count)
    dic[key_] = j[1]

Output:

{'name': 'John',
 'age': '25',
 'Address': 'Chicago',
 'Address1': 'Phoenix',
 'Address2': 'Washington',
 'email': '[email protected]'}

PS. don't use the python keyword list to name your variables as it overrides the type list

2
fferri On

Don't use list as a variable name. list is the name of a Python builtin class, and it is used in the following solution. I renamed your list variable l.

This solution consists of first building a multidict (using collections.defaultdict(list)) to store the multiple values:

import collections
d = collections.defaultdict(list)
for entry in l:
    key, value = entry.split(':', 2)
    d[key].append(value)

now d contains:

{'name': ['John'], 'age': ['25'], 'Address': ['Chicago', 'Phoenix', 'Washington'], 'email': ['[email protected]']}

then iterate the values of d, and if more than one, append a suffix:

output = {}
for key, values in d.items():
    if len(values) > 1:
        for i, value in enumerate(values):
            output[f'{key}{i+1}'] = value
    else:
        output[key] = values[0]

output:

{'name': 'John', 'age': '25', 'Address1': 'Chicago', 'Address2': 'Phoenix', 'Address3': 'Washington', 'email': '[email protected]'}

0
Andrej Kesely On

Another solution (this iterates over the list only once):

lst = [
    "name:John",
    "age:25",
    "Address:Chicago",
    "Address:Phoenix",
    "Address:Washington",
    "email:[email protected]",
]

cnts, out = {}, {}
for k, v in map(lambda s: s.split(":"), lst):
    c = cnts.get(k, 0)
    if c == 0:
        out[k] = v
    elif c == 1:
        out[f"{k}1"] = out.pop(k)
        out[f"{k}2"] = v
    else:
        out[f"{k}{c + 1}"] = v

    cnts[k] = c + 1

print(out)

Prints:

{
    "name": "John",
    "age": "25",
    "Address1": "Chicago",
    "Address2": "Phoenix",
    "Address3": "Washington",
    "email": "[email protected]",
}
0
Alain T. On

You could first separate values from keys in two lists, then make the keys list unique by adding suffixes and combine the unique keys with the values into a dictionary at the end:

data = ['name:John','age:25','Address:Chicago',
        'Address:Phoenix','Address:Washington','email:[email protected]']

keys,values = zip(*(s.split(":") for s in data))
keys        = [ k+str(keys[:i].count(k))*(keys.count(k)>1) 
                for i,k in enumerate(keys,1) ]
dic         = dict(zip(keys,values))

print(dic)

{'name': 'John', 
 'age': '25', 
 'Address1': 'Chicago', 
 'Address2': 'Phoenix', 
 'Address3': 'Washington', 
 'email': '[email protected]'}

Note that this does not cover cases where the suffixed keys clash with original keys. For example: ["Address1:...","Address:...","Address:..."] would produce a duplicate "Address1" by adding a suffix to the "Address" key. If that situation could exist in your data, a different approach would be needed

Alternatively, you can use a dictionary to group values in lists associated with each key and then expand this group dictionary to produce distinct keys:

grp = dict()
grp.update( (k,grp.get(k,[])+[v]) for s in data for k,v in [s.split(":")] )
dic = { k+str(i or ''):v for k,g in grp.items() 
                         for i,v in enumerate(g,len(g)>1) }

print(dic)

{'name': 'John', 
 'age': '25', 
 'Address1': 'Chicago', 
 'Address2': 'Phoenix', 
 'Address3': 'Washington', 
 'email': '[email protected]'}

Although grp itself may actually be easier to manipulate in subsequent code:

print(grp)

{'name':    ['John'], 
 'age':     ['25'], 
 'Address': ['Chicago', 'Phoenix', 'Washington'], 
 'email':   ['[email protected]']}