本文主要介绍Python pandas,通过指定的列作为key,汇总指定列的数据的方法,及相关示例代码。

示例数据:

dictionary =[{'Flow': 100, 'Location': 'USA', 'Name': 'A1'},
            {'Flow': 90, 'Location': 'Europe', 'Name': 'B1'},
            {'Flow': 20, 'Location': 'USA', 'Name': 'A1'},
            {'Flow': 70, 'Location': 'Europe', 'Name': 'B1'}]

汇总结果:

new_dictionary =[{'Flow': 120, 'Location': 'USA', 'Name': 'A1'},
            {'Flow': 160, 'Location': 'Europe', 'Name': 'B1'},] 

使用groupby、sum 和to_dict实现

import pandas as pd
dictionary =[{'Flow': 100, 'Location': 'USA', 'Name': 'A1'},
            {'Flow': 90, 'Location': 'Europe', 'Name': 'B1'},
            {'Flow': 20, 'Location': 'USA', 'Name': 'A1'},
            {'Flow': 70, 'Location': 'Europe', 'Name': 'B1'}]
print(pd.DataFrame(dictionary)
   .groupby(['Location', 'Name'], as_index=False)
   .Flow.sum()
   .to_dict('dict'))

或者

from itertools import groupby
from operator import itemgetter
dictionary =[{'Flow': 100, 'Location': 'USA', 'Name': 'A1'},
            {'Flow': 90, 'Location': 'Europe', 'Name': 'B1'},
            {'Flow': 20, 'Location': 'USA', 'Name': 'A1'},
            {'Flow': 70, 'Location': 'Europe', 'Name': 'B1'}]

grouper = ['Location', 'Name']
key = itemgetter(*grouper)
dictionary.sort(key=key)
print([{**dict(zip(grouper, k)), 'Flow': sum(map(itemgetter('Flow'), g))} 
    for k, g in groupby(dictionary, key=key)])

推荐文档