Python Pandas DataFrame 多个条件过滤筛选数据的方法

本文主要介绍Python Pandas中多个条件过滤筛选DataFrame中数据的几种方法及相关示例代码。

df = pd.DataFrame([[1990,7,1000],[1990,8,2500],[1990,9,2500],[1990,9,1500],[1991,1,250],[1991,2,350],[1991,3,350],[1991,7,450]], columns = ['year','month','data1'])

示例数据:

year    month    data1
1990      7      1000
1990      8      2500
1990      9      2500
1990      9      1500
1991      1      250
1991      2      350
1991      3      350
1991      7      450

多个条件例如:

df = df.loc[(df.year != 1990) | (df.month != 7)]

1、使用apply和isin实现

import pandas as pd

df = pd.DataFrame([[1990,7,1000],[1990,8,2500],[1990,9,2500],[1990,9,1500],[1991,1,250],[1991,2,350],[1991,3,350],[1991,7,450]], columns =['year','month','data1'])

mask = ~df[['year', 'month']].apply(tuple, 1).isin([(1990, 7), (1990, 8), (1991, 1)])
print(df[mask])

或者

import pandas as pd

df = pd.DataFrame([[1990,7,1000],[1990,8,2500],[1990,9,2500],[1990,9,1500],[1991,1,250],[1991,2,350],[1991,3,350],[1991,7,450]], columns =['year','month','data1'])
mask = ~(df.year*100 + df.month).isin({199007, 199008, 199101})
df[mask]

输出:

   year  month  data1
2  1990      9   2500
3  1990      9   1500
5  1991      2    350
6  1991      3    350
7  1991      7    450

2、使用merge实现

import pandas as pd

df = pd.DataFrame([[1990,7,1000],[1990,8,2500],[1990,9,2500],[1990,9,1500],[1991,1,250],[1991,2,350],[1991,3,350],[1991,7,450]], columns =['year','month','data1'])
out = df.drop(df.reset_index().merge(pd.DataFrame({'year':[1990,1990,1991],'month':[7,8,1]}))['index'])
print(out)

或者

import pandas as pd

df = pd.DataFrame([[1990,7,1000],[1990,8,2500],[1990,9,2500],[1990,9,1500],[1991,1,250],[1991,2,350],[1991,3,350],[1991,7,450]], columns =['year','month','data1'])
out = df.merge(pd.DataFrame({'year':[1990,1990,1991],'month':[7,8,1]}),indicator=True,how='left').loc[lambda x : x['_merge']=='left_only']
print(out)

输出

   year  month  data1
2  1990      9   2500
3  1990      9   1500
5  1991      2    350
6  1991      3    350
7  1991      7    450

推荐阅读
cjavapy编程之路首页