DataFrame.nsmallest(self, n, columns, keep='first') → 'DataFrame' [source]
返回按列升序排列的前n行。
以升序返回前n行中column中的最小值。未指定的列也将返回,但不用于排序。
此方法等效于df.sort_values(columns, ascending=True).head(n)
,但性能更高。
参数: | n : int 要检索的项目数。 columns :list 或 str 列名或按顺序排列的名称。 keep : 其中有重复的值: 1) first : 以第一个事件为例。 2) last : 以最后一个事件为例。 3) all : 不要删除任何重复项, 即使这意味着要选择超过n个项目。 0.24.0版中的新功能。 |
返回值: | DataFrame |
例子
1)选取 population
列中值最小的三行
import pandas as pd # 创建 DataFrame df = pd.DataFrame({ 'population': [59000000, 65000000, 434000, 434000, 434000, 337000, 11300, 11300, 11300], 'GDP': [1937894, 2583560 , 12011, 4520, 12128, 17036, 182, 38, 311], 'alpha-2': ["IT", "FR", "MT", "MV", "BN", "IS", "NR", "TV", "AI"] }, index=["Italy", "France", "Malta", "Maldives", "Brunei", "Iceland", "Nauru", "Tuvalu", "Anguilla"]) # 最小的3个population print(df.nsmallest(3, 'population'))
2)使用 keep='last'
,相同值保留最后的顺序
import pandas as pd # 创建 DataFrame df = pd.DataFrame({ 'population': [59000000, 65000000, 434000, 434000, 434000, 337000, 11300, 11300, 11300], 'GDP': [1937894, 2583560 , 12011, 4520, 12128, 17036, 182, 38, 311], 'alpha-2': ["IT", "FR", "MT", "MV", "BN", "IS", "NR", "TV", "AI"] }, index=["Italy", "France", "Malta", "Maldives", "Brunei", "Iceland", "Nauru", "Tuvalu", "Anguilla"]) print(df.nsmallest(3, 'population', keep='last'))
3)使用 keep='all'
,保留所有最小值重复的行(不限制为3行)
import pandas as pd # 创建 DataFrame df = pd.DataFrame({ 'population': [59000000, 65000000, 434000, 434000, 434000, 337000, 11300, 11300, 11300], 'GDP': [1937894, 2583560 , 12011, 4520, 12128, 17036, 182, 38, 311], 'alpha-2': ["IT", "FR", "MT", "MV", "BN", "IS", "NR", "TV", "AI"] }, index=["Italy", "France", "Malta", "Maldives", "Brunei", "Iceland", "Nauru", "Tuvalu", "Anguilla"]) print(df.nsmallest(3, 'population', keep='all'))
4)按多个列排序:先按 population
,再按 GDP
import pandas as pd # 创建 DataFrame df = pd.DataFrame({ 'population': [59000000, 65000000, 434000, 434000, 434000, 337000, 11300, 11300, 11300], 'GDP': [1937894, 2583560 , 12011, 4520, 12128, 17036, 182, 38, 311], 'alpha-2': ["IT", "FR", "MT", "MV", "BN", "IS", "NR", "TV", "AI"] }, index=["Italy", "France", "Malta", "Maldives", "Brunei", "Iceland", "Nauru", "Tuvalu", "Anguilla"]) print(df.nsmallest(3, ['population', 'GDP']))