Introduction to the nsmallest()
Function
In data analysis and manipulation, the ability to sort and extract the top or bottom values from a dataset is crucial. The nsmallest()
function in the Python pandas library is a powerful tool that allows you to extract the n smallest elements from a pandas Series or DataFrame. This function is particularly useful when dealing with large datasets, where identifying the smallest values quickly is important for making informed decisions.
In this tutorial, we will explore the pandas nsmallest()
function in detail. We’ll cover its syntax, parameters, and usage with practical examples to illustrate its versatility and usefulness.
Table of Contents
- Overview of the
nsmallest()
Function - Syntax of
nsmallest()
- Parameters of
nsmallest()
- Examples of Using
nsmallest()
- Example 1: Extracting Smallest Values from a Series
- Example 2: Finding Smallest Values in a DataFrame Column
- Conclusion
1. Overview of the nsmallest()
Function
The nsmallest()
function is a pandas built-in method that allows you to extract the n smallest values from a pandas Series or DataFrame. It works by returning the specified number of elements with the smallest values while maintaining their original order in the dataset.
2. Syntax of nsmallest()
The syntax of the nsmallest()
function is as follows:
Series.nsmallest(n, keep='first')
n
: The number of smallest values to extract from the Series.keep
: This parameter determines how to handle duplicates in the dataset. It can take one of three values:'first'
(default),'last'
, orFalse
.
3. Parameters of nsmallest()
Let’s take a closer look at the parameters of the nsmallest()
function:
n
: This mandatory parameter specifies the number of smallest values to extract from the Series. It must be a positive integer. Ifn
is greater than the length of the Series, the function will return all values in ascending order.keep
: This parameter determines how to handle duplicate values in the dataset. It accepts three possible values:'first'
: This is the default behavior. It keeps the first occurrence of duplicates while extracting the n smallest values.'last'
: This keeps the last occurrence of duplicates.False
: This discards all duplicates.
4. Examples of Using nsmallest()
Now, let’s dive into some practical examples to demonstrate how to use the nsmallest()
function effectively.
Example 1: Extracting Smallest Values from a Series
Suppose you have a pandas Series containing the scores of students in a mathematics test. You want to extract the 3 smallest scores. Here’s how you can use the nsmallest()
function to achieve this:
import pandas as pd
# Create a sample Series
scores = pd.Series([85, 92, 78, 64, 90, 72, 56, 88, 80])
# Extract the 3 smallest scores
smallest_scores = scores.nsmallest(3)
print("Smallest Scores:")
print(smallest_scores)
Output:
Smallest Scores:
6 56
3 64
5 72
dtype: int64
In this example, the nsmallest(3)
function call returns a new Series containing the 3 smallest scores: 56, 64, and 72.
Example 2: Finding Smallest Values in a DataFrame Column
Let’s consider a more complex scenario involving a DataFrame. Suppose you have a DataFrame containing information about various products, including their prices. You want to extract the 5 products with the lowest prices. Here’s how you can use the nsmallest()
function in this context:
import pandas as pd
# Create a sample DataFrame
data = {
'Product': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'],
'Price': [25, 18, 30, 15, 12, 40, 28, 10, 22, 8]
}
df = pd.DataFrame(data)
# Extract the 5 products with the lowest prices
smallest_prices = df['Price'].nsmallest(5)
print("Products with Lowest Prices:")
print(df[df['Price'].isin(smallest_prices)])
Output:
Products with Lowest Prices:
Product Price
3 D 15
4 E 12
7 H 10
9 J 8
1 B 18
In this example, the nsmallest(5)
function call is applied to the ‘Price’ column of the DataFrame. The resulting Series contains the 5 smallest prices, and the .isin()
method is used to filter the original DataFrame to show the corresponding products.
5. Conclusion
In this tutorial, we explored the pandas nsmallest()
function, which is a valuable tool for extracting the n smallest values from a pandas Series or DataFrame. We discussed its syntax, parameters, and provided examples to illustrate its usage in real-world scenarios.
By leveraging the nsmallest()
function, you can efficiently identify and analyze the smallest values in your datasets, making informed decisions and gaining insights from your data analysis tasks.