close
close
pandas get row by index

pandas get row by index

2 min read 11-10-2024
pandas get row by index

Accessing Data with Precision: A Guide to Pandas iloc and loc for Row Selection

Pandas, a powerful data manipulation library in Python, provides a wealth of methods for accessing and working with data. One common task is selecting specific rows from a DataFrame, which is where the iloc and loc attributes come in handy.

Understanding iloc and loc

Both iloc and loc enable you to select rows (and columns) from your DataFrame, but they differ in their indexing mechanisms:

  • iloc: This attribute works with integer-based indexing. It lets you select rows and columns by their numerical position in the DataFrame, starting from 0. This is similar to how you would access elements in a list.
  • loc: This attribute works with label-based indexing, allowing you to select rows and columns based on their row and column labels. Think of it like accessing elements in a dictionary.

Examples:

Let's illustrate the differences with a practical example. Consider a DataFrame representing sales data:

import pandas as pd

data = {'Product': ['Laptop', 'Keyboard', 'Mouse', 'Monitor'],
        'Price': [1200, 50, 25, 300],
        'Quantity': [10, 20, 50, 15]}

df = pd.DataFrame(data)

print(df)

   Product  Price  Quantity
0   Laptop   1200       10
1  Keyboard     50       20
2     Mouse     25       50
3   Monitor    300       15

Selecting Rows with iloc:

To select the first row (index 0), use:

first_row = df.iloc[0]
print(first_row)

Product      Laptop
Price         1200
Quantity        10
Name: 0, dtype: object

To select the second and third rows (indices 1 and 2), use:

second_third_rows = df.iloc[1:3]
print(second_third_rows)

     Product  Price  Quantity
1  Keyboard     50       20
2     Mouse     25       50

Selecting Rows with loc:

To select the row with the label 'Keyboard', use:

keyboard_row = df.loc[df['Product'] == 'Keyboard']
print(keyboard_row)

     Product  Price  Quantity
1  Keyboard     50       20

To select all rows with a price greater than 100, use:

expensive_rows = df.loc[df['Price'] > 100]
print(expensive_rows)

     Product  Price  Quantity
0   Laptop   1200       10
3   Monitor    300       15

When to Use iloc and loc

  • iloc: Use when you need to select rows based on their numerical position (index) in the DataFrame.
  • loc: Use when you want to select rows based on their labels. This is particularly useful when working with DataFrames with meaningful row labels or when you need to filter rows based on specific conditions.

Important Notes:

  • iloc: This attribute uses zero-based indexing, so the first row has an index of 0.
  • loc: This attribute uses the actual row labels (which may not always be numbers).
  • Both iloc and loc can also be used to select specific columns. For example, df.iloc[:, 0] selects the first column.

By understanding these differences and applying the appropriate method, you gain more control and flexibility when working with your pandas DataFrames.

Related Posts


Popular Posts