What is Indexing in Pandas?
- Indexing in Pandas means selecting particular rows and columns from a DataFrame.
- Indexing in Pandas is the same as we did for a Python List and a NumPy array.
- There are two different methods of indexing in Pandas:
- loc - label based selection
- iloc - index-based selection
Index-Based Selection
- Index-based selection is to select data based on its numerical position in DataFrame.
- iloc is used for selecting data based on numerical position.
- The syntax for using the iloc operator is , where df is a DataFrame name. You can pass the numerical positions of rows and columns to select in the square bracket.
1
df.iloc[ ]
- Do you remember the indexing in a NumPy array? If not, don’t worry; you will soon see its implementation on a dataset.
- Import Pandas Library and load the ‘exam_scores.csv’ file
Selecting data using iloc:
- You can also pass list of indexes
- You can also pass negative indexes
Label-Based Selection
- Label-based selection selects data based on the column or row names/index. This becomes important while selecting data from a DataFrame.
- Label-based selection is made with loc.
- Do you remember the python default indexing of the DataFrame and the indexes/names that you changed in the previous topics?
- loc and iloc are conceptually similar. The difference is that iloc considers the default indexing while loc ignores the default indexing.
- loc is used for selecting data based on the data index value/name, not the numerical positions.
- The syntax for loc is similar to iloc: , where df is the DataFrame name.
1
df.loc[ ]
- We will learn about its implementation on a dataset soon.