Learning Objectives
- Reading a data file
- Writing a data file
Reading Data Files
- It is always good to be able to create a dataframe by hand. But, generally, we don't make our data by hand. We work on the data that already exists.
- Data exists in several formats. The most basic of these is the CSV file. CSV stands for comma-separated-values.
What is a CSV file?
- CSV files are normally created by programs that handle large amounts of data. They are a convenient way to export data from spreadsheets and databases and import or use it in other programs.
- CSV (Comma Separated Values) is a simple file format that stores tabular data, such as a spreadsheet or database.
- A CSV file stores tabular data (numbers and text) in plain text.
- Each line of the file is a data record.
- Each record consists of one or more fields, separated by commas.
- The use of the comma as a field separator is the source of the name for this file format.
How does CSV look like?
Working with CSV files in Python
- For working with CSV files in Python, there is an inbuilt module named csv.
- However, a common method for working with CSV files is using Pandas. It makes importing and analyzing data much easier.
- One crucial feature of Pandas is its ability to write and read Excel, CSV, and many other types of files.
Pandas read_csv
- Functions like the Pandas read_csv() method enable you to work with files effectively.
- The read_csv() function reads the CSV file into a DataFrame object.
- A CSV file is similar to a two-dimensional table, and the DataFrame object represents a two-dimensional tabular view.
- The most basic way to read a CSV file in Pandas:
- Now, let's understand how to provide the filename
- One can do many other things through this function only to change the returned object completely.
- For instance, one can read a CSV file not only locally but from a URL through read_csv or choose what columns are to be imported so that we don't have to edit the array later.
- These modifications can be done by the various arguments it takes.
- We don't need to memorize all the arguments, though. Let's look at a few important ones below.
Pandas to_csv with example
- The easiest way to write DataFrames to CSV files is using the Pandas to_csv function.
- Syntax:
- If you want to export without the index, add index=False
- Example:
Comprehensive Tutorial
- Must Read
A comprehensive tutorial on Pandas for beginners:
https://www.learndatasci.com/tutorials/python-pandas-tutorial-complete-introduction-for-beginners/
Slide Download Link
You can download the slides for this topic from here.