Histogram
What is a Histogram?
A histogram is a graphical display of data using bars(rectangles) of different heights.
Parts of a Histogram
- The title: The title describes the information included in the histogram.
- X-axis: The X-axis are intervals that show the scale of values which the measurements fall under. These intervals are also called bins.
- Y-axis: The Y-axis shows the number of times that the values occurred(frequency) for each interval on the X-axis.
- The bars: The height of the bar shows the number of times the values occurred within the interval, while the width of the bar shows the covered interval.
Example: Height of Orange Trees
- You measure the height of every tree in the orchard in centimeters (cm)
- The heights vary from 100 cm to 340 cm
- You decide to put the results into groups of 50 cm:
- The 100 to just below 150 cm range,
- The 150 to just below 200 cm range, Etc…
- So, a tree that is 260 cm tall is added to the "250-300" range.
- And here is the result:
- You can see (for example) that there are 30 trees from 150 cm to just below 200 cm tall. You just created a histogram!
- Source: https://www.mathsisfun.com/data/histograms.html
Creating a Histogram
Matplotlib can be used to create histograms using the hist() method.
Parameters:
- x(n,): this takes either a single array or a sequence of arrays that are not required to be of the same length.
- bins: intervals of any quantity
If the bins are:
[1, 2, 3, 4]
then the first bin is [1, 2) (including 1, but excluding 2) and the second [2, 3). The last bin, however, is [3, 4], which includes 4.
Applications of Histogram
- Histograms are a very common type of plot when looking at data like height and weight, stock prices, waiting time for a customer, etc., which are continuous in nature.
- Histograms are good for showing general distributional features of dataset variables. You can see roughly where the peaks of the distribution are, whether the distribution is skewed or symmetric and if there are any outliers.
Bar Plot
What is Bar Plot?
Bar charts are one of the most common graphs used to show data associated with categorical variables.
Let's see some ways to display a bar graph with matplotlib:
Creating a Vertical Bar Plot
Pyplot provides a bar() method to make bar graphs that take the following arguments: categorical variables, their values, and color (if you want to specify any).
Creating a Horizontal Bar Plot
Making a horizontal bar chart is also really simple using the plt.barh() method.
Bar Charts with multiple quantities
When comparing several quantities and changing one variable, we might want a bar chart with bars of one color for one quantity value.
We can plot multiple bar charts by playing with the bars' thickness and positions.
Stacked Bar Charts
The stacked bar chart stacks bars that represent different groups on top of each other. The height of the resulting bar shows the combined result of the groups.
Applications of Bar Charts
Bar graphs are used to match things between different groups or to trace changes over time. Look at the bar chart below representing the most in-demand tech skills for data engineers.
Bar Chart vs Histogram
Histograms are a great way to show results of continuous data, such as:
- weight
- height
- how much time etc.
But when the data is in categories (such as Country or Favorite Movie), we should use a Bar Chart. Have a look at the next part describing the difference between histogram and bar chart:
Pie Chart
What is Pie Chart?
-
A pie chart (or a circle chart) is a circular statistical graphic, which is divided into slices(wedges) to illustrate numerical proportion.
-
Imagine a pizza where different slices contain different toppings. Bigger the slice, larger the amount of that topping is present.
Parameters of a pie chart
- x: The wedge sizes.
- labels: A sequence of strings providing the labels for each wedge.
- Colors: A sequence of colors through which the pie chart will cycle. If None, will use the colors in the currently active cycle.
- Autopct: string, used to label the wedges with their numeric value. The label will be placed inside the wedge. The format string will be fmt%pct.
We can also pass in arguments to customize our Pie chart to show shadow, explode a part of it, tilt it at an angle or do lot more exciting things!
Don’t get overwhelmed with the terms being used, it’ll be all clear in the below section.
Creating a Pie Chart
Pie chart can be made using the method pie().
Applications of Pie Charts
- A pie chart is best used when trying to work out the composition of something. If you have categorical data then using a pie chart would work really well as each slice can represent a different category. A good example of a pie chart can be seen below.
- Another good use for a pie chart would be to compare areas of growth within a business such as turnover, profit and exposure.
- Try creating a pie chart to see how much money you spend in different areas.