Histograms in R ⚡ [create, adjust, throw, add curves...] (2023)

is a histogramCharts most commonly used to represent continuous data. This is a bar graph that shows the frequency of measurements grouped by a certain interval and calculates the number of observations in each interval. Additionally, pitch is determined by the ratio of frequency to interval width. In this tutorial, we'll cover how to create a histogram using the R programming language.

  • How to make a histogram in R? R function
  • 2 Change the color of the histogram
  • 3 gaps in the R histogram
  • 4 Histogram with two variables in R
  • 5 Adding a normal curve to the histogram
  • 6 Adding density lines to the histogram
  • Combine: histogram and frame in R
  • Using ggplot2 to plot histograms in R
  • 9 Draw a histogram

How to make a histogram in R? R function

If you're reading this, you're wonderingHow to draw a histogram in R. So, to explain the steps of building a histogram in R, we'll use the following data that represents the distance in yards after a golf ball is hit.

Distance <-c(241.1, 284.4, 220.2, 272.4, 271.1, 268.3, 291.6, 241.6, 286.1, 285.9, 259.6, 299.6, 253.1, 239.6, 277.8, 263.8, 267. 2, 272.6, 283.4, 2 34.5, 260.4, 264.2, 295.1, 276, 4, 263.1, 251.4, 264.0, 269.2, 281.0, 283.2)

You can plot a histogram in R using the following commandhistoryFunction.by default, the function will createfrequency histogram.

hist(distance, main = "frequency histogram") # frequency
Histograms in R ⚡ [create, adjust, throw, add curves...] (1)

However, if you set the parameterprobabilitycomehe saysYou will getDensity histogram.

hist(distance, prob = TRUE, main = "density histogram") # gustoća
Histograms in R ⚡ [create, adjust, throw, add curves...] (2)

You can also add a grid to the histogram using the commandnetoThe function is as follows:

hist(udaljenost, prob=TRUE) grid(nx=NA, ny=NULL, lty=2, col="sivo", lwd=1) hist(distance, prob=TRUE, add=TRUE, col="white")
Histograms in R ⚡ [create, adjust, throw, add curves...] (3)

Note that you need to plot the histogram twice in order for the grid to appear below the main graph.

Change the color of the histogram

Now you know how to make a histogram in RYou can also customize it. So if you want to change the color of the container you can set itdepressionThe parameter is your favorite color. Like other plots,Many functions can be customizedChart content such as title, axes, font size...

hist(distance, col = "light blue")
Histograms in R ⚡ [create, adjust, throw, add curves...] (4)

Gaps in the R histogram

is a histogramUseful for representing the underlying distribution of dataIf the number of containers is correctly selected. but thatChoosing the number of bins (or bin width) can be tricky:

  1. Several bins group too many observations.
  2. With multiple bins, there will be several observations in each bin, which increases the variability of the resulting graph.

HaveSeveral rules for determining the number of cuvettes. w R.Sturges's method is standardly used. If you want to change the number of containers, you can set a parameterrestto the desired number.

par(mfrow = c(1, 3)) hist(distance, gaps = 2, main = "several containers") hist(distance, gaps = 50, main = "too many containers") hist(distance, main = "Sturges " method") par(mfrow = c(1, 1))
Histograms in R ⚡ [create, adjust, throw, add curves...] (5)

You can also choose the width of the histogram bin using the plugin method implemented in Wand (1995)core smoothingThe library looks like this:

# Metoda dodatka# install.packages("KernSmooth")library(KernSmooth)bin_width <- dpih(distance)nbins <- seq(min(distance) - bin_width, max(distance) + bin_width, by = bin_width)hist(distance, breaks = nbins, main = "plugins")
Histograms in R ⚡ [create, adjust, throw, add curves...] (6)

Histogram with two variables in R

Setting parametersAdd tocomehe saysAllows overlaying of histograms on other charts. For example, you can createR histogram by groupswith the following code block:

set.seed(1)x <- rnorm(1000) # first group <- rnorm(1000, 1) # second grouphist(x, main = "two variables")hist(y, add = TRUE, col = rgb(1 ) ) , 0, 0, 0.5))
Histograms in R ⚡ [create, adjust, throw, add curves...] (7)

TenRGBThe function sets the color in the RGB channel,AThe parameter sets the transparency. In fact, when connecting drawings, it's best to set the colors to transparent so you can see the drawings behind them.

Add a normal curve to the histogram

To draw normal curves on the histogram you can usenormaandWireThe function is as follows:

hist(distance, probability = TRUE, main = "normal curve histogram") x <- seq(min(distance), max(distance), length = 40) f ​​​​​​​​​​​​<- dnorm(x, mean = mean(distance), sd = sd(distance)) row(x, f, column = "red", lwd = 2)
Histograms in R ⚡ [create, adjust, throw, add curves...] (8)

Add density lines to the histogram

To add a density curve to the top of the histogram you can useWirecurve drawing functions anddensityused to calculate the underlying instrumentnon-parametric (kernel) density distribution.

hist(distance, frequency = FALSE, main = "density curve") lines(density(distance), lwd = 2, col = "red")
Histograms in R ⚡ [create, adjust, throw, add curves...] (9)

Bandwidth selection for fitting nonparametric densities is an area of ​​intense research.Also note that by defaultdensityThe function uses a Gaussian kernel. For more information, call? density.

We will add the previous code to the functionAutomatic generation of histograms with normals and density lines:

histDenNorm <- function (x, main = "") { hist(x, prob = TRUE, main = main) # histogram line(density(x), col = "blue", lwd = 2) # density x2 < - seq (min(x), max(x), duljina = 40) f ​​​​<- dnorm(x2,prosjek(x),sd(x))linije(x2,f,col = "crveno",lwd = 2) # normalna legenda("topright", c("histogram", "density", "normal"), box.lty = 0, lty = 1, col = c("black", "blue", "red") , lwd = c(1,2,2))}

Now you can test the behavior of the function on sample data.

set.seed(1)# Normal data x <- rnorm(n = 5000, mean = 110, sd = 5) # Exponential data y <- rex(n = 3000, rate = 1) par(mfcol = c(1 , 2) )histDenNorm(x, main = "Histogram X")histDenNorm(y, main = "Histogram Y")par(mfcol = c(1, 1))
Histograms in R ⚡ [create, adjust, throw, add curves...] (10)

Combination: Histograms and Boxplots in R

You can add a border bubble on top of the histogrampair(new=true)between plots.

hist(distance, probability = TRUE, ylab = "", main = "", col = rgb(1, 0, 0, alpha = 0.5), axes = FALSE) axis(1) # add horizontal axis (new = TRUE) boxplot(distance, level=TRUE, axis=FALSE, lwd=2, column=rgb(0, 0, 0, alpha=0.2))
Histograms in R ⚡ [create, adjust, throw, add curves...] (11)

You can also add a normal or density curve to the previous plot.

Histogram w R z ggplot2

To create a histogramggplot2packages you need to useCan+geometric histogramfunction and transmit data asdata frame. from the insideAESSpecify the data frame variable name parameter.

# install.packages("ggplot2")library(ggplot2)ggplot(data.frame(distance), aes(x = distance)) + geom_histogram(color = "gray", fill = "white")
Histograms in R ⚡ [create, adjust, throw, add curves...] (12)

The chart returns a warning message that the histogram was calculated using 30 bins. This is because by defaultCan without using the Sturgis method.

now we wantCalculate the number of containers using the Sturges methodHowhistorythe function executes and sets itrestdiscussion. Note that you can also setboxing roomArgue if you want.

# Izračunajte prijelome, npr. funkcija hist() nbreaks <- Pretty(range(distance), n = nclass.Sturges(distance), min.n = 1)ggplot(data.frame(distance), aes(x = distance) ) + geom_histogram(gaps = ngaps, color = "gray", fill = "white")
Histograms in R ⚡ [create, adjust, throw, add curves...] (13)

As you can see, this is equal to the first histogram.

existggplot2You can also add a density curve withgeometric densityFunction. Also, if you want to fill the area under the curve, set the parameterpunayour favorite color andAColor transparency. Note that you have to install a new oneAESinterior -geometric histogramas follows:

ggplot(data.frame(distance), aes(x=distance)) + geom_histogram(aes(y=..density..), breaks=nbreaks, color="gray", fill="white") + geom_density(fill ="crna", alfa=0,2)
Histograms in R ⚡ [create, adjust, throw, add curves...] (14)

graph histogram

Another way to create a histogram is to useConspiracya package (R adaptation of the JavaScriptplotly library) that creates plots in an interactive format. For example, you can run the following command:

# install.packages("plotly")library(plotly)# frequency histogramfig <-plot_ly(x = distance, type = "histogram")fig# density histogramfig <-plot_ly(x = distance, type = "histogram", histnorm = "probability")

References

Top Articles
Latest Posts
Article information

Author: Rev. Leonie Wyman

Last Updated: 27/05/2023

Views: 6292

Rating: 4.9 / 5 (59 voted)

Reviews: 90% of readers found this page helpful

Author information

Name: Rev. Leonie Wyman

Birthday: 1993-07-01

Address: Suite 763 6272 Lang Bypass, New Xochitlport, VT 72704-3308

Phone: +22014484519944

Job: Banking Officer

Hobby: Sailing, Gaming, Basketball, Calligraphy, Mycology, Astronomy, Juggling

Introduction: My name is Rev. Leonie Wyman, I am a colorful, tasty, splendid, fair, witty, gorgeous, splendid person who loves writing and wants to share my knowledge and understanding with you.