R NA - What values ​​are not available in R? (2023)

Your data contains NA,LubJugvalues? It's not the end of the world, but alarm bells should start ringing!

in R (or R Studio), NA meansis not available. Each data cell shows NAmissing value.

Unavailable values ​​are sometimes enclosed in < and >, i.e.. When a vector or column contains NA thenfactor.

In R, NA must be different from NaN. NaN meansnot a numberand represents an undefined or unrepresentable value. For example, it appears when you try to divide by zero.

consider the followingAn example in R:

# Create some sample variablesx1<-C7,9,Not applicable,2,5x2<- How.factorCDo,2,Not applicable,1,1x3<-C4, jug,0,9,8x4<-C6,1,5,5,7# create a data frameDane<-Dane.ramax1, x2, x3, x4

R NA - What values ​​are not available in R? (1)

Table 1: FROM TO,and a sample of R NaN data

In column X1 of the R data sample, the third row is missing a value. Missing values ​​are shown as NA becausecolumn is numeric.

Two values ​​are missing in the first and third rows of column X2. Missing values ​​are given according tohe said becauseThe second column is the factor.

The third column, X3, belongs to the numeric class (same as X1). The second entry in this column isnot a numberIt is therefore displayed with the code NaN.

The fourth column is X4completelyTherefore it does not contain NA or NaN.

Important features for working with NA

Below I will show you some of the most important methods and features of the R programming language.Handling missing data. I'll use the sample data table we created above.

is omitted

The na.omit function is used to exclude rows that contain one or more missing values ​​from a dataset.Read more...

Already.ignoreDane#x1x2x3x4#2 1 9 5# 5 1 8 7

na.omit can also be used to remove NA from a vector...

Already.ignoredays $x1# [1] 7 9 2 5

...or on the list.

# Create several data frames and matricesdane_1<-Dane[,1:2]dane_2<-Dane[1:3,3:4]dane_3<-stickerEasy= 2, C0,Not applicable,- 4,3,2,1# store dataframes and matrices in listsdata lists<-DescriptionData_1, Data_2, Data_3# Create an empty listdata lists.ignore <-Description# The For loop removes NA rows from the entire listIt doesandexist 1:lenghtdata lists {data lists.ignore[[and]] <-Already.ignoredata lists[[and]]}

Note: With such a for loop, all functions can be applied to lists (not just to.omit).

from rm

na.rm is used to remove NA from the data matrix inside the function by setting na.rm = TRUE. For example, na.rm can be used in conjunction with the mean...

the meaning isdata $x1 on.R M = he says#[1]5,75

...i maks.

maximumdata $x1 on.R M = he says#[1]9

use

Often confusing: the cor function uses the use option instead of na.rm.

Cabbagedata$x1, data$x3, usage= "Done.obs"# [1]-0,9011271

the complete thing

The Complete.cases function creates a boolean vector with TRUE indicating a complete row of the data matrix.Read more...

completely.suitcaseDane# [1] false false false true true true

You can also use this functioncase-sensitive deletion(same as na.omit).

Dane[completely.suitcaseDane,]#x1x2x3x4#2 1 9 5# 5 1 8 7

tak.na

is.na is also used to identify missing values ​​with TRUE and FALSE (TRUE means NA). Unlike complete.cases, is.na preserves the dimensions of the data array.Read more...

So.AlreadyDane#x1x2x3x4# false true false false# false false true false# right right wrong wrong# False False False False# False False False False

!jest.na

!is.na (preceded by !) is the opposite of is.na.

!So.AlreadyDane#x1x2x3x4# correct wrong, right, right# right, right, wrong, right# false false true true# really, really, really# really, really, really

Who

In combination with the who function, you can use boolean vectors to find missing values.Read more...

WhoSo.Alreadydays $x1#[1]3

and

Another advantage of logical vectors is that it is possibleCount the missing values. The sum function can be used with is.na to calculate NA values ​​in R.

andSo.Alreadydays $x1# [1] 1

summarize

Summary functions provide another way to count NA values ​​in a data table, column, array, or vector.

summarizeDane

R NA - What values ​​are not available in R? (2)

Table 2: Aggregate function in R counts NA in each column

The NA number is shown in the bottom cell of each column in Table 2.

Combine complete data with rbind and na.omit

Functiontieandis omittedThey can be combined in such a way that only entire rows are joined (ie combined rows).

# create 2 datasets; not applicable data_merge_2merging data_1<-Dane.ramax1=C5,9,8, x2=C1,2,3Merge data_2<-Dane.ramax1=C2,Not applicable,8, x2=C6,9,3# Merge datasets and keep only full rowsdata merging<-Already.ignoretieData Merge_1, Data Merge_2data merging# view merged data

R removes NA, NaN and Inf

maybe alsoturn offAll rows with values ​​NA, NaN and/or Inf.

# Create data with NA, NaN and InfData about data<-dane dane_inf[5,4] <-information# remove NA, NaN and Infdane if_inf<-Data about data[completely.suitcaseData about data iUsedane_inf,1, the maximum != "Information",]dane if_inf# show the entire subset

Recode the values ​​as NA

Sometimes existing values ​​need to be recoded as NA. If you want to replace the NA value, you can do it as follows.

data_NA<-Dane# copy the datadata_NA[data_NA== 1] <-Do# recode the value 1 as NA

If you want to recode specific cells of the data matrix as NA, you can do so as follows.

Danac_NA2<-Dane# copy the dataDanac_NA2[1,3] <-Do# recode row 1, column 3 as NA

Exchange NA

Logical vectors can also be used to replace NA with other values ​​such as 0.Read more...

vector example<-dane $x1vect_example[So.Alreadyvector example] <- 0vector example# [1] 7 9 0 2 5

Assign the missing values

Imputing missing data replaces missing values ​​with new values. Data allocation has many advantages over data allocationdelete row/columnknowing.Read more...

In the example below we usepredicted imputation mean. However, there are many other imputation methods, e.gregression imputationLubmeans assignmentusable.

install.bag"mouse" # Install the mouse package in Rlibrary"mouse" # loading mouse packagechild<-mouseDane,# estimated dataRice= 1, seeds= 123data, small<-completelychild # Store the estimated data setdata, small# show estimates

Video example - how to deal with NA values

Need more help with NA values ​​in R? Then be sure to check out the following videos from my YouTube channel dedicated to statistical programming.

In this video I explain how to deal with incomplete data. I show easy-to-understand examples and explain how to apply various functions such astak.na,is omitted, do.rm.

Accept YouTube cookies to play this video.If you accept, you will have access to YouTube content, a service provided by a third party.

R NA - What values ​​are not available in R? (3)

YouTube Privacy Policy

If you accept this notification, your selections will be saved and the page will refresh.

I want to hear from you

I showed you my favorite way to solve NA values ​​in R.

now i want to hearyour experience.

Which of the following methods is your favorite? Do you use any other methods that I left out above?

let me knowComment!

privitak

The title chart on this page shows a correlation plot of two variables. Missing cases are marked NA.

Create a diagram in R using the code below.

azote<- 50000 # Sample sizex<-specificationazote # X variableg<-specificationazote # Y variablelandmarkbackground= „#353436” # set the background colorlandmarkMarch=C0,0,0,0 # remove the space around the plotconspiracyx, y,# save observationsdepression= "#1b98e0"full stopx[1:15], j[1:15],# plot missing valuespersonal computer= 16, check= 5, Yamaguchi= „#353436”textx[1:15], j[1:15],# write NA to each missing value "Ten", Yamaguchi= "Red"

6Comment. leave new

  • R NA - What values ​​are not available in R? (4)

    Narasinghana

    May 1, 2019 at 8:46 am

    Bok Joachim,
    My data has a separate column called "Treatment" that contains 1) empty cells 2) drug 3) diet 4) unknown and 5) nothing.
    I would like to create a parallel column called "treatment_n" where the drug will be replaced by 1 and everything else will be replaced by 0.
    Can you help me?
    Thank you
    Nara

    answer
    • R NA - What values ​​are not available in R? (5)

      Joachim

      May 1, 2019 at 2:27 p.m

      Hi Nara,

      That's a great question. I made an example to simulate your problem. You can copy/paste the following code into RStudio and run it yourself:

      # Example dataDane<-Dane.ramato heal=C"lek",Not applicable,"lek","diet","diet","unknown",” „# create a new column treat_ndata$treatment_n<- 0# replace drug with 1data$treatment_n[Data processing== "lek"] <- 1# Evaluate the final data frameDane# treatment treatment_n#lek 1#0#lek 1#diet 0#diet 0# unknown 0#0

      I hope it helps!

      Welcome,

      Joachim

      answer
  • R NA - What values ​​are not available in R? (6)

    Adecola Ovoyemi

    October 24, 2020 at 6:27 p.m

    Bok Joachim,

    In my data, the actual value is "NA" (standard ISO 2 code for Namibia). How can I prevent R from seeing this as?

    answer
    • R NA - What values ​​are not available in R? (7)

      Joachim

      October 25, 2020 at 6:52 am

      Book Adecola,

      You can specify "NA" as a string or ratio level. R distinguishes "NA" and NA.

      For example:

      people<-C"Ten",Not applicable,"z","National Standard", Not applicable

      The first element is considered the country code, while the second and last elements are considered missing data.

      Say hello to Namibia from Germany!

      Joachim

      answer
  • R NA - What values ​​are not available in R? (8)

    Omar, MA

    February 15, 2022 at 5:35 am

    It's fantastic, productive and cost-effective.

    How to use maximum likelihood or expectation maximization techniques to deal with missing data?

    answer
    • R NA - What values ​​are not available in R? (9)

      Joachim

      February 15, 2022 at 8:32 am

      Hi Omar,

      Thank you for your kind words!

      I've never done it personally, but the mlmi package seems to provide features that allow multiple maximum likelihood assignments in R.Look here.

      Welcome,
      Joachim

      answer

Comment

I am Joachim Schock. On this site, I offer statistical tutorials and programming codes in Python and R.

Globus Statistics

Related guides

Remove empty data frame rows in R (2 examples)

na_if dplyr package R functions (2 examples) | converts values ​​to NA

References

Top Articles
Latest Posts
Article information

Author: Carlyn Walter

Last Updated: 27/06/2023

Views: 6314

Rating: 5 / 5 (70 voted)

Reviews: 85% of readers found this page helpful

Author information

Name: Carlyn Walter

Birthday: 1996-01-03

Address: Suite 452 40815 Denyse Extensions, Sengermouth, OR 42374

Phone: +8501809515404

Job: Manufacturing Technician

Hobby: Table tennis, Archery, Vacation, Metal detecting, Yo-yoing, Crocheting, Creative writing

Introduction: My name is Carlyn Walter, I am a lively, glamorous, healthy, clean, powerful, calm, combative person who loves writing and wants to share my knowledge and understanding with you.