Error in file(file, “rt”) : invalid ‘description’ argument in complete.cases program

I am writing an R function that reads a directory full of files and reports the number of completely observed cases in each data file. The function returns a data frame where the first column is the name of the file and the second column is the number of complete cases.

such as,

id nobs
1  108
2  345
...
etc

Here is the function I wrote:

complete <- function(directory, id = 1:332) {

  for(i in 1:332) {
    path<-paste(directory,"/",id,".csv",sep="")
    mydata<-read.csv(path)
    #nobs<-nrow(na.omit(mydata))
    nobs<-sum(complete.cases(mydata))
    i<-i+1
  }

  completedata<-c(id,nobs)
}

I execute the function:

complete("specdata",id=1:332)

but I’m getting this error:

Error in file(file, "rt") : invalid 'description' argument

I also tried the traceback() function to debug my code and it gives this output:

traceback()
# 4: file(file, "rt") at #6
# 3: read.table(file = file, header = header, sep = sep, quote = quote, 
#    dec = dec, fill = fill, comment.char = comment.char, ...) at #6
# 2: read.csv(path) at #6
# 1: complete("specdata", id = 1:332)

7 Answers

It’s hard to tell without a completely reproducible example, but I suspect your problem is this line:

path<-paste(directory,"/",id,".csv",sep="")

id here is a vector, so path becomes a vector of character strings, and when you call read.csv you’re passing it all the paths at once instead of just one. Try changing the above line to

path<-paste(directory,"/",id[i],".csv",sep="")

and see if that works.

It seems you have a problem with your file path. You are passing the full vector id =c(1:332) to the file path name. If your files are named 1.csv, 2.csv, 3.csv, etc..
You can change this line:

path<-paste(directory,"/",id,".csv",sep="")

to

path<-paste(directory,"/",i,".csv",sep="")

and leave out or rework the id input of your function.

Instead of using a for to read the data in, you can try sapply. For example

mydata <- sapply(path, read.csv).

Since path is a vector, sapply will iterate the vector and apply read.csv to it. Therefore, there will be no need for the for loop and your code will be much cleaner.

From there you will have a matrix which each of your files and their respective information from which you can extract the observations.

To find the observations, you can do mydata[2,1][[1]]. Remember that the rows will be your factors and your columns will be your files.

I am working on the exact problem.. file names in the directory “specdata” are named with 001.csv and 002.csv…. 099.csv all the way to file 332.csv however, when you are recalling id=1 then your file name becomes 1.csv which does not exist in the directory. try using this function to get the path of each id file.

filepaths <- function (id){
    allfiles = list.files(getwd())
    file.path(getwd(), allfiles[id])
}

I had this problem because I was trying to run a for loop against the data frame and not a vector:

  ids <- th[th$nobs > threshold,]
  for(i in ids) {

this is what the variable “ids” looks like:

     id nobs
2     2 1041
154 154 1095
248 248 1005

should have been:

  ids <- th[th$nobs > threshold,]
  for(i in ids$id) {

I met the same problem in this sentence:

Browse[2]> read.csv(list.files(".", "XCMS-annotated-diffreport--.*csv$"), row.names = 1)
Error in file(file, "rt") : invalid 'description' argument

then, I found there are two different csv files in the same path, like this:

Browse[2]> list.files(".", "XCMS-annotated-diffreport--.*csv$")
[1] "XCMS-annotated-diffreport--1-vs-2-Y.csv" "XCMS-annotated-diffreport--1-vs-2.csv"  

When I deleted one file, it works again.

change object id to i – because you are in for loop with iteration object i i.e path<-paste(directory,”/”,id,”.csv”,sep=””) to i.e path<-paste(directory,”/”,i,”.csv”,sep=””)

Leave a Reply

Your email address will not be published. Required fields are marked *