Unable to read CSV in R from Hadoop Cluster

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Unable to read CSV in R from Hadoop Cluster

Sankar Mittapally
Hello,

 I am new to big data environment, We have setup Hadoop cluster with Zeppelin. Now trying to read csv file and getting below error. This file have full permissions. 

Error in file(file, "rt"): cannot open the connection


This is the procedure I am following.

%dep z.reset() // Add spark-csv package z.load("com.databricks:spark-csv_2.10:1.2.0")

%spark2.r
data <- read.csv(file='test/test.csv',header=TRUE,sep='\t')

Please let me know how to fix this. Thanks


Reply | Threaded
Open this post in threaded view
|

Re: Unable to read CSV in R from Hadoop Cluster

Georg Heiler
Maybe you should try via sparkly and spar2 -> this already includes the CSV package http://spark.rstudio.com

Sankar Mittapally <[hidden email]> schrieb am Mo., 5. Juni 2017 um 10:10 Uhr:
Hello,

 I am new to big data environment, We have setup Hadoop cluster with Zeppelin. Now trying to read csv file and getting below error. This file have full permissions. 

Error in file(file, "rt"): cannot open the connection


This is the procedure I am following.

%dep z.reset() // Add spark-csv package z.load("com.databricks:spark-csv_2.10:1.2.0")

%spark2.r
data <- read.csv(file='test/test.csv',header=TRUE,sep='\t')

Please let me know how to fix this. Thanks


Reply | Threaded
Open this post in threaded view
|

Re: Unable to read CSV in R from Hadoop Cluster

Sankar Mittapally
No Luck Georg. 


On Mon, Jun 5, 2017 at 4:09 PM, Georg Heiler <[hidden email]> wrote:
Maybe you should try via sparkly and spar2 -> this already includes the CSV package http://spark.rstudio.com

Sankar Mittapally <[hidden email]> schrieb am Mo., 5. Juni 2017 um 10:10 Uhr:
Hello,

 I am new to big data environment, We have setup Hadoop cluster with Zeppelin. Now trying to read csv file and getting below error. This file have full permissions. 

Error in file(file, "rt"): cannot open the connection


This is the procedure I am following.

%dep z.reset() // Add spark-csv package z.load("com.databricks:spark-csv_2.10:1.2.0")

%spark2.r
data <- read.csv(file='test/test.csv',header=TRUE,sep='\t')

Please let me know how to fix this. Thanks