Monthly Archives: July 2015

How to import org.apache.spark.sql.SQLContext.implicits in Spark : error “value toDF is not a member of org.apache.spark.rdd.RDD”

I am doing a mini project for my company using Spark/Scala and have been stuck with the error mentioned in the title for a couple of days. Googling that error suggested to import org.apache.spark.sql.SQLContext.implicits, and that’s what I did:

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.sql._
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.SQLContext.implicits

import org.apache.spark.SparkConf
object TestSQLContext {
[…..]
def main(args:Array[String]) {
[…..]
}
}

And that was the start of the problem: my application started to give a new error:

object SQLContext is not a member of package org.apache.spark.sql
[error] Note: class SQLContext exists, but it has no companion object.

The problem is, none of those online posts mention that we need to create an instance of org.apache.spark.sql.SQLContext before being able to use its members and methods. This is the right way to do it:

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.sql._
import org.apache.spark.SparkConf
object Hi {

case class DimC(ID:Int, Name:String, City:String, EffectiveFrom:Int, EffectiveTo:Int)

def main(args:Array[String]) {
val conf = new SparkConf().setAppName(“LoadDW”)
val sc = new SparkContext(conf)
val sqlContext= new org.apache.spark.sql.SQLContext(sc)
import sqlContext.implicits._

val fDimCustomer = sc.textFile(“DimCustomer.txt”)

var dimCustomer1 =   fDimCustomer.map(_.split(‘,’)).map(r=>DimC(r(0).toInt,r(1),r(2),r(3).toInt,r(4).toInt)).toDF

dimCustomer1.registerTempTable(“Cust_1”)

val customers = sqlContext.sql(“select * from Cust_1”)

customers.show()
}
}

Hope this post helps and please do not hesitate to ask your questions in comments section.

Cheers.

 

 

Advertisements