Showing posts with label Oracle. Show all posts
Showing posts with label Oracle. Show all posts

Saturday, December 8, 2018

Reading JDBC(Oracle/SQL Server) Data using Spark

Reading JDBC table data using Spark Scala


import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.sql.SQLContext

val conf = new SparkConf().setAppName(appName).setMaster(master)
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)

Reading JDBC:

sc.sqlContext .read.format("jdbc")
                               .option("url",<jdbc URL>)
                               .option("dbtable",<table name>)
                               .option("user",<user name>)
                               .option("password",<password>)
                               .option("driver",<Driver Name>)
                               .load

Driver Details:
SQL Server: "com.microsoft.sqlserver.jdbc.SQLServerDriver"
Oracle: "oracle.jdbc.driver.OracleDriver"

URL Details:
SQL Server: "jdbc:sqlserver://<serverName><instanceName><:><portNumber>;<property1=value>;<property2=value>"
Oracle: "jdbc:oracle:thin:@localhost:1521:orcl"

** Please change the local host of Oracle JDBC URL depending on your server IP.
** You can use User Name and Password in place of Property1 and Property2 in time of  SQL Server JDBC URL creation. 




JSON File Parsing Using Spark Scala

Parsing JSON file in Spark:


First create object of Spark Context.


import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.sql.SQLContext


val conf = new SparkConf().setAppName(appName).setMaster(master)
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)

sc.sqlContext.read.option("multiline",<is_multiline>)
                              .option("mode","PERMISSIVE")
                              .option("quote","\"")
                              .option("escape","\u0000")
                              .json(<path>)


** is_multiline:Boolean = True/False
** path = JSON file Path