Advanced Scala: Implicits

First draft of my Implicits lecture.

For about the last 3-4 years I’ve been on a journey that wouldn’t have started if it were for a series of presentations in my local Austin “Scala Enthusiasts” meetup, starting with one where I was first introduced to the concept of “implicits”.

Now, years later, and after reading several books and scratching my head and butting it into walls a whole bunch, I’ve been feeling the desire to create a lecture aimed for the person I was at the very beginning of the journey. You see, I feel like a lot of time I knew I wanted to understand these interesting things… monads and monoids and typeclasses, but I lacked some sort of roadmap that put everything together. I would be grasping at drips and drabs of concepts, but I felt like I was lacking a sort of map that would help me understand where I was headed and where some of the traps and esoteric syntax were all about.

The above video weighs in at 1 hour 25 minutes, and it’s my first attempt to deliver the first of a planned two lectures. I’m planning on re-recording this (having stumbled and mumbled through the first dry run here) and I might break it into a few smaller parts. But rather than letting the “perfect be the enemy of the good” I’m just going to post this now.

Also, here are the slides in PDF form:

Advanced_Scala_Implicits_reduced

Thinking of a Scala Option as a List

Just wanted to show a tiny tidbit of cool and tidy codeā€”and this is one of those things that become intuitive as you start working with lists and flatMap (i.e. Monads, even if you don’t know the term yet) and working toward the wizardry of true FP.

So I’ve got a list (or Seq or similar collection) and I want to append an object to it. Something like:

val name: Option[String] = Some(Dude)
val greetingWords: Seq[String] = List(Hi, there, Dude)

But we’ve got the hitch that the object we want to append is an Option[String]. Maybe it comes from a “getName” function that may not be able to get a name. We could always write the code like this…

val name: Option[String] = Some("Dude")
val greetingWords: Seq[String] = Seq("Hi", "there") ++ { 
  name match {
    case Some(name) => Seq(name)
    case None => Seq.empty[String]
  }
}

…but that’s ugly and inelegant.

So here’s the elegant way of approaching this: Option[A] can be thought of as a special kind of list that has either one or zero values, depending on whether it’s a Some or a None! So in fact, you can save a lot of the code and add the option as though it were another Seq…

scala> Seq("hi","there") ++ Some("dude")
val res3: Seq[String] = List(hi, there, dude)

scala> Seq("hi","there") ++ None
val res4: Seq[String] = List(hi, there)

Clever, huh?

Spark 101 for Scala Users

A quick hands-on intro into Spark for Scala users.

I’ll format this into a more detailed presentation later (so feel free to check back and bug me if I’m not getting around to it) but here are some immediate things you may be interested in if you saw my Austin Scala Enthusiasts Meetup presentation…

Here’s a link to the PDF of the slides I talked to.

Running Zeppelin via a Docker container

docker run --name zeppelin -p 8080:8080 -p 4040:4040 -v $HOME/spark/data:/data -v $HOME/spark/logs:/logs -v $HOME/spark/notebook:/notebook -e ZEPPELIN_NOTEBOOK_DIR='/notebook' -e ZEPPELIN_LOG_DIR='/logs' -e ZEPPELIN_INT_JAVA_OPTS="-Dspark.driver.memory=4G" -e ZEPPELIN_INTP_MEM="-Xmx4g" -d apache/zeppelin:0.9.0 /zeppelin/bin/zeppelin.sh

Running Spark via a Docker container

docker run --name spark -v $HOME/spark/data:/data -p 4040:4040 -it mesosphere/spark bin/spark-shell

For a basic Spark SBT project

build.sbt:

import Dependencies._

ThisBuild / scalaVersion     := "2.12.11"
ThisBuild / version          := "0.1.0-SNAPSHOT"
ThisBuild / organization     := "com.example"
ThisBuild / organizationName := "Meetup Spark Example"
ThisBuild / scalacOptions ++= Seq("-language:higherKinds")

lazy val root = (project in file("."))
  .settings(
    name := "SparkCatScratch",
    libraryDependencies ++= Seq( scalaTest % Test, sparkCore, sparkSQL, catsCore, catsFree, catsMTL)
  )

initialCommands in console :=
  s"""
    |import cats._, cats.data._, cats.implicits._, org.apache.spark.sql.SparkSession
    |val spark = SparkSession.builder().master("local").getOrCreate
    |""".stripMargin

cleanupCommands in console := "spark.close"

project/Dependencies.scala:

import sbt._

object Dependencies {

  val sparkVersion = "2.4.5"
  val catsVersion = "2.0.0"

  lazy val scalaTest = "org.scalatest" %% "scalatest" % "3.0.8"
  lazy val sparkCore = "org.apache.spark" %% "spark-core" % sparkVersion
  lazy val sparkSQL = "org.apache.spark" %% "spark-sql" % sparkVersion
  lazy val catsCore = "org.typelevel" %% "cats-core" % catsVersion
  lazy val catsFree = "org.typelevel" %% "cats-free" % catsVersion
  lazy val catsMTL = "org.typelevel" %% "cats-mtl-core" % "0.7.0"
}

Starting Spark in the SBT console:

import org.apache.spark.sql.SparkSession

val spark = SparkSession.builder().master(?local").getOrCreate
val sc = spark.SparkContext