[Spark] 아파치 스파크(spark) SQL 의 Catalyst Optimizer
Spark SQL 사용시 엔진 차원에서 성능 최적화 / Optimized Query Plan
Posted by
Wonyong Jang
on May 03, 2021 ·
7 mins read
[Spark] 아파치 스파크(spark) DataFrame 구현하기
DataFrame 주요 연산 / groupBy / UDF(User Define Function) / join
Posted by
Wonyong Jang
on May 02, 2021 ·
16 mins read
[Spark] 아파치 스파크(spark) SQL과 DataFrame
RDD vs DataFrame / Catalyst Optimizer / Tungsten execution engine / Encoder
Posted by
Wonyong Jang
on May 01, 2021 ·
5 mins read
[Scala] 예외 처리 ( Option, Either, Try )
NullPointerException 을 처리하기 위한 여러가지 방법
Posted by
Wonyong Jang
on April 29, 2021 ·
8 mins read
[Scala] collection API
Traversable, Seq, List, Array, Vector, map, flatMap, takeWhile, take, groupBy
Posted by
Wonyong Jang
on April 23, 2021 ·
11 mins read
[Spark] Streaming Graceful Shutdown
How to do graceful shutdown of spark streaming job / sigkill, sigterm, sigint 차이
Posted by
Wonyong Jang
on April 19, 2021 ·
9 mins read
[Spark] Streaming Data Sources
Kafka, Kinesis / Direct, Receiver Based Data Sources, Fault Tolerance / Backpressure
Posted by
Wonyong Jang
on April 15, 2021 ·
7 mins read