Scala, together with the Spark Framework, forms a rich and powerful data processing ecosystem. Modern Scala Projects is a journey into the depths of this ecosystem. The machine learning (ML) projects presented in this book enable you to create practical, robust data analytics solutions, with an emphasis on automating data workflows with the Spark ML pipeline API. This book showcases or carefully cherry-picks from Scala’s functional libraries and other constructs to help readers roll out their own scalable data processing frameworks. The projects in this book enable data practitioners across all industries gain insights into data that will help organizations have strategic and competitive advantage.
Modern Scala Projects focuses on the application of supervisory learning ML techniques that classify data and make predictions. You'll begin with working on a project to predict a class of flower by implementing a simple machine learning model. Next, you'll create a cancer diagnosis classification pipeline, followed by projects delving into stock price prediction, spam filtering, fraud detection, and a recommendation engine.
By the end of this book, you will be able to build efficient data science projects that fulfil your software requirements.
What you will learnCreate pipelines to extract data or analytics and visualizationsAutomate your process pipeline with jobs that are reproducible Extract intelligent data efficiently from large, disparate datasets Automate the extraction, transformation, and loading of dataDevelop tools that collate, model, and analyze dataMaintain the integrity of data as data flows become more complexDevelop tools that predict outcomes based on “pattern discovery”Build really fast and accurate machine-learning models in ScalaWho this book is forModern Scala Projects is for Scala developers who would like to gain some hands-on experience with some interesting real-world projects. Prior programming experience with Scala is necessary.