Scala High Performance Programming by 2016
Author:2016
Language: eng
Format: epub, mobi
Publisher: Packt Publishing
Data clean up
The return algorithm is now blazingly fast. That is, blazingly fast to return incorrect results! Remember that we still have to handle some edge cases and clean up the input data. Our algorithm only works if there is exactly one midpoint per minute, and Dave informed us that we are likely to see more than one midpoint computed for the same minute.
To handle this problem, we create a dedicated MidpointSeries module and make sure that an instance of MidpointSeries, wrapping a series of Midpoint instances, is properly created without duplicates:
class MidpointSeries private(val points: Vector[Midpoint]) extends AnyVal object MidpointSeries { private def removeDuplicates(v: Vector[Midpoint]): Vector[Midpoint] = { @tailrec def loop( current: Midpoint, rest: Vector[Midpoint], result: Vector[Midpoint]): Vector[Midpoint] = { val sameTime = current +: rest.takeWhile(_.time == current.time) val average = sameTime.map(_.value).sum / sameTime.size val newResult = result :+ Midpoint(current.time, average) rest.drop(sameTime.size - 1) match { case h +: r => loop(h, r, newResult) case _ => newResult } } v match { case h +: rest => loop(h, rest, Vector.empty) case _ => Vector.empty } } def fromExecution(executions: Vector[Execution]): MidpointSeries = { new MidpointSeries(removeDuplicates( executions.map(Midpoint.fromExecution))) }
Our removeDuplicates method uses a tail recursive method (Refer to Chapter 3, Unleashing Scala Performance). This groups all the midpoints with the same execution time, calculates the average value of these data points, and builds a new series with these average values. Our module provides a fromExecution factory method to build an instance of MidpointSeries from a Vector of Execution. This factory method calls removeDuplicates to clean up the data.
To improve our module, we add our previous computeReturns method to the MidpointSeries class. That way, once constructed, an instance of MidpointSeries can be used to compute any return series:
class MidpointSeries private(val points: Vector[Midpoint]) extends AnyVal { def returns(rollUp: MinuteRollUp): Vector[Return] = { for { i <- (rollUp.value until points.size).toVector } yield Return.fromMidpoint(points(i - rollUp.value), points(i)) } }
This is the same code that we previously wrote, but this time, we are confident that points does not contain duplicates. Note that the constructor is marked private, so the only way to instantiate an instance of MidpointSeries is via our factory method. This guarantees that it is impossible to create an instance of MidpointSeries with a "dirty" Vector. You release this new version of the program, wish good luck to Dave and his team, and leave for a well deserved lunch break.
As you return, you are surprised to find Vanessa, one of the data scientists, waiting at your desk. "The return series code still doesn't work", she says. The team was so excited to finally be given a working algorithm that they decided to skip lunch to play with it. Unfortunately, they discovered some inconsistencies with the results. You try to collect as much data as possible, and spend an hour looking at the invalid results that Vanessa is talking about. You noticed that they all involved trade executions for two specific symbols: FOO and BAR. A surprisingly small amount of trades is recorded for these symbols, and it is not unusual for several minutes to elapse between trade executions.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Deep Learning with Python by François Chollet(12881)
Hello! Python by Anthony Briggs(10130)
The Mikado Method by Ola Ellnestam Daniel Brolund(10020)
OCA Java SE 8 Programmer I Certification Guide by Mala Gupta(9988)
Dependency Injection in .NET by Mark Seemann(9524)
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8529)
Grails in Action by Glen Smith Peter Ledbrook(7889)
Test-Driven iOS Development with Swift 4 by Dominik Hauser(7858)
The Well-Grounded Java Developer by Benjamin J. Evans Martijn Verburg(7776)
Becoming a Dynamics 365 Finance and Supply Chain Solution Architect by Brent Dawson(7774)
Microservices with Go by Alexander Shuiskov(7537)
Practical Design Patterns for Java Developers by Miroslav Wengner(7451)
Test Automation Engineering Handbook by Manikandan Sambamurthy(7400)
Angular Projects - Third Edition by Aristeidis Bampakos(6825)
Secrets of the JavaScript Ninja by John Resig Bear Bibeault(6645)
The Art of Crafting User Stories by The Art of Crafting User Stories(6311)
NetSuite for Consultants - Second Edition by Peter Ries(6255)
Demystifying Cryptography with OpenSSL 3.0 by Alexei Khlebnikov(6061)
Kotlin in Action by Dmitry Jemerov(5302)
