Discover your dream Career
For Recruiters

The C++ techniques you need for $600k hedge fund jobs

Jobs writing C++ code for high frequency trading firms (HFTs) and hedge funds can pay very well indeed. Headhunters put compensation (salary and bonus) for such roles at up to $600k+ a few years ago and say this still holds. But simply knowing C++ is not enough. The language is always comparatively fast, but for low latency trading applications you need to know how to make it really fast. 

Get Morning Coffee  in your inbox. Sign up here.

Paul Bilokon, a former director at Deutsche Bank, visiting professor at Imperial College London, and chief scientific advisor at Thalesians Marine Ltd, says that if you want an integral role as a C++ developer in an HFT team, familiarity with low latency C++ is usually mandatory. Although some firms use programmable FPGAs to achieve ultra low latency, Bilokon says this can be complicated because they require specialized hardware knowledge and languages like Lucid, VHDL and Verilog. "Unless the company is prepared to invest in FPGAs in the long term (both in terms of research and development and ongoing support) it is probably a wise decision to get the most mileage (low latency) out of C++," he tells us. 

However, information on low latency C++ can be hard to come by. A paper* released last year by Bilokon and one of his PhD students, Burak Gunduz, looks at 12 techniques for reducing latency in C++ code, as follows:

  1. Lock-free programming: a concurrent programming paradigm involving multi-threaded algorithms which, unlike their traditional counterparts, do not employ the usage of mutual exclusion mechanisms, such as locks, to arbitrate access to shared resources.
  2. Single mix multiple data (SMID) instructions: Instructions that take advantage of the parallel processing power of contemporary CPUs, allowing simultaneous execution of multiple operations.
  3. Mixing data types: When a computation involves both float and double types, implicit conversions are required. If only float computations are used, performance improves. 
  4. Signed vs unsigned: Ensuring consistent signedness in comparisons to avoid conversions. 
  5. Prefetching: Explicitly loading data into cache before it is needed to reduce data fetch delays, particularly in memory-bound applications
  6. Branch reduction: predicting conditional branch outcomes to allow speculative code execution
  7. Slowpath removal: minimize execution of rarely executed code paths.
  8. Short-circuiting: Logical expressions cease evaluation when the final result is determined.
  9. Inlining: Incorporating the body of a function at each point the function is called, reducing function call overhead and enabling further optimisation by the compiler
  10. Constexpr: Computations marked as constexpr are evaluated at compile time, enabling constant folding and efficient code execution by eliminating runtime calculations
  11. Compile-time dispatch: Techniques like template specialization or function overloading so that optimised code paths are chosen at compile time based on type or value, avoiding runtime dispatch and early optimisation decision. 
  12. Cache warming: To minimize memory access time and boost program responsiveness, data is preloaded into the CPU cache before it’s needed.

Source: C++ design patterns for low-latency applications including high-frequency trading

The effectiveness of these techniques is shown in the chart above. While cache warming and contextpr can bring 90% efficiency improvements. Using signed comparisons only leads to a 12.5% increase. 

If you're interested in the topic, Bilokon also suggests watching the 2019 conference video by Carl Cook and Nimrod Sapir at QSpark, a provider of low-latency trading platforms, shown here:

 

*C++ design patterns for low-latency applications including high-frequency trading. Github: GitHub - 0burak/imperial_hft. Bilokon's academic papers 

Have a confidential story, tip, or comment you’d like to share? Contact: +44 7537 182250 (SMS, Whatsapp or voicemail). Telegram: @SarahButcher. Click here to fill in our anonymous form, or email editortips@efinancialcareers.com. Signal also available.

Bear with us if you leave a comment at the bottom of this article: all our comments are moderated by human beings. Sometimes these humans might be asleep, or away from their desks, so it may take a while for your comment to appear. Eventually it will – unless it’s offensive or libelous (in which case it won’t.)

 

author-card-avatar
AUTHORSarah Butcher Global Editor
  • Se
    SebL
    18 July 2024

    Awesome share thanks. I think adding some GPGPUs techniques here could be fun, such as leveraging the power of compute queues using Vulkan. Do you have something on that topic?

  • 40
    40k-DeathJester
    12 July 2024

    Compiler will do a lot of this for you.

  • mb
    mborges
    12 July 2024

    It is "SIMD" for "Single Instruction Multiple Data". Did an AI tried to invent something new here?

  • an
    ananta.dev
    12 July 2024

    Very useful. I'm actually printing this, and I very rarely print anything these days.

Sign up to Morning Coffee!

Coffee mug

The essential daily roundup of news and analysis read by everyone from senior bankers and traders to new recruits.

Boost your career

Find thousands of job opportunities by signing up to eFinancialCareers today.
Recommended Articles
Recommended Jobs

Sign up to Morning Coffee!

Coffee mug

The essential daily roundup of news and analysis read by everyone from senior bankers and traders to new recruits.