Google Cloud engineer (big data)

  • Negotiable
  • London, England, United Kingdom London England GB
  • Contract, Full time
  • HSBC Bank plc
  • 18 Aug 18 2018-08-18

* Experienced professional in software development and excellent exposure within Big data environment * Programming experience in Java, Scala, and Spark * Google Cloud

GB&M Big Data is a Global Markets and Banking Initiative that is part of the Group Data Strategy to transform the way we govern, manage and use all our data to its full potential across HSBC.

Assets that are being developed as part of GB&M Big Data are being designed to support HSBC at a Group level. These assets include the creation of a Data Lake for GBM and CMB, or a single virtual pool of client, transaction, product, instrument, pricing and portfolio data. Using the lake deliver solution to business requirement using Big Data as business as service.

Spark Data Engineer will be part of core big data technology and design team. Person would be entrusted to developed solutions/design ideas, identify design ideas to enable the software to meet the acceptance and success criteria. Work with architects/BA to build data component on the Big data environment.

As a key member of the technical team alongside Engineers, Data Scientists and Data Users, you will be expected to define and contribute at a high-level to many aspects of our collaborative Agile development process:

  • Software design, java development, automated testing of new and existing components in an Agile, DevOps and dynamic environment
  • Promoting development standards, code reviews, mentoring, knowledge sharing
  • Product and feature design, scrum story writing
  • Data Engineering and Management
  • Product support & troubleshooting
  • Implement the tools and processes, handling performance, scale, availability, accuracy and monitoring
  • Liaison with BAs to ensure that requirements are correctly interpreted and implemented. Liaison with Testers to ensure that they understand how requirements have been implemented - so that they can be effectively tested.
  • Participation in regular planning and status meetings. Input to the development process - through the involvement in Sprint reviews and retrospectives. Input into system architecture and design.
  • Peer code reviews.
  • 3rd line support.


  • Experienced in Java, Scala and/or Python, Unix/Linux environment on-premises and in the cloud
  • Java development and design using Java 1.7/1.8. Advanced understanding of core features of Java and when to use them
  • Experience with most of the following technologies (Apache Hadoop, Scala, Apache Spark, Spark streaming, YARN, Kafka, Hive, HBase, Presto, Python, ETL frameworks, MapReduce, SQL, RESTful services).
  • Sound knowledge on working Unix/Linux Platform
  • Hands-on experience building data pipelines using Hadoop components Sqoop, Hive, Pig, Spark, Spark SQL.
  • Must have experience with developing Hive QL, UDF's for analysing semi structured/structured datasets
  • Experience with time-series/analytics db's such as Elasticsearch
  • Experience with industry standard version control tools (Git, GitHub), automated deployment tools (Ansible & Jenkins) and requirement management in JIRA
  • Exposure to Agile Project methodology but also with exposure to other methodologies (such as Kanban)
  • Understanding of data modelling techniques using relational and non-relational techniques
  • Coordination between Onsite and Offshore
  • Experience on Debugging the Code issues and then publishing the highlighted differences to the development team/Architects;
  • Understanding or experience of Cloud design patterns


  • Google Technologies and Big Data
  • Forward thinking, independent, creative, and self-sufficient; who can work with less documentation, has exposure testing complex multi-tiered integrated applications. Ability to work with minimal supervision on own initiative and on multiple tasks simultaneously
  • Excellent communication, interpersonal, and decision making skills
  • Strong team-working skills, working in global teams across multiple time zones
  • Good knowledge on Data warehouse concepts and
  • Basic knowledge on scheduling tools.
  • Knowledge on Software Development Life Cycle (SDLC), and Methodologies like DevOps, Agile, Scrum, Waterfall, and Iterative process


  • professional software development experience and at least 4 years within Big data environment
  • years of programming experience in Java, Scala, and Spark.
  • Proficient in SQL and relational database design.
  • Elastic Search experience (Elastic/Logstash/Kibana etc)
  • Agile and DevOps experience - at least 2+ years
  • Project planning.
  • Google Cloud Platform, or other cloud vendor