Roles and Responsibilities:

  • Building end-to-end data pipelines for ML models, other data driven solutions such that the pipeline is directly usable for deployment/implementation.
  • Building and maintaining data pipelines: data cleaning, transformation, roll-up, pre processing, etc.
  • Building/developing data insight solutions various teams like Credit, Collections Distribution, Vigilance, HR, etc.
  • Building automation solutions using Python, SQL, Docker, etc. as required.

Technical Skills

Must have:

1. Primary skill set:

  • · High Proficiency in Python coding along with good knowledge of SQL (joins, nested query, etc.)
  • · Data analysis experience. Understand and identify the data points and data acquisition mechanism for structured and unstructured data (text/json/xml) for machine learning data pipeline.
  • · Knowledge of using Python Libraries such as Pandas, sqlalchemy (or other Python SQL related libraries), [good to have: matplotlib, numpy, scipy, scikit-learn, nltk].
  • · Working knowledge of GIT repositories (any of the Github, Gitlab etc.)

2. Data management skill sets:

  • · Ability to understand  data models and create the ETL jobs using Python scripts.
  • · Automate regular data acquisition, application process etc. using Python scripts.

Good to have (must be open to learning if doesn’t have already)

  • Web API technology:
  • Experience in Rest API developments using any of Django, Flask, Fast API etc. (this will be highly appreciated)
  • Deployment of web API on cloud using Docker (this will be highly appreciated).

Other useful skills

  • Working knowledge of Linux. (this will be highly appreciated).
  • Should be able to work on problems independently or with less support
  • Concept of bigdata and Spark (PySpark) knowledge
  • Cloud experience (AWS/Azure/GCP)

Apply for this position

Drop files here or click to uploadMaximum allowed file size is 2 MB.
Allowed Type(s): .pdf, .doc, .docx