Server setup Guide

This guide helps to deploy in data-products in server.

Setup and execution of data-products in the server

Each data-product is an independent spark job that runs in a spark-submit mode for generating reports and data migrations. So it requires, all the data sources and dependency libraries to be present before executing data-product.

Building data-product

Job Path: Build/Lern/LernDataProducts

Deploying data-product

Job Path: Deploy/{{env}}/Lern/LernDataProducts

Params:

  • module - this parameter used to deploy respective process of data-products deployment

    1. lern-dataproducts - to deploy data-products

    2. lern-dataproducts-spark-cluster - to deploy data-products in spark cluster (such as hd insight cluster)

    3. cronjobs - to update cronjobs in server

  • remote - to which spark server to deploy the above module

Cron jobs

Data-product is running in demon mode which is getting triggered based on schedule by using cronjobs.

Provisioning Postgres DB for exhaust job execution

In data-products, exhaust jobs are using job_request table from postgres DB for maintaining the exhaust job requests.

Job Path: Provision/{{env}}/DataPipeline/PostgresDbUpdate

Running a data-product through Jenkins

Data-product is running in server using cronjobs. For development and testing purpose, below Jenkins job can be used to trigger the job with respective job id.

Job Path: Deploy/{{env}}/Lern/LernAnalyticsReplayJobs

Params:

  • job_type - run-job

  • job_id - specific job id such as (admin-user-reports, progress-exhaust)

  • batch_identifier - specific batch id

  • start_date - Data consumption start date. not required for LERN data-products

  • end_date - Data consumption end date. not required for LERN data-products

  • private_branch - specific private branch

  • branch_or_tag - public branch

Running a data-product using shell command

The data-product can be executed with the following shell command in the server

Last updated

Was this helpful?