Collection Summary Job

The 'Collection Summary Snapshot' data source in Druid serves as a repository for extracting metrics used in various reports. This Collection Summary Job captures information from user data and courses, saving it as a blob. The blob data is then indexed into the 'Collection Summary Snapshot' within Druid for further analysis and retrieval.

Data provider:

Cassandra

  1. user_enrolments

  2. course_batch

API

  1. Druid ingestion task trigger API

  2. Content search API

Collection Summary Ingestion Spec:

Collection Summary Snapshot Spec:

Dimension In Druid

Column Label

Data Type

Description

batch_id

batchid

String

Unique Batch Identifier

batch_name

batchname

String

Name of the batch

content_channel

channel

String

Name of the Channel

collection_org_name

organisation

String

Published By or Course Publisher Name or tenant

collection_id

courseid

String

Unique Collection Identifier

collection_name

collectionname

String

Name of Course

batch_start_date

startdate

String

Start Date of the Batch

batch_end_date

enddate

String

End Date of the Batch

user_org

orgname

String

User Organisation name

content_status

contentstatus

String

State of Course. Ex: Live, Draft, etc.

total_enrolment

enrolleduserscount

Long

The number of users are enrolled for the course.

total_completion

completionuserscount

Long

The number of users have completed the course

total_certificates_issued

certificateissuedcount

Long

The number of users received the certificate in course

user_state

state

String

Name of The State

user_district

district

String

Name of the District

keywords

keywords

List[String]

Keywords/Tags which are assigned to course

timestamp

timestamp

Long

TimeStamp of when the report is generated.

has_certificate

hascertified

Boolean

Whether batch is certified or not

Last updated

Was this helpful?