DataStage Scenario Based Question – Find Candidates Who did not submit their documents

DataStage Scenario Based Question:

Candidates are filling forms in some xyz institute, they are submitting photocopy of their documents, but the documents they submit are not all the same, some may submit 1 document some may submit 3 and so on.

You have the list of all the documents that have possibility of submission by the candidate.

Now the job is to find out which documents the candidate has not submitted based on only two inputs, List of candidates and their submitted documents and list of all documents.

Input Data:

List of Documents(Docs)

Docs
a
b
c
d
e

List of Candidates and their Documents (CandidateID, Document)

CandidateID Document
1 c
1 e
2 a
2 d
2 e

Solution:

STEP1: Read the input files from sequential file stages.

datastage-scenario-job-input-data-docs-list

datastage-scenario-job-input-data-docs

STEP2: Use column generator stage on both sides to generate new column, this column will be used to join the data from two input files.

As there are no common key columns in two files, we are going with this approach.

datastage-scenario-column-generator-new-column

datastage-scenario-column-generator-column-meta-data

datastage-scenario-column-generator-column-mapping

STEP3: Remove duplicates based on Candidate ID key column.

 

remove-duplicates-stage-on-candidate-id

STEP4: Join the two inputs based on the DUMMY column that was generated from column generator stages.

datastage-scenario-join-stage-documents

STEP5: Use lookup stage and join using CandidateID and Document.

Rejected data from Lookup stage is our final required data.

datastage-scenario-lookup-stage-documents

STEP6: Rejected data from lookup stage is:

datastage-scenario-lookup-stage-documents-rejects-real-output

Normal output from lookup stage is:

datastage-scenario-lookup-stage-documents-normal-output

This would be the final design of the job.

datastage-scenario-job-design-documents-examples

Comments

comments

Leave a Reply