How to move dataset from one server to another in IBM InfoSphere DataStage

As a developer, sometimes we need same dataset that should be used in another datastage server. Like it can be from Development server to Production server or Production to QA server.

But it is not easy to move dataset from one server to another server, dataset is not like normal flat file such as .txt,.csv..etc.

Even if you move dataset( whichever file you see in Unix server with ‘.ds’ extension), that is not a original dataset that holds data. If you are not sure about it, just go through the dataset management post once.

However, still you decided to move the dataset? Then follow the below process.

There are two options for moving a dataset:

1. Create a job that reads the dataset and writes to a sequential file.

Then move the sequential file to the new system and create a job to read the sequential file and write to a dataset.

This the easiest process to move dataset across servers.

Read also: DataStage Scenario Based Question – Separate Numbers and Alphabets

2. As long as both systems can rsh/ssh between each other, you can set up a configuration file that utilises node pools for each system.

As an example:
{
node “node1”
{
fastname “server1”
pools “Serv1” “”
resource disk “/sample/resource” {pools “”}
resource scratchdisk “/sample/scratch” {pools “”}}
node “node2”
{
fastname “server1”
pools “Serv1”
resource disk “/sample/resource” {pools “”}
resource scratchdisk “/sample/scratch” {pools “”}}
node “node3”
{
fastname “server1”
pools “Serv1”
resource disk “/sample/resource” {pools “”}
resource scratchdisk “/sample/scratch” {pools “”}}
node “node4”
{
fastname “server2”
pools “Serv2”
resource disk “/sample/resource” {pools “”}
resource scratchdisk “/sample/scratch” {pools “”}}
node “node5”
{
fastname “server2”
pools “Serv2”
resource disk “/sample/resource” {pools “”}
resource scratchdisk “/sample/scratch” {pools “”}}
}

Then create job on server1

dataset > copy > dataset

In the copy stage, set the Node Pool and Resource Constraints on advanced tab, then select Node Pool in constraint column and select Serv2 (the target node pool).

Read also: Datastage Transformer Usage Guidelines

Once the job has run, the data will be on the target system but the dataset header file for the target dataset will still be on the source server.

You will have to move the header files to your target system (same paths).

Comments

comments

Leave a Reply