Pig Latin Parameters in Store Command

73 views Asked by At

I have developed a Pig Latin that accepts a parameter $colour.

I have loaded in my dataset and successfully filtered it based on this parameter.

Now I'm trying to store the output, I am trying to use the parameter as a folder in the store command, as such:

STORE Final_Relation INTO '/output/colour/'$colour'' USING PigStorage();

This is giving me the following error:

ERROR 1200: <file TestScript, line 32, column 63> mismatched input 'blue' expecting SEMI_COLON

One thing I will add, is that the colour here has a hyphen in it, e.g. blue-grey... although the same applied ot the first colour, e.g. red-orange, so I'm not sure if this is relevant or not (I just found it odd the error message just contains the first part of the string 'blue' and not the full string 'blue-grey'.

As an alternative, I thought it might be ok just to store everything in the colour folder, using the following command:

STORE Final_Relation INTO '/output/colour' USING PigStorage();

But when I do this, and I run my script for the second time (i.e. it works fine the first time), I get the error:

Output Location Validation Failed for: '/output/colour More info to follow:
Output directory hdfs://sandbox-hdp.XXXX.com:XXXX/output/colour already exists

This seems to put me in an awkward situation as:

  • I can't create subfolders dynamically using the parameter
  • I can't put all output into the same folder
  • My real data has thousands of variations of colour - so manually creating folders is impractical.
1

There are 1 answers

0
PetyrBaelish On

@pauljcg has answered the question, the format of my output string needed to be:

'/output/colour/$colour'

I was incorrectly placing additional quotes around the parameter.

ps. I don't know how to promote a comment to an answer, otherwise I would have done.