Powershell: import complex PRICAT csv file

Background

A while ago I was asked to help with importing tyre data from several different manufacturers into a local PIM system.

For exchanging data electronically the tyre industry have adopted the EDIFACT subset EANCOM PRICAT for the trade, particularly for the tyre trade by the term "EDIWheel". EDIWheel process begin with an electronic product and price catalogue (PRICAT) then orders (ORDERS), confirmation of orders (ORDRSP), delivery dispatch (DESADV) and electronic invoice (INVOIC).

The EDI documents I had to process was PRICAT and invoice. PRICAT is basically a CSV file with some extra information on the first rows of the file.

Delimiter in the file was ";". The extra rows in the csv file prevented me from simply reading the file with the built-in powershell function import-csv.

EDIWheel PRICAT exists in two different versions, B2 (2008) or B4 (2016). In my case the format will be the older B2 format.

Since the only script language available, due to the IT-policy, was Windows Powershell I had to parse the PRICAT with Powershell and store the data in the PIM system

B2 or B4?

First step was to agree on what version of PRICAT to use. Some of the manufacturers offered only B2 so we decided to go with that. They are both structured the same way but B4 have additional parameters and extra values for some of the B2 parameters. They are both a CSV with extra information on the top two lines. EDIWheel foundation call this top information "HDS". 

HDS looks like this



The documentation is copyrighted and can be obtained from the Ediweel web site. Read more Ediwheel.net

Description of HDS in below table

Parameter

Status

Format

Length

Example

Comment

Record type

Mandatory

AN*

3

HDS

 

Document ID

Mandatory

AN

17

DED_BRV_010607_1

Pricelist/Catalogue Identification

Date

Mandatory

AN

12

20070601

Date or Date and Time of Pricelist/Catalogue creation, FORMAT="YYYYMMDD" or "YYYYMMDDHHMM"

Currency

Optional

AN

3

EUR

 

Supplier ID

Mandatory

AN

13

4399901585232

Code to identify the supplier defined by third parties (ILN, DUNS, ...) or

customer.

Receiver ID

Optional

AN

13

142145

Customer Number

Country code

Mandatory

AN

3

DE

Country code, ISO 3166, e.g. DE, GB

EDIWheel version

Mandatory

AN

35

B2

 

Catalogue type

Optional

AN

1

P

If blank = full catalogue. P= partial catalogue

* AlphaNumeric

From row three and onwards the product information is stored in POH records.



Step 1: Read the HDS from the file with powershell


There is a built in function in powershell to import a CSV, called import-CSV but the HDS record will make powershell refuse to import the file as CSV. However there is other built in functions to read from files. After a lot of Googling and trial and error i found away to only load the second row with the built in function Get-Content.


1
2
3
4
5
6
# Load the CSV-file specified in $filename. 
# Two first rows contain file info. Read only row 2 [-1]. 
# Store in $header variable and then split into array and store in $data 

$header = (Get-Content -Path $filename -First 2)[-1]
$data = $header.Split(';')


From $data we can now read the parameters we need in our system. In my case that is the manufacturer, currency and country.

I also had to translate $SUPPLIER_ID to the internal ID of the manufacturer but I will not show how as that is company internal.

1
2
3
4
5
6
7
8
#Currency
$currency = $data[3]

#Supplier_ID as the ID of the manufacturer
$SUPPLIER_ID= $data[4]

#Country
$country = $data[6]


Step 2: Read the POH rows from the file with powershell

From row 3 an onwards the file is a standard CSV files so I first tried Import-CSV function but I couldn't get the result I wanted so I tried with Get-Content instead which has built in function for skipping rows when reading the file.

After reading the file from row 3 and onwards I used ConvertFrom-CSV to be able to parse the data with powershell.

First I declare which columns to read, I did not need all the information in the PRICAT file. The parameter $desiredColumns is an array with the column names in the CSV file.

1
2
3
4
5
#Column headers is row 3
$desiredColumns = 'POS','SUPPLIER_CODE','DESCRIPTION_1','WEIGHT','BRAND_TEXT','PRODUCT_TYPE','VEHICLE_TYPE','NET_VALUE','NV_VALID_FROM','RECYCLING_FEE','NOISE_PERFORMANCE','NOISE_CLASS_TYPE','ROLLING_RESISTANCE','WET_GRIP', 'EC_VEHICLE_CLASS', 'EU_DIRECTIVE_NUMBER'

#Load the PRICAT file, skip first two rows and only store columns we need
$pricat = Get-Content -Path $filename | Select-Object -Skip 2 | Out-String | ConvertFrom-Csv -Delimiter (';') | Select $desiredColumns


Now it is only a matter of looping through each $row of the loaded $pricat and insert it into the database. In this example I am using Invoke-Sqlcmd that comes with the MS SQL tools. In the real case I had to store it in a old PIM database using a text based flat file format.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#Loop through all rows
foreach($row in $pricat) {

    $SQL="INSERT INTO tyre_info (POS,supplier_id,country,currency,description,weight,brand,product_type,vehicle_type,net_price,price_valid_from, rec_fee,noise,noise_type,rol_resistance,wet_grip,EC_veh_class,EU_directive)"

    $SQL+=" VALUES($row.POS,$SUPPLIER_ID,$country,$currency,$row.DESCRIPTION,$row.WEIGHT,$row.BRAND_TEXT,$row.PRODUCT_TYPE,$row.VEHICLE_TYPE,$row.NET_VALUE,$row.NV_VALID_FROM,$row.RECYCLING_FEE,$row.NOISE_PERFORMANCE,$row.NOISE_CLASS_TYPE,$row.ROLLING_RECISTANCE,$row.WET_GRIP,$row.EC_VEHICLE_CLASS,$row.EU_DIRECTIVE_NUMBER)"

    #Execute the SQL using Invoke-SqlCmd from MS SQL tools
    Invoke-Sqlcmd -Query $SQL -ServerInstance "MyServer\Instance"

}
 




Comments

Popular posts from this blog

Creating an old school amiga demo bitmap scroll in Python 3 and pygame 2

MS SQL server: CTE example