-    HOW TO: LOADING CSV FILES    

Loading CSV files with LONG fields

Need to load data into a Uniface system that has fields longer than 8 Kbytes? Whilst Uniface baulks at the task, the UD6/CMtool Driver incorporates a facility to let you do just this..

The UD6/CMtool Driver is designed to store your Uniface repository in, just like you would store it in Oracle or Sybase. However when you store your repository with UD6/CMtool Driver, plain ASCII text files are produced, which in turn, can be managed by a Source Code Manager or Configuration Management tool (PVCS, ClearCase etc).

What has this got to do with CSV files?

For the UD6/CMtool Driver to store your Uniface source code in text files, it must perform several technical feats that NO OTHER DRIVER or solution offers. This is why UD6/CMtool Driver is the best solution for storing your Uniface source code.

However a side effect of this, is that it has the 'knowledge' to be able to take a delimited text file (e.g.: CSV) with LONG fields and load it into Uniface in a way that Uniface can understand it.

Uniface cannot understand these long fields on its own, because it imposes a maximum record size of 8 Kbytes, and anything longer than that, must be split into 8 Kbytes 'chunks'. Behind the scenes the UD6/CMtool Driver transparently performs these tasks.

A simple example

Lets take a simple example, loading a tab delimited text file with the following fields:

Entity name (filename): EXPDATA

Field name Data Type Interface In Db?
PK N N9 Y
F1 S C10 Y
F2 S C* Y
F3 S C* Y
F4 S C20 Y
D4 D D N

The first thing to note is that the Date field in the file (F4), is considered a String, and another (dummy) field called D4 is defined. In the READ trigger for this entity we can define some code to format this:

read
if ($status) done
d4=$date(f4)

The date needs to be handled this way due to translation difficulties with most date formats. Combination Date/Time fields do not appear to have the same translation difficulties, but careful experimentation is advised.

Assignment file and overflows settings

For the UD6/CMtool driver to load the tab delimited file into this entity definition, some settings must be specified in your assignment file. Please see the online manual (help/mhhelp.hlp) for more detailed descriptions of these parameters:

[ENTITIES]
EXPDATA.* $UD6:EXPDATA.*

[DRIVERS]
USYS$UD6_PARAMS=triggertag disable, mmf binary, convert csv, convert sep \t

The UD6/CMtool driver has three (3) additional configuration files that are installed by default in your Uniface bin directory. One of these is called 'overflows', and it also must be modified.

This configuration file tells the UD6/CMtool driver which tables have LONG fields, and which is the FIRST long field in that table. So for our example, we would add a line to the overflows file:

:OEXPDATA:F2+

This line tells the UD6/CMtool Driver that the entity EXPDATA's first variable length (LONG) field is called F2. Each line must begin with a colon (:) and end in a plus sign (+). The two fields are separated by a colon (:).

If the table name is renamed in the assignment file then an additional step must be taken, e.g.:

[ENTITIES]
EXPDATA.* $UD6:EXP.*

In this case BOTH the table names OEXPDATA, and OEXP must be declared in the overflows file, e.g.:

:OEXPDATA:F2+
:OEXP:F2+

Loading the file

You can now create a form component using the IDF, that displays this record, and compile it. You may need to restart the IDF to set correct UD6_PARAMS and the overflows definitions.

To load the file, it MUST have the extension XML, so our example file is called expdata.xml.

Loading the file is as simple as pressing retrieve.

You can even use retrieve profiles. However you cannot STORE CSV files. If you press Store the file will be converted to XML format (the native format for the UD6/CMtool Driver). The demo driver is limited to retrieving 11 rows of CSV data, additional rows are discarded.

The file may be processed using that form, or you may simply wish to load it into your database. To load the file into your database, you need to use the TRX format as an intermediary step (you can avoid this depending on the other settings in your assignment file, this technique is described for the sake of its simplicity).

If you have not analysed the model, these commands may not work.

$idf /cpy ud6:expdata.* trx:expdata.trx:

Now comment out the line in the assignment file that redirects EXPDATA to the path $UD6.

[ENTITIES]
;EXPDATA.* $UD6:EXPDATA.*

It is now possible to load the trx data to your default path:

$idf /cpy trx:expdata.trx def:

 

National Language Support / Extended characters

Since the UD6/CMtool Driver is designed to store the Uniface repository, it makes certain assumptions about your data. One of these is that LONG fields will be in TRX format. This is a technique that the Uniface repository uses to ensure compatibility across the many diverse systems it runs on.

However, if you are using 'extended characters', or national language support (NLS) in the file you are loading, the format will be incompatible.

The UD6/CMtool Driver can support this, however the options you need to enable will corrupt your source code if it is stored using the UD6/CMtool Driver. If you are not using the UD6/CMtool Driver to store your source code, then there will be no problem.

If you are using the UD6/CMtool Driver to store your source code, you will need to create a separate assignment file for the settings described here, and an IDF shortcut that uses the new assignment file. Using this 'new' assignment file and shortcut for actually running your data conversion and forms etc, and the 'normal one' for defining the model, building the form component etc will ensure your source code is not corrupted.

How to enable extended characters in the variable part of the file (the LONG Fields)

If the extended characters are in fields BEFORE the variable length field, then you need to take NO ACTION. The extended characters will be loaded without difficulty.

If the extended characters are in the LONG field(s) or after the LONG field(s) then you need to define the entity a little differently, and make some additional assignments in the [DRIVERS] section of your assignment file.

Assignment file settings

For the UD6/CMtool driver to load the tab delimited file into this entity definition, some settings must be specified in your assignment file. Please see the online manual for more detailed descriptions of these parameters:

[ENTITIES]
EXPDATA.* $UD6:EXPDATA.*

[DRIVERS]
; These settings are:
; * all on one line
; * not to be used to store your source code!
;
USYS$UD6_PARAMS=triggertag disable, mmf binary, convert csv, convert sep \t, translatechars binary,translate disable

Please ensure that the settings are exactly as described, including the order of the parameters.

Entity definition

Since Uniface will internally store ALL the long fields, and fields after the first LONG field in a single field, it is now necessary to define string identifiers, so Uniface will know where one field starts and the next begins. Field identifiers can only be used with string fields, so any fields after the first LONG field must now be variable length strings.

This process of many fields becoming one, is transparent to you, the user. When the file is loaded into Uniface, you will see the correct information in each field.

Entity name (filename): EXPDATA

Field name Data Type Interface In dB?
PK N N9 Y
F1 S C10 Y
F2 S VC*VI\} # Y
F3 S VC*VI\} $ Y
F4 S VC*VI\} % Y
D4 D D N

The first thing to note is that field identifier is different for each field. These field identifiers must match, exactly, the field number (do not count non-database fields). The identifier is always, backquote, right brace, space, followed by an ASCII character corresponding to the field number:

Field Pos'n ASCII char Field Pos'n ASCII char Field Pos'n ASCII char
1 ! 22 6 43 K
2 \ 23 7 44 L
3 # 24 8 45 M
4 $ 25 9 46 N
5 % 26 : 47 O
6 & 27 ; 48 P
7 ' 28 < 49 Q
8 ( 29 = 50 R
9 ) 30 > 51 S
10 * 31 ? 52 T
11 + 32 @ 53 U
12 , 33 A 54 V
13 - 34 B 55 W
14 . 35 C 56 X
15 / 36 D 57 Y
16 0 37 E 58 Z
17 1 38 F 59 [
18 2 39 G 60 \
19 3 40 H 61 ]
20 4 41 I 62 ^
21 5 42 J 63 _

Loading the extended character set file

You can now create a form component using the IDF, that displays this record, and compile it. You may need to restart the IDF to set correct UD6_PARAMS.

Loading the file is as simple as pressing retrieve.

You can even use retrieve profiles. However you cannot STORE CSV files. If you press Store the file will be converted to XML format (the native format for the UD6/CMtool Driver). The demo driver is limited to retrieving 11 rows of CSV data, additional rows are discarded.

Further Information

How the driver works with Source Code Control
Format of the files
Modifying the XML files directly
Differences between UD6/CMtool and UD7/XML-DATA and W3C XML-DATA format
Division of the entities

UD6 Overview
How to use the driver with a configuration management solution

$Revision: 1.6.4.3 $ $Date: 2003/09/16 17:52:02 $[zum Seitenanfang]