ETL Process configuration
One of the main features of Spago4Q is the capability to do ETL operations.
ETL operations are:
- Extract data from outside sources
- Transforming it to fit operation needs
- Loading it into the end target
They can be configure and realize following these steps:
- Configure the data source
- Configure and create the target database
- Define the extraction and transformation process
We assume that you login as Spago4Q administrator pointing the browser to:
Login as:
| User | Password |
|---|
| s4qadmin | s4qadmin |
The Main Page is presented, from the "Main Menu" select the "Extractors" item and than click on :
- Source Types to create and configure the data source
- Extraction process to create and configure the extraction process
- Data Interfaces to create an interface
- Scripts to create a transformation scripts
- Import Export to import and export configurations from a Spago4Q installation to another one
The section exlplains how to configure the data source. The data source is the place where will be performed the extraction. To create the data source you have to create and configure:
- The source type: where to specify the type of data source (XML, Data Base, etc.) and configure the engine (Java class) that extract the data.
- The data source.
- The data source parameters: specify the parameter to access the data source.
In this guide, as example, a Data Base data source will be created and configured.
Create a Source Type
From the "Main Menu" select the "Extractors" item and than click on the Source Types menu item and you will enter into the list of existing Source Types.
In order to create a new Source Type click on the "Insert" button

in the top right corner of the page and you will enter in a form like the following one:

In this form you can insert the fields:
- Name (mandatory): it is the label of the Source Type and it identifies it.
- Description: describe the Source Type
- Extractor class: it is the name of the Java class that perform the extraction. In this version you can use the following extractor classes:
| Extractor class | it.eng.spago4q.extractors.db.DbExtractor | This class extracts data from a Database using a SQL-92 query |
| Data Source parameters | USER | The username that has the access to the Database |
| | PASSWORD | The password of the user |
| | URL | The url that identifies the Database |
| | DRIVER | The driver to perform the queries |
| Operation Parameters | QUERY | The query to do when spago4Q execute an extraction |
Enter the required information and click on the "Save" button

. Than click on the "Back" button

and you will see the new Source Type just inserted in Source Types list.
Create a Data Source
In Spago4Q a Data Source is an instance of a Source Type with its own parameter. To create a Data Source you have to perform these operations:
- Click on the menu item "Source Types" and you will enter into the list of existing Source Types.
- Click on the "View Data Sources" button
and you will enter into the list of existing Data Sources that use a specific Source Type.

In order to create a new Data Source click on the "Insert" button

in the top right corner of the page and you will enter in a form like the following one:

In this form you can insert the fields:
- Name (mandatory): it is the label of the Data Source and it identifies it.
- Description: Describe the Data Source.
Enter the required information and click on the "Save" button

. Than click the "Back" button

and you will see the new Data Source just inserted in Data Sources list.
Create Data Source parameters
To perform an extraction the Extraction class (that is a field of a Source Type) needs same parameters that describes how the extractor can connect to the Data Source. A Data Source is an instance of a Source Type with its own parameters. To specify the parameters of a Data Source you have to do these operations:
- Click on the menu item "Source Types" and you will enter into the list of existing Source Types.
- Click on the "View Data Sources" button
and you will enter into the list of existing Data Sources that use a specific Source Type.
- Click on the "View Data Sources Parameters" button
and you will enter into the list of existing Data Source Parameters of a specific Data Source.

In order to create a new Data Source Parameter click on the "Insert" button

in the top right corner of the page and you will enter in a form like the following one:

In this form you can insert the fields:
- Name (mandatory): it is the label of the Data Source Parameter and it identifies it.
- Value: is the value of the field.
- Cript: if it is selected, when you click on the "Save" button
, the value will be stored and encoded in Spago4Q platform.
Enter the required information and click on the "Save" button

. Than click the "Back" button

and you will see the new Data Source Parameter just inserted in Data Source Parameters list.
To configure a Data Source that implements a Source Type that has it.eng.spago4q.extractors.db.DbExtractor as extractor class you have to insert the following parameters:
The section explains how to configure the Data Interfaces that are the data structure (fact tables) inside the target database that are used to store the extracted data. To create the fact table for a Data Interface you have to follow these steps:
- Create a Data Interface where you define the name of the fact table.
- Create the Data Interface field. An interface field represents a column of the fact table.
This guide describes how to create an Interface type that represent this table:
| FT_PROJECT_MAN |
| Field |
Type |
Key |
| PRJ |
VARCHAR |
YES |
| AC |
DOUBLE |
NO |
| PV |
DOUBLE |
NO |
| EV |
DOUBLE |
NO |
Create an Interface Type
An Interface Type represent a fact table that will be create in the database. This table will be populated with the data extracted from the extraction process. To create an Interface Type you have to do these operations:
- from the "Main Menu" select the "Extractors" item and than Click on the menu item "Interfaces" and you will enter into the list of existing Interface Types.
In order to create a new Interface Type click on the "Insert" button

in the top right corner of the page and you will enter in a form like the following one:

In this form you can insert the fields:
- Name (mandatory): it is the label of the Interface Type and it identifies it.
- Description: describe the interface type
- Table Name: it is the name of the fact table to create. When you create a fact table, a prefix like "FT_" will be automatically added.
In the example explained in the figure a fact table named "FT_PROJECT_MAN" will be create.
Enter the required information and click on the "Save" button

. Than click the "Back" button

and you will see the new Interface Type just inserted in Interface Type list.
Create interface fields
To replicate the table explained before in the example you have to add to the Interface Type these fields:
| PRJ |
STRING |
| AC |
DOUBLE |
| PV |
DOUBLE |
| EV |
DOUBLE |
To Associate same field to an Interface type you have to do these operations:
- Click on the menu item "Interfaces" and you will enter into the list of existing Interface Types.
- Select the interface type that you want to add the new fields (in the example select "PROJECT_MANAGMENT") and click the "View Interface Fields" button
and you will enter into the list of Interface Fields of a specific Interface Type.
In order to create a new Interface field click on the "Insert" button

in the top right corner of the page and you will enter in a form like the following one:
In this form you can insert these fields:
- Name (mandatory): it is the label of the Interface Field and it identifies it. It represents the name of a column of the fact table specify by the Interface Type.
- Field Type: it is the type of the column that will be created in the fact table. The type that you can select are:
- String
- Integer
- Double
- Boolean (0,1)
- Date (Specify in the format of the Database where the fact table will be created)
- Key: specify if this column identify a row of the Data base
- Sensible: If it is checked if an extraction produce two rows with the same values as key and different values as sensible field produce the insertion of two rows. If an extraction of two rows with the same values as key and same value as sensible field produce the insertion of only one row.
Enter the required information and click on the "Save" button

. Than click the "Back" button

and you will see the new Interface field just inserted in Interface fields list.
If you want to replicate the example explained before you have to insert the following interface field:
| Name |
Field Type |
Sensible |
Key |
| PRJ |
STRING |
false |
true |
| AC |
DOUBLE |
true |
false |
| PV |
DOUBLE |
true |
false |
| EV |
DOUBLE |
true |
false |
Domains value
Sometime an extraction can produce a column of a finite set of values. In the following example the extraction produced this data:
| ID |
proposal |
approved/rejected |
| 1 |
proposal1 |
rejected |
| 2 |
proposal2 |
approved |
| 3 |
proposal3 |
approved |
The column "approved/rejected" can have only this set of value : {approved, rejected}.
To optimize the quantity of data that will store from an extraction, Spago4Q has the capability to define a Set of value that can be repeated in an extraction. When an extraction will be performed Spago4Q replace this value with the identifier of the value that has be specified in the Domain Value.
To associate a Domain Value to an interface field you have to do this operations:
- Click on the menu item "Interfaces" and you will enter into the list of existing Interface Types.
- Select the interface type and click the "View Interface Fields" button
and you will enter into the list of Interface Fields of a specific Interface Type.

Select the interface field that you want to associated a Domain Value and click the "View Interface Fields" button

and you will enter into the list of Interface Fields of a specific Interface Type.

In order to create a new Domain Value click on the "Insert" button

in the top right corner of the page and you will enter in a form like the following one:
- In this form you can insert this field:
- Value: identify one of the value in the set.

Enter the required information and click on the "Save" button

. Than click the "Back" button

and you will see the new Domain Value just inserted in Interface fields list. To complete the example insert two Domain Value:
When you execute an extraction that produce the value show before, the column "approved/rejected", if it is associated with an interface field that has the domain value just inserted will be produce this result:
| ID |
proposal |
approved/rejected |
| 1 |
proposal1 |
{ID Domain Value rejected} |
| 2 |
proposal2 |
{ID Domain Value approved} |
| 3 |
proposal3 |
{ID Domain Value approved} |
Where {ID Domain Value rejected/approved} will be the identifier of the data set with the value rejected/approved.
Create the fact table
After you defined the Interface Type and the Interface Fields that represent the Fact Table you can create the fact table in the Data Base. To create the Fact Table in the Data Base you have to do these operations:
- Click on the menu item "Interfaces" and you will enter into the list of existing Interface Types.
- Click the "Select" button
of the interface type that represents the fact table that you want to create and you will enter in a form like the following one:

To create the fact table click the "Create Fact Table" button

. If you check in the database you will find a table named as "FT_"+ the name specify in the "Table Name" field of the Interface Type with the field specify in the Interface Fields associated of this Interface Type.

If you want to add or modify the fields of an interface type you have to delete the fact table and recreate it. To Delete a fact table click the "Drop Fact Table" button

(all the data in the fact table will be erased).
An extraction process is a set of operations. Every operation extracts, transforms, and loads data from a data source to a fact table. To define an extraction process you have to do this operations:
- Define an extraction process.
- Define and associate same operations to the extraction process.
- Define and associate same parameter to an operation.
- Map the data extracted from the data source to the column of the fact table.
- Execute the extraction process.
To create an Extraction Process you have to do these operations:
- from the "Main Menu" select the "Extractors" item and than click on the menu item "Extraction Process" and you will enter into the list of existing extraction processes.
Click on the "View Operations" button

and you will enter into the list of existing Operations associated to a specific Extraction Process:

In order to create a new Extraction Process click on the "Insert" button

in the top right corner of the page and you will enter in a form like the following one:

In this form you can insert the fields:
- Name (mandatory): it is the label of the Extraction Process and it identifies it.
- Description: describe the Extraction Process
- Coordinator class: Specify the Java Class that performs the extraction operations. By default is it.eng.spago4q.extractors.DefaultCoordinator.
- Periodicity: In this field you can select the frequency when the extraction process will be executed. If you select a periodicity automatically the extraction process will be scheduled. If you want to remove the extraction process from the scheduler set this field to empty.
Enter the required information and click on the "Save" button

. Than click the "Back" button

and you will see the new Extraction Process just inserted in Extraction Processes list.
Define and associate same operations to the extraction process
An Extraction Process will be executed same operations. In an operation you can specify:
- What you can extract from a Data source
- How to map and transform the data extracted from a Data Source in an Interface Type (Fact table)
This guide explains how to create an Operation that perform a query in a Data Base (defined in a Data Source) and map the value extracted to the fact table (defined in a Interface Type). The first thing to do is create and associate an operation to an Extraction process. To create and associate an Operation to an extraction process you have to follow these steps:
- click on the menu item "Extraction Process" and enter into the list of existing extraction processes.
- click on the to enter the operations list for that extraction process

In order to create a new operation click on the "Insert" button

in the top right corner of the page and you will enter in a form like the following one:

In this form you can insert this fields:
- Name (mandatory): it is the label of the Operation and it identifies it.
- Description: describe the operation.
- Data Source(mandatory): Using this combo you can associate a Data Source to the Operation.
- Interface Type (mandatory): Using this combo you can associate an Interface type to the Operation.
In this example the operation named PRJ_MANAGMENT_OPERATION will extract from the S4Q_DS_DATA_BASE and put the data to the fact table described from the PROJECT_MANAGMENT Interface Type.
Enter the required information and click on the "Save" button

. Than click the "Back" button

and you will see the new Operation just inserted in Operations list.
Define and associate same parameters to an operation
To define what the operation has to extract you have to define same parameters that describe what the extractor class (Defined in Source Type and execute from the operation) has to return. The extractor class described in this guide in section X needs this operation parameters:
| Extractor class |
it.eng.spago4q.extractors.db.DbExtractor |
This class extracts data from a Database using a SQL-92 query |
| Operation Parameters |
QUERY |
The query to do when spago4Q execute an extraction |
To create an Operation Parameter you have to do these operations:
- Click on the menu item "Extraction Process" and enter into the list of existing extraction processes.
- Click on the "View Operations" button
and you will enter into the list of existing Operations associated to a specific Extraction Process.
- Click on the "View Operation Parameters" button
and you will enter into the list of existing Operation Parameters associated to a specific Operation.

In order to create a new operation parameter click on the "Insert" button

in the top right corner of the page and you will enter in a form like the following one:
In this form you can insert the fields:
- Name (mandatory): it is the label of the Operation Parameter and it identifies it. This name will be defined by the documentation of the extractor class. (in this example the name will be QUERY)
- Value: specify the value of the operation parameter.
- Cript: if it is checked the data will be saved and cripted in the Data Base.
In this example the parameter define a query that select all the fields from a table named DT_PROJECT_MAN in the Database defined by the Data Source named S4Q_DS_DATA_BASE.
Enter the required information and click on the "Save" button

. Than click the "Back" button

and you will see the new Operation Parameter just inserted in Operation Parameters list.
Test Operation
TODO
Map the data extracted from the data source to the column of the fact table
When an operation is executed it produced same data that will be insert in the fact table defined in the Interface Type. To Map the data extracted to the field of an Interface Type you have to do these operations:
- Click on the menu item "Extraction Process" and enter into the list of existing extraction processes.
- Click on the "View Operations" button
and you will enter into the list of existing Operations associated to a specific Extraction Process.
- Click on the "View Operation Field" button
and you will enter into the list of existing Operation Field associated to a specific Operation.

In order to create a new operation field click on the "Insert" button

in the top right corner of the page and you will enter in a form like the following one:

In this form you can insert the fields:
- Name (mandatory): it is the label of the field extracted.
- Script: The combo defines a script that has the ability to transform a data extract from the data source before the insertion in the fact table.
- Interface Field: it is the field of the Interface Type (the column of a Fact Table) where to put the data extracted from the operation.
Enter the required information and click on the "Save" button

. Than click the "Back" button

and you will see the new Operation Field just inserted in Operation Parameters list.
Sometime the data produced from the extraction process are incompatible with the format of the fact table. Sometime the data produced from the extraction process has to be transform before the insertion in the fact table. To perform these operations Spago4q has the capability to define a script and associated it to an operation field (as explained before). To create a script you have to do these operations:
*from the "Main Menu" select the "Extractors" item and than click on the menu item "Script" and enter into the list of existing Script.
In order to create a new Script click on the "Insert" button

in the top right corner of the page and you will enter in a form like the following one:

In this form you can insert the fields:
- Name (mandatory): it is the label of the Script.
- Description: it describes the script
- Script Type: The combo defines the type of the script:
- Groovy: if you select this value you will insert a groovy script in the "Script" field. The groovy script will be implemented the "it.eng.spago4q.extractors.script.IMappingScript" interface.
- Date Converter: If you select this value you will insert in the "Script" field the date format of the value extracted. This type of script is used to convert a date in the format compatible with the one used by the fact table.
For example if you want to convert a boolean value to 1 if the value is true and 0 otherwise you have to insert this value in the form:
| name |
BooleanConverter |
| description |
Convert true to 1 0 oderwise |
| Script type |
Groovy |
| Script |
import it.eng.spago4q.extractors.script.IMappingScript;
public class Tester implements IMappingScript {
public String execute(String value) {
String toReturn = "0";
if (value == "true")
toReturn = "1";
return toReturn;
}
Object o = new Tester();
|
If you want to test the script you can click on "Save and Test" button

. When you click this button should be appear a pop up like this

Insert an input value (like "true") and click on the "OK" button. You should see at the bottom of the form this information:

If the script produces the result that you expected click on the "Save" button

. Than click the "Back" button

and you will see the list of the scripts.
you
When you've finished the configuration of an extraction process (Data Sources, Interface Type and Operations), before the execution of the extraction process, you can check the configuration using a report that summarize all the configuration objects and parameters. To generate this report you have to do these operations:
Click on the menu item "Extraction Process" and enter into the list of existing extraction processes.

Click on the "Detail Process Report" button

and you will see a pdf report that summarizes all the configuration of the extraction process.
From the "Main Menu" select the "Extractors" item and than "Import Export"
Main Menu -> Extractors -> Import Export
- On the left side of the page insert a name for the package (Es: extractionConfiguration)
- flag the operation(s) name you want to export
- click on the image (up-right of the export frame)
- save the package