Question & Answer
Question
How can the Db2 user use Db2 Federation to query the data from flat files such as file.txt, file.csv, and file.xls?
Answer
Based on the Db2 external table, you can use Db2 federation server to make complex queries (JOIN/UNION) from multiple flat files at remote data sources.
Description
Federation can now support on accessing data from various file types (delimited, txt, csv, excel) with JOIN, UNION or INSERT. The matrix below shows the federation supported operations between different types/locations of files:
Data source | Remote external table on txt | Remote external table on csv | Remote external table on xls |
Remote external table on txt | JOIN/UNION/INSERT | JOIN/UNION/INSERT | JOIN/UNION/INSERT |
Remote external table on csv | JOIN/UNION/INSERT | JOIN/UNION/INSERT | JOIN/UNION/INSERT |
Remote external table on xls | JOIN/UNION/INSERT | JOIN/UNION/INSERT | JOIN/UNION/INSERT |
Table 1 - Operations between remote external tables
Data source | Local external table on txt | Local external table on csv | Local external table on xls |
Remote external table on txt | JOIN/UNION/INSERT | JOIN/UNION/INSERT | JOIN/UNION/INSERT |
Remote external table on csv | JOIN/UNION/INSERT | JOIN/UNION/INSERT | JOIN/UNION/INSERT |
Remote external table on xls | JOIN/UNION/INSERT | JOIN/UNION/INSERT | JOIN/UNION/INSERT |
Table 2 - Operations between remote and local external tables
Data source | Local Db2 table on txt | Local Db2 table on csv | Local Db2 table on xls |
Remote external table on txt | JOIN/UNION/INSERT | JOIN/UNION/INSERT | JOIN/UNION/INSERT |
Remote external table on csv | JOIN/UNION/INSERT | JOIN/UNION/INSERT | JOIN/UNION/INSERT |
Remote external table on xls | JOIN/UNION/INSERT | JOIN/UNION/INSERT | JOIN/UNION/INSERT |
Table 3 - Operations between remote external table and local Db2 table
Through the three tables above, we have the summary as following:
(1) Db2 external table can be leveraged for Federation to access to different types of flat files like txt, csv, and xls.
(2) Federation user can create Nicknames on either local or remote external tables, which will be treated like local Db2 tables.
(3) Comparing to external table, only federation can give the user the capability to do federated query (JOIN/UNION) across multiple data sources when at least one of the data sources is txt, csv, or xls flat files.
Examples
1) Prepare external table on remote data source, in this case the remote data source is FLOWER.CN.IBM.COM, the Db2 port number is 53001
CONNECT TO SVT_DB
CREATE EXTERNAL TABLE STUDENT1(ID INT, NAME VARCHAR(20)) USING (DATAOBJECT '/HOME/DB2INST3/FLATFILE0416/STUDENT1.TXT')
CREATE EXTERNAL TABLE STUDENT2(ID INT, NAME VARCHAR(20)) USING (DATAOBJECT '/HOME/DB2INST3/FLATFILE0416/STUDENT2.CSV')
INSERT INTO STUDENT1 VALUES(13,'ZHANGSA')
INSERT INTO STUDENT2 VALUES(14,'LISI')
For more external table information, please refer to CREATE EXTERNAL TABLE statement
2) Catalog DB on remote Db2 data source to local Db2
UNCATALOG NODE FLOWER
CATALOG TCPIP NODE FLOWER REMOTE FLOWER.CN.IBM.COM SERVER 53001
UNCATALOG DB REMOTE
CATALOG DB SVT_DB AS REMOTE AT NODE FLOWER
DB2STOP FORCE
DB2START
3) Create federation objects like Wrapper, Server, and Nicknames for the external tables on the remote Db2
CONNECT TO TESTDB
CREATE WRAPPER FENCED_DRDA LIBRARY 'LIBDB2DRDA.SO' OPTIONS(DB2_FENCED 'Y')
CREATE SERVER DRDA1 TYPE DB2/UDB VERSION 9 WRAPPER FENCED_DRDA AUTHORIZATION "DB2INST3" PASSWORD "PASSW0RD" OPTIONS(DBNAME 'REMOTE')
CREATE NICKNAME REMOTE_NK_STUDENT1 FOR DRDA1."DB2INST3"."STUDENT1"
CREATE NICKNAME REMOTE_NK_STUDENT2 FOR DRDA1."DB2INST3"."STUDENT2"
4) Query with join on 2 external tables on the remote Db2 data source
SELECT * FROM REMOTE_NK_STUDENT1 FULL JOIN REMOTE_NK_STUDENT2 ON REMOTE_NK_STUDENT1.ID = REMOTE_NK_STUDENT2.ID
5) Query with union on 2 external tables on the remote Db2 data source
SELECT * FROM REMOTE_NK_STUDENT1 UNION ALL SELECT * FROM REMOTE_NK_STUDENT2
6) Insert
INSERT INTO REMOTE_NK_STUDENT1 SELECT * FROM REMOTE_NK_STUDENT2
Was this topic helpful?
Document Information
Modified date:
20 August 2018
UID
ibm10728355