Unable to ingest .csv files in real time

Created by Victor Diaz on Thu, 08/22/2019 - 11:03
Published URL:
https://www.ibm.com/support/pages/node/1071628
1071628

Troubleshooting


Problem

The source files are coming every 5th minute, and the PI latency is set to 5 mins. We can see the .csv files being ingested by PI when doing a historic data ingestion, but the same fails when ingested on real time, saying the time stamp is set to future. The data is in UTC timezone, the model is also in UTC timezone. The analytics server is in CEST timezone

Symptom

The data-source log file, from 'Prometheus' in this case, shows:
2019-08-11 22:49:00,001 ERROR [FileSourceReader]  ERR_COULD_NOT_PROCESS: Could not parse/post-process data entry, null or invalid format of a value(s); total lines failed to process: 1.
com.ibm.tivoli.netcool.pa.mediation.MediationException: ERR_COULD_NOT_PROCESS
    at com.ibm.tivoli.netcool.pa.mediation.source.file.FileSourceReader.processFile(FileSourceReader.java:671)
    at com.ibm.tivoli.netcool.pa.mediation.source.file.FileSourceReader.read(FileSourceReader.java:391)
    at com.ibm.tivoli.netcool.pa.mediation.source.DataReader.executeMe(DataReader.java:156)
    at com.ibm.tivoli.netcool.pa.mediation.RunnableStage.run(RunnableStage.java:104)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:522)
    at java.util.concurrent.FutureTask.run(FutureTask.java:277)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1153)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.lang.Thread.run(Thread.java:785)
    at com.ibm.streams.operator.internal.runtime.OperatorThreadFactory$2.run(OperatorThreadFactory.java:137)
Caused by: com.ibm.tivoli.netcool.pa.processing.FileProcessingException: ERR_INVALID_FUTURE_TS
    at com.ibm.tivoli.netcool.pa.mediation.source.file.FileSourceReader.processFile(FileSourceReader.java:631)
    ... 9 more

Cause

The content of the .csv file contains timestamps from a period ahead of the actual filename, so the file name can be e.g. 12:35 to 12:40, but t the file content can be 12:35:00 to 12:39:59.9999999
This is explained on the documentation:
 
The end time is optional and can be at or after the last time stamp in the file. If the end time in the file name is at the same time as the start of the next interval, this end time must be after the latest time stamp in the file. For example if the end time of 10:05 is at the start of the next interval, the latest time stamp in the file can be 10:04:59.

Diagnosing The Problem

Confirm the content of the .csv files are within the interval period designated by the file name.

Resolving The Problem

Adjust the csv files content to reflect the periods of their file names.

Document Location

Worldwide

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSJQQ3","label":"IBM Operations Analytics - Predictive Insights"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

Historical Number

TS002599646

Document Information

Modified date:
22 August 2019

UID

ibm11071628