Estimation of Error Distribution for Multi-source Data without Ground Truth Data using Modified Appr

Aug 24, 2017
1 min read

One of the challenges in measuring accuracy of multi-source data, before this study, is a requirement of ground truth data (or baseline data), since the accuracy of each data source is defined as the difference between the truth and the measurements of the data source. Determining the ground truth data source is another challenge since measuring the accuracy of the ground truth involves additional requirement of more accurate baseline data. This study proposes a methodology to estimate error distributions of data sources by aggregating measurements from multi-source data. Approximate Bayesian Computation was adopted and modified to construct the error distribution based on simulations. In the simulated experiment, the proposed model outperformed the alternative approach, which is a conventional way of evaluating data source that is gathering error information by using the benchmark data. The sensitivity analysis is also provided to explore the model performance by sample size, number of data sources, and distribution types. The proposed model is limited to one dimensional variable with an assumption of independence between the data sources, but the basic approach provided in this study might be easily expanded in other applications.

ITS

Intelligent Transportation Scientist

Estimation of Error Distribution for Multi-source Data without Ground Truth Data using Modified Appr

Comments

Who Am I?

Other Posts

Enhancing Accuracy of Position and Distance Measurements for Connected Vehicles based on Modified Ap

Estimation of Error Distribution for Multi-source Data without Ground Truth Data using Modified Appr

A Challenge of Using Harmonic Mean as a Calculation of Space Mean Speed on a Fixed-segment: Proof of

Empirical Evaluation of the Accuracy of Technologies for Measuring Average Speed in Real Time

Expanding License Plate Matching Capabilities with Secondary Self-Learning Algorithm

Alternative Approach for Forecasting Parking Volumes

Who produces the most CO2 emissions for trips in the Seoul metropolis area?

Incorporating Information Complexity into Regression-Based Freight Generation Model Selection

A Gravity Model Using Spatial Correlation of Time Series Data for Freight Distribution Estimation: A

Enhancement to Self-learning License Plate Matching Algorithm: Starting Association Matrix

Related Websites

Follow Me