• Conference
  • CESI - Hors LINEACT

Conférence : Communications avec actes dans un congrès international

Cleaning data is one of the most important tasks in data science and machine learning. It solves many problems in datasets, such as time complexity, added noise, and so on. In a huge datasets, outliers are extreme values that deviate from an overall pattern on a sample. Usually, they indicate variability in measurements or experimental errors. Depending on whether the entity is numeric or categorical, we can use different techniques to study its distribution to detect outliers. Like histogram, box plot and z-score, etc. This work aims to develop a modelbased method to detect undesirable points in a 3D point cloud representing a building. Our proposed method relies on the Z-score concept for filtering outliers which is well known in statistics as the standard score. The idea behind the use of this concept is to help to understand if the data value is above or below average and at what distance. More specifically, the Z-score indicates how many standard deviations away a data point is from
the mean.