Skip to main content

Data Cleaner

Ecommerce websites are affected by bots, fraud, tests, repeated orders, and large orders which are not typical. This information can cause issues for analysis and machine learning models. In order to address these challenges DataMilk has developed outlier detection algorithms that filter out outliers which can cause issues for analytics systems.

An outlier request can be sent to determine if the current session or anonymous shopper is outside typical expected behavior or is a bot. Additionally it can be used to verify if a purchase is classified as an outlier indicating that it should be ignored from statistical analysis for the given reason. Configuration

Default API Configuration applies. There are no additional API Configuration properties. cleanData

Note that the current context information about the user and page including the current session they are visiting is automatically sent by the SDK if used from a browser.

export enum CleaningReason {
Bot = 'Bot',
DuplicateOrder = 'DuplicateOrder',
ExtremeValue = 'ExtremeValue',
UnusualUsagePattern = 'UnusualUsagePattern',
InvalidOrderValue = 'InvalidOrderValue',
}

export interface OrderInfo {
amount: number;
currency: string;
id: string;
}

// Context information is sent automatically about the current user session and URL.
export interface DataCleanerRequest extends DmRequest {
order?: OrderInfo;
}
export interface DataCleanerResponse extends DmResponse {
dirty: boolean;

// If dirty, the reason.
reason?: CleaningReason;
}