Outlier Analysis Worker
Processes outlier detection jobs to identify statistical outliers in spatial data.
Overview
The outlier analysis worker identifies features with values that are statistically unusual using z-score or MAD (Median Absolute Deviation) methods.
Job Type
outlier_analysis
Input Parameters
{
"dataset_id": 123,
"value_field": "income",
"method": "zscore",
"threshold": 2.0
}
Parameters
dataset_id(required): Source dataset IDvalue_field(required): Numeric field to analyzemethod(optional): “zscore” or “mad” (default: “zscore”)threshold(optional): Z-score threshold or MAD multiplier (default: 2.0)
Output
Creates a new dataset with outlier analysis results:
Original features marked as outliers
Outlier score (z-score or MAD score)
Outlier flag
Original attributes preserved
Methods
Z-Score Method
Calculates standardized z-scores:
Mean and standard deviation calculated
Z-score = (value - mean) / standard_deviation
Features with |z-score| > threshold are outliers
MAD Method
Uses Median Absolute Deviation:
Median calculated
MAD = median(|value - median|)
Modified z-score = 0.6745 * (value - median) / MAD
Features with |modified z-score| > threshold are outliers
Example
# Enqueue an outlier analysis job via API
curl -X POST "https://example.com/api/analysis/outlier_run.php" \
-H "Content-Type: application/json" \
-d '{
"dataset_id": 123,
"value_field": "income",
"method": "zscore",
"threshold": 2.0
}'
Background Jobs
This analysis runs as a background job. The worker:
Fetches queued
outlier_analysisjobsValidates input parameters
Calculates statistics (mean/std or median/MAD)
Identifies outliers
Creates output dataset
Marks job as completed
Performance Considerations
Processing time depends on dataset size
Z-score method requires two passes (mean/std, then scoring)
MAD method is more robust to outliers in calculation
Consider filtering null values before analysis