{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# M-Estimators for Robust Linear Modeling" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from statsmodels.compat import lmap\n", "import numpy as np\n", "from scipy import stats\n", "import matplotlib.pyplot as plt\n", "\n", "import statsmodels.api as sm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* An M-estimator minimizes the function \n", "\n", "$$Q(e_i, \\rho) = \\sum_i~\\rho \\left (\\frac{e_i}{s}\\right )$$\n", "\n", "where $\\rho$ is a symmetric function of the residuals \n", "\n", "* The effect of $\\rho$ is to reduce the influence of outliers\n", "* $s$ is an estimate of scale. \n", "* The robust estimates $\\hat{\\beta}$ are computed by the iteratively re-weighted least squares algorithm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* We have several choices available for the weighting functions to be used" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "norms = sm.robust.norms" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "def plot_weights(support, weights_func, xlabels, xticks):\n", " fig = plt.figure(figsize=(12,8))\n", " ax = fig.add_subplot(111)\n", " ax.plot(support, weights_func(support))\n", " ax.set_xticks(xticks)\n", " ax.set_xticklabels(xlabels, fontsize=16)\n", " ax.set_ylim(-.1, 1.1)\n", " return ax" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Andrew's Wave" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on function weights in module statsmodels.robust.norms:\n", "\n", "weights(self, z)\n", " Andrew's wave weighting function for the IRLS algorithm\n", " \n", " The psi function scaled by z\n", " \n", " Parameters\n", " ----------\n", " z : array_like\n", " 1d array\n", " \n", " Returns\n", " -------\n", " weights : ndarray\n", " weights(z) = sin(z/a)/(z/a) for \\|z\\| <= a*pi\n", " \n", " weights(z) = 0 for \\|z\\| > a*pi\n", "\n" ] } ], "source": [ "help(norms.AndrewWave.weights)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "image/png": 