Ignore:
Timestamp:
04/10/12 16:21:31 (2 years ago)
Author:
Ales Erjavec <ales.erjavec@…>
Branch:
default
Message:

BUGFIX: Fixed 'to_string' method to work even if model statistics are not available.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • Orange/regression/linear.py

    r10643 r10777  
    99 
    1010 
    11 `Linear regression <http://en.wikipedia.org/wiki/Linear_regression>`_ is a statistical regression method 
    12 which tries to predict a value of a continuous response (class) variable based on the values of several predictors. 
    13 The model assumes that the response variable is a linear combination of the predictors, the task of linear regression 
    14 is therefore to fit the unknown coefficients. 
     11`Linear regression <http://en.wikipedia.org/wiki/Linear_regression>`_  
     12is a statistical regression method which tries to predict a value of  
     13a continuous response (class) variable based on the values of several  
     14predictors. The model assumes that the response variable is a linear 
     15combination of the predictors, the task of linear regression is  
     16therefore to fit the unknown coefficients. 
    1517 
    1618To fit the regression parameters on housing data set use the following code: 
     
    4951:: 
    5052 
    51     Actual: 24.000, predicted: 30.004  
    52     Actual: 21.600, predicted: 25.026  
    53     Actual: 34.700, predicted: 30.568  
    54     Actual: 33.400, predicted: 28.607  
    55     Actual: 36.200, predicted: 27.944    
     53    Actual: 24.00, predicted: 30.00 
     54    Actual: 21.60, predicted: 25.03 
     55    Actual: 34.70, predicted: 30.57 
     56    Actual: 33.40, predicted: 28.61 
     57    Actual: 36.20, predicted: 27.94 
    5658 
    5759========================= 
     
    7072:: 
    7173 
    72     Variable  Coeff Est  Std Error    t-value          p 
     74      Variable  Coeff Est  Std Error    t-value          p       
    7375     Intercept     36.459      5.103      7.144      0.000   *** 
    7476          CRIM     -0.108      0.033     -3.287      0.001    ** 
     
    8587             B      0.009      0.003      3.467      0.001   *** 
    8688         LSTAT     -0.525      0.051    -10.347      0.000   *** 
    87     Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1 empty 1 
    88  
    8989 
    9090 
     
    9393=================== 
    9494 
    95 To use stepwise regression initialize learner with stepwise=True. 
    96 The upper and lower bound for significance are contolled with 
    97 add_sig and remove_sig. 
     95To use stepwise regression initialize learner with ``stepwise=True``. 
     96The upper and lower bound for significance are controlled with 
     97``add_sig`` and ``remove_sig``. 
    9898 
    9999.. literalinclude:: code/linear-example.py 
    100100   :lines: 20-23,25 
    101101 
    102 As you can see from the output the non-significant coefficients 
    103 have been removed from the output 
     102As you can see from the output, the non-significant coefficients 
     103have been removed from the model. 
    104104 
    105105:: 
     
    119119           TAX     -0.012      0.003     -3.493      0.001   *** 
    120120    Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1 empty 1 
    121  
    122  
    123121 
    124122""" 
     
    467465 
    468466        """ 
    469         from string import join 
    470         labels = ('Variable', 'Coeff Est', 'Std Error', 't-value', 'p') 
    471         lines = [join(['%10s' % l for l in labels], ' ')] 
    472  
    473         fmt = "%10s " + join(["%10.3f"] * 4, " ") + " %5s" 
    474         if not self.p_vals: 
    475             raise ValueError("Model does not contain model statistics.") 
    476467        def get_star(p): 
    477468            if p < 0.001: return  "*" * 3 
     
    481472            else: return " " 
    482473 
    483         if self.intercept == True: 
    484             stars =  get_star(self.p_vals[0]) 
    485             lines.append(fmt % ('Intercept', self.coefficients[0], 
    486                                 self.std_error[0], self.t_scores[0], 
    487                                 self.p_vals[0], stars)) 
    488             for i in range(len(self.domain.attributes)): 
    489                 stars = get_star(self.p_vals[i + 1]) 
    490                 lines.append(fmt % (self.domain.attributes[i].name, 
    491                              self.coefficients[i + 1], self.std_error[i + 1], 
    492                              self.t_scores[i + 1], self.p_vals[i + 1], stars)) 
     474        labels = ("Variable", "Coeff Est", "Std Error", "t-value", "p", "") 
     475        names = [a.name for a in self.domain.attributes] 
     476 
     477        if self.intercept: 
     478            names = ["Intercept"] + names 
     479 
     480        float_fmt = "%10.3f" 
     481        float_str = float_fmt.__mod__ 
     482 
     483        coefs = map(float_str, self.coefficients) 
     484        if self.std_error is not None: 
     485            std_error = map(float_str, self.std_error) 
    493486        else: 
    494             for i in range(len(self.domain.attributes)): 
    495                 stars = get_star(self.p_vals[i]) 
    496                 lines.append(fmt % (self.domain.attributes[i].name, 
    497                              self.coefficients[i], self.std_error[i], 
    498                              self.t_scores[i], self.p_vals[i], stars)) 
    499         lines.append("Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1 empty 1") 
     487            std_error = None 
     488 
     489        if self.t_scores is not None: 
     490            t_scores = map(float_str, self.t_scores) 
     491        else: 
     492            t_scores = None 
     493 
     494        if self.p_vals is not None: 
     495            p_vals = map(float_str, self.p_vals) 
     496            stars = [get_star(p) for p in self.p_vals] 
     497        else: 
     498            p_vals = None 
     499            stars = None 
     500 
     501        columns = [names, coefs, std_error, t_scores, p_vals, stars] 
     502        labels = [label for label, c in zip(labels, columns) if c is not None] 
     503        columns = [c for c in columns if c is not None] 
     504        name_len = max([len(name) for name in names] + [10]) 
     505        fmt_name = "%%%is" % name_len 
     506        lines = [" ".join([fmt_name % labels[0]] + \ 
     507                          ["%10s" % l for l in labels[1:]]) 
     508                 ] 
     509 
     510        if p_vals is not None: 
     511            fmt = fmt_name + " " + " ".join(["%10s"] * (len(labels) - 2)) + " %5s" 
     512        else: 
     513            fmt = fmt_name + " " + " ".join(["%10s"] * (len(labels) - 1)) 
     514 
     515        for i in range(len(names)): 
     516            lines.append(fmt % tuple([c[i] for c in columns])) 
     517 
     518        if self.p_vals is not None: 
     519            lines.append("Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1 empty 1") 
    500520        return "\n".join(lines) 
    501521 
     
    599619    c = LinearRegressionLearner(table) 
    600620    print c 
    601  
Note: See TracChangeset for help on using the changeset viewer.