• StatisCLAS: Methods for Statistical Classification

    • Helping file: .pdf
    • Code (the helping file is included) in different compression formats:
      .tar.gz .tar.gz      .rar: .rar      .zip: .rar

    Last update: 29th May 2013

    What Is New:

    • Two previous packages, codeDFMforFD and codeRFDforTS, have been merged, and the code has been modified so as to make it easier to maintain and reuse—the six main scripts of the new package are different but share many lines.
    • The principal steps or tasks of the algorithms are identified with a number, which allows only these parts to be changed—e.g. functional distance or transformation—or new methods to be designed. Each method is determined by few numbers and a name.
    • The value of a parameter—number of blocks for time series and differentiation order for curves—can be optimized in each run. In this case, for each value a measure of the minimizing-power is shown.
    • When there is more than one method, in each run the most appropriate can be selected automatically. In this case, for each method a measure of the minimizing-power is shown.
    • For given samples it is possible, in the learning scheme determined by the previous method selection, to select the most approapriate data transformation, distance, type of discriminant vector, multivariate classification submethod, et cetera. Some of these ideas were outlined in my doctoral thesis.
    • Theoretical explanations have been included so that to explain how methodologies and scripts work—e.g., effects on the quality of the estimations, or when a larger number of runs is necessary.
    • Some code has been written to reduce the computational effort (that described in the two previous items can be used for this purpose too) or to control and warn the user.
    • A new kind of script has been added, to obtain labels instead of error rates. The user can place new data by applying different types of call.
    • The preprogrammed stochastic models, for processes or functions, have been generalized by using coefficients.
    • Finally, several simulation exercises have been included just as examples and to show some concepts.

  • I have collaborated on the code, in the language R, of some tools of Bioinformatics for the Spanish National Cancer Research Center (CNIO).

Other works

David Casado de Lucas

Last update: November 2014