Why scikit numpy and pandas

scikit has machine learning algs.
Scikit needs nd arrays Numpy datatypes
Pandas provide dataframe and R like syntax...

so read data..convert to pandas...numpy..leverage scikit

https://www.datascienceretreat.com/courses/numpy-scipy-pandas-scikit-learn?locale=en

Numpy and Scipy took python from a general programming language to a very powerful matrix-oriented one. Pandas brought data.frames to python. Data.frames are one of the core concepts in modern data analysis. Building on top of these data structures, Scikit-learn brought killer implementations of best-of-breed algorithms, all under a standardized library. Nowadays, python is the programming language of choice of data scientists.
Preprocessing with Pandas

    Reading data
    Selecting columns and rows
    Filtering
    Vectorized string operations
    Missing values
    Handling time
    Time series

NumPy, SciPy

    Arrays
    Indexing, Slicing, and Iterating
    Reshaping
    Shallow vs deep copy
    Broadcasting
    Indexing (advanced)
    Matrices
    Matrix decompositions

Scikit-learn

    Feature extraction
    Classification
    Regression
    Clustering
    Dimension reduction
    Model selection




Comments

Popular posts from this blog

ScoreCard Model using R

The auxService:mapreduce_shuffle does not exist

Zeppelin and Anaconda