Installing scikit-learn
There are different ways to get scikit-learn installed:
- Install the version of scikit-learn provided by your operating system or Python distribution. This is the quickest option for those who have operating systems that distribute scikit-learn.
- Install an official release. This is the best approach for users who want a stable version number and aren’t concerned about running a slightly older version of scikit-learn.
- Install the latest development version. This is best for users who want the latest-and-greatest features and aren’t afraid of running brand-new code.
Note
If you wish to contribute to the project, it’s recommended you
install the latest development version.
Installing an official release
Scikit-learn requires:- Python (>= 2.6 or >= 3.3),
- NumPy (>= 1.6.1),
- SciPy (>= 0.9).
Windows
First you need to install numpy and scipy from their own official installers.Wheel packages (.whl files) for scikit-learn from PyPI can be installed with the pip utility. Open a console and type the following to install or upgrade scikit-learn to the latest stable release:
pip install -U scikit-learn
Mac OSX
Scikit-learn and its dependencies are all available as wheel packages for OSX:pip install -U numpy scipy scikit-learn
Linux
At this time scikit-learn does not provide official binary packages for Linux so you have to build from source.Installing build dependencies
Installing from source requires you to have installed the scikit-learn runtime dependencies, Python development headers and a working C/C++ compiler. Under Debian-based operating systems, which include Ubuntu, you can install all these requirements by issuing:sudo apt-get install build-essential python-dev python-setuptools \
python-numpy python-scipy \
libatlas-dev libatlas3gf-base
sudo update-alternatives --set libblas.so.3 \
/usr/lib/atlas-base/atlas/libblas.so.3
sudo update-alternatives --set liblapack.so.3 \
/usr/lib/atlas-base/atlas/liblapack.so.3
Note
In order to build the documentation and run the example code contains in
this documentation you will need matplotlib:sudo apt-get install python-matplotlib
Note
The above installs the ATLAS implementation of BLAS
(the Basic Linear Algebra Subprograms library).
Ubuntu 11.10 and later, and recent (testing) versions of Debian,
offer an alternative implementation called OpenBLAS.Using OpenBLAS can give speedups in some scikit-learn modules, but can freeze joblib/multiprocessing prior to OpenBLAS version 0.2.8-4, so using it is not recommended unless you know what you’re doing.
If you do want to use OpenBLAS, then replacing ATLAS only requires a couple of commands. ATLAS has to be removed, otherwise NumPy may not work:
sudo apt-get remove libatlas3gf-base libatlas-dev
sudo apt-get install libopenblas-dev
sudo update-alternatives --set libblas.so.3 \
/usr/lib/openblas-base/libopenblas.so.0
sudo update-alternatives --set liblapack.so.3 \
/usr/lib/lapack/liblapack.so.3
sudo yum -y install gcc gcc-c++ numpy python-devel scipy
Building scikit-learn with pip
This is usually the fastest way to install or upgrade to the latest stable release:pip install --user --install-option="--prefix=" -U scikit-learn
The --install-option="--prefix=" flag is only required if Python has a distutils.cfg configuration with a predefined prefix= entry.
From source package
Download the source package from http://pypi.python.org/pypi/scikit-learn/ , unpack the sources and cd into the source directory.This packages uses distutils, which is the default way of installing python modules. The install command is:
python setup.py install
Third party distributions of scikit-learn
Some third-party distributions are now providing versions of scikit-learn integrated with their package-management systems.These can make installation and upgrading much easier for users since the integration includes the ability to automatically install dependencies (numpy, scipy) that scikit-learn requires.
The following is an incomplete list of Python and OS distributions that provide their own version of scikit-learn.
Debian and derivatives (Ubuntu)
The Debian package is named python-sklearn (formerly python-scikits-learn) and can be installed using the following command:sudo apt-get install python-sklearn
A quick-‘n’-dirty way of rolling your own .deb package is to use stdeb.
Python(x,y) for Windows
The Python(x,y) project distributes scikit-learn as an additional plugin, which can be found in the Additional plugins page.Canopy and Anaconda for all supported platforms
Canopy and Anaconda ships a recent version, in addition to a large set of scientific python library.MacPorts for Mac OSX
The MacPorts package is named pysudo port install py26-scikit-learn
sudo port install py27-scikit-learn
Arch Linux
Arch Linux’s package is provided through the official repositories as python-scikit-learn for Python 3 and python2-scikit-learn for Python 2. It can be installed by typing the following command:# pacman -S python-scikit-learn
# pacman -S python2-scikit-learn
Building on windows
To build scikit-learn on Windows you need a working C/C++ compiler in addition to numpy, scipy and setuptools.Picking the right compiler depends on the version of Python (2 or 3) and the architecture of the Python interpreter, 32-bit or 64-bit. You can check the Python version by running the following in cmd or powershell console:
python --version
python -c "import struct; print(struct.calcsize('P') * 8)"
For 32-bit Python it is possible use the standalone installers for Microsoft Visual C++ Express 2008 for Python 2 or Microsoft Visual C++ Express 2010 or Python 3.
Once installed you should be able to build scikit-learn without any particular configuration by running the following command in the scikit-learn folder:
python setup.py install
The Windows SDKs include the MSVC compilers both for 32 and 64-bit architectures. They come as a GRMSDKX_EN_DVD.iso file that can be mounted as a new drive with a setup.exe installer in it.
- For Python 2 you need SDK v7.0: MS Windows SDK for Windows 7 and .NET Framework 3.5 SP1
- For Python 3 you need SDK v7.1: MS Windows SDK for Windows 7 and .NET Framework 4
cmd /E:ON /V:ON /K
SET DISTUTILS_USE_SDK=1
SET MSSdk=1
"C:\Program Files\Microsoft SDKs\Windows\v7.0\Setup\WindowsSdkVer.exe" -q -version:v7.0
"C:\Program Files\Microsoft SDKs\Windows\v7.0\Bin\SetEnv.cmd" /x64 /release
python setup.py install
Replace /x64 by /x86 to build for 32-bit Python instead of 64-bit Python.
The .whl package and .exe installers can be built with:
pip install wheel
python setup.py bdist_wheel bdist_wininst -b doc/logos/scikit-learn-logo.bmp
It is possible to use MinGW (a port of GCC to Windows OS) as an alternative to MSVC for 32-bit Python. Not that extensions built with mingw32 can be redistributed as reusable packages as they depend on GCC runtime libraries typically not installed on end-users environment.
To force the use of a particular compiler, pass the --compiler flag to the build step:
python setup.py build --compiler=my_compiler install
Bleeding Edge
See section Retrieving the latest code on how to get the development version. Then follow the previous instructions to build from source depending on your platform.Testing
Testing scikit-learn once installed
Testing requires having the nose library. After installation, the package can be tested by executing from outside the source directory:$ nosetests -v sklearn
C:\Python34\python.exe -c "import nose; nose.main()" -v sklearn
Ran 3246 tests in 260.618s
OK (SKIP=20)
Testing scikit-learn from within the source folder
Scikit-learn can also be tested without having the package installed. For this you must compile the sources inplace from the source directory:python setup.py build_ext --inplace
nosetests -v sklearn/
make in
make test
pip install --editable .
No comments:
Post a Comment