5. Installation Instructions¶
5.1. Prerequisites¶
A modern C++ compiler that supports the C++11 standard, such as the latest release of the GNU or clang compilers
Elemental, a high-performance library for dense, distributed linear algebra, which requires:
Python 2.7, including the following libraries:
- numpy
- scipy
- cython version 0.22
5.1.1. Elemental¶
Elemental can make use of MPI parallelization if available. This is generally advantageous for large problems. The SmallK code is also internally parallelized to take full advantage of multiple CPU cores for maximum performance. SmallK does not currently support distributed computation. However, future updates are planned that provide this capability. Please see the About page for information regarding distributed versions of many of the algorithms within SmallK.
We strongly recommend that users install both the HybridRelease and PureRelease builds of Elemental. OpenMP is enabled in the HybridRelease build and disabled in the PureRelease build. So why install both? For smaller problems the overhead of MPI can actually cause code to run slower than without it. Whereas for large problems MPI parallelization generally helps, but there is no clear transition point between where it helps and where it hurts. Thus, we encourage users to experiment with both builds to find the one that performs best for their typical problems.
We also recommend that users clearly separate the different build types as well as the versions of Elemental on their systems. Elemental is under active development, and new releases can introduce changes to the API that are not backwards-compatible with previous releases. To minimize build problems and overall hassle, we recommend that Elemental be installed so that the different versions and build types are cleanly separated.
Thus, two versions of Elemental need to be built. One is a hybrid release build with OpenMP parallelization, and the other is the pure release build without OpenMP parallelization. A separate build folder will be created for each build. The build that uses internal OpenMP parallelization is called a HybridRelease
build; the build that doesn’t is called a PureRelease
build. The debug build is called a PureDebug
build. The HybridRelease build is best for large problems, where the problem size is large enough to overcome the OpenMP parallel overhead. The following is for the 0.84 version of elemental. Set the version to that specified in the README.html file. Note that the files will be installed in /usr/local/elemental/[version]/[build type]
.
The SmallK software supports the latest stable release of Elemental, version 0.85 and above.
5.1.1.1. How to Install Elemental on MacOSX¶
On MacOSX we recommend using Homebrew as the package manager. Homebrew does not require sudo privileges for package installation, unlike other package managers such as MacPorts. Thus the chances of corrupting vital system files are greatly reduced using Homebrew.
It is convenient to be able to view hidden files (like .file) in the MacOSX Finder. To do so run the following at the command line:
defaults write com.apple.finder AppleShowAllFiles -bool YES
To revert back to hiding hidden files, set the Boolean flag to NO:
defaults write com.apple.finder AppleShowAllFiles -bool NO
If you use Homebrew, ensure that your PATH is configured to search Homebrew’s installation directory first. Homebrew’s default installation location is /usr/local/bin
, so that location needs to be first on your path. To check, run this command from a terminal window:
cat /etc/paths
We also recommend running the following commands on a daily basis to refresh your brewed installations:
brew update
brew upgrade
brew cleanup
brew doctor
This will maintain your Homebrew installed software and diagnose any issues with the installations.
If the first entry is not /usr/local/bin
, you will need to edit the /etc/paths
file. This is a system file, so first create a backup. Move the line /usr/local/bin
so that it is on the first line of the file. Save the file, then close the terminal session and start a new terminal session so that the path changes will take effect.
5.1.1.1.1. OSX:Install the latest GNU compilers¶
Elemental and SmallK both require a modern C++ compiler compliant with the C++11 standard. We recommend that you install the latest stable version of the clang and GNU C++ compilers. To do this, first install the XCode command line tools with this command:
xcode-select --install
If this command produces an error, download and install XCode from the AppStore, then repeat the command. If that should still fail, install the command line tools from the XCode preferences menu. After the installation completes, run this command from a terminal window:
clang++ --version
You should see output similar to this:
Apple LLVM version 8.1.0 (clang-802.0.42)
Target: x86_64-apple-darwin16.7.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
The latest version of the GNU compiler at the time of writing is g++-7 (gcc 7.1.0), which is provided by the gcc
homebrew package. In addition to the gcc package, homebrew also provides a gcc49 package from the homebrew/versions tap. If this alternative gcc49 package is installed on your system it will prevent homebrew from symlinking the gcc package correctly. We recommend uninstalling the gcc49 versioned package and just using the gcc package instead. The Fortran compiler provided with the gcc package will also be configured to properly build numpy, which is required for the python interface to SmallK.
If you need to uninstall the gcc49 package, run the following commands:
brew uninstall gcc49
brew cleanup
brew doctor
Then install the gcc package as follows:
brew install gcc
The Apple-provided gcc and g++ will not be overwritten by this installation. The new compilers will be installed into /usr/local/bin
as gcc-7, g++-7, and gfortran-6. The Fortran compiler is needed for the installation of MPI and for building the python interface to SmallK.
5.1.1.1.2. OSX:Install MPI Tools¶
Install the latest version of mpich with Homebrew as follows:
brew install mpich
We recommend installing mpich rather than openMPI due to some superior features of mpich (prior versions of Elemental use openMPI, which can be installed using Homebrew as well). Also, Elemental 0.85 (discussed below) now uses mpich. Please see some discussion regarding openMPI vs mpich at: http://stackoverflow.com/questions/2427399/mpich-vs-openmpi
5.1.1.1.3. OSX:Install libFlame¶
Next we detail the installation of the high performance numerical library libflame. The library can be gotten from the libflame git repository on github.
It’s important to perform the git clone into a subdirectory NOT called flame
since this can cause name conflicts with the installation. Typically, a git clone is performed into a directory called libflame
. However, other directory names will work as well. Please do not use the directory name `flame`.
To obtain the latest version of the FLAME library, clone the FLAME git repository with this command:
git clone https://github.com/flame/libflame.git
Run the configure script in the top-level FLAME directory as follows (assuming the install path is /usr/local/flame
):
./configure --prefix=/usr/local/flame --with-cc=/usr/local/bin/gcc-6 --with-ranlib=/usr/local/bin/gcc-ranlib-6
A complete list of configuration options can be obtained by running ./configure --help
.
After the configuration process completes, build the FLAME library as follows:
make -j4
The -j4
option tells Make to use four processes to perform the build. This number can be increased if you have a more capable system. Libflame will be installed with the following command:
make install
The FLAME library is now installed.
5.1.1.1.4. OSX:Install Elemental¶
### Here is a recommended installation scheme for Elemental: ###
Choose a directory for the root of the Elemental installation. For example, this may be:
/usr/local/elemental
Download one of the SmallK-supported releases of Elemental, unzip and untar the distribution, and cd to the top-level directory of the unzipped distribution. This directory will be denoted by UNZIP_DIR in the following instructions.
We now recommend using Elemental 0.85 or later. Earlier versions will no longer be supported.
5.1.1.1.4.1. HybridRelease Build¶
From the Elemental-0.85 directory, run the following command to create a local build directory for the HybridRelease build:
mkdir build_hybrid
cd build_hybrid
Use the following CMake command for the HybridRelease build, substituting 0.85 for <VERSION_STRING>:
cmake -D CMAKE_INSTALL_PREFIX=/usr/local/elemental/<VERSION_STRING>/HybridRelease
-D CMAKE_BUILD_TYPE=HybridRelease
-D CMAKE_CXX_COMPILER=/usr/local/bin/g++-7
-D CMAKE_C_COMPILER=/usr/local/bin/gcc-7
-D CMAKE_Fortran_COMPILER=/usr/local/bin/gfortran-7
-D MATH_LIBS="/usr/local/flame/lib/libflame.a;-framework Accelerate"
-D ELEM_EXAMPLES=ON -D ELEM_TESTS=ON ..
Note that we have installed g++-7 into /usr/local/bin
and libFLAME into /usr/local/flame
. Alter these paths, if necessary, to match the installation location on your system.
Once the CMake configuration step completes, you can build Elemental from the generated Makefiles with the following command:
make -j4
The -j4 option tells Make to use four processes to perform the build. This number can be increased if you have a more capable system.
After the build completes, install elemental as follows:
make install
For Elemental version 0.85 and later, you need to setup your system to find the Elemental dynamic libraries. Method 2 below is preferred:
- If your Mac OSX is earlier than Sierra, then, in your startup script (
~/.bash_profile
) or in a terminal window, enter the following command on a single line, replacing VERSION_STRING as above:
export DYLD_LIBRARY_PATH=
$DYLD_LIBRARY_PATH:/usr/local/elemental/VERSION_STRING/HybridRelease/lib/
- If your Mac OSX is Sierra or higher Apple’s System Integrity Protection (SIP) will prevent using the
DYLD_LIBRARY_PATH
variable. We highly discourage disabling SIP as a workaround. Instead, in your startup script (~/.bash_profile
) or in a terminal window, enter the following command on a single line, replacingVERSION_STRING
as above:
ln -s /usr/local/elemental/<VERSION_STRING>/HybridRelease/lib/*.dylib* /usr/local/lib
This will symlink the required Elemental libraries.
5.1.1.1.4.2. PureRelease Build¶
Run these commands to create a build directory for the PureRelease build:
cd ..
mkdir build_pure
cd build_pure
Then repeat the CMake configuration process, this time with the following command for the PureRelease build:
cmake -D CMAKE_INSTALL_PREFIX=/usr/local/elemental/<VERSION_STRING>/PureRelease
-D CMAKE_BUILD_TYPE=PureRelease -D CMAKE_CXX_COMPILER=/usr/local/bin/g++-7
-D CMAKE_C_COMPILER=/usr/local/bin/gcc-7
-D CMAKE_Fortran_COMPILER=/usr/local/bin/gfortran-7
-D MATH_LIBS="/usr/local/flame/lib/libflame.a;-framework Accelerate"
-D ELEM_EXAMPLES=ON -D ELEM_TESTS=ON ..
Repeat the build commands and install this build of Elemental.
For Elemental version 0.85 and later, you need to setup your system to find the Elemental dynamic libraries. Method 2 below is preferred:
- If your Mac OSX is earlier than Sierra, then, in your startup script (
~/.bash_profile
) or in a terminal window, enter the following command on a single line, replacingVERSION_STRING
as above:
export DYLD_LIBRARY_PATH=
$DYLD_LIBRARY_PATH:/usr/local/elemental/VERSION_STRING/HybridRelease/lib/
- If your Mac OSX is Sierra or higher Apple’s System Integrity Protection (SIP) will prevent using the
DYLD_LIBRARY_PATH
variable. We highly discourage disabling SIP as a workaround. Instead, in your startup script (~/.bash_profile
) or in a terminal window, enter the following command on a single line, replacingVERSION_STRING
as above:
ln -s /usr/local/elemental/<VERSION_STRING>/HybridRelease/lib/*.dylib* /usr/local/lib
This will symlink the required Elemental libraries.
The two builds of Elemental are now complete.
To test the installation, follow Elemental’s test instructions for the SVD test to verify that Elemental is working correctly.
5.1.1.2. How to Install Elemental on Linux¶
We strongly recommend using a package manager for your Linux distribution for installation and configuration of the required dependencies. We cannot provide specific installation commands for every variant of Linux, so we specify the high-level steps below. The following was tested on a system with Ubuntu 16.04 installed.
5.1.1.2.1. Linux:Install the latest GNU compilers¶
We recommend installation of the latest stable release of the GNU C++ compiler, which is g++-6 at the time of this writing.
Also, install the latest version of GNU Fortran, which is needed for the installation of the Message Passing Interface (MPI) tools.
5.1.1.2.2. Linux:Install MPI Tools¶
Elemental version 0.85 and higher uses mpich for its MPI implementation.:
sudo apt-get update
sudo apt-get install mpich
This completes the installation of the MPI tools. It should also be noted that the Open MP implementation of the MPI tools could also be used for the following installations.
5.1.1.2.3. Linux:Install libFlame¶
Next we detail the installation of the high performance numerical library libflame. The library can be gotten from the libflame git repository on github.
It’s important to perform the git clone into a subdirectory NOT called flame
since this can cause name conflicts with the installation. We normally do a git clone into a directory called libflame
. However, other directory names will work as well, but not flame
.
To obtain the latest version of the FLAME library, clone the FLAME git repository with this command:
git clone https://github.com/flame/libflame.git
Run the configure script in the top-level FLAME folder as follows (assuming you want to install to /usr/local/flame
; if not, change the prefix path):
./configure --prefix=/usr/local/flame --with-cc=/usr/local/bin/gcc-6 --with-ranlib=/usr/local/bin/gcc-ranlib-6
A complete list of configuration options can be obtained by running:
./configure --help
Then build and install the code as follows:
make -j4
make install
This completes the installation of the FLAME library.
5.1.1.2.4. Linux:Install an accelerated BLAS library¶
It is essential to link Elemental with an accelerated BLAS library for maximum performance. Linking Elemental with a ‘reference’ BLAS implementation will cripple performance, since the reference implementations are designed for correctness not speed.
If you do not have an accelerated BLAS on your system, you can download and build OpenBLAS. Download, unzip, and untar the tarball (version 0.2.19 as of this writing) and cd into the top-level folder. Build OpenBLAS with this command, assuming you have a 64-bit system:
make BINARY=64 USE_OPENMP=1
Install with this command, assuming the installation directory is /usr/local/openblas/0.2.19/
:
make PREFIX=/usr/local/openblas/0.2.19/ install
This completes the installation of OpenBLAS.
5.1.1.2.5. Linux:Install Elemental¶
### Here is our suggested installation scheme for Elemental: ###
We strongly recommend that users install both the HybridRelease and PureRelease builds of Elemental. MPI tools are enabled in the HybridRelease build and disabled in the PureRelease build. So why install both? For smaller problems the overhead of MPI can actually cause code to run slower than without it. On the other hand, for large problems, MPI parallelization generally helps. However, there is no clear transition point between where it helps and where it hurts. Thus, we encourage users to experiment with both builds to find the one that performs best for their typical problems.
Another strong recommendation is that users clearly separate the different build types as well as the versions of Elemental on their systems. Elemental is under active development, and new releases can introduce changes to the API that are not backwards compatible with previous releases. To minimize build problems and overall hassle, we recommend that Elemental be installed so that the different versions and build types are cleanly separated.
Choose a directory for the root of the Elemental installation. A good choice is:
/usr/local/elemental
Download one of the SmallK-supported releases of Elemental (see above), unzip and untar the distribution, and cd to the top-level folder of the unzipped distribution. This directory will be denoted by UNZIP_DIR
in the following instructions.
Note that Elemental version 0.85 or later is the version currently supported; earlier versions are not supported. If an earlier version is needed for Linux, use the following instructions.
For the first step of the installation, for Elemental versions prior to 0.85, we need to fix a few problems with the CMake configuration files. Open the following file in a text editor:
UNZIP_DIR/cmake/tests/OpenMP.cmake
On the first line of the file, change:
if(HYBRID)
to this:
if(ELEM_HYBRID)
Next, open this file in a text editor:
UNZIP_DIR/cmake/tests/Math.cmake
Near the first line of the file, change:
if(PURE)
to this:
if(ELEM_PURE)
Save both files.
Run these commands to create the required directories for the build types:
mkdir build_hybrid
mkdir build_pure
5.1.1.2.5.1. HybridRelease build¶
From the Elemental-<VERSION>
folder, run the following command to change to the local build folder for the HybridRelease build:
cd build_hybrid
For the first step of the installation, we need to fix a few problems with the CMake configuration files. Open the following file in a text editor:
Elemental-<VERSION>/cmake/tests/OpenMP.cmake
On the first line of the file, change:
if(HYBRID)
to this:
if(ELEM_HYBRID)
Next, open this file in a text editor:
Elemental-<version>/cmake/tests/Math.cmake
Near the first line of the file, change:
if(PURE)
to this:
if(ELEM_PURE)
Save both files.
Run the following command to create a local build folder for the HybridRelease build:
cd build_hybrid
Use the following CMake command for the HybridRelease build:
cmake -D CMAKE_INSTALL_PREFIX=/usr/local/elemental/<VERSION>/HybridRelease
-D CMAKE_BUILD_TYPE=HybridRelease -D CMAKE_CXX_COMPILER=/usr/local/bin/g++-6
-D CMAKE_C_COMPILER=/usr/local/bin/gcc-6
-D CMAKE_Fortran_COMPILER=/usr/local/bin/gfortran-6
-D MATH_LIBS="/usr/local/flame/lib/libflame.a;-L/usr/local/openblas/0.2.19/ -lopenblas -lm"
-D ELEM_EXAMPLES=ON -D ELEM_TESTS=ON ..
Note that we have installed g++-6 into /usr/local/bin
and libFLAME into /usr/local/flame
. Alter these paths, if necessary, to match the installation location on your system.
If this command does not work on your system, you may need to define the BLAS_LIBS
and/or GFORTRAN_LIB
config options.
Version 0.85 of Elemental has an error in one of its cmake files. The file is:
Elemental-0.85/cmake/tests/CXX.cmake
Modify the first line of this file from:
include(FindCXXFeatures)
to:
include_directories(FindCXXFeatures)
since FindCXXFeatures is now a directory. After this change, Elemental should Make without errors.
Once the CMake configuration step completes, you can build Elemental from the generated Makefiles with the following command:
make -j4
The -j4 option tells Make to use four processes to perform the build. This number can be increased if you have a more capable system.
After the build completes, install elemental as follows:
make install
After installing Elemental version 0.85, setup the system to find the Elemental shared library. Either in the startup script (~/.bashrc
) or in a terminal window, enter the following command on a single line, replacing VERSION_STRING
as above:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/elemental/VERSION_STRING/HybridRelease/lib/
5.1.1.2.5.2. PureRelease build¶
After this, run these commands to create a build folder for the PureRelease build:
cd ..
cd build_pure
Then repeat the CMake configuration process, this time with the following command for the PureRelease build:
cmake -D CMAKE_INSTALL_PREFIX=/usr/local/elemental/0.84-p1/PureRelease
-D CMAKE_BUILD_TYPE=PureRelease -D CMAKE_CXX_COMPILER=/usr/local/bin/g++-6
-D CMAKE_C_COMPILER=/usr/local/bin/gcc-6
-D CMAKE_Fortran_COMPILER=/usr/local/bin/gfortran-6
-D MATH_LIBS="/usr/local/flame/lib/libflame.a;-L/usr/local/openblas/0.2.19/ -lopenblas -lm"
-D ELEM_EXAMPLES=ON -D ELEM_TESTS=ON ..
If this command does not work on your system, you may need to define the BLAS_LIBS
and/or GFORTRAN_LIB
config options.
Repeat the build commands and install this build of Elemental. Then, if you installed a version of Elemental prior to the 0.84 release, edit the /usr/local/elemental/<version>/PureRelease/conf/ElemVars
file and replace the CXX line as indicated above.
Version 0.85 of Elemental has an error in one of its cmake files. The file is:
Elemental-0.85/cmake/tests/CXX.cmake
Modify the first line of this file from:
include(FindCXXFeatures)
to:
include_directories(FindCXXFeatures)
since FindCXXFeatures is now a directory. After this change, Elemental should Make without errors.
If Elemental version 0.85 or later was installed, setup the system to find the Elemental shared library for the PureRelease build. Enter the following command in a terminal window on a single line, replacing VERSION_STRING
as above:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/elemental/VERSION_STRING/PureRelease/lib/
Note: set this variable to point to either the HybridRelease or the PureRelease build of the Elemental shared library whenever you want to use SmallK.
This completes the two builds of Elemental.
To test the installation, follow Elemental’s test instructions for the SVD test to verify that Elemental is working correctly.
5.1.2. Installation of Python libraries¶
Note: the following section for installing the Python libraries can be skipped if not needed.
5.1.2.1. OSX:Install Python libraries¶
5.1.2.1.1. Install Python scientific packages¶
Assuming that you have used brew to install gcc, as indicated earlier, you can run the following commands to install the necessary libraries:
brew install python
brew install numpy
brew install scipy
To check your installation, run:
brew test numpy
IMPORTANT: Check to see that your numpy installation has correctly linked to the needed BLAS libraries.
Ensure that you are running the correct python:
which python
This should print out /usr/local/bin/python
. Open a python terminal by typing python
at the command line and run the following:
import numpy as np
np.__config__.show()
You should see something similar to the following:
lapack_opt_info:
extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
extra_compile_args = ['-msse3']
define_macros = [('NO_ATLAS_INFO', 3)]
blas_opt_info:
extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
extra_compile_args = ['-msse3', '-I/System/Library/Frameworks/vecLib.framework/Header']
define_macros = [('NO_ATLAS_INFO', 3)]
If you are using OpenBLAS, you should see that indicated as well.
5.1.2.1.2. Install Cython: a Python interface to C/C++¶
First install the Python Package Index utility, pip. Many Python packages are configured to use this package manager, Cython being one.:
brew install pip
Only Cython 0.22 is supported at this time. To check which version is installed on your system use this commands:
$ python
>> import Cython
>> Cython.__version__
'0.22'
>>
To install Cython version 0.22 (if not already installed):
pip uninstall cython
pip install cython==0.22
Check the version of cython as above to ensure that Cython version 0.22 is installed.
5.1.2.2. Linux:Install Python libraries¶
The Python libraries can easily be installed via pip and apt-get with the following commands:
apt-get install pip
pip install numpy
apt-get install python-scipy
pip uninstall cython
pip install cython==0.22
This also ensures that cython version 0.22 is installed, which is the currently supported version. The Makefile assumes an installation path of /usr/local/lib/python2.7/site-packages
for the compiled library file. If you are not using apt-get to install your packages, you will need to tell the Makefile where the appropriate site-packages directory is located on your system. Setting the SITE_PACKAGES_DIR
command line variable when running make accomplishes this. If this doesn’t work, an alternative way to set this up is to add a line to the .bash_profile
file (always back up first):
export SITE_PACKAGES_DIR="<path to lib/python2.7>/site-packages/"
This allows for special installations of Python such as Continuum Analytics’ Anaconda distribution site-packages to be accessed.
5.2. Build and Installation of SmallK¶
5.2.1. Obtain the Source Code¶
The source code for the SmallK library can be downloaded from the SmallK repository on github. Once downloaded uncompress the tar ball and follow the installation instructions below.
5.2.2. Build the SmallK library¶
After downloading and unpacking the code tarball cd into the top-level libsmallk1_<version>
directory, where version is MAJOR.MINOR.PATCH
(for example 1.6.2). The makefiles assume that you followed our suggested installation plan for Elemental. If this is NOT the case you will need to do one of the following:
- Create an environment variable called
ELEMENTAL_INSTALL_DIR
which contains the path to the root folder of your Elemental installation- Define the variable
ELEMENTAL_INSTALL_DIR
on the make command line- Edit the SmallK makefile so that it can find your Elemental installation
Assuming that the default install locations are acceptable, build the SmallK code by running this command from the root directory of the distribution:
make all PYSMALLK=1 ELEMVER=0.85
or:
make all PYSMALLK=0 ELEMVER=0.85
This will build the SmallK and pysmallk (optional; see section [Installation of Python libraries]) below for setup of the Python libraries) libraries and several command-line applications. These are:
libsmallk.a
, the SmallK librarypreprocess_tf
, a command-line application for processing and scoring term-frequency matricesmatrixgen
, a command-line application for generating random matricesnmf
, a command-line application for NMFhierclust
, a command-line application for fast hierarchical clusteringflatclust
, a command-line application for flat clustering via NMFpysmallk.so
, if PYSMALLK=1 (0: default), the Python-wrapped SmallK library, making SmallK available via Python
5.2.3. Install the SmallK library¶
To install the code, run this command to install to the default location, which is /usr/local/smallk
:
make install PYSMALLK=1 ELEMVER=0.85
or:
make install PYSMALLK=0 ELEMVER=0.85
This will install the binary files listed above into the /usr/local/smallk/bin
directory, which needs to be on your path to run the executables from anywhere on your system and avoid prepending with the entire path. To install the binary code to a different location, either create an environment variable called SMALLK_INSTALL_DIR
and set it equal to the desired installation location prior to running the install command, or supply a prefix argument:
make prefix=/path/to/smallk install
If PYSMALLK=1
, this will install pysmallk.so into the site-packages directory associated with the Python binary, which is determined by brew install python
as discussed above or wherever the python distribution is installed on the system, e.g., Continuum’s Anaconda Python distribution is installed in the user’s home directory. To install the Python library to a different location, create an environment variable called SITE_PACKAGES_DIR
and set it equal to the desired installation location prior to running the install command, or supply this as an argument for make:
make SITE_PACKAGES_DIR=/path/to/site-packages install
Or, as a last resort, you can edit the top-level SmallK makefile to conform to the installation scheme of your system. You may need root privileges to do the installation, depending on where you choose to install it.
Before testing the installation, the test code needs to access data. The data is located in a separate github repository so that when cloning the code, the large amount of data is not included. The data repository is located on github at smallk_data:
5.2.4. Check the build and installation¶
To test the build, run this command with DATA_DIR
set to wherever the SmallK data repository was cloned:
make check PYSMALLK=1 ELEMVER=0.85 DATA_DIR=../smallk_data
or:
make check PYSMALLK=0 ELEMVER=0.85 DATA_DIR=../smallk_data
This will run a series of tests, none of which should report a failure. Sample output from a run of these tests can be found in section SmallK Test Results.
Note: if you installed Elemental version 0.85, you will need to configure your system to find the Elemental shared library. See the Elemental installation instructions above for information on how to do this.
The command-line applications can be built individually by running the appropriate make command from the top-level SmallK directory. These commands are:
To build the smallk library only: ``make libsmallk``
To build the preprocessor only: ``make preprocessor``
To build the matrix generator only: ``make matrixgen``
To build the nmf only: ``make nmf``
To build hierclust only: ``make hierclust``
To build flatclust only: ``make flatclust``
To build pysmallk only: ``make pysmallk``
This completes the SmallK NMF library installation.
5.4. Matrix file formats¶
The SmallK software supports comma-separated value (CSV) files for dense matrices and Matrix Market files for sparse matrices.
For example, the 5x3 dense matrix:
42 47 52
43 48 53
44 49 54
45 50 55
46 51 56
would be stored in a CSV file as follows:
42,47,52
43,48,53
44,49,54
45,50,55
46,51,56
The matrix is loaded exactly as it appears in the file. Internally, SmallK stores dense matrices in column-major order. Sparse matrices are stored in compressed column format.
5.5. Disclaimer¶
This software is a work in progress. It will be updated throughout the course of the XDATA program with additional algorithms and examples. The distributed NMF factorization routine uses sequential algorithms, but it replaces the matrices and matrix operations with distributed versions. The GA Tech research group is working on proper distributed NMF algorithms, and when such algorithms are available they will be added to the library. Thus, the performance of the distributed code should be viewed as being the baseline for our future distributed NMF implementations.
5.6. Contact Info¶
For comments, questions, bug reports, suggestions, etc., contact: