Mahout 1.0 Features by Engine
| Single Machine | MapReduce | Spark | H2O | Flink | |
|---|---|---|---|---|---|
| Mahout Math-Scala Core Library and Scala DSL | |||||
| Mahout Distributed BLAS. Distributed Row Matrix API with R and Matlab like operators. Distributed ALS, SPCA, SSVD, thin-QR. Similarity Analysis. | x | x | in development | ||
| Mahout Interactive Shell | |||||
| Interactive REPL shell for Spark optimized Mahout DSL | x | ||||
| Collabritive Filtering with CLI Drivers | |||||
| User-Based Collaborative Filtering | x | x | |||
| Item-Based Collaborative Filtering | x | x | x | ||
| Matrix Factorization with ALS | x | x | |||
| Matrix Factorization with ALS on Implicit Feedback | x | x | |||
| Weighted Matrix Factorization, SVD++ | x | ||||
| Classification with CLI Drivers | |||||
| Logistic Regression - trained via SGD | x | ||||
| Naive Bayes / Complementary Naive Bayes | x | in development | in development | ||
| Random Forest | x | ||||
| Hidden Markov Models - single machine | x | ||||
| Multilayer Perceptron - single machine | x | ||||
| Clustering with CLI Drivers | |||||
| Canopy Clustering | deprecated | deprecated | |||
| k-Means Clustering | x | x | |||
| Fuzzy k-Means | x | x | |||
| Streaming k-Means | x | x | |||
| Spectral Clustering | x | ||||
| Dimensionality Reduction with CLI Drivers - note: most scala-based dimensionality reduction algorithms are available through the Math-Scala Core Library for all engines | |||||
| Singular Value Decomposition | x | x | |||
| Lanczos Algorithm | x | x | |||
| Stochastic SVD | x | x | |||
| PCA (via Stochastic SVD) | x | x | |||
| QR Decomposition | x | x | |||
| Topic Models | |||||
| Latent Dirichlet Allocation | x | x | |||
| Miscellaneous | |||||
| RowSimilarityJob | x | x | |||
| ConcatMatrices | x | ||||
| Collocations | x | ||||
| Sparse TF-IDF Vectors from Text | x | ||||
| XML Parsing | x | ||||
| Email Archive Parsing | x | ||||
| Lucene Integration | x | ||||
| Evolutionary Processes | x |