本页面内容由深度学习中文社区(studydl.com)持续维护更新, 软件详细介绍及官方主页请点击文末了解更多获取。深度学习目前主流的框架包括有Tensorflow,Caffe,Pytorch,Keras,MXNet,CNTK等,他们在GitHub的相关数据统计如下图所示.
各个开源框架在GitHub上的数据统计
除了这些主流的框架以外,我们还整理了如下所示的一些软件列表.
C通用机器学习
Recommender - 一个产品推荐的C语言库,利用了协同过滤.计算机视觉
CCV - C-based/Cached/Core Computer Vision Library ,是一个现代化的计算机视觉库。VLFeat - VLFeat 是开源的 computer vision algorithms库, 有 Matlab toolbox。C++计算机视觉
OpenCV - 最常用的视觉库。有 C++, C, Python 以及 Java 接口),支持Windows, Linux, Android and Mac OS。DLib - DLib 有 C++ 和 Python 脸部识别和物体检测接口 。EBLearn - Eblearn 是一个面向对象的 C++ 库,实现了各种机器学习模型。VIGRA - VIGRA 是一个跨平台的机器视觉和机器学习库,可以处理任意维度的数据,有Python接口。通用机器学习
MLPack - 可拓展的 C++ 机器学习库。DLib - 设计为方便嵌入到其他系统中。encog-cppsharkVowpal Wabbit (VW) - A fast out-of-core learning system.sofia-ml - fast incremental 算法套件.Shogun - The Shogun Machine Learning ToolboxCaffe - deep learning 框架,结构清晰,可读性好,速度快。CXXNET - 精简的框架,核心代码不到 1000 行。XGBoost - 为并行计算优化过的 gradient boosting library.CUDA - This is a fast C++/CUDA implementation of convolutional [DEEP LEARNING]Stan - A probabilistic programming language implementing full Bayesian statistical inference with Hamiltonian Monte Carlo samplingBanditLib - A simple Multi-armed Bandit library.Timbl - 实现了多个基于内存的算法,其中 IB1-IG (KNN分类算法)和 IGTree(决策树)在NLP中广泛应用.自然语言处理
MIT Information Extraction Toolkit - C, C++, and Python 工具,用来命名实体识别和关系抽取。CRF++ - 条件随机场的开源实现,可以用作分词,词性标注等。CRFsuite - CRFsuite 是条件随机场的实现,可以用作词性标注等。BLLIP Parser - 即Charniak-Johnson parser。colibri-core - 一组C++ library, 命令行工具以及Python binding,高效实现了n-grams 和 skipgrams。ucto - 多语言tokenizer,支持面向Unicode的正则表达式,支持 FoLiA 格式.libfolia - C++ library for the FoLiA formatMeTA - MeTA : ModErn Text Analysis 从巨量文本中挖掘数据。机器翻译
EGYPT (GIZA++)MosespharaohSRILMNiuTransjaneSAMT语音识别
Kaldi - Kaldi是一个C ++工具,以Apache许可证V2.0发布。Kaldi适用于语音识别的研究。Sequence Analysis
ToPS - This is an objected-oriented framework that facilitates the integration of probabilistic models for sequences over a user defined alphabet.Java自然语言处理
Cortical.io - Retina: 此API执行复杂的NLP操作(消歧义,分类,流文本过滤等),快速、直观如同大脑一般。CoreNLP - Stanford CoreNLP 提供了一组自然语言分析工具,可采取raw英语文本输入并给出单词的基本形式。Stanford Parser - parser是一个程序,能分析出句子的语法结构。Stanford POS Tagger - 词性标注器Stanford Name Entity Recognizer - 斯坦福大学NER是一个Java实现的命名实体识别器。Stanford Word Segmenter - 原始文本的token化是许多NLP任务的标准预处理步骤。Tregex, Tsurgeon and Semgrex - Tregex是匹配树模式的工具,基于树的关系和正则表达式的节点匹配( short for "tree regular expressions")。Stanford Phrasal: A Phrase-Based Translation SystemStanford English Tokenizer - Stanford Phrasal 是最先进的统计的基于短语的机器翻译系统,用Java编写。Stanford Tokens Regex - A tokenizer divides text into a sequence of tokens, which roughly correspond to "words"Stanford Temporal Tagger - SUTime 是识别和规范时间表达式的库。Stanford SPIED - 从种子集开始,迭代使用模式,从未标注文本中习得实体。Stanford Topic Modeling Toolbox - 主题建模工具,社会学家用它分析的数据集。Twitter Text Java - Java实现的Twitter文本处理库。MALLET - 基于Java的软件包,包括统计自然语言处理,文档分类,聚类,主题建模,信息提取,以及其它机器学习应用。OpenNLP - 一个基于机器学习的自然语言处理的工具包。LingPipe - 计算语言学工具包。ClearTK - ClearTK提供了开发统计自然语言处理组件的框架,其建立在Apache UIMA之上。Apache cTAKES - Apache 临床文本分析及知识提取系统(cTAKES)是从电子病历、临床文本中进行信息抽取的一个开源系统。通用机器学习
aerosolve - Airbnb 从头开始设计的机器学习库,易用性好。Datumbox - 机器学习和统计应用程序的快速开发框架。ELKI - 数据挖掘工具. (非监督学习: 聚类, 离群点检测等.)Encog - 先进的神经网络和机器学习框架。 Encog中包含用于创建各种网络,以及规范和处理数据的神经网络。 Encog训练采用多线程弹性的传播方式。 Encog还可以利用GPU的进一步加快处理时间。有基于GUI的工作台。H2O - 机器学习引擎,支持Hadoop, Spark等分布式系统和个人电脑,可以通过R, Python, Scala, REST/JSON调用API。htm.java - 通用机器学习库,使用 Numenta’s Cortical Learning Algorithmjava-deeplearning - 分布式深度学习平台 for Java, Clojure,ScalaJAVA-ML - Java通用机器学习库,所有算法统一接口。JSAT - 具有很多分类,回归,聚类等机器学习算法。Mahout - 分布式机器学习工具。Meka - 一个开源实现的多标签分类和评估方法。基于weka扩展。MLlib in Apache Spark - Spark分布式机器学习库Neuroph - 轻量级Java神经网络框架ORYX - Lambda Architecture Framework,使用Apache Spark和Apache Kafka实现实时大规模机器学习。RankLib - 排序算法学习库。Stanford Classifier - A classifier is a machine learning tool that will take data items and place them into one of k classes.SmileMiner - Statistical Machine Intelligence & Learning EngineSystemML - 灵活的,可扩展的机器学习语言。WalnutiQ - 面向对象的人脑模型Weka - WEKA是机器学习算法用于数据挖掘任务的算法集合。语音识别
CMU Sphinx - 开源工具包,用于语音识别,完全基于Java的语音识别库。数据分析、可视化
Hadoop - Hadoop/HDFSSpark - Spark 快速通用的大规模数据处理引擎。Impala - 实时Hadoop查询。DataMelt - 数学软件,包含数值计算,统计,符号计算,数据分析和数据可视化。Dr. Michael Thomas Flanagan's Java Scientific LibraryDeep Learning
Deeplearning4j - 可扩展的产业化的深度学习,利用并行的GPU。Python计算机视觉
Scikit-Image - Python中的图像处理算法的集合。SimpleCV - 一个开源的计算机视觉框架,允许访问几个高性能计算机视觉库,如OpenCV。可以运行在Mac,Windows和Ubuntu Linux操作系统上。Vigranumpy - 计算机视觉库VIGRA C++ 的Python绑定。自然语言处理
NLTK - 构建与人类语言数据相关工作的Python程序的领先平台。Pattern - 基于Python的Web挖掘模块。它有自然语言处理,机器学习等工具。Quepy - 将自然语言问题转换成数据库查询语言。TextBlob - 为普通的自然语言处理(NLP)任务提供一致的API。构建于NLTK和Pattern上,并很好地与两者交互。YAlign - 句子对齐工具,从对照语料中抽取并行句子。jieba - 中文分词工具SnowNLP - 中文文本处理库。loso - 中文分词工具genius - 基于条件随机场的中文分词工具KoNLPy - 韩语自然语言处理nut - 自然语言理解工具Rosetta - Text processing tools and wrappers (e.g. Vowpal Wabbit)BLLIP Parser - BLLIP Natural Language Parser 的Python绑定(即 Charniak-Johnson parser)PyNLPl - Python的自然语言处理库。还包含用于解析常见NLP格式的工具,如FoLiA, 以及 ARPA language models, Moses phrasetables, GIZA++ 对齐等。python-ucto - ucto(面向unicode的基于规则的tokenizer)的Python 绑定python-frog - Frog的Python 绑定。荷兰语的词性标注,lemmatisation,依存分析,NER。python-zpar - ZPar的Python 绑定(英文的基于统计的词性标注, constiuency解析器和依赖解析器)colibri-core - 高效提取 n-grams 和 skipgrams的C++库的Python 绑定spaCy - 工业级 NLP with Python and Cython.PyStanfordDependencies - 将 Penn Treebank tree转换到Stanford 依存树的Python接口.通用机器学习
machine learning - 构建和 web-interface, programmatic-interface 兼容的支持向量机API. 相应的数据集存储到一个SQL数据库,然后生成用于预测的模型,存储到一个NoSQL的数据库。XGBoost - eXtreme Gradient Boosting (Tree)库的Python 绑定Featureforge一组工具,用于创建和测试机器学习的特征,具有与scikit-learn兼容的APIscikit-learn - 基于SciPy的机器学习的Python模块。metric-learn - metric learning的Python模块SimpleAI -实现了“人工智能现代方法”一书中描述的许多人工智能算法。它着重于提供一个易于使用的,文档良好的和经过测试的库。astroML - 天文学机器学习和数据挖掘库。graphlab-create - 基于disk-backed DataFrame的库,实现了各种机器学习模型(回归,聚类,推荐系统,图形分析等)。BigML - 与外部服务器交流的库。pattern - Web数据挖掘模块.NuPIC - Numenta智能计算平台.Pylearn2 - 基于 Theano的机器学习库。keras - 基于 Theano的神经网络库hebel - GPU加速的Python深度学习库。Chainer - 灵活的神经网络架构gensim - 易用的主题建模工具topik - 主题建模工具包PyBrain - Another Python Machine Learning Library.Crab - 灵活的,快速的推荐引擎python-recsys - 实现一个推荐系统的Python工具Restricted Boltzmann Machines -受限玻尔兹曼机CoverTree - Python implementation of cover trees, near-drop-in replacement for scipy.spatial.kdtreenilearn - NeuroImaging机器学习库Shogun - Shogun Machine Learning ToolboxPyevolve - 遗传算法框架Caffe - deep learning 框架,结构清晰,可读性好,速度快。breze - 基于Theano 的深度神经网络pyhsmm - 贝叶斯隐马尔可夫模型近似无监督的推理和显式时长隐半马尔可夫模型,专注于贝叶斯非参数扩展,the HDP-HMM and HDP-HSMM,大多是弱极限近似。mrjob - 使得 Python 程序可以跑在 Hadoop上.SKLL - 简化的scikit-learn接口,易于做实验neurolab - 优化机器学习的管道。将它看作您的数据科学助理,自动化机器学习中大部分的枯燥工作。数据分析、可视化
SciPy - A Python-based ecosystem of open-source software for mathematics, science, and engineering.NumPy - A fundamental package for scientific computing with Python.Numba - Python JIT (just in time) complier to LLVM aimed at scientific Python by the developers of Cython and NumPy.NetworkX - A high-productivity software for complex networks.Pandas - A library providing high-performance, easy-to-use data structures and data analysis tools.Open Mining - Business Intelligence (BI) in Python (Pandas web interface)PyMC - Markov Chain Monte Carlo sampling toolkit.zipline - A Pythonic algorithmic trading library.PyDy - Short for Python Dynamics, used to assist with workflow in the modeling of dynamic motion based around NumPy, SciPy, IPython, and matplotlib.SymPy - A Python library for symbolic mathematics.statsmodels - Statistical modeling and econometrics in Python.astropy - A community Python library for Astronomy.matplotlib - A Python 2D plotting library.bokeh - Interactive Web Plotting for Python.plotly - Collaborative web plotting for Python and matplotlib.vincent - A Python to Vega translator.d3py - A plottling library for Python, based on D3.js.ggplot - Same API as ggplot2 for R.ggfortify - Unified interface to ggplot2 popular R packages.Kartograph.py - Rendering beautiful SVG maps in Python.pygal - A Python SVG Charts Creator.PyQtGraph - A pure-python graphics and GUI library built on PyQt4 / PySide and NumPy.pycascadingPetrel - Tools for writing, submitting, debugging, and monitoring Storm topologies in pure Python.Blaze - NumPy and Pandas interface to Big Data.emcee - The Python ensemble sampling toolkit for affine-invariant MCMC.windML - A Python Framework for Wind Energy Analysis and Predictionvispy - GPU-based high-performance interactive OpenGL 2D/3D data visualization librarycerebro2 A web-based visualization and debugging platform for NuPIC.NuPIC Studio An all-in-one NuPIC Hierarchical Temporal Memory visualization and debugging super-tool!SparklingPandas Pandas on PySpark (POPS)Seaborn - A python visualization library based on matplotlibbqplot - An API for plotting in Jupyter (IPython)Common Lisp通用机器学习
mgl - Neural networks (boltzmann machines, feed-forward and recurrent nets), Gaussian Processesmgl-gpr - Evolutionary algorithmscl-libsvm - Wrapper for the libsvm support vector machine libraryClojure自然语言处理
Clojure-openNLP - Natural Language Processing in Clojure (opennlp)Infections-clj - Rails-like inflection library for Clojure and ClojureScript通用机器学习
Touchstone - Clojure A/B testing libraryClojush - he Push programming language and the PushGP genetic programming system implemented in ClojureInfer - Inference and machine learning in clojureClj-ML - A machine learning library for Clojure built on top of Weka and friendsEncog - Clojure wrapper for Encog (v3) (Machine-Learning framework that specializes in neural-nets)Fungp - A genetic programming library for ClojureStatistiker - Basic Machine Learning algorithms in Clojure.clortex - General Machine Learning library using Numenta’s Cortical Learning Algorithmcomportex - Functionally composable Machine Learning library using Numenta’s Cortical Learning Algorithm数据分析、可视化
Incanter - Incanter is a Clojure-based, R-like platform for statistical computing and graphics.PigPen - Map-Reduce for Clojure.Envision - Clojure Data Visualisation library, based on Statistiker and D3Matlab计算机视觉
Contourlets - MATLAB source code that implements the contourlet transform and its utility functions.Shearlets - MATLAB code for shearlet transformCurvelets - The Curvelet transform is a higher dimensional generalization of the Wavelet transform designed to represent images at different scales and different angles.Bandlets - MATLAB code for bandlet transformmexopencv - Collection and a development kit of MATLAB mex functions for OpenCV library自然语言处理
NLP - An NLP library for Matlab通用机器学习
t-Distributed Stochastic Neighbor Embedding - t-SNE是一个获奖的技术,可以降维,尤其适合高维数据可视化Spider - The spider有望成为matlab里机器学习中的完整的面向对象环境。LibSVM - 著名的支持向量机库。LibLinear - A Library for Large Linear ClassificationCaffe - deep learning 框架,结构清晰,可读性好,速度快。Pattern Recognition Toolbox - Matlab机器学习中一个完整的面向对象的环境。Optunity - A library dedicated to automated hyperparameter optimization with a simple, lightweight API to facilitate drop-in replacement of grid search. Optunity is written in Python but interfaces seamlessly with MATLAB.致力于自动化超参数优化的,一个简单的,轻量级的API库,方便直接替换网格搜索。 Optunity是用Python编写的,但与MATLAB的无缝连接。数据分析、可视化
matlab_gbl - MatlabBGL is a Matlab package for working with graphs.gamic - Efficient pure-Matlab implementations of graph algorithms to complement MatlabBGL's mex functions..NET计算机视觉
OpenCVDotNet - A wrapper for the OpenCV project to be used with .NET applications.Emgu CV - Cross platform wrapper of OpenCV which can be compiled in Mono to e run on Windows, Linus, Mac OS X, iOS, and Android.AForge.NET - Open source C# framework for developers and researchers in the fields of Computer Vision and Artificial Intelligence. Development has now shifted to GitHub.Accord.NET - Together with AForge.NET, this library can provide image processing and computer vision algorithms to Windows, Windows RT and Windows Phone. Some components are also available for Java and Android.自然语言处理
Stanford.NLP for .NET - A full port of Stanford NLP packages to .NET and also available precompiled as a NuGet package.通用机器学习
Accord-Framework - 一个完整的框架,可以用于机器学习,计算机视觉,computer audition, 信号处理,统计应用等。.Accord.MachineLearning - Support Vector Machines, Decision Trees, Naive Bayesian models, K-means, Gaussian Mixture models and general algorithms such as Ransac, Cross-validation and Grid-Search for machine-learning applications. This package is part of the Accord.NET Framework.DiffSharp - An automatic differentiation (AD) library providing exact and efficient derivatives (gradients, Hessians, Jacobians, directional derivatives, and matrix-free Hessian- and Jacobian-vector products) for machine learning and optimization applications. Operations can be nested to any level, meaning that you can compute exact higher-order derivatives and differentiate functions that are internally making use of differentiation, for applications such as hyperparameter optimization.Vulpes - Deep belief and deep learning implementation written in F# and leverages CUDA GPU execution with Alea.cuBase.Encog - An advanced neural network and machine learning framework. Encog contains classes to create a wide variety of networks, as well as support classes to normalize and process data for these neural networks. Encog trains using multithreaded resilient propagation. Encog can also make use of a GPU to further speed processing time. A GUI based workbench is also provided to help model and train neural networks.Neural Network Designer - DBMS management system and designer for neural networks. The designer application is developed using WPF, and is a user interface which allows you to design your neural network, query the network, create and configure chat bots that are capable of asking questions and learning from your feed back. The chat bots can even scrape the internet for information to return in their output as well as to use for learning.数据分析、可视化
numl - numl is a machine learning library intended to ease the use of using standard modeling techniques for both prediction and clustering.Math.NET Numerics - Numerical foundation of the Math.NET project, aiming to provide methods and algorithms for numerical computations in science, engineering and every day use. Supports .Net 4.0, .Net 3.5 and Mono on Windows, Linux and Mac; Silverlight 5, WindowsPhone/SL 8, WindowsPhone 8.1 and Windows 8 with PCL Portable Profiles 47 and 344; Android/iOS with Xamarin.Sho - Sho is an interactive environment for data analysis and scientific computing that lets you seamlessly connect scripts (in IronPython) with compiled code (in .NET) to enable fast and flexible prototyping. The environment includes powerful and efficient libraries for linear algebra as well as data visualization that can be used from any .NET language, as well as a feature-rich interactive shell for rapid development.Ruby自然语言处理
Treat - Text REtrieval and Annotation Toolkit, definitely the most comprehensive toolkit I’ve encountered so far for RubyRuby Linguistics - Linguistics is a framework for building linguistic utilities for Ruby objects in any language. It includes a generic language-independent front end, a module for mapping language codes into language names, and a module which contains various English-language utilities.Stemmer - Expose libstemmer_c to RubyRuby Wordnet - This library is a Ruby interface to WordNetRaspel - raspell is an interface binding for rubyUEA Stemmer - Ruby port of UEALite Stemmer - a conservative stemmer for search and indexingTwitter-text-rb - A library that does auto linking and extraction of usernames, lists and hashtags in tweets通用机器学习
Ruby Machine Learning - Some Machine Learning algorithms, implemented in RubyMachine Learning RubyjRuby Mahout - JRuby Mahout is a gem that unleashes the power of Apache Mahout in the world of JRuby.CardMagic-Classifier - A general classifier module to allow Bayesian and other types of classifications.数据分析、可视化
rsruby - Ruby - R bridgedata-visualization-ruby - Source code and supporting content for my Ruby Manor presentation on Data Visualisation with Rubyruby-plot - gnuplot wrapper for ruby, especially for plotting roc curves into svg filesplot-rb - A plotting library in Ruby built on top of Vega and D3.scruffy - A beautiful graphing toolkit for RubySciRubyGlean - A data management tool for humansBiorubyArelMiscBig Data For ChimpsListof - Community based data collection, packed in gem. Get list of pretty much anything (stop words, countries, non words) in txt, json or hash. Demo/Search for a listR通用机器学习
ahaz - ahaz: Regularization for semiparametric additive hazards regressionarules - arules: Mining Association Rules and Frequent Itemsetsbigrf - bigrf: Big Random Forests: Classification and Regression Forests for Large Data SetsbigRR - bigRR: Generalized Ridge Regression (with special advantage for p >> n cases)bmrm - bmrm: Bundle Methods for Regularized Risk Minimization PackageBoruta - Boruta: A wrapper algorithm for all-relevant feature selectionbst - bst: Gradient BoostingC50 - C50: C5.0 Decision Trees and Rule-Based Modelscaret - Classification and Regression Training: Unified interface to ~150 ML algorithms in R.caretEnsemble - caretEnsemble: Framework for fitting multiple caret models as well as creating ensembles of such models.Clever Algorithms For Machine LearningCORElearn - CORElearn: Classification, regression, feature evaluation and ordinal evaluationCoxBoost - CoxBoost: Cox models by likelihood based boosting for a single survival endpoint or competing risksCubist - Cubist: Rule- and Instance-Based Regression Modelinge1071 - e1071: Misc Functions of the Department of Statistics (e1071), TU Wienearth - earth: Multivariate Adaptive Regression Spline Modelselasticnet - elasticnet: Elastic-Net for Sparse Estimation and Sparse PCAElemStatLearn - ElemStatLearn: Data sets, functions and examples from the book: "The Elements of Statistical Learning, Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani and Jerome Friedman Prediction" by Trevor Hastie, Robert Tibshirani and Jerome Friedmanevtree - evtree: Evolutionary Learning of Globally Optimal Treesfpc - fpc: Flexible procedures for clusteringfrbs - frbs: Fuzzy Rule-based Systems for Classification and Regression TasksGAMBoost - GAMBoost: Generalized linear and additive models by likelihood based boostinggamboostLSS - gamboostLSS: Boosting Methods for GAMLSSgbm - gbm: Generalized Boosted Regression Modelsglmnet - glmnet: Lasso and elastic-net regularized generalized linear modelsglmpath - glmpath: L1 Regularization Path for Generalized Linear Models and Cox Proportional Hazards ModelGMMBoost - GMMBoost: Likelihood-based Boosting for Generalized mixed modelsgrplasso - grplasso: Fitting user specified models with Group Lasso penaltygrpreg - grpreg: Regularization paths for regression models with grouped covariatesh2o - A framework for fast, parallel, and distributed machine learning algorithms at scale -- Deeplearning, Random forests, GBM, KMeans, PCA, GLMhda - hda: Heteroscedastic Discriminant AnalysisIntroduction to Statistical Learningipred - ipred: Improved Predictorskernlab - kernlab: Kernel-based Machine Learning LabklaR - klaR: Classification and visualizationlars - lars: Least Angle Regression, Lasso and Forward Stagewiselasso2 - lasso2: L1 constrained estimation aka ‘lasso’LiblineaR - LiblineaR: Linear Predictive Models Based On The Liblinear C/C++ LibraryLogicReg - LogicReg: Logic RegressionMachine Learning For Hackersmaptree - maptree: Mapping, pruning, and graphing tree modelsmboost - mboost: Model-Based Boostingmedley - medley: Blending regression models, using a greedy stepwise approachmlr - mlr: Machine Learning in Rmvpart - mvpart: Multivariate partitioningncvreg - ncvreg: Regularization paths for SCAD- and MCP-penalized regression modelsnnet - nnet: Feed-forward Neural Networks and Multinomial Log-Linear Modelsoblique.tree - oblique.tree: Oblique Trees for Classification Datapamr - pamr: Pam: prediction analysis for microarraysparty - party: A Laboratory for Recursive Partytioningpartykit - partykit: A Toolkit for Recursive Partytioningpenalized - penalized: L1 (lasso and fused lasso) and L2 (ridge) penalized estimation in GLMs and in the Cox modelpenalizedLDA - penalizedLDA: Penalized classification using Fisher's linear discriminantpenalizedSVM - penalizedSVM: Feature Selection SVM using penalty functionsquantregForest - quantregForest: Quantile Regression ForestsrandomForest - randomForest: Breiman and Cutler's random forests for classification and regressionrandomForestSRC - randomForestSRC: Random Forests for Survival, Regression and Classification (RF-SRC)rattle - rattle: Graphical user interface for data mining in Rrda - rda: Shrunken Centroids Regularized Discriminant Analysisrdetools - rdetools: Relevant Dimension Estimation (RDE) in Feature SpacesREEMtree - REEMtree: Regression Trees with Random Effects for Longitudinal (Panel) Datarelaxo - relaxo: Relaxed Lassorgenoud - rgenoud: R version of GENetic Optimization Using Derivativesrgp - rgp: R genetic programming frameworkRmalschains - Rmalschains: Continuous Optimization using Memetic Algorithms with Local Search Chains (MA-LS-Chains) in Rrminer - rminer: Simpler use of data mining methods (e.g. NN and SVM) in classification and regressionROCR - ROCR: Visualizing the performance of scoring classifiersRoughSets - RoughSets: Data Analysis Using Rough Set and Fuzzy Rough Set Theoriesrpart - rpart: Recursive Partitioning and Regression TreesRPMM - RPMM: Recursively Partitioned Mixture ModelRSNNS - RSNNS: Neural Networks in R using the Stuttgart Neural Network Simulator (SNNS)RWeka - RWeka: R/Weka interfaceRXshrink - RXshrink: Maximum Likelihood Shrinkage via Generalized Ridge or Least Angle Regressionsda - sda: Shrinkage Discriminant Analysis and CAT Score Variable SelectionSDDA - SDDA: Stepwise Diagonal Discriminant AnalysisSuperLearner and subsemble - Multi-algorithm ensemble learning packages.svmpath - svmpath: svmpath: the SVM Path algorithmtgp - tgp: Bayesian treed Gaussian process modelstree - tree: Classification and regression treesvarSelRF - varSelRF: Variable selection using random forestsXGBoost.R - R binding for eXtreme Gradient Boosting (Tree) LibraryOptunity - A library dedicated to automated hyperparameter optimization with a simple, lightweight API to facilitate drop-in replacement of grid search. Optunity is written in Python but interfaces seamlessly to R.数据分析、可视化
ggplot2 - A data visualization package based on the grammar of graphics.Scala自然语言处理
ScalaNLP - ScalaNLP is a suite of machine learning and numerical computing libraries.Breeze - Breeze is a numerical processing library for Scala.Chalk - Chalk is a natural language processing library.FACTORIE - FACTORIE is a toolkit for deployable probabilistic modeling, implemented as a software library in Scala. It provides its users with a succinct language for creating relational factor graphs, estimating parameters and performing inference.数据分析、可视化
MLlib in Apache Spark - Distributed machine learning library in SparkScalding - A Scala API for CascadingSumming Bird - Streaming MapReduce with Scalding and StormAlgebird - Abstract Algebra for Scalaxerial - Data management utilities for Scalasimmer - Reduce your data. A unix filter for algebird-powered aggregation.PredictionIO - PredictionIO, a machine learning server for software developers and data engineers.BIDMat - CPU and GPU-accelerated matrix library intended to support large-scale exploratory data analysis.Wolfe Declarative Machine Learning通用机器学习
Conjecture - Scalable Machine Learning in Scaldingbrushfire - Distributed decision tree ensemble learning in Scalaganitha - scalding powered machine learningadam - A genomics processing engine and specialized file format built using Apache Avro, Apache Spark and Parquet. Apache 2 licensed.bioscala - Bioinformatics for the Scala programming languageBIDMach - CPU and GPU-accelerated Machine Learning Library.Figaro - a Scala library for constructing probabilistic models.H2O Sparkling Water - H2O and Spark interoperability.上述软件详细介绍及官方主页请点击文末了解更多按钮进行获取