Currently Wikimedia does not provide enough server capacities to create a PDF version but here is on Google drive.
- Introduction and Main Principles
- Machine learning
- Data analysis
- Occam's razor
- Curse of dimensionality
- No free lunch theorem
- Accuracy paradox
- Overfitting
- Regularization (machine learning)
- Inductive bias
- Data dredging
- Ugly duckling theorem
- Uncertain data
- Background and Preliminaries
- Knowledge discovery in Databases
- Knowledge discovery
- Data mining
- Predictive analytics
- Predictive modelling
- Business intelligence
- Reactive business intelligence
- Business analytics
- Reactive business intelligence
- Pattern recognition
- Reasoning
- Abductive reasoning
- Inductive reasoning
- First-order logic
- Inductive logic programming
- Reasoning system
- Case-based reasoning
- Textual case based reasoning
- Causality
- Search Methods
- Nearest neighbor search
- Stochastic gradient descent
- Beam search
- Best-first search
- Breadth-first search
- Hill climbing
- Grid search
- Brute-force search
- Depth-first search
- Tabu search
- Anytime algorithm
- Statistics
- Exploratory data analysis
- Covariate
- Statistical inference
- Algorithmic inference
- Bayesian inference
- Base rate
- Bias (statistics)
- Gibbs sampling
- Cross-entropy method
- Latent variable
- Maximum likelihood
- Maximum a posteriori estimation
- Expectationâ"maximization algorithm
- Expectation propagation
- Kullbackâ"Leibler divergence
- Generative model
- Main Learning Paradigms
- Supervised learning
- Unsupervised learning
- Active learning (machine learning)
- Reinforcement learning
- Multi-task learning
- Transduction
- Explanation-based learning
- Offline learning
- Online learning model
- Online machine learning
- Hyperparameter optimization
- Classification Tasks
- Classification in machine learning
- Concept class
- Features (pattern recognition)
- Feature vector
- Feature space
- Concept learning
- Binary classification
- Decision boundary
- Multiclass classification
- Class membership probabilities
- Calibration (statistics)
- Concept drift
- Prior knowledge for pattern recognition
- Iris flower data set (Classic data sets)
- Online Learning
- Margin Infused Relaxed Algorithm
- Semi-supervised learning
- Semi-supervised learning
- One-class classification
- Coupled pattern learner
- Lazy learning and nearest neighbors
- Lazy learning
- Eager learning
- Instance-based learning
- Cluster assumption
- K-nearest neighbor algorithm
- IDistance
- Large margin nearest neighbor
- Decision Trees
- Decision tree learning
- Decision stump
- Pruning (decision trees)
- Mutual information
- Adjusted mutual information
- Information gain ratio
- Information gain in decision trees
- ID3 algorithm
- C4.5 algorithm
- CHAID
- Information Fuzzy Networks
- Grafting (decision trees)
- Incremental decision tree
- Alternating decision tree
- Logistic model tree
- Random forest
- Linear Classifiers
- Linear classifier
- Margin (machine learning)
- Margin classifier
- Soft independent modelling of class analogies
- Statistical classification
- Statistical classification
- Probability matching
- Discriminative model
- Linear discriminant analysis
- Multiclass LDA
- Multiple discriminant analysis
- Optimal discriminant analysis
- Fisher kernel
- Discriminant function analysis
- Multilinear subspace learning
- Quadratic classifier
- Variable kernel density estimation
- Category utility
- Evaluation of Classification Models
- Data classification (business intelligence)
- Training set
- Test set
- Synthetic data
- Cross-validation (statistics)
- Loss function
- Hinge loss
- Generalization error
- Type I and type II errors
- Sensitivity and specificity
- Precision and recall
- F1 score
- Confusion matrix
- Matthews correlation coefficient
- Receiver operating characteristic
- Lift (data mining)
- Stability in learning
- Feature Creation and Optimization
- Data Pre-processing
- Discretization of continuous features
- Feature engineering
- Feature selection
- Feature extraction
- Dimension reduction
- Principal component analysis
- Multilinear principal-component analysis
- Multifactor dimensionality reduction
- Targeted projection pursuit
- Multidimensional scaling
- Nonlinear dimensionality reduction
- Kernel principal component analysis
- Kernel eigenvoice
- Gramian matrix
- Gaussian process
- Kernel adaptive filter
- Isomap
- Manifold alignment
- Diffusion map
- Elastic map
- Locality-sensitive hashing
- Spectral clustering
- Minimum redundancy feature selection
- Clustering
- Cluster analysis
- K-means clustering
- K-means++
- K-medians clustering
- K-medoids
- DBSCAN
- Fuzzy clustering
- BIRCH (data clustering)
- Canopy clustering algorithm
- Cluster-weighted modeling
- Clustering high-dimensional data
- Cobweb (clustering)
- Complete-linkage clustering
- Constrained clustering
- Correlation clustering
- CURE data clustering algorithm
- Data stream clustering
- Dendrogram
- Determining the number of clusters in a data set
- FLAME clustering
- Hierarchical clustering
- Information bottleneck method
- Lloyd's algorithm
- Nearest-neighbor chain algorithm
- Neighbor joining
- OPTICS algorithm
- Pitmanâ"Yor process
- Single-linkage clustering
- SUBCLU
- Thresholding (image processing)
- UPGMA
- Evaluation of Clustering Methods
- Rand index
- Dunn index
- Daviesâ"Bouldin index
- Jaccard index
- MinHash
- K q-flats
- Rule Induction
- Decision rules
- Rule induction
- Classification rule
- CN2 algorithm
- Decision list
- First Order Inductive Learner
- Association rules and Frequent Item Sets
- Association rule learning
- Apriori algorithm
- Contrast set learning
- Affinity analysis
- K-optimal pattern discovery
- Ensemble Learning
- Ensemble learning
- Ensemble averaging
- Consensus clustering
- AdaBoost
- Boosting
- Bootstrap aggregating
- BrownBoost
- Cascading classifiers
- Co-training
- CoBoosting
- Gaussian process emulator
- Gradient boosting
- LogitBoost
- LPBoost
- Mixture model
- Product of Experts
- Random multinomial logit
- Random subspace method
- Weighted Majority Algorithm
- Randomized weighted majority algorithm
- Graphical Models
- Graphical model
- State transition network
- Bayesian Learning Methods
- Naive Bayes classifier
- Averaged one-dependence estimators
- Bayesian network
- Variational message passing
- Markov Models
- Markov model
- Maximum-entropy Markov model
- Hidden Markov model
- Baumâ"Welch algorithm
- Forwardâ"backward algorithm
- Hierarchical hidden Markov model
- Markov logic network
- Markov chain Monte Carlo
- Markov random field
- Conditional random field
- Predictive state representation
- Learning Theory
- Computational learning theory
- Version space
- Probably approximately correct learning
- Vapnikâ"Chervonenkis theory
- Shattering (machine learning)
- VC dimension
- Minimum description length
- Bondy's theorem
- Inferential theory of learning
- Rademacher complexity
- Teaching dimension
- Subclass reachability
- Sample exclusion dimension
- Unique negative dimension
- Uniform convergence (combinatorics)
- Witness set
- Support Vector Machines
- Kernel methods
- Support vector machine
- Structural risk minimization
- Empirical risk minimization
- Kernel trick
- Least squares support vector machine
- Relevance vector machine
- Sequential minimal optimization
- Structured SVM
- Regression analysis
- Outline of regression analysis
- Regression analysis
- Dependent and independent variables
- Linear model
- Linear regression
- Least squares
- Linear least squares (mathematics)
- Local regression
- Additive model
- Antecedent variable
- Autocorrelation
- Backfitting algorithm
- Bayesian linear regression
- Bayesian multivariate linear regression
- Binomial regression
- Canonical analysis
- Censored regression model
- Coefficient of determination
- Comparison of general and generalized linear models
- Compressed sensing
- Conditional change model
- Controlling for a variable
- Cross-sectional regression
- Curve fitting
- Deming regression
- Design matrix
- Difference in differences
- Dummy variable (statistics)
- Errors and residuals in statistics
- Errors-in-variables models
- Explained sum of squares
- Explained variation
- First-hitting-time model
- Fixed effects model
- Fraction of variance unexplained
- Frischâ"Waughâ"Lovell theorem
- General linear model
- Generalized additive model
- Generalized additive model for location, scale and shape
- Generalized estimating equation
- Generalized least squares
- Generalized linear array model
- Generalized linear mixed model
- Generalized linear model
- Growth curve
- Guess value
- Hat matrix
- Heckman correction
- Heteroscedasticity-consistent standard errors
- Hosmerâ"Lemeshow test
- Instrumental variable
- Interaction (statistics)
- Isotonic regression
- Iteratively reweighted least squares
- Kitchen sink regression
- Lack-of-fit sum of squares
- Leverage (statistics)
- Limited dependent variable
- Linear probability model
- Mallows's Cp
- Mean and predicted response
- Mixed model
- Moderation (statistics)
- Moving least squares
- Multicollinearity
- Multiple correlation
- Multivariate probit
- Multivariate adaptive regression splines
- Neweyâ"West estimator
- Non-linear least squares
- Nonlinear regression
- Logistic Regression
- Logit
- Multinomial logit
- Logistic regression
- Bio-inspired Methods
- Bio-inspired computing
- Metaheuristic and search algs. there
- Swarm intelligence and methods there
- Particular algorithms:
- Particle_swarm_optimization
- Ant colony optimization algorithms
- Artificial immune system
- Firefly algorithm, 2008
- Cuckoo search, 2009
- Bat algorithm, 2010
- Evolutionary Algorithms
- Evolvability (computer science)
- Evolutionary computation
- Evolutionary algorithm
- Genetic algorithm
- Chromosome (genetic algorithm)
- Crossover (genetic algorithm)
- Fitness function
- Evolutionary data mining
- Genetic programming
- Learnable Evolution Model
- Stochastic diffusion search (SDS)
- Neural Networks
- Neural network
- Artificial neural network
- Artificial neuron
- Types of artificial neural networks
- Perceptron
- Multilayer perceptron
- Activation function
- Self-organizing map
- Attractor network
- ADALINE
- Adaptive Neuro Fuzzy Inference System
- Adaptive resonance theory
- IPO underpricing algorithm
- ALOPEX
- Artificial Intelligence System
- Autoassociative memory
- Autoencoder
- Backpropagation
- Bcpnn
- Bidirectional associative memory
- Biological neural network
- Boltzmann machine
- Restricted Boltzmann machine
- Cellular neural network
- Cerebellar Model Articulation Controller
- Committee machine
- Competitive learning
- Compositional pattern-producing network
- Computational cybernetics
- Computational neurogenetic modeling
- Confabulation (neural networks)
- Cortical column
- Counterpropagation network
- Cover's theorem
- Cultured neuronal network
- Dehaene-Changeux Model
- Delta rule
- Early stopping
- Echo state network
- The Emotion Machine
- Evolutionary Acquisition of Neural Topologies
- Extension neural network
- Feed-forward
- Feedforward neural network
- Generalized Hebbian Algorithm
- Generative topographic map
- Group method of data handling
- Growing self-organizing map
- Memory-prediction framework
- Helmholtz machine
- Hierarchical temporal memory
- Hopfield network
- Hybrid neural network
- HyperNEAT
- Infomax
- Instantaneously trained neural networks
- Interactive Activation and Competition
- Leabra
- Learning Vector Quantization
- Lernmatrix
- Lindeâ"Buzoâ"Gray algorithm
- Liquid state machine
- Long short-term memory
- Madaline
- Modular neural networks
- MoneyBee
- Neocognitron
- Nervous system network models
- NETtalk (artificial neural network)
- Neural backpropagation
- Neural coding
- Neural cryptography
- Neural decoding
- Neural gas
- Neural Information Processing Systems
- Neural modeling fields
- Neural oscillation
- Neurally controlled animat
- Neuroevolution of augmenting topologies
- Neuroplasticity
- Ni1000
- Nonspiking neurons
- Nonsynaptic plasticity
- Oja's rule
- Optical neural network
- Phase-of-firing code
- Promoter based genetic algorithm
- Pulse-coupled networks
- Quantum neural network
- Radial basis function
- Radial basis function network
- Random neural network
- Recurrent neural network
- Reentry (neural circuitry)
- Reservoir computing
- Rprop
- Semantic neural network
- Sigmoid function
- SNARC
- Softmax activation function
- Spiking neural network
- Stochastic neural network
- Synaptic plasticity
- Synaptic weight
- Tensor product network
- Time delay neural network
- U-Matrix
- Universal approximation theorem
- Winner-take-all
- Winnow (algorithm)
- Reinforcement learning
- Reinforcement learning
- Markov decision process
- Bellman equation
- Q-learning
- Temporal difference learning
- SARSA
- Multi-armed bandit
- Apprenticeship learning
- Predictive learning
- Text Mining
- Text mining
- Natural language processing
- Document classification
- Bag of words model
- N-gram
- Part-of-speech tagging
- Sentiment analysis
- Information extraction
- Topic model
- Concept mining
- Semantic analysis (machine learning)
- Automatic summarization
- String kernel
- Biomedical text mining
- Never-Ending Language Learning
- Structure Mining
- Structure mining
- Structured learning
- Structured prediction
- Sequence mining
- Sequence labeling
- Process mining
- Advanced Learning Tasks
- Multi-label classification
- Automated machine learning (AutoML)
- Classifier chains
- Web mining
- Anomaly detection
- Anomaly Detection at Multiple Scales
- Local outlier factor
- Novelty detection
- GSP Algorithm
- Optimal matching
- Record linkage
- Meta learning (computer science)
- Learning automata
- Learning to rank
- Multiple-instance learning
- Statistical relational learning
- Relational classification
- Data stream mining
- Alpha algorithm
- Syntactic pattern recognition
- Multispectral pattern recognition
- Algorithmic learning theory
- Deep learning
- Bongard problem
- Learning with errors
- Parity learning
- Inductive transfer
- Granular computing
- Conceptual clustering
- Formal concept analysis
- Biclustering
- Information visualization
- Co-occurrence networks
- Applications
- Problem domain
- Recommender system
- Collaborative filtering
- Profiling (information science)
- Speech recognition
- Stock forecast
- Activity recognition
- Data Analysis Techniques for Fraud Detection
- Molecule mining
- Behavioral targeting
- Proactive Discovery of Insider Threats Using Graph Analysis and Learning
- Robot learning
- Computer vision
- Facial recognition system
- Outlier detection
- Anomaly detection
- Novelty detection
Hello World - Machine Learning Recipes #1 - Six lines of Python is all it takes to write your first machine learning program! In this episode, we'll briefly introduce what machine learning is and why it's important. Then, we'll follow...