Algorithm Porfolios Combination of stochastic algorithms Boosting as a Metaphor for Algorithm Design Algorithm portfolios Learning techniques for Automatic Algorithm Portfolio Selection The purpose of this paper is to show that a well
known machine learning technique based on Decision Trees
can be e®ectively used to select the best approach (in terms
of e±ciency) in an algorithm portfolio for a particular case
study: the Bid Evaluation Problem (BEP) in Combinatorial
Auctions. In particular, we are interested in deciding when
to use a Constraint Programming (CP) approach and when
an Integer Programming (IP) approach, on the basis of the
structure of the instance considered. Di®erent instances of
the same problem present a di®erent structure, and one as-
pect (e.g. feasibility or optimality) can prevail on the other.
We have extracted from a set of BEP instances, a number of
parameters representing the instance structure. Some of them
(few indeed) precisely identify the best strategy and its cor-
responding tuning to be used to face that instance. We will
show that this approach is very promising, since it identi¯es
the most e±cient algorithm in the 90% of the cases.
Anytime Algorithms
Papers
Operational Rationality through Compilation of Anytime Algorithms An important and largely ignored aspect of real-time decision making is the capability of agents to factor the cost of deliberation into the decision making process. I have developed an efficient model that creates this capability. The model uses as basic components anytime algorithms whose quality of results improves gradually as computation time increases. The main contribution of this work is a compilation process that extends the property of gradual improvement from the level of single algorithms to the level of complex systems.
Improved Boosting Algorithms Using Confidence-rated Predictions Imporevements to AdaBoost and its application to multiclass learning Online Choice of Active Learning Algorithms This work is concerned with the question of how to combine online an ensemble of active learners so as to expedite the learning progress in pool-based active learning. We develop an active-learning master algorithm, based on a known competitive algorithm for the multi-armed bandit problem. A major challenge in successfully choosing top performing active learners online is to reliably estimate their progress during the learning session. To this end we propose a simple maximum entropy criterion that provides effective estimates in realistic settings. We study the performance of the proposed master algorithm using an ensemble containing two of the best known active-learning algorithms as well as a new algorithm. The resulting active-learning master algorithm is empirically shown to consistently perform almost as well as and sometimes outperform the best algorithm in the ensemble on a range of classification problems. Popular Ensemble Methods: An Empirical Study Empirical evaluation of boosting and bagging
People
Jerome H. Friedman Mainly statistical analysis of boosting and bagging approaches. Identified ada boost as additive modeling of a logistic function Schapire AdaBoost inventor
Resources
Boosting Basic resources regarding boosting and other ensemble methods. Contains some good tutorials and selected papers. It is though a little outdated. Introduction to Boosting Site containing some introductory papers regarding boosting.
Two complementary patterns to build multi-expert systems The purpose of this paper is to present two complementary patterns to build multi-expert systems, that
is systems that need to integrate large, heterogeneous specialized modules, and implement complex, non
deterministic control strategies.
The rst pattern summarizes the blackboard pattern presented in [1]. The second pattern builds on the
model of dynamic control where a blackboard system has a repertoire of independent domain and control
knowledge sources, a control plan expressing the system's desirable behavior, and a meta-controller that
chooses at each point in time the currently enabled knowledge source that best matches the control plan.
A system implementing this architectural pattern is able to dynamically change its control strategies as a
response to unpredictable events.
Journal of Artificial Intelligence Research Excellent journal available online. It addresses most of the field of rigorous artificial intelligence. Journal of Machine Learning Research Excellent journal available online. It focuses mainly on the problems of machine learning and pattern recognition. PCAI Practically oriented Artificial Intelligence journal. Some issues are for free.
People
Stuart Russel One of AIMA authors, great source of general AI information
Decision Making And Problem Solving Interesting Article by Herbert Simon, creator of the General Problem Solver on principles behind Subjective Expected Utility in various fields.
Resources
Optimal Ordered Problem Solver abstract: We introduce a general and in a certain sense time-optimal way of solving one problem after another, efficiently searching the space of programs that compute solution candidates, including those programs that organize and manage and adapt and reuse earlier acquired knowledge. The Optimal Ordered Problem Solver OOPS draws inspiration from Universal Search designed for single problems and universal Turing machines. It spends part of the total search time for a new problem on testing programs that exploit previous solution-computing programs in computable ways. If the new problem can be solved faster by copy-editing/invoking previous code than by solving the new problem from scratch, then OOPS will find this out. If not, then at least the previous solutions will not cause much harm. We introduce an efficient, recursive, backtracking-based way of implementing OOPS on realistic computers with limited storage. Experiments illustrate how OOPS can greatly profit from metalearning or metasearching, that is, searching for faster search procedures.
Knowledge Representation
Papers
Complexity Results and Approximation Schemes for MAP Explanations Abstract: MAP is the problem of finding a most probable instantiation of a set of variables given evidence. MAP has always been perceived to be significantly harder than the related problems of computing the probability of a variable instantiation Pr, or the problem of computing the most probable explanation (MPE). This paper investigates the complexity of MAP in Bayesian networks. Specifically, we show that MAP is complete for NP^PP and provide further negative complexity results for algorithms based on variable elimination. We also show that MAP remains hard even when MPE and Pr become easy. For example, we show that MAP is NP-complete when the networks are restricted to polytrees, and even then can not be effectively approximated.
Given the difficulty of computing MAP exactly, and the difficulty of approximating MAP while providing useful guarantees on the resulting approximation, we investigate best effort approximations. We introduce a generic MAP approximation framework. We provide two instantiations of the framework; one for networks which are amenable to exact inference Pr, and one for networks for which even exact inference is too hard. This allows MAP approximation on networks that are too complex to even exactly solve the easier problems, Pr and MPE. Experimental results indicate that using these approximation algorithms provides much better solutions than standard techniques, and provide accurate MAP estimates in many cases.
In Defense Of One-Vs-All Classification We consider the problem of multiclass classification. Our main thesis is that a simple "one-vs-all" scheme is as accurate as any other approach, assuming that the underlying binary classifiers are well-tuned regularized classifiers such as support vector machines. This thesis is interesting in that it disagrees with a large body of recent published work on multiclass classification. We support our position by means of a critical review of the existing literature, a substantial collection of carefully controlled experimental work, and theoretical arguments. Recent Advances in Predictive Machine Learning Prediction involves estimating the unknown value of an attribute of a system under study given the
values of other measured attributes. In prediction (machine) learning the prediction rule is derived
from data consisting of previously solved cases. Most methods for predictive learning were originated
many years ago at the dawn of the computer age. Recently two new techniques have emerged that
have revitalized the ¯eld. These are support vector machines and boosted decision trees. This
paper provides an introduction to these two new methods tracing their respective ancestral roots to
standard kernel methods and ordinary decision trees. The Expectation Maximization Algorithm Explanation of EM algorithm Tutorial on Support Vector Machines Abstract. We briefly describe the main ideas of statistical learning theory, support
vector machines (SVMs), and kernel feature spaces.We place particular emphasis
on a description of the so-called º-SVM, including details of the algorithm
and its implementation, theoretical results, and practical applications.
Sharing Metainformation to Guide Cooperative Search Among Heterogeneous Reusable Agents Abstract—A reusable agent is a self-contained computational system that implements some specific expertise and that can be
embedded into diverse applications requiring that expertise. Systems composed of heterogeneous reusable agents are potentially
highly adaptable, maintainable, and affordable, assuming that integration issues such as information sharing, coordination, and
conflict management can be effectively addressed. In this article, we investigate sharing metalevel search information to improve
system performance, specifically with respect to how sharing affects the quality of solutions and the runtime efficiency of a reusableagent
system. We first give a formal description of shareable metainformation in systems where agents have private knowledge and
databases and where agents are specifically intended to be reusable. We then present and analyze experimental results from a
mechanical design system for steam condensers that demonstrate performance improvements related to information sharing and
assimilation. Finally, we discuss the practical benefits and limitations of information sharing in application systems comprising
heterogeneous reusable agents. Issues of pragmatic interest include determining what types of information can realistically be
shared and determining when the costs of sharing outweigh the benefits.
People
Victor Lesser His major research focus is on the control and organization of complex AI systems. Professor Lesser is a Fellow of the American Association of Artificial Intelligence (AAAI), and is considered a leading researcher in the areas of blackboard systems for interpretation, distributed AI, and real-time AI.
Online learning
People
Schapire One of the inventors of the boosting. Also conrtains information on prediction with expert advice and statistical decomposition.
Using Genetic Algorithms to Solve NP Complete Problems Abstract
A strategy for using Genetic Algorithms (GAs) to
solve NP-complete problems is presented. The key
aspect of the approach taken is to exploit the observation
that, although all NP-complete problems are
equally difficult in a general computational sense,
some have much better GA representations than others,
leading to much more successful use of GAs on
some NP-complete problems than on others. Since
any NP-complete problem can be mapped into any
other one in polynomial time, the strategy described
here consists of identifying a canonical NP-complete
problem on which GAs work well, and solving other
NP-complete problems indirectly by mapping them
onto the canonical problem. Initial empirical results
are presented which support the claim that the
Boolean Satisfiability Problem (SAT) is a GAeffective
canonical problem, and that other NPcomplete
problems with poor GA representations
can be solved efficiently by mapping them first onto
SAT problems.
Boosting Combinatorial Search Through Randomization Abstract
Unpredictability in the running time of complete search
procedures can often be explained by the phenomenon of
“heavy-tailed cost distributions”, meaning that at any time
during the experiment there is a non-negligible probability
of hitting a problem that requires exponentially more time to
solve than any that has been encountered before (Gomes et
al. 1998a). We present a general method for introducing controlled
randomization into complete search algorithms. The
“boosted” search methods provably eliminate heavy-tails to
the right of the median. Furthermore, they can take advantage
of heavy-tails to the left of the median (that is, a nonnegligible
chance of very short runs) to dramatically shorten
the solution time. We demonstrate speedups of several orders
of magnitude for state-of-the-art complete search procedures
running on hard, real-world problems. Recursive Approximation of the High Dimensional max Function Abstract. An alternative smoothing method for the high dimensional max
function has been studied. The proposed method is a recursive extension of
the two dimensional smoothing functions. In order to analyze the proposed
method, a theoretical framework related to smoothing methods has been discussed.
Moreover, we support our discussion by considering some application
areas. This is followed by a comparison with an alternative well-known
smoothing method. Universal Duality in Conic Convex Programs
Resources
Continuous Optimization Methods Lecture notes. Steepest descent, newton method, quasi-newton methods, cnstrained optimization, penalty and barrier methods, linearly constrained problems, duality theory Cross-Entropy Method Very interesting generic method for multi-extremal optimization and rare-event simulation. IBROW The objective of IBROW3 is to develop intelligent brokers that are able to distributively configure reusable components into knowledge systems through the World-Wide Web. The WWW is changing the nature of software development to a distributive plug & play process, which requires a new kind of managing software: intelligent software brokers.
IBROW3 will integrate research on heterogeneous DB, interoperability and Web technology with knowledge-system technology and ontologies.
Mathematical Programming Glossary Glossary of terms regarding mathematical programming. Very thorough. Optimization Online Optimization Online is a repository of eprints about optimization and related topics. Problem Reformulation and Search A Past project in cooperation with iLog. Dealt with performance depending on the formulation of the problem in constraint programming domain. Trying to formulate JSP as VRP and reversely.
How to Use Expert Advice Abstract. We analyze algorithms that predict a binary value by combining the predictions of several
prediction strategies, called experts. Our analysis is for worst-case situations, i.e., we make no
assumptions about the way the sequence of bits to be predicted is generated. We measure the
performance of the algorithm by the difference between the expected number of mistakes it makes on
the bit sequence and the expected number of mistakes made by the best expert on this sequence,
where the expectation is taken with respect to the randomization in the predictions. We show that the
minimum achievable difference is on the order of the square root of the number of mistakes of the
best expert, and we give efficient algorithms that achieve this. Our upper and lower bounds have
matching leading constants in most cases. We then show how this leads to certain kinds of pattern
recognition/learning algorithms with performance bounds that improve on the best results currently known in this context. We also compare our analysis to the case in which log loss is used instead of
the expected number of mistakes. Local Experts Combination through Density Decomposition In this paper we describe a divide-and-combine strategy for decomposition of a complex prediction problem into simpler local sub-problems. We firstly sh ow how to perform a soft decomposition via clustering of input data. Such decomposition leads to a partition of the input space into several regions wh ich may overlap. Therefore, to each region is assigned a local predictor (or exp ert) which is trained only on local data. To construct a solution to the glob al prediction problem, we combine the local experts using two approaches: weighted averaging where the outputs of local experts are weighted by their p rior densities, and nonlinear adaptive combination where the pooling pa rameters are obtained through minimization of a global error. To illustrate the validity of our approach, we show simulation results for two classification task s, vowels and phonemes, using local experts which are Multi-Layer Perceptrons (MLP) and Support Vector Machines (SVM). We compare the results obta ined using the two local combination modes with the results obtained using a g lobal predictor and a linear combination of global predictors.
Sampling
Papers
Safe and Effective Importance Sampling We present two improvements on the technique of importance sampling. First we show that importance sampling from a mixture of densities, using those densities as control variates, results in a useful upper bound on the asymptotic variance. That bound is a small multiple of the asymptotic variance of importance sampling from the best single component density. This allows one to benefit from the great variance reductions obtainable by importance sampling, while protecting against the equally
A Decision Theoretic Approach to Adaptive Problem Solving This article argues that it is desirable, and possible, to construct general problem solving techniques
that automatically adapt to the characteristics of a specific application. Adaptive problem solving
is a means of reconciling two seemingly contradictory needs. On the one hand, general purpose techniques
can ease much of the burden of developing a application and satisfy the oft argued need for
declarative and modular knowledge representation. On the other hand, general purpose approaches
are ill-suited to the specialized demands of individual applications. General approaches have proven
successful, only after a tedious cycle of manual experimentation and modification. Adaptive Problem-Solving for Large-Scale Scheduling Problems: A Case Study Abstract
Although most scheduling problems are NP-hard, domain specific techniques perform well in
practice but are quite expensive to construct. In adaptive problem-solving, domain specific
knowledge is acquired automatically for a general problem solver with a flexible control architecture.
In this approach, a learning system explores a space of possible heuristic methods for one well-suited
to the eccentricities of the given domain and problem distribution. In this article, we discuss an
application of the approach to scheduling satellite communications. Using problem distributions
based on actual mission requirements, our approach identifies strategies that not only decrease the
amount of CPU time required to produce schedules, but also increase the percentage of problems that
are solvable within computational resource limitations.
Multi-Tac
An Overview of Learning in the Multi-TAC System Abstract: this paper we give an overview of the Multi-TAC system focusing on its learning mechanisms. We present one of the areas of current research in the project, and summarize some empirical results comparing the programs written by Multi-TAC to those written by humans
A Combinatorial Scheme for Developing Efficient Composite Solvers Abstract:
Many fundamental problems in scientific computing have more than one solution method. It is not uncommon for alternative solution methods to represent different tradeoffs between solution cost and reliability. Furthermore, the performance of a solution method often depends on the numerical properties of the problem instance and thus can vary dramatically across application domains. In such situations, it is natural to consider the construction of a multi-method composite solver to potentially improve both the average performance and reliability. In this paper,we provide a combinatorial framework for developing such composite solvers. We provide analytical results for obtaining an optimal composite from a set of methods with normalized measures of performance and reliability. Our empirical results demonstrate the effectiveness of such optimal composites for solving large, sparse linear systems of equations.
A Formal Framework for Speedup Learning from Problems and Solutions Abstract: Speedup learning seeks to improve the computational efficiency of problem solving with experience. In this paper, we develop a formal framework for learning efficient problem solving from random problems and their solutions. We apply this framework to two different representations of learned knowledge, namely control rules and macro-operators, and prove theorems that identify sufficient conditions for learning in each representation. Our proofs are constructive in that they are accompanied with learning algorithms. Our framework captures both empirical and explanation-based speedup learning in a unified fashion. We illustrate our framework with implementations in two domains: symbolic integration and Eight Puzzle. This work integrates many strands of experimental and theoretical work in machine learning, including empirical learning of control rules, macro-operator learning, Explanation-Based Learning (EBL), and Probably Approximately Correct (PAC) Learning.
Exploratory analysis of speedup learning data using expectation maximization Abstract: Experimental evaluations of speedup learning methods have in the oast used non-parametric hypothesis testing to determine whether or not learning is beneficial. We show here how to obtain deeper insight into the comparative performance of learning methods through a complementary parametric approach to data analysis. In this approach experimental data is used to estimate values for the parameters of a statistical model of the performance of a problem solver. To model problem solvers that use speedup learning methods, we propose a two component lenear model that captures hoe learned knowledge may accelerate the solution of some problems while leaving solution to of thers relatively unchanged. We show how to apply expectation maximization (EM), a statistical technique, to fit this kind os multi-component model. EM allows us to fit the model in the presence os censored data, a methodological difficulty common to experiments involving speedup learning. Optimal Schedules for Parallelizing Anytime Algorithms: The Case of Shared Resources Abstract: The performance of anytime algorithms can be improved by simultaneously solving several instances of algorithm-problem pairs. These pairs may include different instances of a problem (such as starting from a different initial state), different algorithms (if several alternatives exist), or several runs of the same algorithm (for non-deterministic algorithms). In this paper we present a methodology for designing an optimal scheduling policy based on the statistical characteristics of the algorithms involved. We formally analyze the case where the processes share resources (a single-processor model), and provide an algorithm for optimal scheduling. We analyze, theoretically and empirically, the behavior of our scheduling algorithm for various distribution types. Finally, we present empirical results of applying our scheduling algorithm to the Latin Square problem.
Speedup Learning for Repair-based Search Abstract
Repair-based search algorithms start with an initial solution and attempt to improve it by iteratively
applying repair operators. Such algorithms can often handle large-scale problems that may
be difficult for systematic search algorithms. Nevertheless, the computational cost of solving such
problems is still very high. We observed that many of the repair steps applied by such algorithms
are redundant in the sense that they do not eventually contribute to finding a solution. Such redundant
steps are particularly harmful in repair-based search, where each step carries high cost due to
the very high branching factor typically associated with it.
Accurately identifying and avoiding such redundant steps would result in faster local search
without harming the algorithm’s problem-solving ability. In this paper we propose a speedup learning
methodology for attaining this goal. It consists of the following steps: defining the concept
of a redundant step; acquiring this concept during off-line learning by analyzing solution paths
for training problems, tagging all the steps along the paths according to the redundancy definition
and using an induction algorithm to infer a classifier based on the tagged examples; and using the
acquired classifier to filter out redundant steps while solving unseen problems.
Our algorithm was empirically tested on instances of real-world employee timetabling problems
(ETP). The problem solver to be improved is based on one of the best methods for solving
some large ETP instances. Our results show a significant improvement in speed for test problems
that are similar to the given example problems. Statistical Methods For Analyzing Speedup Learning Experimens Trying to get hold of the document. By O. Etzioni and R. Etzioni appared in Machine Learning Journal.
Fusion of Domain Knowledge with Data for Decision Support Special issue of JRML on combining background knowledge and data. May be related to some kind of speedup learning. Learning for Planning and Problem Solving University of Texas reasearch group. Some phD theses, mainly about learning to plan. Prodigy The PRODIGY Project, led by Manuela Veloso (originally began by Jaime Carbonell), is an architecture for planning and learning. Currently, work on Prodigy includes explanation-based learning, partial evaluation, experimentation, graphical knowledge acquisition, automatic abstraction, mixed-initiative planning, case-based reasoning and a number of real-world domains. Self-Adapting Large-scale Solver Architecture SALSA is a software project that aims to assist applications in finding suitable linear and nonlinear system solvers based on analysis of the application-generated data. The SALSA system features heuristic decision making, based on a database of performance results that can tune the heuristics over time.
Soar Soar is a general cognitive architecture for developing systems that exhibit intelligent behavior. Researchers all over the world are using Soar.
Additional scilab functions Additional funtions for scilab Scilab A free replacement for matlab. Has similar syntax and also available limited translator from matlab files.