Probability and Statistics Seminar, Department of Mathematics and Statistics, Boston University
Please check the weakly seminar schedule at http://www.bu.edu/stat/seminar/
for possible changes and updates. Below is a tentative schedule.
[ Schedule for Fall 2012 | Schedule for Spring 2013 |
Schedule for Fall 2013 | Schedule for Spring 2014 | Schedule for Fall 2014 | Schedule for Spring 2015 | Schedule for Fall 2015]
Schedule for Fall 2012:
For the Fall of 2012, the seminar usually meets on Thursdays at 4-5pm in Room MCS 148, 111 Cummington Street.
- Speaker: Luc Rey-Bellet, Department of Mathematics and Statistics, University of Massachusetts, Amherst, Thursday 13 Sep 2012 ROOM CHANGE: B21 (BASEMENT)!
Title: Irreversibility, entropy production, and fluctuations.
Absract: In the past 15 years there has been a number of results
on the structure on non-equilibrium steady states in statistical mechanics
models. From a mathematical point of view non-equilibrium means lacks of
time-reversibility or lack of detailed balance. In this talk we will explain which kind
of general results one can obtain for irreversible (Markov or deterministic) processes
and illustrate these results with a number of physical examples.
- Speaker: Konstantinos Spiliopoulos, Department of Mathematics and Statistics, Boston University, Thursday 20 Sep 2012
Title: Escaping from an attractor: importance sampling and rest points
Absract: Questions like understudying transitions between metastable equilibrium states of stochastic dynamical systems and computing transition times have attracted
a lot of attention in both the the probability and applied mathematics community and at the same time are generic questions in disciplines such as chemical physics and biology.
However, despite the substantial developments of the last five decades in both theory and algorithms, very little is known on how to design and rigorously analyze
provably efficient Monte Carlo methods for rare event problems, like probability of escape from an equilibrium and transition to another one, when rest points play a
key role. Even though several algorithms do exist, they have been applied only to specific systems and have not been rigorously analyzed. Therefore, it is unclear
when they work and how one should efficiently design them. In this talk, I will discuss importance sampling schemes for the estimation of finite time exit probabilities
of small noise diffusions that
involve escape from an equilibrium. We build importance sampling schemes with provably good performance both pre-asymptotically, i.e., for fixed size of the noise, and asymptotically, i.e., as the size of the noise goes to zero, and that do not degrade as the time horizon gets large. Extensive simulation studies demonstrated the theoretical results.
- Speaker: Jing Zhang, Department of Statistics, Yale University , Thursday 27 Sep 2012
Title: Detecting and understanding combinatorial mutation patterns responsible for HIV drug resistance
Absract: We propose a systematic approach for a better understanding of how HIV viruses employ various combinations of mutations to resist drug treatments, which is critical to developing new drugs and optimizing the use of existing drugs. By probabilistically modeling mutations in the HIV-1 protease or reverse transcriptase (RT) isolated from drug-treated patients, we present a statistical procedure that first detects mutation combinations associated with drug resistance and then infers detailed interaction structures of these mutations. The molecular basis of our statistical predictions is further studied by using molecular dynamics simulations and free energy calculations. We have demonstrated the usefulness of this systematic procedure on three HIV drugs, (Indinavir, Zidovudine, and Nevirapine), discovered unique interaction features between viral mutations induced by these drugs, and revealed the structural basis of such interactions. More advanced Bayesian models are also developed for transmitted drug resistance and cross-resistance for multiple drugs.
This is a joint work with Tingjun Hou, Wei Wang, and Jun S. Liu
Ref: 1. Zhang, J., Hou, T., Wei, W., Liu, S.J. (2010) Detecting and understanding combinatorial mutation patterns responsible for HIV drug resistance. PNAS 107, 1321.
2. Systematic Investigation on Interactions for HIV Drug Resistance and Cross-Resistance among Protease Inhibitors. (2012) Jing Zhang Tingjun Hou, Yang Liu, Gang Chen, Xiao Yang, Jun S Liu and Wei Wang. Accepted by Journal of Proteome Science & Computational Biology.
- Speaker: Andrew Papanicolaou, Department of Operations Research and Financial Engineering,
Princeton University, Thursday 4 Oct 2012
Title: Dimension reduction of the Bellman equations for maximum expected utility with partial information in discrete time
Absract: The full availability of information in nancial markets is something that is often
assumed when working with models. However, parameters such as an asset's volatility and rate of
return are not known and need to be estimated from past data. In this regard, the optimization
of expected utility of wealth over a set of admissible trading strategies becomes a ltering problem,
wherein the investor must use the ltration generated by past events to make the optimal decision for
future returns. It turns out that this non-Markovian problem can be Markovianized once the dynam-
ics of the lter are determined, but this Markovianized problem requires optimization over an innite
dimensional eld. However, there is a class of perturbation models for which the Markovianized prob-
lem is well-approximated by an unperturbed nite dimensional problem. This approximation to the
perturbed problem is analyzed, and there is found to be an information premium in the market.
- Speaker: Ivan Corwin, Clay Mathematics Institute, Department of Mathematics MIT and Microsoft Research, Thursday 11 Oct 2012
Title: Beyond the Gaussian Universality Class
Absract: The Gaussian central limit theorem says that for a wide class of stochastic systems, the bell curve (Gaussian distribution) describes the statistics for random fluctuations of important observables. In this talk I will look beyond this class of systems to a collection of probabilistic models which include random growth models, polymers,particle systems, matrices and stochastic PDEs, as well as certain asymptotic problems in combinatorics and representation theory. I will explain in what ways these different examples all fall into a single new universality class with a much richer mathematical structure than that of the Gaussian.
- Speaker: Philippe Rigollet, Department of Operations Research and Financial Engineering, Princeton University, Thursday 25 Oct 2012
Title: Deviation optimal model selection using greedy algorithms
Absract: A statistical problem of model selection for regression can be
simply described as a stochastic optimization problem where the objective is
quadratic and the domain finite or countable.
To solve this problem it is now known that, contrary to the principle of
empirical risk minimization, one should seek a solution in the convex hull
of the domain. This idea is implemented by exponential weights that are
known to solve the problem in expectation, but they are, surprisingly,
sub-optimal in deviation. We propose a new formulation called Q-aggregation
that consists in minimizing a penalized version of the original criterion
but for which the penalty vanishes at the points of interest. This approach
leads to efficient greedy algorithms in the spirit of Frank-Wolfe but for
which stronger bounds can be derived.
- Speaker: Lee Jones, UMass Lowell, Thursday 1 November 2012
Title: Order statistics probability rates and some new results for
statistical inference from transactional data in queuing systems
Abstract: Efficient algorithms were initially developed for
computing the probability that the order statistics of n i.i.d. uniform random variables lie in a given n-dimensional
rectangular region in order to calculate the cumulative distribution of the Kolmogorov statistic. These
algorithms were rediscovered and used to find expected queue length (and other queue performance measures)
in a queuing system from the set of recorded start/stop service data in a time interval in the
interior of which each server who became free was immediately
reengaged by a waiting customer. With most practical data there are
time gaps between the recorded service completion and the recorded
start of service with a waiting customer. These may be due to customer
delay in engaging a free server , to server delay in availability to
the next in queue, or to both. We propose models for the various
delays . By generalizing the order statistics probability
computational problem and developing feasible algorithms for its
solution we can give confidence intervals for queue performance
measures for practical transactional data.
- Speaker: Bud Mishra,The Courant Institute of Mathematical Sciences, NYU, Thursday 8 November 2012---CANCELLED
Title: Towards Cancer Hybrid Automata
Abstract: Recently, we introduced Cancer Hallmark Automata, a formalism to model the progression of cancers through discrete phenotypes (so-called “hallmarks”). The classification of various cancers using stages and hallmarks has become common in the biology literature, but primarily as an organizing principle, and not as an executable formalism. The precise computational model developed here aims to exploit this untapped potential, namely, through automatic verification of progression models (e.g., consistency, causal connections, etc.), classification of unreachable or unstable states (e.g., “anti-hallmarks”) and computer-generated (individualized or universal) therapy plans. This talk builds on a phenomenological approach, and as such does not need to model the biochemistry underlying the progression. Rather, it abstractly models transition timings between hallmarks as well as the effects of drugs and clinical tests, and thus allows formalization of temporal statements about the progression as well as notions of timed therapies. The model proposed here is ultimately based on hybrid automata (with multiple clocks), for which relevant verification and planning algorithms exist in the literature.
"Towards Cancer Hybrid Automata," (with L. Olde Loohuis and A. Witzel), First International Workshop on Hybrid Systems and Biology: HSB 2012, Newcastle upon Tyne, UK, September 3, 2012.
- Speaker: Samuel Kou, Department of Statistics, Harvard University, Thursday 15 November 2012
Title: Optimal Shrinkage Estimation in Heteroscedastic Hierarchical Models
Abstract: Hierarchical models are powerful statistical tools widely used in scientific and engineering applications. The homoscedastic (equal variance) case has been extensively studied, and it is well known that shrinkage estimates, the James-Stein estimate in particular, have nice theoretical (e.g., risk) properties. The heteroscedastic (the unequal variance) case, on the other hand, has received less attention, even though it frequently appears in real applications. It is not clear of how to construct "optimal" shrinkage estimate. In this talk, we study this problem. We introduce a class of shrinkage estimates, inspired by Stein's unbiased risk estimate. We will show that this class is asymptotically optimal in the heteroscedastic case. We apply the estimates to real examples and observe excellent numerical results.
This talk is based on joint work with Lawrence Brown and Xianchao Xie.
- Speaker: clayton Scott, Department of Electrical Engineering and Computer Science, University of Michigan, Thursday 29 November 2012
Title: Classification with Asymmetric Label Noise
Abstract: In many real-world classification problems, the labels of training examples are randomly corrupted. That is, the set of training examples for each class is contaminated by examples of the other class. Existing approaches to this problem assume that the two classes are separable, that the label noise is independent of the true class label, or that the noise proportions for each class are known. We introduce a general framework for classification with label noise that eliminates these assumptions. In particular, we identify necessary and sufficient distributional assumptions for the existence of a consistent estimator of the optimal risk, with associated estimation strategies. We find that learning in the presence of label noise is possible even when the class-conditional distributions overlap and the label noise is not symmetric. A key to our approach is a universally consistent estimator of the maximal proportion of one distribution that is present in another, or equivalently, of the so-called "separation distance" between two distributions. The methodology is motivated by a problem in nuclear particle classification.
- Speaker: Erhan Bayraktar, Department of Mathematics, University of Michigan, Thursday 6 December 2012
Title: Quickest Search over Brownian Channels
Abstract:In this paper we resolve an open problem proposed by Lai, Poor, Xin, and Georgiadis (2011, IEEE Transactions on Information Theory). Consider a sequence of Brownian Motions with unknown drift equal to one or zero, which we may be observed one at a time. We give a procedure for finding, as quickly as possible, a process which is a Brownian Motion with nonzero drift. This original quickest search problem, in which the filtration itself is dependent on the observation strategy, is reduced to a single filtration impulse control and optimal stopping problem, which is in turn reduced to an optimal stopping problem for a reflected diffusion, which can be explicitly solved.
Joint work with Ross Kravitz.
top
Schedule for Spring 2013:
For the Spring of 2013, the seminar usually meets on Thursdays at 4-5pm in Room MCS 148, 111 Cummington Street.
- Speaker: Ioannis Karatzas, Department of Mathematics and Departmanet of Statistics, Columbia University , Thursday 21 February 2013
Title: DIFFUSIONS WITH RANK-BASED CHARACTERISTICS
Absract: Imagine you run two Brownian-like particles on the real line. At any given time,
you assign drift g and dispersion \sigma to the laggard; and you assign drift -h and
dispersion \rho to the leader. Here g , h , \rho and \sigma are given nonnegative constants
with \rho^2 + \sigma^2 = 1 and g + h > 0 .
Is the martingale problem for the resulting innitesimal generator
\[
\mathcal{L}\,=\,
\mathbf{ 1}_{ \{ x_1 > x_2 \} } \left( { \, \rho^2\, \over \,2\,} { \partial^2 \over \, \partial x_1^2} + { \, \sigma^2\, \over \,2\,} { \partial^2 \over \, \partial x_2^2} \, - h\,{ \partial \over \, \partial x_1 } + g\, { \partial \over \, \partial x_2} \right)
\]
well-posed? If so, what is the probabilistic structure of the resulting two-dimensional
diusion process? What are its transition probabilities? How does it look like when
time is reversed? Questions like these arise in the context of systems of diusions
interacting through their ranks; see, for instance, [1], [6], [8]. They become a lot more
interesting, if one poses them for several particles instead of just two.
The construction we carry out involves features of Brownian motion with \bangbang"
drift [7], as well as of \skew Brownian motion" [4], [2]. Surprises are in store
when one sets up a system of stochastic dierential equations for this planar diusion
and then tries to decide questions of strength and/or weakness (cf. [2] for a onedimensional
analogue); also when one looks at the time-reversal of the diusion.
There are also very strong connections with the recent work [9] on the so-called
\perturbed Tanaka equations".
I'll try to explain what we know about all this, then pose a few open questions.
(This talk covers joint work with E. Robert Fernholz, Tomoyuki Ichiba, Vilmos Prokaj
and Mykhaylo Shkolnikov.)
- Speaker: Ping Li, Department of Statistical Science, Cornell University , Joint with the HARIRI Institute and on Thursday 7 March 2013
Title: Exact Sparse Recovery with L0 Projections
Absract: Many applications concern sparse signals, for example,
detecting anomalies from the differences between consecutive images
taken by surveillance cameras. In general, anomaly events are
sparse. This talk focuses on the problem of recovering a K-sparse
signal in N dimensions (coordinates). Classical theories in
compressed sensing say the required number of measurement is M = O(K
log N). In our most recent work on L0 projections, we show that an
idealized algorithm needs about M = 5K measurements, regardless of N.
In particular, 3 measurements suffice when K = 2 nonzeros.
Practically, our method is very fast, accurate, and very robust
against measurement noises. Even when there are no sufficient
measurements, the algorithm can still accurately reconstruct a
significant portion of the nonzero coordinates, without catastrophic
failures (unlike popular methods such as linear programming). This
is joint work with Cun-Hui Zhang at Rutgers University. Paper URL:
http://stat.cornell.edu/~li/Stable0CS/Stable0CS.pdf
- Speaker: Herold Dehling, Department of Mathematics, Ruhr-Universität Bochum , Thursday 21 March 2013
Title: Empirical Process CLT for Markov Chains and Dynamical Systems
Absract: In our talk we present some recent developments concerning the empirical process central limit theorem for dependent data that do not satisfy any of the classical mixing conditions. Our results are applicable, e.g. to Markov chains and certain dynamical systems. As a special example, we can prove the empirical process CLT for ergodic torus automorphisms. (Joint work with Olivier Durieu, Marco Tusche and Dalibor Volny)
- Speaker: Luke W. Miratrix, Department of Statistics, Harvard University , Thursday 28 March 2013
Title: An introspection on using sparse regression techniques to analyze text
Abstract: In this talk, I propose a general framework for topic-specific
summarization of large text corpora, and illustrate how it can be used for
analysis in two quite different contexts: legal decisions on workers'
compensation claims (to understand relevant case law) and an OSHA database
of occupation-related accident reports (to search for high risk
circumstances). Our summarization framework, built on sparse
classification methods, is a lightweight and flexible tool that offers a
compromise between simple word frequency based methods currently in wide
use, and more heavyweight, model-intensive methods such as Latent
Dirichlet Allocation (LDA). For a particular topic of interest (e.g.,
emotional disability, or chemical gas), we automatically labels documents
as being either on- or off-topic, and then use sparse classification
methods to predict these labels with the high-dimensional counts of all
the other words and phrases in the documents. The resulting small set of
phrases found as predictive are then harvested as the summary. Using a
branch-and-bound approach, this method can be extended to allow for
phrases of arbitrary length, which allows for potentially rich
summarization. I further discuss how focus on specific aspects of the
corpus and the purpose of the summaries can inform choices of
regularization parameters and constraints on the model. Overall, I argue
that sparse methods have much to offer text analysis, and hope that this
work opens the door for a new branch of research in this important field.
- Speaker: Manfred Denker, Department of Mathematics, Penn State University, Thursday 4 April 2013
Title: Von Mises statistics for a measure preserving transformation.
Abstract: Let $T$ be a measure preserving transformation on a probability space. I will present three theorems on the almost sure and weak convergence of sums of the form
$$ sum_{0 <= i_k
- Speaker: Hongzhe Li, Department of Biostistics and Epidemiology
University of Pennsylvania Perelman School of Medicine , Thursday 11 April 2013
Title: Robust Segment Identification in Next-Generation Sequencing Data
Absract: Copy number variants (CNVs) are alternations of DNA of a
genome that results in the cell having a less or more than two
copies of segments of the DNA. CNVs correspond to relatively
large regions of the genome, ranging from about one kilobase
to several megabases, that are deleted or duplicated. Motivated
by CNV analysis based on next generation sequencing
data, we consider the problem of detecting and identifying
sparse short segments hidden in a long linear sequence of
data with an unspecified noise distribution. We propose a
computationally efficient method that provides a robust and
near-optimal solution for segment identification over a wide
range of noise distributions. We theoretically quantify the
conditions for detecting the segment signals and show that the
method near-optimally estimates the signal segments whenever
it is possible to detect their existence. Simulation studies
are carried out to demonstrate the efficiency of the method
under different noise distributions. We present results from
a CNV analysis of a HapMap Yoruban sample to further illustrate
the theory and the methods.
- Speaker: Tanya Berger-Wolfe, Department of Computer Science, University of Illinois, Thursday 18 April 2013
Title: Analysis of Dynamic Interaction Networks
Abstract: From gene interactions and brain activity to highschool friendships and zebras grazing
together, large, noisy, and highly dynamic networks of interactions are everywhere.
Unfortunately, in this domain, our ability to analyze data lags substantially behind our
ability to collect it. In this talk I will show how computational approaches can be part
of every stage of the scientific process of understanding how entities interact, from
data collection (by using our network sampling framework which results in representative
samples for many network problems) to hypothesis formulation (using unique clustering and
pattern discovery methods), leading to novel scientific insights.
- Speaker: Kavita Ramanan, Division of Applied Mathematics, Brown University , Thursday 25 April 2013
Title: Asymptotic analysis of a class of stochastic networks
Absract: Finite-dimensional diffusions have been successfully used as tractable approximations to gain insight into a certain class of queueing systems. On the other hand, we show that many classes of queueing systems, including many-server queues with general service distributions, are more naturally modeled by measure-valued processes. We describe asymptotic limit theorems for these measure-valued processes and describe the insight they provide into the performance of the original networks.
top
Schedule for Fall 2013:
For the Fall of 2013, the seminar usually meets on Thursdays at 4-5pm in Room MCS 148, 111 Cummington Street.
- Speaker: Evan Johnson, Computational Biomedicine, Boston University , Thursday 12 September 2013
Title: Adaptive factor analysis models for assessing drug sensitivity and pathway activation in individual patient samples
Absract: The development of personalized treatment regimes is an active area of current research in genomics. The focus of our research is to investigate core biological components that contribute to disease prognosis and development, and to develop latent variable models to accurately determine optimal therapeutic regimens for individual patients. To accomplish this aim, we have developed an adaptive Bayesian factor analysis model that integrates in vitro experimental data into our models while still allowing for the refinement and adaptation of drug or pathway profiles within each patient cohort and individual, efficiently accounting for cell-type specific pathway differences or any “rewiring” do to cancer deregulation. Our modeling approach serves an essential role in our attempts to develop a comprehensive and integrated set of relevant, biologically interpretable computational tools for genomic studies in personalized medicine. We are currently working on a variety of applications using data from cancer and pulmonary disease with the potential to be extremely important in treating patients with these diseases.
- Speaker: Jiashun Jin, Department of Statistics, Carnegie Mellon University , Thursday 26 September 2013
Title: Fast Network Community Detection by SCORE
Absract: Consider a network where the nodes split into
K different communities. The community labels for the nodes are unknown and it is of major interest to estimate
them (i.e., community detection). Degree Corrected Block Model (DCBM) is a popular
network model. How to detect communities with the DCBM is an interesting problem,
where the main challenge lies in the degree heterogeneity.
We propose Spectral Clustering On Ratios-of-Eigenvectors (SCORE) as a new
approach to community detection. Compared to existing spectral methods, the main
innovation is to use the entry-wise ratios between the first a few leading eigenvectors
for community detection. The central surprise is, the effect of degree heterogeneity
is largely ancillary, and can be effectively removed by taking such entry-wise ratios.
We have applied SCORE to the well-known web blogs data and the statistics
co-author network data which we have collected very recently. We find that SCORE
is competitive both in computation and in performance. On top of that, SCORE is
conceptually simple and has the potential for extensions in various directions. Addi-
tionally, we have identied several interesting communities in statisticians, including
what we call the \Object Bayesian community", \Theoretic Machine Learning Com-
munity", and the \Dimension Reduction Community".
We develop a theoretic framework where we show that under mild regularity
conditions, SCORE stably yields consistent community detection. In the core of the
analysis is the recent development on Random Matrix Theory (RMT), where the
matrix-form Bernstein inequality is especially helpful.
- Speaker: Soumyadip Ghosh, IBM Research , Thursday 3 October 2013
Title: Optimal Sampling in Stochastic Recursions
Absract: We refer to classical iterative algorithms such as quasi-Newton recursions, trust-region methods, and fixed-point recursions as "Stochastic" recursions when they involve quantities (functions, their gradients, Hessians etc.) that can only be estimated using a simulation oracle. The primary motivating settings are the Stochastic Root Finding problem that seeks the zero for a simulation-estimated function, and the closely related Simulation Optimization problem that seeks a minima. The estimation quality of the simulation oracle depends on the effort expended in the simulation: in a typical scenario where a Central Limit Theorem applies, estimation error drops to zero at the canonical $\sqrt{n}$ rate with sample size $n$. We address the central question that arises in the practical context where the primary computational burden in the stochastic recursion is the Monte Carlo sampling procedure: how should sampling proceed within stochastic recursion iterates in order to ensure that the identified candidate solutions remain consistent to the true solution, and more importantly, when can we ensure that sampling is efficient, that is, converges at the fastest possible rate. The answer involves a trade-off between the two types of error inherent in the iterates: the deterministic error due to the recursion algorithm and the "stochastic" component due to sampling. We characterize the relationship between sample sizing and convergence rates, and demonstrate that consistency and efficiency are intimately coupled with the speed of the underlying recursion, with faster algorithms yielding a wider regime of "optimal" sampling rates.
- Speaker: Stephan Sturm, Department of Mathematical Sciences, Worcester Polytechnic Institute , Thursday 10 October 2013
Title: From Smile Wings to Market Risk Measures
Absract: The left tail of the implied volatility skew, coming from quotes on out-of-the-money put options, can be thought to reflect the market's assessment of the risk of a huge drop in stock prices. We analyze how this market information can be integrated into the theoretical framework of convex monetary measures of risk. In particular, we make use of indifference pricing by dynamic convex risk measures, which are given as solutions of backward stochastic differential equations (BSDEs), to establish a link between these two approaches to risk measurement. We derive a characterization of the implied volatility in terms of the solution of a nonlinear PDE and provide a small time-to-maturity expansion. This procedure allows to choose convex risk measures in a conveniently parametrized class, distorted entropic dynamic risk measures, such that the asymptotic volatility skew under indifference pricing can be matched with the market skew. This is joint work with Ronnie Sircar.
- Speaker: Yu Gu, Department of Applied Mathematics and Physics, Columbia University , Thursday 17 October 2013
Title: Weak Convergence Approach to a Parabolic Equation with Large Random Potential
Abstract: Solutions to partial differential equations with highly oscillatory, large random potential have been shown to converge either to homogenized, deterministic limits or to stochastic limits depending on the statistical properties of the potential. We obtain the convergence rate in the homogenization setting. The derivations are based on a
Feynman-Kac representation, an invariance principle for Brownian motion in random scenery, and a quantitative version of martingale CLT. Joint work with Guillaume Bal.
- Speaker: Marvin K. Nakayama, Computer Science Department, New Jersey Institute of Technology , Thursday 24 October 2013
Title: Efficient Simulation of Risk and its Error: Confidence Intervals for Quantiles When Using Variance-Reduction Techniques
Abstract: The p-quantile of a continuous random variable is the constant for which exactly p of the mass of its distribution lies to the left of the quantile; e.g., the median is the 0.5-quantile. Quantiles are widely used to assess risk. For example, a project manager may want to determine a time T such that the project has a 95% chance of completing by T, which is the 0.95-quantile. In finance, where a quantile is known as a value-at-risk, analysts frequently measure risk with the 0.99-quantile of a portfolio’s loss. For complex stochastic models, analytically computing a quantile often is not possible, so simulation is employed. In addition to providing a point estimate for a quantile, we also want to measure the simulation estimate's error, and this is typically done by giving a confidence interval (CI) for the quantile. Indeed, the U.S. Nuclear Regulatory Commission requires that licensees of nuclear power plants demonstrate compliance using a “95/95 criterion,” which entails ensuring (with 95% confidence) that a 0.95-quantile lies below a mandated limit.
In this talk we present some methods for constructing CIs for a quantile estimated via simulation. Unfortunately, crude Monte Carlo often produces wide CIs, so analysts often apply variance-reduction techniques (VRTs) in simulations to decrease the error. We first discuss forming a CI using a finite difference, and the second approach applies a procedure known as sectioning, which is closely related to batching. The asymptotic validity of both CIs follows from a so-called Bahadur representation, which shows that a quantile estimator can be approximated by a linear transformation of a probability estimator. We have established Bahadur representations for a broad class of VRTs, including antithetic variates, control variates, replicated Latin hypercube sampling, and importance sampling. We present some empirical results comparing the different CIs.
This work is supported by NSF grants CMMI-0926949, CMMI-1200065, and DMS-1331010.
- Speaker: Luke Bornn, Department of Statistics, Harvard University , Thursday 31 October 2013
Title: Towards the Derandomization of Markov chain Monte Carlo for Bayesian Inference
Absract: In this talk, I will explore the current trend towards conducting
Bayesian inference through Markov chain Monte Carlo (MCMC)
algorithms which exhibit converge at a rate faster than $n^{-1/2}$
by derandomizing components of the algorithm. For instance, herded
Gibbs sampling (Bornn et al., 2013) can be shown to exhibit
convergence in certain settings at a $n^{-1}$ rate. These
algorithms exhibit remarkable similarity to existing MCMC
algorithms; as an example, herded Gibbs sampling is equivalent to
the Wang-Landau algorithm with various specified tuning parameters,
and with the random sampling replaced with an argmax step. We
demonstrate that many such MCMC algorithms lie in a middle-ground
between vanilla Gibbs samplers and deterministic algorithms by using
clever auxiliary variable schemes to induce both negatively
correlated samples as well as force exploration of the parameter
space. Based on this observation, we propose several new algorithms
which exploit elements of both MCMC and deterministic algorithms to
improve exploration and convergence.
- Speaker: Peter I. Frazier, School of Operations Research
and Information Engineering, Cornell University, NOTE: Friday 8 November 2013 (Joint seminar with CISE) 3:00 PM to 4:00 PM
8 St. Mary's Street, Room 211 Refreshments served at 2:45.
Title: Bayesian Methods for Simulation Optimization
Abstract: We consider simulation optimization, in which we wish to solve an
optimization problem whose objective function can only be evaluated using
stochastic simulation. When the simulator is large and time-consuming, the
time to solve a simulation optimization problem is gated by the number of
simulation replications required. One increasingly popular approach to
algorithm development for such problems is to place a Bayesian prior
distribution on the underlying objective function, and to value potential
function evaluations, or collections of function evaluations, according to
the probability distribution of the improvement they would provide. We
provide an overview of this class of algorithms, discussing links to
decision theory and Markov decision processes, and present an application
to the design of cardiovascular bypass grafts.
- Speaker: Benjamin Kedem, Department of Mathematics, University of Maryland
College Park , Thursday 14 November 2013
Title: Estimation of Small Tail Probabilities in Food Safety and Bio-Surveillance
Absract: In food safety and bio-surveillance in many cases it is often desired to estimate the probability that a contaminant such as some insecticide or pesticide
exceeds unsafe very high thresholds. The probability or chance in question is then very small. To estimate such a probability we need information about large values. However, in many cases the data do not contain information about exceedingly large contamination levels, which ostensibly makes the problem impossible to solve. A solution is provided whereby more information about small tail probabilities is obtained by combining the real data with computer generated data. The method provides short but reliable interval estimates from moderately large samples. Examples are given in terms of DDT derivatives and chlorpyrifos found in fish, mussel, and sediments, and in terms of mercury levels obtained from males and females of all ages from 1 to 150 years.
- Speaker: David F. Anderson, Department of Mathematics, University of Wisconsin Madison
, Thursday 21 November 2013
Title: Stochastic analysis of biochemical reaction networks with absolute concentration robustness
Absract: It has recently been shown that structural conditions on the reaction network, rather than a fine-tuning of system parameters, often suffice to impart "absolute concentration robustness" on a wide class of biologically relevant, deterministically modeled mass-action systems [Shinar and Feinberg, Science, 2010]. Many biochemical networks, however, operate on a scale insufficient to justify the assumptions of the deterministic mass-action model, which raises the question of whether the long-term dynamics of the systems are being accurately captured when the deterministic model predicts stability. I will discuss recent results that show that fundamentally different conclusions about the long-term behavior of such systems are reached if the systems are instead modeled with stochastic dynamics and a discrete state space. Specifically we characterize a large class of models which exhibit convergence to a positive robust equilibrium in the deterministic setting, whereas trajectories of the corresponding stochastic models are necessarily absorbed by a set of states that reside on the boundary of the state space (i.e. an extinction event). The results are proved with a combination of methods from stochastic processes and chemical reaction network theory.
- Speaker:
Matthew T Harrison , Division of Applied Mathematics, Brown University
, Thursday 5 December 2013
Title: Robust inference for nonstationary spike trains
Absract: The coordinated spiking activity of simultaneously recorded neurons
can reveal clues about the dynamics of neural information processing,
about the mechanisms of brain disorders, and about the underlying
anatomical microcircuitry. Statistical models and methods play an
important role in these investigations. In cases where the scientific
questions require disambiguating dependencies across multiple spatial
and temporal scales, conditional inference can be used to create
procedures that are strikingly robust to nonstationarity, model
misspecification, and incidental parameters problems, which are common
neurostatistical challenges. Examples include testing for cell
assembly dynamics in human epilepsy data and learning putative
anatomical networks from spike train data in behaving rodents.
top
Schedule for Spring 2014:
For the Spring of 2014, the seminar usually meets on Thursdays at 4-5pm in Room MCS 148, 111 Cummington Street.
- Speaker: Mark van der Laan, Biostatistics and Statistics at UC Berkeley , Thursday 20 February 2014
Title: Targeted Learning of Optimal Individualized Treatment Rules
Absract: Suppose we observe n independent and identically distributed observations of a
time-dependent random variable consisting of baseline covariates, initial treatment and
censoring indicator, intermediate covariates, subsequent treatment and censoring
indicator, and a final outcome. For example, this could be data generated by a
sequentially randomized controlled trial, where subjects are sequentially randomized to a
first line and second line treatment, possibly assigned in response to an intermediate
biomarker, and are subject to right-censoring. We consider data adaptive estimation of an
optimal dynamic multiple time-point treatment rule defined as the rule that maximizes the
mean outcome under the dynamic treatment, where the candidate rules are restricted to
only respond to a user-supplied subset of the baseline and intermediate covariates. This
estimation problem is addressed in a statistical model for the data distribution that is
nonparametric beyond possible knowledge about the treatment and censoring mechanism. In
addition, we provide a targeted minimum loss-based estimator of the mean outcome under
the optimal rule, with corresponding statistical inference. Both estimation problems
addressed contrasts from the current literature that relies on parametric assumptions.
We also present a cross-validated TMLE estimators of data adaptive target parameters such
as the mean outcome under a data adaptive fit of the optimal rule.
Practical performance of the methods is demonstrated with some simulations.
- Speaker: Jeremy Achin, DataRobot , Thursday 27 February 2014
Title: Applied Data Science: Extracting Maximum Value from Real-World Data
Absract: This talk is about extracting maximum value from the real-world data using modern statistical and machine learning techniques. Real-world data is diverse, messy, and spread out across many data sources. Extracting maximum value equates to using the data to make the most accurate predictions possible on out-of-sample examples.
The talk will focus on a single case study in which we predict diabetes in undiagnosed patients using their medical records. The dataset comes from a Kaggle competition sponsored by Practice Fusion: http://www.kaggle.com/c/pf2012-diabetes.
- Speaker: Liming Feng, University of Illinois at Urbana-Champaign, Department of Industrial and Enterprise Systems Engineering , Thursday 6 March 2014
Title: Hilbert Transform and Options Valuation
Absract: Transform methods have been widely used for options valuation in models with explicit characteristic functions. We explore the analyticity of the characteristic functions and propose Hilbert transform based schemes for the valuation of European, American and path dependent options and Monte Carlo simulation from analytic characteristic functions. The schemes are based on sinc expansions of functions analytic in a horizontal strip in the complex plane. They are very easy to implement. Despite the simplicity, they are very accurate with exponentially decaying errors. Numerical examples illustrate the effectiveness of these schemes.
- Speaker: Lie Wang, Department of Mathematics, MIT , Tuesday 18 March 2014, NOTE:UNUSUAL DAY!
Title: Multivariate Regression with Calibration
Absract: We propose a new method named calibrated multivariate
regression (CMR) for fitting high dimensional multivariate regression
models. Compared to existing methods, CMR calibrates the
regularization for each regression
task with respect to its noise level so that it is simultaneously
tuning insensitive and achieves an improved finite sample performance.
We also develop an efficient smoothed proximal gradient algorithm to
implement it. Theoretically, it is proved
that CMR achieves the optimal rate of convergence in parameter
estimation. We illustrate the usefulness of CMR by thorough numerical
simulations and show that CMR consistently outperforms existing
multivariate regression methods. We also apply CMR on a brain activity
prediction problem and find that CMR even outperforms the handcrafted
models created by human experts.
- Speaker: Xinyun Chen, Applied Mathematics and Statistics, Stony Brook University , Thursday 20 March 2014
Title: Perfect sampling and gradient simulation of Queueing Networks
Abstract: Perfect sampling is a Monte Carlo technique to generate samples from the stationary distribution of Markov processes without any bias. We develop a perfect sampling algorithm for a class of queueing models called stochastic fluid networks, as used in communication network and data processing systems. Our framework can be combined with infinitesimal perturbation analysis to simulate the gradient of the stationary queue length with no bias. Therefore, our perfect sampling algorithm can be used in sensitivity analysis and simulation optimization for resource allocation in the network. In the end, we will discuss the potential extension of our algorithm to reflected Brownian motion and generalized Jackson network.
- Speaker: Scott Robertson, Department of Mathematics, Carnegie Mellon University , Thursday 3 April 2014
Title: Continuous Time Perpetuities and the Time Reversal of Diffusions.
Joint work with Kostas Kardaras, LSE.
Abstract: In this talk we consider the problem of obtaining the
distribution of a continuous time perpetuity, where the non-discounted
cash flow rate is determined by an ergodic diffusion. Using results
regarding the time reversal of diffusions, we identify the distribution of
the perpetuity with the invariant measure associated to a certain
(different) ergodic diffusion. This enables efficient estimation of the
distribution via simulation and, in certain instances, an explicit formula
for the distribution. Time permitting, we will talk about how Large
Deviations Principles and results concerning Couplings of diffusions can
be used to estimate rates of convergence, thus providing upper bounds for
how long simulations must be run when obtaining the distribution.
- Speaker: Harrison Zhou, Department of Statistics, Yale University , Thursday 10 April 2014
Title: Asymptotic Normality and Efficiency In Estimation of High-dimensional Graphical Models
Absract: In this talk we will first introduce an asymptotically normal and efficient result for estimation of high-dimensional Gaussian graphical model under a sparseness assumption, which is shown to be not only sufficient, but also necessary, then present some preliminary analogous results for Ising model.
- Speaker: Ryan Adams, School of Engineering and Applied Science, Harvard University, Thursday 17 April 2014
Title: Accelerating Exact MCMC with Subsets of Data
Abstract: One of the challenges of building statistical models for large data
sets is balancing the correctness of inference procedures against
computational realities. In the context of Bayesian procedures, the
pain of such computations has been particularly acute as it has
appeared that algorithms such as Markov chain Monte Carlo necessarily
need to touch all of the data at each iteration in order to arrive at
a correct answer. Several recent proposals have been made to use
subsets (or "minibatches") of data to perform MCMC in ways analogous
to stochastic gradient descent. Unfortunately, these proposals have
only provided approximations, although in some cases it has been
possible to bound the error of the resulting stationary distribution.
In this talk I will discuss two new, complementary algorithms for
using subsets of data to perform faster MCMC. In both cases, these
procedures yield stationary distributions that are exactly the desired
target posterior distribution. The first of these, "Firefly Monte
Carlo", is an auxiliary variable method that uses randomized subsets
of data to achieve valid transition operators, with connections to
recent developments in pseudo-marginal MCMC. The second approach I
will discuss, parallel predictive prefetching, uses subsets of data to
parallelize Markov chain Monte Carlo across multiple cores, while
still leaving the target distribution intact. These methods have both
yielded significant gains in wallclock performance in sampling from
posterior distributions with millions of data.
- Speaker: Ofer Harel, Department of Statistics, University of Connecticut, Thursday 24 April 2014
Title: Generating multiple imputation from multiple models to reflect missing data mechanism uncertainty: Application to a longitudinal clinical trial.
Absract: We present a framework for generating multiple imputations
for continuous variables when
the missing data are assumed to be nonignorably missing. Imp
utations are generated from
more than one imputation model in order to incorporate uncer
tainty regarding the miss-
ing data mechanism. Parameter estimates based on the differe
nt imputation models are
combined using rules for nested multiple imputation. Throu
gh the use of simulation, we
investigate the impact of missing data mechanism uncertain
ty on post-imputation infer-
ences and show that incorporating this uncertainty can incr
ease the coverage of parameter
estimates. We apply our method to a longitudinal clinical tr
ial of low-income women with
depression where nonignorably missing data were a concern.
We show that different assump-
tions regarding the missing data mechanism can have a substa
ntial impact on inferences. Our
method provides a simple approach for formalizing subjecti
ve notions regarding nonresponse
so that they can be easily stated, communicated, and compare
d. This is a joint work with
Juned Siddique and Catherine Crespi.
top
Schedule for Fall 2014:
For the Fall of 2014, the seminar usually meets on Thursdays at 4-5pm in Room MCS 148, 111 Cummington Street.
- Speaker: Jose Blanchet, IEOR, Columbia
University , Thursday 11 September 2014
Title: Strong Monte Carlo for Multidimensional SDEs via Rough Path Analysis
Absract: Underlying there is a multidimensional SDE X(.) driven by Brownian motion. A strongly simulatable approximation to X(.) is a sequence of process {X_n(.)} which are piece-wise constant, with finitely many discountinuities for each n, and such that the uniform norm between X(.) and X_n(.) in the compact set [0,1] is less than 1/n with probability one. The probability one statement is crucial. Strong Monte Carlo approximations have been known basically for one dimensional diffusions and related processes. We provide the first strong simulatable approximations for multidimensional SDEs. The construction leverages off the theory of rough paths, and novel simulation techniques of times that look into the infinite future of a sequence of information often used to approximate SDEs.
- Speaker: Georgios Tripodis, School of Public Health, Boston University, Thursday 18 Septmber 2014. NOTE: Seminar takes place in MCS B21!!!
Title: Predicting the cognitive status of an aging population
Absract: Cognitive trajectories are characterized by tremendous heterogeneity in rates of change. We utilized a subset of the NACC dataset to estimate cognitive trajectories in order to investigate possible causes of differences in variability among normal controls.
We analyzed data from 298 cases that were free from any cognitive impairment for at least 2 visits from the National Alzheimer Coordinating Center (NACC). 149 cases remained normal for at least 2 visits following our observation period, while 149 cases were diagnosed subsequently with Mild Cognitive Impairment (MCI). For all cases, we consider only time points when their cognitive status was normal. The groups were matched by age, sex, education and total number of visits. We used an innovative statistical method of dynamic factor models developed by the authors on the NACC neuropsychological battery. Based on a large array of test scores (MMSE, logical memory: immediate and delayed, digits backward and forward, animals, vegetables, TRAILS A and B, Boston naming test and WAIS), we estimated one latent composite trajectory for each individual. We then used linear mixed effect models to compare differences between groups in their rate of cognitive decline. We hypothesized that there will be differences in the cognitive trajectory between the two groups during their normal state.
Factor analytic models are limited to cross-sectional datasets ignoring any longitudinal or dynamic analysis. The latent cognitive index is a weighted average of past and present scores of neuropsychological tests. These weights are a function of the between-subject variability as well as the correlation between tests. Measures that are highly correlated with other measures will get higher weight. Moreover, measures that show increased between- subject-variability will receive higher weight. Current factor analytic methods do not use any information from within- subject-variability over time. If we do not account for time variability we may over(under)inflate the weights. Past observations of measures that are stable over time will be discounted. Tests with rates of change that are highly correlated with other tests’ rates of change will receive more weight. The estimated cognitive trajectory shows significant differences in the rate of change (p-value=0.0003). The cases that remain in a normal cognitive status show significant improvement over time (estimate=-.06, p-value=0.01), indicating a probable learning effect. The cases that will convert to MCI show no improvement in their cognitive trajectory during the period, which are assigned with normal cognition (estimate=--.003, p-value=0.79).These data suggest that there is a probable learning effect in repeated testing only for those that remain in a normal cognitive status. For the cases that will convert to MCI in the future, there is no improvement in their cognitive trajectory. These differences may be used for a more timely diagnosis of MCI.
- Speaker: Victor de la Pena, Department of Statistics, Columbia University , NOTE: Friday 26 September 2014 (Joint seminar with CISE) 3:00 PM to 4:00 PM 8 St. Mary's Street, Room 210 Refreshments served at 2:45.
Title: Dependence Measures: A Perspective
Absract: In recent years there has been an increasing interest in the development of
new measures of dependence. In this talk I will provide an overview of some
of these results including work developed using copulas as well as the
distance covariance. Finally, I will introduce a general framework that
includes several of the known dependence measures. (Joint work with Y. Liu
(Google) and T. Zheng (Columbia).
- Speaker: Neil Shephard, Department of Statistics and Department of Economics, Harvard University , Thursday, 2 October 2014
Title: Low Latency Financial Data: Continuous Time Analysis of Fleeting Discrete Price Moves
Absract: Computer based automated trading dominates many of the most important financial markets. Extracting information from the order and trading flow from such markets is important for trading at high frequency, for policy, regulation and forensic finance. What is distinctive about this area is that the policy, the regulation, the policing and the trading focus is often on the very short term, frequently over time intervals which may be much less than a second. At very short time scales, for most important markets, such low latency data is dominated by three essential aspects: (i) prices are crucially discrete, due to the market's tick structure, (ii) prices change in continuous time, (iii) a high proportion of price changes are fleeting, reversed in a fraction of a second. But the econometricians cupboard is practically bare, for there are nearly no models or techniques which focus on all of these features, putting the role of the impact of time at center stage. In this paper we develop a novel continuous time framework which captures these types of low latency environments in an analytically tractable, semi-parametric manner where the role of calendar time is straightforward to calculate.
- Speaker: Josh Reed, Stern School of Business,NYU , Friday 10 October 2014 at MCS148, NOTE: Special DAY!!
Title: Series Expansions for the All-time Maximum of alpha-stable Random Walks
Abstract: We study random walks whose increments are alpha-stable distributions with shape parameter 1 < alpha < 2. Specifically, assuming a mean increment size which is negative, we provide series expansions in terms of the mean increment size for the probability that the all-time maximum of an alpha-stable random walk is equal to zero and, in the totally skewed to the left case of beta=-1, for the expected value of the all-time maximum of an alpha-stable random walk. Our proofs also cover the Gaussian case of alpha=2 and beta=0 for which previous results have already been obtained in the literature using different techniques. Key ingredients in our proofs are Spitzer's identity for random walks and Zolotarev's integral representation for the CDF of an alpha-stable random variable. We also discuss an application of our results to a problem arising in queueing theory. This is joint work with Cliff Hurvich.
- Speaker: Michael Dietze, Earth and Environment, Boston University , Thursday 16 October 2014
Title: Ecological Forecasting: An Emerging Challenge.
Abstract: Understanding how terrestrial ecosystems will respond to climate change is one of the most critical scientific questions of our time. This is not only because these ecosystems provide the natural resources and ecosystem services our species depends upon for survival, but because feedbacks from the terrestrial biosphere are one of the greatest sources of uncertainty in climate change projections. Reducing uncertainty requires not only a better understanding of the basic science involved, but also a systematic effort to synthesize existing knowledge, quantify uncertainties, and target measurements where they maximize new information. In this effort ecologists are increasingly being called upon to make quantitative, data-driven forecasts using sophisticated statistical tools and computer models. Such models are not only tools for forecasting but also represent a mathematical formalization of our current understanding of how ecosystems function. As such they provide a critical scaffold for assimilating a diverse array of data types on different spatial and temporal scales which cannot otherwise be directly compared. My work within the nascent field of ecological forecasting is heavily focused on the assimilation of data into terrestrial biosphere models as a means of quantifying, partitioning, and reducing uncertainty about how terrestrial ecosystems will respond to climate change. In this talk I will highlight work done in my lab to confront process-based ecosystem models with data and introduce some of the tools we have been developing to manage model-data fusion. I will also discuss the nature of the ecological forecasting problem, how it differs from other forecasting problems (e.g. weather forecasting), and some of the open statistical challenges in this emerging discipline.
- Speaker: Nalini Ravishanker, Department of Statistics, University of Connecticut, Thursday 23 October 2014
Title: Estimating Function Approach for Nonlinear Time Series.
Absract: The framework of martingale estimating functions (Godambe, 1985) provides an optimal approach for developing inference for linear and nonlinear time series based on information on the first two conditional moments of the observed process. In situations where information about higher order conditional moments of the process is also available, combined (linear and quadratic) estimating functions are more informative. This approach is especially useful in practice when recursive estimates of model parameters can be derived, resulting in a fast computational estimation approach. The approach is illustrated for different classes of nonlinear time series models, such as generalized duration models, and random coefficient autoregressive models with heavy-tailed errors, which are useful in financial data analysis.
- Speaker: Gustavo A. Schwenkler, School of Management, Boston University, Thursday 30 October 2014
Title: Simulated Likelihood Estimators for Discretely Observed Jump-Diffusions.
Abstract: This paper develops an unbiased Monte Carlo approximation to the transition density of a jump-diffusion process with state-dependent drift, volatility, jump intensity, and jump magnitude. The approximation is used to construct a likelihood estimator of the parameters of a jump-diffusion observed at fixed time intervals that need not be short. The estimator is asymptotically unbiased for any sample size. It has the same large-sample asymptotic properties as the true but uncomputable likelihood estimator. Numerical results illustrate its computational advantages.
- Speaker: Vladas Pipiras, Department of Statistics and Operations Research, University of North Carolina, Thursday 6 November 2014
Title: Quadratic programming in synthesis of stationary Gaussian fields
Absract: Circulant matrix embedding is one of the most popular and efficient methods for the exact generation of Gaussian stationary univariate series, given its autocovariance function. Although the circulant matrix embedding has also been used for the generation of Gaussian stationary random fields, there are many practical covariance structures of random fields where the classical embedding method breaks down, in the sense that some of the eigenvalues of the covariance embedding are negative. In this talk, I will discuss several approaches to modify the classical circulant matrix embedding so that all the eigenvalues are nonnegative. In one such approach, feasible circulant embeddings are constructed based on quadratic optimization problem with linear inequality constraints, with an objective function measuring the distance of the covariance embedding to the targeted covariance structure over the domain of interest. A well-known interior point optimization strategy called primal log barrier method can be suitably adapted to solve the quadratic problem faster than commercial solvers. The talk is based on joint work with S. Kechagias (University of North Carolina), H. Helgason (University of Iceland), and P. Abry (ENS Lyon).
- Speaker: Lizhen Lin, Department of Statistics and Data Sciences, University of Texas, Thursday 13 November 2014
Title: Robust and scalable inference using median posteriors.
Absract: While theoretically justified and computationally efficient point estimators were developed in robust estimation for many problems, robust Bayesian analogues are not sufficiently well-understood. We propose a novel approach to Bayesian analysis that is provably robust to the presence of outliers in the data, and often has noticeable computational advantages over standard methods. Our approach is based on the idea of splitting the data into several non-overlapping subsets, evaluating the posterior distribution given each subset data, and then combining the resulting subset posterior measures by taking the geometric medians. The resulting final measure is called the median posterior which is the ultimate object used for inference. We show several strong theoretical results for the median posterior, including concentration rates and provable robustness. We illustrate and validate the method through experiments on simulated and real data. [Joint work with Stas Minker, Sanvesh Srivastava and David Dunson]
- Speaker: Vic Patragenarou, Department of Statistics, Florida State University, Thursday 20 November 2014
Title: All about Statistics as far as Objects on Sample Spaces are concerned.
Absract: Noncategorical observations, when regarded as points on a stratified space, lead to a nonparametric data analysis extending data analysis on manifolds. In particular, given a probability measure on a sample space with a manifold stratification, one may define the associated Fr\'echet function, Fr\'echet total variance and Cartan mean set. The sample counterparts of these parameters have a more nuanced asymptotic behaviors than in nonparametric data analysis on manifolds. This allows for the most inclusive data analysis known to date. Unlike the case of manifolds, Fr\'echet sample means on stratified spaces, such as graphs, may stick to a lower dimensional stratum, a new dimension reduction phenomenon. The downside of stickiness is that it yields a less meaningful interpretation of the analysis. To compensate for this, an extrinsic data analysis, that is more sensitive to input data is suggested. In this paper one explores analysis of data on low-dimensional stratified spaces, via simulations. An example of extrinsic analysis on phylogenetic tree data is also given. This is joint work with Leif Ellingson (Texas Tech), Harrie Hendricks (Radboud University, Nijmegen, Netherlands) and Paul San Valentin (Florida State University).
- Speaker: ShuYang (Ray) Bai, Department of Mathematics and Statistics, Boston University, Thursday 4 December 2014
Title: Self-similar processes with stationary increments on Wiener chaos.
Absract: Self-similar processes with stationary increments are important because they exhaust scaling limits of sums of stationary sequences. In this talk, we introduce a broad class of such processes represented by a multiple stochastic integral, called the generalized Hermite processes. We show that the sum of some nonlinear long-memory stationary sequences scale to these generalized Hermite processes. We then look at one particular example of the generalized Hermite processes represented by a double stochastic integral, and show some interesting limit phenomena of this process as its parameters approach the critical values. Some tools used involving the recent developments of the connection between the Malliavin calculus and the Stein's method will be briefly introduced along the way.
- Speaker: Ramis Movassagh, Department of Mathematics, MIT and Northwestern University, Thursday 11 December 2014
Title: Eigenvalues of Sums of Matrices from Free Probability Theory and Their Stochastic Dynamics
Absract: The method of "Isotropic Entanglement" (IE), inspired by Free Probability Theory and Random Matrix Theory, predicts the eigenvalue distribution of quantum many-body systems with generic interactions. At the heart is a "Slider", which interpolates between two extrema by matching fourth moments. The first extreme treats the non-commuting terms classically and the second treats them isotropically. Isotropic means that the eigenvectors are in generic positions. We prove that the interpolation is universal. We then show that free probability theory also captures the density of states of the Anderson model with arbitrary disorder and with high accuracy. Theory will be illustrated by numerical experiments. Lastly and time permitting, we shall present a very recent result applicable to non-Hermitian models. We prove that the complex conjugate eigenvalues of a real asymmetric matrix "attract" in response to additive real randomness. The motion of the eigenvalues can be seen as a many-body system; we derive their stochastic dynamics in the complex plane.
top
Schedule for Spring 2015:
For the Spring of 2015, the seminar usually meets on Thursdays at 4-5pm in Room MCS 148, 111 Cummington Street.
- Speaker: Pierre Nyquist, Division of Applied Mathematics, Brown University
, Thursday 26 February 2015
Title: Min-max representations of viscosity solutions of Hamilton-Jacobi equations and applications in rare-event simulation
Absract: The problem of rare-event sampling is a hindrance for using methods of stochastic simulation in situations where one is interested in quantities that are determined mainly by events of small probabilities. One of the more successful ways to overcome this is importance sampling, a technique used to reduce the variance of standard Monte Carlo. In the last decade, by the works of Dupuis, Wang and collaborators (2004 and onwards), it has been understood that the design of efficient simulation algorithms is intimately connected to subsolutions of the Hamilton-Jacobi equation associated with the underlying stochastic system. We will discuss a duality relation between the Manae potential and a functional common in control theory, referred to as Mather’s action functional in weak KAM theory, in the context of convex and state-dependent Hamiltonians. The duality is used to obtain min-max representations of viscosity solutions of first order Hamilton-Jacobi equations. The representations suggest a way to construct viscosity subsolutions, which in turn are good candidates for designing efficient rare-event simulation algorithms. The application to rare-event simulation is illustrated by the problem of computing escape probabilities for small-noise diffusions and Markov jump processes with state-dependent jumps.
- Speaker: David Gamarnik, MIT Sloan School of Management, MIT
, Friday 6 March 2015 (Joint seminar with CISE) 3:00 PM to 4:00 PM 8 St. Mary's Street, Room 210 Refreshments served at 2:45.
Title: Limits of Local Algorithms for Randomly Generated Constraint Satisfaction Problems
Absract: We will discuss the problem of designing algorithms for solving randomly generated constraint satisfaction problems, such random K-SAT problem, random coloring problem and similar. We establish a fundamental barrier on the power of local algorithms to solve such problems, despite some conjectures put forward in the past. We show that a broad class of local algorithms, including the so-called Belief Propagation and Survey Propagation algorithms, cannot find satisfying assignments in a variant of random K-SAT problem called NAE-K-SAT problem above a certain asymptotic threshold, below which even simple algorithms succeed with high probability. Our negative results exploit fascinating geometry of the solution space of random constraint satisfaction problems, which was first predicted by physicists heuristically and now confirmed by rigorous methods. According to this picture, the solution space exhibits a clustering property whereby feasible solutions tend to cluster with respect to the underlying Hamming distance. This clustering property creates a barrier for local algorithms.
- Speaker: Wei Biao Wu, Department of Statistics, University of Chicago
, Thursday 19 March 2015
Title: $L^2$ Asymptotic Theory for High-Dimensional Data
Absract: I will present an asymptotic theory for $L^2$ norms of sample mean vectors of high-dimensional data. An invariance principle for the $L^2$ norm is derived under conditions that involve a delicate interplay between the dimension $p$, the sample size $n$ and the moment condition. Under proper normalization, central and non-central limit theorems are obtained. To perform the related statistical inference, I will propose a plug-in calibration method and a re-sampling procedure to approximate the distributions of the $L^2$ norms. The results will be applied multiple tests and inference of covariance matrix structures.
- Speaker: Ramon Van Handel, Department of Operations Research and Financial Engineering, Princeton University
, Thursday 26 March 2015
Title: How large is the norm of a random matrix?
Absract: Understanding the spectral norm of random matrices is a problem of basic interest in several areas of pure mathematics (probability theory, functional analysis, combinatorics) and in applied mathematics, statistics, and computer science. While the spectral norm of classical random matrix models is well understood, existing methods almost always fail to be sharp in the presence of nontrivial structure. In this talk, I will discuss new bounds on the norm of random matrices with independent entries that are sharp under mild conditions. These bounds shed significant light on the nature of the problem, and make it possible to effortlessly address otherwise nontrivial problems such as identifying the phase transition of the spectral edge of random band matrices.
- Speaker: Peter Bull, DrivenData ( American Statistical Education Association)
, Thursday, 2 April 2015
Title: Using your powers for good: Data science in the social sector
Absract: Just like every major corporation today, nonprofits and governments have more data than ever before. And just like those corporations, they are eager to tap into the power of their data. But the social sector doesn’t have the same resources to attract talent. Jeff Hammerbacher, Chief Scientist at Cloudera, put it best: "The best minds of my generation are thinking about how to make people click ads. That sucks.” At DrivenData our goal is to make the world suck a little less by empowering impact organizations to get the most from their data. Peter Bull, co-founder at DrivenData, will speak on the ways in which statistics, computer science, and machine learning can be applied to the challenges in the social sector. The talk will address both the big-picture context of the data for good movement, and an in-depth case study of the methods which won DrivenData’s recent machine learning competition on smart school budgeting. It’s an exciting time for people who love data: methods are improving, computational costs are decreasing, storage and transport are cheaper, and the talent pool is growing. It’s up to the data geeks to use these powers for good.
- Speaker: Fan Zhuo, Department of Economics, Boston University
, Thursday 9 April 2015
Title: Likelihood Ratio Based Tests for Markov Regime Switching
Absract: Regime switching models provide a flexible framework for modeling sudden and recursive shifts in dynamic relationships and have influenced the thinking in both the economics and the finance literature. Although there have been persistent interests in applying likelihood ratio based tests to detect regime switches (e.g., Hansen, 1992 and Garcia, 1998), the asymptotic distributions of such tests have remained an enigma. This paper considers such tests and establishes their asymptotic distributions in the context of nonlinear models permitting multiple switching parameters. The analysis simultaneously addresses three difficulties: (i) some nuisance parameters are unidentified under the null hypothesis, (ii) the null hypothesis yields a local maximum, and (iii) conditional regime probabilities follow stochastic processes that can only be expressed recursively. The important work of Cho and White (2007) took on only the first two difficulties, while this paper shows that addressing the third can lead to substantially higher testing power when the regimes are serially dependent. Besides obtaining the tests’ asymptotic distributions, this paper also obtains four sets of results that can be of independent interest: (1) a characterization of the conditional regime probabilities and their derivatives with respect to the model’s parameters, (2) a high order approximation to the log-likelihood ratio permitting multiple switching parameters, (3) a further refinement to the asymptotic distribution that provides better approximations in finite samples, and (4) a unified algorithm to simulate the critical values. In linear models, all the elements needed for the algorithm can be computed analytically. Finally, the above results reveal that some bootstrap procedures can be inconsistent and that standard information criteria, such as AIC and BIC, can be sensitive to the hypothesis and the model’s structure.
- Speaker: Natesh Pillai, Department of Statistics, Harvard University
, Thursday 16 April 2015
Title: Some aspects of shrinkage priors in high dimensions
Absract: In this talk we explore some aspects of shrinkage priors in high dimensional Bayesian inference. These prior distributions (constructed as an alternative to the spike and slab priors) are popular because the corresponding MCMC algorithms mix very quickly. However, nothing much is known about their statistical efficiency. We present some results in this direction and also give a new prior which is both statistically and computationally efficient. We will also discuss some open problems.
- Speaker: Francesco MAINARDI, Department of Physics, University of Bologna
, Thursday 23 April 2015
Title: Brownian motion and anomalous diffusion revisited via a fractional Langevin equation
Absract: In this talk the Brownian motion is revisited on the basis of the fractional Langevin equation which turns out to be a particular case of the generalized Langevin equation introduced by Kubo in 1966. The importance of this approach is to model the Brownian motion more realistically than the usual one based on the classical Langevin equation, in that it takes into account also the retarding effects due to hydrodynamic back-flow, i.e. the added mass and the Basset memory drag. We provide the analytical expressions of the correlation functions (both for the random force and the particle velocity) and of the mean squared particle displacement. The random force has been shown to be represented by a superposition of the usual white noise with a "fractional" noise. The velocity correlation function is no longer expressed by a simple exponential but exhibits a slower decay, proportional to $t^{-3/2}$ for long times, which indeed is more realistic. Finally, the mean squared displacement is shown to maintain, for sufficiently long times, the linear behaviour which is typical of normal diffusion, with the same diffusion coefficient of the classical case. However, the Basset history force induces a retarding effect in the establishing of the linear behaviour, which in some cases could appear as a manifestation of anomalous diffusion to be correctly interpreted in experimental measurements.
- Speaker: Zsolt Pajor-Gyulai, Department of Mathematics, University of Maryland at College Park
, Thursday 30 April 2015 NOTE: Seminar will be in MCS B25!!!
Title: From averaging to homogenization in cellular flows - an exact description of the transition
Absract: We consider a two-parameter averaging-homogenization type elliptic problem together with the stochastic representation of the solution. A limit theorem is derived for the corresponding diffusion process and a precise description of the two-parameter limit behavior for the solution of the PDE is obtained. Joint work with M. Hairer and L. Koralov.
top
Schedule for Fall 2015:
For the Fall of 2015, the seminar usually meets on Thursdays at 4-5pm in Room MCS 148, 111 Cummington Street.
- Speaker: Liliya Zax, Department of Mathematics and Statisitcs, Boston University
, Thursday 10 September 2015, 4-5 at MA B33
Title: Statistics application in industry: financial institutions and tech companies
Absract: In my presentation I would share some of the aspects of my statistics related experience in different industries, namely in financial and technology companies. We would discuss some specific statistical problems that are of interest to the industry, what statistical tools do they use to try to solve those problems, and what are the statistical challenges that they are facing. The goal of the presentation is to help students to understand better how knowledge and skills they get in their academic programs can be later applied if they prefer to continue their career in industry.
- Speaker: Leu Guo, College of Communication, Boston University
, Thursday 17 September 2015
Title: The power of message networks: Semantic network analysis of media effects in Twittersphere during the 2012 U.S. presidential election.
Absract: Do traditional news media still lead public opinion in this digital age? This talk will present a study that explores how media such as newspapers and televisions set the public agenda through constructing message networks. Semantic network analysis and big data analytics were used to examine the large dataset collected on Twitter during the 2012 U.S. presidential election.
- Speaker: Philippe Rigollet, Department of Mathematics, MIT
, Thursday 1 October 2015
Title: Batched Bandits
Absract: Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic multi-armed bandits under the constraint that the employed policy must split trials into a small number of batches. Our results show that a very small number of batches gives already close to minimax optimal regret bounds and we also evaluate the number of trials in each batch. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits. [Joint with V. Perchet, S. Chassang and E. Snowberg].
- Speaker: John Harlim, Department of Mathematics, Penn State University
, Thursday 8 October 2015
Title: Diffusion Forecast: A Nonparametric Modeling Approach
Absract: I will discuss a nonparametric modeling approach for forecasting stochastic dynamical systems on low-dimensional manifolds. In the limit of large data, this approach converges to a Galerkin projection of the semigroup solution of the backward Kolmogorov equation of the underlying dynamics on a basis adapted to the invariant measure. This approach allows one to evolve the probability distribution of non-trivial dynamical systems with equation-free modeling. I will also discuss nonparametric filtering methods, leveraging the diffusion forecast in Bayesian framework to initialize the forecasting distribution given noisy observations.
- Speaker: Pierre Jacob, Department of Statistics, Harvard University
, Thursday, 15 October 2015
Title: Estimation of the Derivatives of Functions That Can Only Be Evaluated With Noise
Absract: Iterated Filtering methods have recently been introduced to perform maximum likelihood parameter estimation in state-space models, and they only require being able to simulate the latent Markov model according to its prior distribution. They rely on an approximation of the score vector for general statistical models based upon an artificial posterior distribution and bypasses the calculation of any derivative. We show here that this score estimator can be derived from a simple application of Stein’s lemma and how an additional application of this lemma provides an original derivative-free estimator of the observed information matrix. These methods tackle the general problem of estimating the first two derivatives of a function that can only be evaluated point-wise with some noise. We compare these new methods with finite difference schemes and make connections with proximal mappings. In particular we look at the bias and variance of these estimators, the effect of the variance of the noise, and the effect of the dimension of the parameter space.
- Speaker: Jian Zhou, Department of Mathematical Sciences, Worcester Polytechnic Institute
, Thursday 22 October 2015
Title: Volatility Inference Using High-Frequency Financial Data and Efficient Computations
Absract: The field of high-frequency finance has experienced a rapid evolvement over the past few decades. One focus point is volatility modeling and analysis for high-frequency financial data. It plays a major role in finance and economics. In this talk, we focus on the statistical inference problem on large volatility matrix using high-frequency financial data, and propose a methodology to tackle this problem under various settings. We illustrate the methodology with the high-frequency price data on stocks traded in New York Stock Exchange in 2013. The theory and numerical results show that our approach perform well while pooling together the strengths of regularization and estimation from a high-frequency finance perspective.
- Speaker: Markos Katsoulakis, Department of Mathematics and Statistics, UMass Amherst
, Thursday 29 October 2015
Title: Path-space information metrics for uncertainty quantification and coarse-graining of molecular systems
Absract: We present path-space, information theory-based, sensitivity analysis, uncertainty quantification and variational inference methods for complex high-dimensional stochastic dynamics, including chemical reaction networks with hundreds of parameters, Langevin-type equations and lattice kinetic Monte Carlo. We establish their connections with goal-oriented methods in terms of new, sharp, uncertainty quantification inequalities that scale appropriately at both long times and for high dimensional state and parameter space. The combination of proposed methodologies is capable to (a) tackle non-equilibrium processes, typically associated with coupled physicochemical mechanisms or boundary conditions, such as reaction-diffusion problems, and where even steady states are unknown altogether, e.g. do not have a Gibbs structure. The path-wise information theory tools, (b) yield a surprisingly simple, tractable and easy-to-implement approach to quantify and rank parameter sensitivities, as well as (c) provide reliable parameterizations for coarse-grained molecular systems based on fine-scale data, and rational model selection through path-space (dynamics-based) variational inference methods.
- Speaker: Iddo Ben-Ari, Department of Mathematics, University of Connecticut
, Thursday 5 November 2015
Title: The Bak-Sneppen Model of Biological Evolution and Related Models
Absract: The Bak-Sneppen model is a Markovian model for biological evolution that was introduced as an example for Self-Organized Criticality. In this model, a population of size N evolves according to the following rule. The population is arranged on a circle, or more generally a connected graph. Each individual is assigned a random fitness, uniform on [0,1], independent of the other fitness of the other individuals. At each unit of time, the least fit individual and its neighbors are removed from the population, and are replaced by new individuals. Despite being extremely simple, the model is known to be very challenging, and the evidence for Self-Organized Criticality provided by Bak and Sneppen was obtained through numerical simulations. I will review the main rigorous results on this model, mostly due to R. Meester and his coauthors, present some new results and open problems. I will then turn to a recent and more tractable variants of the model, in which on the one hand the spatial structure is relaxed, while on the other hand the population size is random. I will focus on the functional central limit for model, which has a somewhat unusual form.
- Speaker: Mokshay Madiman, Department of Mathematical Sciences, University of Delaware
, Thursday 12 November 2015
Title: Optimal Concentration of Information for Log-Concave Distributions
Absract: It was shown by Bobkov and the speaker that for a random vector X in R^n drawn from a log-concave density e^{-V}, the information content per coordinate, namely V(X)/n, is highly concentrated about its mean. Their argument was nontrivial, involving the localization technique, and also gave suboptimal exponents, but it was sufficient to demonstrate that high-dimensional log-concave measures are in a sense close to uniform distributions on the annulus between 2 nested convex sets. We will present recent work that obtains an optimal concentration bound in this setting (optimal even in the constant terms, not just the exponent), using very simple techniques, and outline the proof. Applications that motivated the development of these results include high-dimensional convex geometry and random matrix theory, and we will outline these applications. Based on (multiple) joint works with Sergey Bobkov, Matthieu Fradelizi, and Liyao Wang.
- Speaker: Youssef M. Marzouk, Department of Aeronautics and Astronautics, MIT
, Thursday 19 November 2015
Title: Transport maps for Bayesian computation
Absract: We will discuss how transport maps, i.e., deterministic couplings between probability measures, can enable useful new approaches to Bayesian computation. A first use involves a combination of optimal transport and Metropolis correction; here, we use continuous transportation to transform typical MCMC proposals into adapted non-Gaussian proposals, both local and global. Second, we discuss a variational approach to Bayesian inference that constructs a deterministic transport map from a reference distribution to the posterior, without resorting to MCMC. Independent and unweighted posterior samples can then be obtained by pushing forward reference samples through the map. Making either approach efficient in high dimensions, however, requires identifying and exploiting low-dimensional structure. We present new results relating sparsity of transport maps to the conditional independence structure of the target distribution, and discuss how this structure can be revealed through the analysis of certain average derivative functionals. A connection between transport maps and graphical models yields many useful algorithms for efficient ordering and decomposition---here, generalized to the continuous and non-Gaussian setting. The resulting inference algorithms involve either the direct identification of sparse maps or the composition of low-dimensional maps and rotations. We demonstrate our approaches on Bayesian inference problems arising in spatial statistics and in partial differential equations. This is joint work with Matthew Parno and Alessio Spantini.
- Speaker: Shuyang (Ray) Bai, Department of Mathematics and Statistics, Boston University
, Thursday 3 December 2015
Title: Self-normalized resampling for time series
Absract: The inference procedures for the mean of a stationary time series are usually quite different depending on the strength of the dependence as well as the heavy tailedness of the model. In this talk, combining the ideas of resampling and self-normalization, we introduce a unified procedure which is valid under various different model assumptions. The procedure avoids estimation of any nuisance parameter, and requires only the choice of one bandwidth. Simulation examples will be given to illustrate its performance. The asymptotic theory will also be introduced. This is a joint work with Murad S. Taqqu and Ting Zhang.
- Speaker: Vidhu Prasad – University of Massachusetts Lowell
, Thursday 10 December 2015
Title: Towers, Codes and Approximate Conjugacy
Absract: Consider the following question about an irrational rotation $T$ of the unit circle and a mixing Markov chain: is there a partition of the circle (indexed by the state space of the MC) so that the itinerary process given by $T$ and the partition has the distribution of the given Markov Chain? Furthermore, this will be true for any aperiodic measure preserving transformation (not just irrational rotation): the existence of “tower structures” for any $T$ is equivalent to the coding property above (the existence of a partition which is moved like the MC by $T$) and the latter property is equivalent to an “almost conjugacy” property for $T$. The “tower property” is generalization of one of the truly basic results in ergodic theory: (Kakutani) -Rokhlin’s Lemma.
top