Department of Statistics & Operations Research

Statistics Seminars

2017/2018

To subscribe to the list, please follow this link or send email to 12345yekutiel@post.tau.ac.il54321 (remove numbers unless you are a spammer…)

Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: C:\Users\user\Documents\myWebsite\TAU Statistics Seminar Home Page_files\red2.gif

 

 

 

 

Second  Semester

 

 

5 March

James A. Evans, University of Chicago 

19 March

Nalini Ravishanker, University of Connecticut

 

 

 

 

 

 

4 June

Giles Hooker, Cornell

11 June

Judith Somekh, Haifa University

 

 

 

 

 

 

 

 

 

First Semester

23 October

Adam Kapelner, City University of New York

 

Harmonizing Fully Optimal Designs with Classic Randomization in Fixed Trial Experiments

6 November

Daniel Nevo, TAU

 

LAGO: The adaptive Learn-As-you-GO design for multi-stage intervention studies

27 November

Liran Katzir,  Final Ltd.

 

Social network size estimation via sampling

25 December

Bella Vakulenko-Lagun, Harvard

 

Some methods to recover from selection bias in survival data

1 January

Meir Feder, TAU

 

Universal Learning for Individual Data

8 January

Adi Berliner Senderey, Clalit

 

Effective implementation of evidence based medicine in Healthcare

 

 

 

 

 
 
 
 
 
 
 

 

 
 
 

 

 

 

 

 

 

 


 

 

Seminars are held on Tuesdays, 10.30 am, Schreiber Building, 309 (see the TAU map ). The seminar organizer is Daniel Yekutieli.

To join the seminar mailing list or any other inquiries - please call (03)-6409612 or email 12345yekutiel@post.tau.ac.il54321 (remove numbers unless you are a spammer…)

 


Seminars from previous years


 

 

 

ABSTRACTS

 

 

 

 

 

·         Daniel Nevo, TAU

 

LAGO: The adaptive Learn-As-you-GO design for multi-stage intervention studies

 

In large-scale public-health intervention studies, the intervention is a package consisting of multiple components. The intervention package is chosen in a small pilot study and then implemented in large-scale setup. However, for various reasons I will discuss, this approach can lead the an implementation failure. 

In this talk, I will present a new design, called the learn-as-you-go (LAGO) adaptive design. In the LAGO design,  the intervention package is adapted in stages during the study

based on past outcomes.  Typically, an effective intervention package is sought, while minimizing cost. The main complication when analyzing data from a LAGO is that interventions in later stages depend upon the outcomes in the previous stages. Under the setup of logistic regression, I will present asymptotic theory for LAGO studies and tools that can be used by researchers in practice. The LAGO design will be illustrated via application to the BetterBirth Study, which aimed to improve maternal and neonatal outcomes in India.

 

 

 

 

·         Adam Kapelner, City University of New York

 

Harmonizing Fully Optimal Designs with Classic Randomization in Fixed Trial Experiments

 

There is a movement in design of experiments away from the classic randomization put forward by Fisher, Cochran and others to one based on optimization. In fixed-sample trials comparing two groups, measurements of subjects are known in advance and subjects can be divided optimally into two groups based on a criterion of homogeneity or "imbalance" between the two groups. These designs are far from random. This talk seeks to understand the benefits and the costs over classic randomization in the context of different performance criterions such as Efron's worst-case analysis. In the criterion that we motivate, randomization beats optimization. However, the optimal design is shown to lie between these two extremes. Much-needed further work will provide a procedure to find this optimal designs in different scenarios in practice. Until then, it is best to randomize.

 

 

 

 

·         Liran Katzir,  financial algorithms researcher at Final Ltd.

 

Social network size estimation via sampling

 

This presentation addresses the problem of estimating the number of users in online social networks. While such networks occasionally publish user numbers, there are good reasons to validate their reports. The proposed algorithm can also estimate the cardinality of network sub-populations.  Since this information is seldom voluntarily divulged, algorithms must limit themselves to the social networks’ public APIs. No other external information can be assumed.  Additionally, due to obvious traffic and privacy concerns, the number of API requests must also be severely limited. Thus, the main focus is on minimizing the number of API requests needed to achieve good estimates. Our approach is to view a social network as an undirected graph and use the public interface to produce a random walk. By counting the number of collisions, an estimate is produced using a non-uniform samples version of the birthday paradox. The algorithms are validated on several publicly available social network datasets.

 

 

 

·         Bella Vakulenko-Lagun, Harvard

 

Some methods to recover from selection bias in survival data

 

We consider several study designs resulting in truncated survival data.  First, we look at a study with delayed entry, where the left truncation time and the lifetime of interest are dependent. The critical assumption in using standard methods for truncated data is the assumption of quasi-independence or factorization. If this condition does not hold, the standard methods cannot be used. We address one specific scenario that can result in dependence between truncation and event times - this is covariates-induced dependent truncation. While in regression models for time-to-event data this type of dependence does not present any problem, in nonparametric estimation of the lifetime distribution P(X), ignoring the dependence might cause bias. We propose two methods that are able to account for this dependence and allow consistent estimation of P(X).

 

Our estimators for dependently truncated data will be inefficient if we use them when there is no dependence between truncation and event times. Therefore it is important to test for independence. The common knowledge is that we can test for quasi-independence, that is "independence in the observable region". We derived two other conditions, called factorization conditions, which are indistinguishable from quasi-independence, given data at hand. This means that in the standard analysis of truncated data, when we assume quasi-independence, we ultimately make an untestable assumption in order to estimate the distribution of the target lifetime. This non-identifiability problem has not been recognized before.

 

Finally, we consider retrospectively ascertained time-to-event data resulting in right truncation, and discuss estimation of regression coefficients in the Cox model. We suggest an approach that incorporates external information in order to solve the problem of non-positivity that often happens with right-truncated data. 

 

 

·         Meir Feder, TAU

 

 

Universal Learning for Individual Data

 

Universal learning is considered from an information theoretic point of view following the universal prediction approachoriginated by Solomonoff, Kolmogorov, Rissanen, Cover, Ziv and others and developed in the 90's by F&Merhav.  Interestingly, the extension to learning is not straight-forward. In previous works we considered on-line learning and supervised learning in a stochastic setting. Yet, the most challenging case is batch learning where prediction is done  on a test sample once the entire training data is observed, in the individual setting where the features and labels,  both of the training and test, are specific individual quantities. 

 

Our results provide schemes that for any individual data compete with a "genie" (or reference) that knows the true test label.  We suggest design criteria and develop the corresponding universal learning schemes, where the main proposed scheme is termed Predictive Normalized Maximum Likelihood (pNML). We demonstrate that pNML learning and its variations provide robust, "stable" learning solutions that outperforms the current leading approach based on Empirical Risk Minimization (ERM). Furthermore, the pNML construction provides a pointwise indication for the learnability. This measure the uncertainty in learning the  specific test challenge with the given training examples letting the learner know when it does not know.

 

Joint work with Yaniv Fogel and Koby Bibas

 

 

 

·         Adi Berliner Senderey, Clalit

 

Effective implementation of evidence based medicine in Healthcare

 

Two projects illustrating use of data for determining effective treatment policies are presented.

 

1. Machine Learning in Healthcare – Shifting the Focus to Fairness  – by Noam Barda

This project deals with an algorithm for improving fairness in predictive models. The method is meant to address concerns regarding potential unfairness of prediction models towards groups which are underrepresented in the training dataset and thus might receive uncalibrated scores. the algorithm was implemented on widely used risk models, including the ACC/AHA 2013 model for cardiovascular events and the FRAX model for osteoporotic fractures, and tested on a large real world sample. Based on a joint work with Noa Dagan, Guy Rothblum, Gal Yona, Ran Balicer and Eitan Bachmat.

 

2. Rates of Ischemic stroke, Death and Bleeding in Men and Women with Non-Valvular Atrial Fibrillation –by Adi Berliner Senderey

Data regarding the thromboembolic risk and differences in outcomes in men and women with non-valvular atrial fibrillation (NVAF) are inconsistent. The aim of the present study is to evaluate differences in treatment strategies and risk of ischemic stroke, death, and bleeding between men and women in a large, population-based cohort of individuals with non-valvular AF (NVAF). Based on a joint work with Yoav Arnson, Moshe Hoshen, Adi Berliner Senderey, Orna Reges, Ran Balicer, Morton Leibowitz, Meytal Avgil Tsadok, Moti Haim