DSAA 2016 Program

October 16


October 17


Opening [in Salle De Bal - Lower Lobby]


Keynote: Yoshua Bengio - Deep Learning and AI [in Salle De Bal - Lower Lobby]




Classification [in Salles Mainsonneuve A]

194 research On the Evaluation of Outlier Detection and One-Class Classification Methods, Lorne Swersky, Henrique O. Marques, Joerg Sander, Ricardo Campello J. G. B. and Arthur Zimek

79 research Active Semi-Supervised Classification based on Multiple Clustering Hierarchies, Antonio J.L. Batista, Ricardo J. G. B. Campello and Joerg Sander

110 research Combining Static and Dynamic Features for Multivariate Sequence Classification, Anna Leontjeva, Ilya Kuzovkin,

61 research Correcting Relational Bias to Improve Classification in Sparsely-Labeled Networks, Joshua King, Luke McDowell,

67 research Hyperparameter Optimization Machines, Lars Schmidt-Thieme, Martin Wistuba, Nicolas Schilling,


Networks [in Salles Mainsonneuve B,C]

152 research Temporal Network Change Detection Using Network Centrality, Yoshitaro Yonamoto, Kai Morino and Kenji Yamanishi.

255 research Harvester: Influence Optimization in Symmetric Interaction Networks, Sergey Ivanov and Panagiotis Karras

297 research Pattern Matching Trajectories for Investigative Graph Searches, Benjamin Hung, Anura Jayasumana and Vidarshana Bandara

240 research A Framework for Description and Analysis of Approximate Triangle Counting Algorithms, Mostafa Haghir Chehreghani,

180 research Limiting the diffusion of information by a selective PageRank-preserving approach, Grigorios Loukides, Robert Gwadera


Anonymity, Fraud, and Privacy [in Salles Mainsonneuve D]

30 research An Exploratory Statistical Cusp Catastrophe Model, Ding-Geng Chen, Xinguang Chen and Kai Zhang.

34 research Using Loglinear Model for Discrimination Discovery and Prevention, Yongkai Wu and Xintao Wu

145 applications Fraud Detection in Energy Consumption: A Supervised Approach, Bernat Coma-Puig, Josep Carmona, Ricard Gavalda, Santiago Alcoverro, Victor Martin,

270 applications Anomaly Detection in Automobile Control Network Data with Long Short-Term Memory Networks,Adrian Taylor, Sylvain Leblanc and Nathalie Japkowicz

220 applications Anonymizing NYC Taxi Data: Does It Matter? Marie Douriez, Harish Doraiswamy, Claudio Silva and Juliana Freire


SS1: Statistical Learning for Data Science [in Salles Mainsonneuve E, F]

39 SS1-SLDS A Distributed Decision Tree Algorithm and Its Implementation on Big Data Platforms Jingxiang Chen, Tao Wang, Ralph Abbey and Joseph Pingenot.

81 SS1-SLDS Analysing the History of Autism Spectrum Disorder using Topic Models Adham Beykikhoshk, Dinh Phung, Ognjen Arandelovic, Svetha Venkatesh,

181 SS1-SLDS Sparse Linear Discriminant Analysis in Structured Covariates Space Sandra Safo and Qi Long

199 SS1-SLDS Informative Priors and Bayesian Computation Shirin Golchi,

230 SS1-SLDS Causal structure learning with reduced partial correlation thresholding Arjun Sondhi and Ali Shojaie




High-dimensional data [in Salles Mainsonneuve A]

253 research Infinite Langevin Mixture Modeling and Feature Selection, Ola Amayri and Nizar Bouguila

29 research Efficient Identification of Tanimoto Nearest Neighbors, David Anastasiu, George Karypis,

37 research Parallel Least-Squares Policy Iteration, Jun-Kun Wang, Shou-De Lin,

56 research Dilation of Chisini-Jensen-Shannon Divergence, Piyush Sharma and Gary Holness

100 research Projecting ``better than randomly": How to reduce the dimensionality of very large datasets in a way that outperforms random projections,
Michael Wojnowicz, Glenn Chisholm, Xuan Zhao and Matt Wolff.


Social Media and Crowd [in Salles Mainsonneuve B, C]

18 research Task Composition in Crowdsourcing, Sihem Amer-Yahia, Ria Borromeo, Eric Gaussier, Vincent Leroy, Julien Pilourdault and Motomichi Toyama

272 research On the Role of Mentions on Tweet Virality, Soumajit Pramanik, Maximilien Danisch, Qinna Wang, Anand Kumar, Sumanth Bandi, Jean-Loup Guillaume and Bivas Mitra

60 applications Mining Pre-Exposure Prophylaxis Trends in Social Media, Patrick Breen, Jane Kelly, Tim Heckman and Shannon Quinn

112 research Overlapping Target Event and Story Line Detection of Online Newspaper Articles, Yifang Wei, Lisa Singh, Brian Gallagher and David Butler

117 research Online Collaborative Prediction of Regional Vote Results, Vincent Etter, Mohammad Emtiyaz Khan, Patrick Thiran and Matthias Grossglauser


SS2: Health Data Science [in Salles Mainsonneuve D]

305 SS1-SLDS Nonparametric Adjoint-Based Inference for Stochastic Differential Equations Harish S. Bhat, R. W. M. A. Madushani,

118 SS2-HDS Actitracker: A Smartphone-based Activity Recognition System for Improving Health and Well-Being Gary Weiss, Jeffrey Lockhart, Tony Pulickal, Paul McHugh, Isaac Ronan and Jessica Timko

258 SS2-HDS The Highly Adaptive Lasso Estimator David Benkeser, Mark van der Laan,

285 SS2-HDS Meeting Health Care Research Needs in a Kimball Integrated Data Warehouse Robert Hart and Mu-Hsing Kuo

309 SS2-HDS MedCare: Leveraging Medication Similarity for Disease Prediction Dipanwita Dasgupta, Nitesh V. Chawla


SS3: Environmental and Geo-spatial Data Analytics [in Salles Mainsonneuve E, F]

52: Reserve Price Optimization at Scale, Daniel Austin, Sam Seljan, Julius Monello and Stephanie Tzeng.

164 SS3-EnGeoData Efficient Large Scale Clustering based on Data Partitioning Malika Bendechache, Nhien-An Le-Khac and M-Tahar Kechadi

200 SS3-EnGeoData Traffic Risk Mining Using Partially Ordered Non-negative Matrix Factorization Taito Lee, Shin Matsushima and Kenji Yamanishi.

219 SS3-EnGeoData On the Use of Ontology as a priori Knowledge into Constrained Clustering Hatim Chahdi, Nistor Grozavu, Isabelle Mougenot, Laure Berti-Equille and Younès Bennani

227 SS3-EnGeoData Maritime Pattern Extraction from AIS data using a Genetic Algorithm Andrej Dobrkovic, Maria-Eugenia Iacob and Jos Van Hillegersberg




Industry Keynote: Xin Fu (LinkedIn): Path to 400M Members: LinkedIn’s Data Powered Journey [in Salle De Bal - Lower Lobby]


Industry invited talk: Abesh Bhattacharjee (InfoSys): Predictive Maintenance of Automobile Parts using distributed data store and classification techniques [in Salle De Bal - Lower Lobby]


October 18


Keynote: Juliana Freire - Democratizing Urban Data Analysis [in Salle De Bal - Lower Lobby]




Temporal Analytics [in Salles Mainsonneuve A]

114 research Continuous Monitoring of A/B Tests without Pain: Optional Stopping in Bayesian Testing, Alex Deng, Jiannan Lu, Shouyuan Chen,

304 research Learning Temporal Dependence from Time-Series Data with Latent Variables, Baosen Zhang, Hossein Hosseini, Radha Poovendran, Sreeram Kannan,

243 research Trend Detection based Regret Minimization for Bandit Problems, Paresh Nakhe, Rebecca Reiffenhaeuser,

262 applications A Symbolic Tree Model for Oil and Gas Production Prediction Using Time-Series Production Data, Bingjie Wei, Helen Pinto, Xin Wang,

128 research Resampling Strategies for Imbalanced Time Series, Luis Torgo, Nuno Moniz, Paula Branco


Scale [in Salles Mainsonneuve B, C]

70 research Performance Improvement of MapReduce Process Using Limited Node Block Placement Policy, Sungchul Lee, Juyeon Jo and Yoohwan Kim

149 research Closest Interval Join Using MapReduce, Qiang Zhang, Andy He, Chris Liu and Eric Lo

290 research EM*: An EM algorithm for Big Data, Hasan Kurban, Mark Jenne, Mehmet M. Dalkilic,

36 research Efficient Sampling-based ADMM for Distributed Data, Jun-Kun Wang, Shou-De Lin,

298 research A Parallel Framework for Grid-based Bottom-up Subspace Clustering, Poonam Goyal, Sonal Kumari, Shubham Singh, Vivek Choudhary, Sundar Balasubramaniam and Navneet Goyal


SS4: Emotion and Sentiment in Intelligent Systems and Big Social Data Analysis [in Salles Mainsonneuve D]

105 SS4-SentISData Connecting Opinions to Opinion-Leaders: A case study on Brazilian political protests
Ramon Vieira, Alan Neves, Fernando Mourão, Leonardo Rocha, Srinivasan Parthasarathy, Bortik Bandyopadhyay and Dárlinton B. F. Carvalho

129 SS4-SentISData Exploiting a Bootstrapping Approach for Annotating Emotions in Texts Automatically Lea Canales, Carlo Strapparava, Ester Boldrini and Patricio Martínez-Barco.

143 SS4-SentISData An Anatomy of Hate: Identifying Hate Speech in Social Media Brian Carignan, Howard Needham,

251 SS4-SentISData Senpy: A Pragmatic Linked Sentiment Analysis Framework J. Fernando Sánchez-Rada and Carlos A. Iglesias

287 SS4-SentISData Word Segmentation Algorithms and Combined Lexical Resources for Identifying Hashtag Types Credell Simeon, Howard Hamilton, Robert Hilderman,


Tutorial1: Continuous Measurement of Quality of Data Streams [in Salles Mainsonneuve E, F]




Search and Mining [in Salles Mainsonneuve A]

103 research Impact of Query Sample Selection Bias on Information Retrieval System Ranking, Massimo Melucci,

295 research Mining Research Problems from Scientific Literature, Chanakya Aalla, Vikram Pudi,

80 research Perceived, Projected, and True Investment Expertise: Not All Experts Provide Expert Recommendations, Amit Shavit, Sameena Shah,

189 applications A Multi-granularity Pattern-based Sequence Classification Framework for Educational Data, Mohammad Jaber, Peter Wood, Panagiotis Papapetrou and Ana González-Marcos

17: Testing Interestingness Measures in Practice: A Large-Scale Analysis of Buying Patterns, Martin Kirchgessner, Vincent Leroy, Sihem Amer-Yahia, Sashwat Mishra


Relational and Structured Data [in Salles Mainsonneuve B, C]

98 research Inconsistent Node Flattening for Improving Top-down Hierarchical Classification, Azad Naik, Huzefa Rangwala,

225 research Learning Multifaceted Latent Activities from Heterogeneous Mobile Data, Thanh Binh Nguyen, Vu Nguyen, Thuong Nguyen, Svetha Venkatesh, Mohan Kumar and Dinh Phung

316 research The synthetic data vault, Neha Patki, Roy Wedge and Kalyan Veeramachaneni

58 research A Decision Tree-based Approach for Categorizing Spatial Database Query Results, Xiangfu Meng, Xiaoyan Zhang and Jinguang Sun

245 research The Semantic Knowledge Graph: A compact, auto-generated model for real-time traversal and ranking of any relationship within a domain,
Trey Grainger, Khalifeh Aljadda, Mohammed Korayem and Andries Smith


SS5: Game Data Science + SS6: Stat and Math Tools for DM [in Salles Mainsonneuve D]

69 SS5-GDS Using players' Gameplay Action-Decision Profiles to prescribe training: Reducing training costs with Serious Games Analytics Christian Loh, I-Hung Li,

186 SS5-GDS What did I do Wrong in my MOBA Game?: Mining Patterns Discriminating Deviant Behaviours Olivier Cavadenti, Victor Codocedo, Jean-Francois Boulicaut and Mehdi Kaytoue

212 SS5-GDS On The "Tiny yet Real Happiness" Phenomenon in The Mobile Games Market Po-Heng Chen, Yi-Pei Tu and Kuan-Ta Chen

139 SS6-SMTDM The Uniqueness and Greedy Method for Quadratic Compressive Sensing Jun Fan, Lingchen Kong, Liqun Wang, Naihua Xiu,

274 SS6-SMTDM Robust Online Time Series Prediction with Recurrent Neural Networks Tian Guo, Zhao Xu, Xin Yao, Karl Aberer and Koichi Funaya


Tutorial1: Continuous Measurement of Quality of Data Streams (continue) [in Salles Mainsonneuve E, F]




Trends &s; Controvercy + Panel [in Salle De Bal - Lower Lobby]


October 19


Award ceremony + DSAA17 [in Salle De Bal - Lower Lobby]


Keynote: David Donoho [in Salle De Bal - Lower Lobby]




Predictive Analytics [in Salles Mainsonneuve A]

293 research Prediction engineering:Enabling agile predictive analytics, Max Kanter, Kalyan Veeramachaneni and Owen Gillespie

310 research Trane: A Language to Express Predictive Problems, Benjamin Schreck, Kalyan Veeramachaneni,

250 applications Detecting Inaccurate Predictions of Pediatric Surgical Durations, Zhengyuan Zhou, Daniel Miller, Neal Master, David Scheinker, Nicholas Bambs and Peter Glynn

188 applications Advanced Analytics for Train Delay Prediction Systems by Including Exogenous Weather Data,
Luca Oneto, Emanuele Fumeo, Giorgio Clerico, Renzo Canepa, Federico Papa, Carlo Dambra, Nadia Mazzino and Davide Anguita

82 applications Waiting to be Sold: Prediction of Time-Dependent House Selling Probability, Mansurul Bhuiyan, Mohammad Al Hasan


SS7: Data Science for Agricultural Decision Support Systems [in Salles Mainsonneuve B, C]

SS7-DS4ADSS Data science and digital agriculture at The Climate Corporation, Steve Sain (invited)

185 SS7-DS4ADSS Disease Detection and Severity Estimation in Cotton Plant from Unconstrained Images, Aditya Parikh, Mehul Raval, Chandrasinh Parmar and Sanjay Chaudhary

273 SS7-DS4ADSS Digital Knowledge Ecosystem for achieving Sustainable Agriculture Production: A Case Study from Sri Lanka, Athula Ginige, Anusha Walisadeera, Tamara Ginige, Lasanthi De Silva, Pasquale Di Giovanni, Maneesh Mathai, Jeevani Goonethilaka, Gihan Wikramanayake, Giuliana Vitiello, Monica Sebillo, Genny Tortora, Deborah Richards and Ramesh Jain


Tutorial 2: Model Selection and Error Estimation without the Agonizing Pain [in Salles Mainsonneuve D]


Tutorial 3: Similarity Search on Time Series Data: Past, Present and Future [in Salles Mainsonneuve E, F]




Business Intelligence [in Salles Mainsonneuve A]

84 applications BOTS: Behavior-oriented Optimal Time Segmentation for Personalized Rules of Mobile Phone Users, Iqbal Sarker,

48 applications Customer Simulation for Direct Marketing Experiments,Yegor Tkachenko, Mykel Kochenderfer and Krzysztof Kluza

124 applications Online Experimentation Diagnosis and Troubleshooting Beyond AA Validation, Zhenyu Zhao, Miao Chen, Don Matheson and Maria Stone

140 applications Role Models: Mining Role Transitions Data in IT Project Management, Girish Palshikar, Sachin Pawar and Nitin Ramrakhiyani

92 applications Deconstructing Domain Names to Reveal Latent Topics, Cheryl Flynn, Kenneth Shirley, Wei Wang


SS8: Big Behavioral Data Analytics [in Salles Mainsonneuve B, C]

232: Uncovering the Bitcoin blockchain: an analysis of the full users graph, Damiano Di Francesco Maesa, Andrea Marino and Laura Ricci

146 SS8-BBDA Data-driven Sales Leads Prediction for Everything-as-a-Service in the Cloud, Chul Sung, Bo Zhang, Chunhui Higgins and Yoonsuck Choe

159 SS8-BBDA Churn Prediction in Mobile Social Games: Towards a Complete Assessment Using Survival Ensembles, Africa Perianez, Alain Saas, Anna Guitart and Colin Magne

209 SS8-BBDA Web Behavior Analysis Using Sparse Non-Negative Matrix Factorization, Akihiro Demachi, Shin Matsushima and Kenji Yamanishi

218 SS8-BBDA EBM: Evidence-Based Behavioral Model for Calendar Schedules of Individual Mobile Users, Iqbal H. Sarker.


Tutorial 2: Model Selection and Error Estimation without the Agonizing Pain (continue) [in Salles Mainsonneuve D]


Tutorial 3: Similarity Search on Time Series Data: Past, Present and Future (continue) [in Salles Mainsonneuve E, F]






DSAA 2016