2015 IEEE International Conference on Big Data, Big Data 2015, Santa Clara, CA, USA, October 29 - November 1, 2015

researchr

You are not signed in
Sign in
Sign up

2015 IEEE International Conference on Big Data, Big Data 2015, Santa Clara, CA, USA, October 29 - November 1, 2015. IEEE, 2015. [doi]

Conference: bigdataconf2015

Abstract is missing.

How big data changes statistical machine learningLéon Bottou. 1 [doi]

Moving past the "Wild West" era for Big DataH. V. Jagadish. 2 [doi]

Conquering Big Data with SparkIon Stocia. 3 [doi]

Online and on-demand partitioning of streaming graphsIoanna Filippidou, Yannis Kotidis. 4-13 [doi]

Learning to accurately COUNT with query-driven predictive analyticsChristos Anagnostopoulos, Peter Triantafillou. 14-23 [doi]

Practical message-passing framework for large-scale combinatorial optimizationInho Cho, Soya Park, Sejun Park, Dongsu Han, Jinwoo Shin. 24-31 [doi]

Rewriting complex SPARQL analytical queries for efficient cloud-based processingPadmashree Ravindra, HyeongSik Kim, Kemafor Anyanwu. 32-37 [doi]

Concept hierarchies and human navigationSalvador Aguiñaga, Aditya Nambiar, Zuozhu Liu, Tim Weninger. 38-45 [doi]

Iteratively refining SVMs using priorsEnric Junqué de Fortuny, Theodoros Evgeniou, David Martens, Foster J. Provost. 46-52 [doi]

Towards scalable quantile regression treesHarish S. Bhat, Nitesh Kumar, Garnet J. Vaz. 53-60 [doi]

Super-CWC and super-LCC: Super fast feature selection algorithmsKilho Shin, Tetsuji Kuboyama, Takako Hashimoto, Dave Shepard. 61-67 [doi]

Considerations and recommendations for data availability for data analytics for manufacturingDon Libes, Seungjun Shin, Jungyub Woo. 68-75 [doi]

ScaleGraph: A high-performance library for billion-scale graph analyticsToyotaro Suzumura, Koji Ueno. 76-84 [doi]

System and architecture level characterization of big data applications on big and little core server architecturesMaria Malik, Setareh Rafatirah, Avesta Sasan, Houman Homayoun. 85-94 [doi]

Data streaming algorithms for the Kolmogorov-Smirnov testAshwin Lall. 95-104 [doi]

Techniques for fast and scalable time series traffic generationJilong Kuang, Daniel G. Waddington, Changhui Lin. 105-114 [doi]

Energy-efficient acceleration of big data analytics applications using FPGAsKatayoun Neshatpour, Maria Malik, Mohammad Ali Ghodrat, Avesta Sasan, Houman Homayoun. 115-123 [doi]

Workload scheduling in distributed stream processors using graph partitioningLorenz Fischer, Abraham Bernstein. 124-133 [doi]

Evaluating different distributed-cyber-infrastructure for data and compute intensive scientific applicationArghya Kusum Das, Seung-Jong Park, Jae-Ki Hong, Wooseok Chang. 134-143 [doi]

Scalejoin: A deterministic, disjoint-parallel and skew-resilient stream joinVincenzo Gulisano, Yiannis Nikolakopoulos, Marina Papatriantafilou, Philippas Tsigas. 144-153 [doi]

When computing meets heterogeneous cluster: Workload assignment in graph computationJilong Xue, Zhi Yang, Shian Hou, Yafei Dai. 154-163 [doi]

A scalable parallel XQuery processorE. Preston Carman Jr., Till Westmann, Vinayak R. Borkar, Michael J. Carey, Vassilis J. Tsotras. 164-173 [doi]

Computing load aware and long-view load balancing for cluster storage systemsGuoxin Liu, Haiying Shen, Haoyu Wang. 174-183 [doi]

Distributed frank-wolfe under pipelined stale synchronous parallelismNam-Luc Tran, Thomas Peel, Sabri Skhiri. 184-192 [doi]

Evaluating cloud frameworks on genomic applicationsMichele Bertoni, Stefano Ceri, Abdulrahman Kaitoua, Pietro Pinoli. 193-202 [doi]

Towards green cloud computing: Demand allocation and pricing policies for cloud service brokerageChenxi Qiu, Haiying Shen, Liuhua Chen. 203-212 [doi]

Elastic complex event processing exploiting predictionNikos Zacheilas, Vana Kalogeraki, Nikolaos Zygouras, Nikolaos Panagiotou, Dimitrios Gunopulos. 213-222 [doi]

PortHadoop: Support direct HPC data processing in HadoopXi Yang, Ning Liu, Bo Feng, Xian-He Sun, Shujia Zhou. 223-232 [doi]

Machine learning at the limitJohn Canny, Huasha Zhao, Bobby Jaros, Ye Chen, Jiangchang Mao. 233-242 [doi]

Performance characterization and acceleration of in-memory file systems for Hadoop and Spark applications on HPC clustersNusrat Sharmin Islam, Md. Wasi-ur-Rahman, Xiaoyi Lu, Dipti Shankar, Dhabaleswar K. Panda. 243-252 [doi]

Panopticon: A lock broker architecture for scalable transactions in the datacenterSerafettin Tasci, Murat Demirbas. 253-262 [doi]

Toward locality-aware scheduling for containerized cloud servicesDongfang Zhao, NagaPramod Mandagere, Gabriel Alatorre, Mohamed Mohamed, Heiko Ludwig. 263-270 [doi]

ATOM: Automated tracking, orchestration and monitoring of resource usage in infrastructure as a service systemsMin Du, Feifei Li. 271-278 [doi]

Composable and efficient functional big data processing frameworkDongyao Wu, Sherif Sakr, Liming Zhu, Qinghua Lu. 279-286 [doi]

Hybrid active learning for non-stationary streaming data with asynchronous labelingHyunjoo Kim, Sriganesh Madhvanath, Tong Sun. 287-292 [doi]

Octopus: A multi-job scheduler for GraphlabSrikant Padala, Dinesh Kumar, Arun Raj, Janakiram Dharanipragada. 293-298 [doi]

Spark deployment and performance evaluation on the MareNostrum supercomputerRubén Tous, Anastasios Gounaris, Carlos Tripiana, Jordi Torres, Sergi Girona, Eduard Ayguadé, Jesús Labarta, Yolanda Becerra, David Carrera, Mateo Valero. 299-306 [doi]

G-Storm: GPU-enabled high-throughput online data processing in StormZhenhua Chen, Jielong Xu, Jian Tang, Kevin A. Kwiat, Charles A. Kamhoua. 307-312 [doi]

Chronos: Failure-aware scheduling in shared Hadoop clustersOrcun Yildiz, Shadi Ibrahim, Tran Anh Phuong, Gabriel Antoniu. 313-318 [doi]

An architecture for stream OLAP exploiting SPE and OLAP engineKousuke Nakabasami, Toshiyuki Amagasa, Salman Ahmed Shaikh, Franck Gass, Hiroyuki Kitagawa. 319-326 [doi]

Two-mode data distribution scheme for heterogeneous storage in data centersWei Xie, Jiang Zhou, Mark Reyes, Jason Noble, Yong Chen. 327-332 [doi]

A predictive scheduling framework for fast and distributed stream data processingTeng Li, Jian Tang, Jielong Xu. 333-338 [doi]

A scalable implementation of information theoretic feature selection for high dimensional dataAnthony Kleerekoper, Michael Pappas, Adam Pocock, Gavin Brown, Mikel Luján. 339-346 [doi]

Edge importance identification for energy efficient graph processingS. M. Faisal, G. Tziantzioulis, A. M. Gok, Nikolaos Hardavellas, Seda Ogrenci Memik, Srinivasan Parthasarathy. 347-354 [doi]

Regular expression acceleration on the micron automata processor: Brill tagging as a case studyKeira Zhou, Jack Wadden, Jeffrey J. Fox, Ke Wang, Donald E. Brown, Kevin Skadron. 355-360 [doi]

Parallel in-memory trajectory-based spatiotemporal topological joinSuprio Ray, Angela Demke Brown, Nick Koudas, Rolando Blanco, Anil K. Goel. 361-370 [doi]

Spatially clustered join on heterogeneous scientific data setsBin Dong, Surendra Byna, Kesheng Wu. 371-380 [doi]

Recommending missing sensor valuesChung-Yi Li, Wei-Lun Su, Todd G. McKenzie, Fu-Chun Hsu, Shou-de Lin, Jane Yung-jen Hsu, Phillip B. Gibbons. 381-390 [doi]

The roles of network communities in social information diffusionCheng-Te Li, Yu-Jen Lin, Mi-Yen Yeh. 391-400 [doi]

Big data entity resolution: From highly to somehow similar entity descriptions in the WebVasilis Efthymiou, Kostas Stefanidis, Vassilis Christophides. 401-410 [doi]

Parallel meta-blocking: Realizing scalable entity resolution over large, heterogeneous dataVasilis Efthymiou, George Papadakis 0001, George Papastefanatos, Kostas Stefanidis, Themis Palpanas. 411-420 [doi]

Slingshot: A modular framework for designing data processing systemsBogdan Simion, Daniel N. Ilha, Suprio Ray, Leslie Barron, Angela Demke Brown, Ryan Johnson. 421-430 [doi]

LabBook: Metadata-driven social collaborative data analysisEser Kandogan, Mary Roth, Peter M. Schwarz, Joshua Hui, Ignacio Terrizzano, Christina Christodoulakis, Renée J. Miller. 431-440 [doi]

TrustMR: Computation integrity assurance system for MapReduceHuseyin Ulusoy, Murat Kantarcioglu, Erman Pattuk. 441-450 [doi]

AccountableMR: Toward accountable MapReduce systemsHuseyin Ulusoy, Murat Kantarcioglu, Erman Pattuk, Lalana Kagal. 451-460 [doi]

TKSimGPU: A parallel top-K trajectory similarity query processing algorithm for GPGPUsEleazar Leal, Le Gruenwald, Jianting Zhang, Simin You. 461-469 [doi]

A transaction model for management of replicated data with multiple consistency levelsAnand Tripathi, Bhagavathi Dhass Thirunavukarasu. 470-477 [doi]

Quadtree-based lightweight data compression for large-scale geospatial rasters on multi-core CPUsJianting Zhang, Simin You, Le Gruenwald. 478-484 [doi]

DSDQuery DSI - Querying scientific data repositories with structured operatorsRoee Ebenstein, Gagan Agrawal. 485-492 [doi]

Brown Dog: Leveraging everything towards autocurationSmruti Padhy, Greg Jansen, Jay Alameda, Edgar F. Black, Liana Diesendruck, Mike Dietze, Praveen Kumar, Rob Kooper, Jong Lee, Rui Liu, Richard Marciano, Luigi Marini, Dave Mattson, Barbara S. Minsker, Chris Navarro, Marcus Slavenas, William Sullivan, Jason Votava, Inna Zharnitsky, Kenton McHenry. 493-500 [doi]

Cost-efficient partitioning of spatial data on cloudAfsin Akdogan, Saratchandra Indrakanti, Ugur Demiryurek, Cyrus Shahabi. 501-506 [doi]

BigFUN: A performance study of big data management system functionalityPouria Pirzadeh, Michael J. Carey, Till Westmann. 507-514 [doi]

A flexible QoS fortified distributed key-value storage system for the cloudTonglin Li, Ke Wang, Dongfang Zhao, Kan Qiao, Iman Sadooghi, Xiaobing Zhou, Ioan Raicu. 515-522 [doi]

TPS: A task placement strategy for big data workflowsMahdi Ebrahimi, Aravind Mohan, Shiyong Lu, Robert G. Reynolds. 523-530 [doi]

Improving transaction processing performance by consensus reductionYuqing Zhu, Yilei Wang. 531-538 [doi]

Benchmarking key-value stores on high-performance storage and interconnects for web-scale workloadsDipti Shankar, Xiaoyi Lu, Md. Wasi-ur-Rahman, Nusrat S. Islam, Dhabaleswar K. Panda. 539-544 [doi]

An iterative methodology for big data management, analysis and visualizationRoberto Tardío, Alejandro Maté, Juan Trujillo. 545-550 [doi]

Bandwidth-efficient distributed k-nearest-neighbor search with dynamic time warpingChin-Chi Hsu, Perng-Hwa Kung, Mi-Yen Yeh, Shou-de Lin, Phillip B. Gibbons. 551-560 [doi]

Dynamic theme tracking in TwitterLiang Zhao, Feng Chen, Chang-Tien Lu, Naren Ramakrishnan. 561-570 [doi]

SyntacticDiff: Operator-based transformation for comparative text miningSean Massung, ChengXiang Zhai. 571-580 [doi]

Visual analysis of bi-directional movement behaviorYixian Zheng, Wenchao Wu, Huamin Qu, Chunyan Ma, Lionel M. Ni. 581-590 [doi]

User-curated image collections: Modeling and recommendationYuncheng Li, Tao Mei, Yang Cong, Jiebo Luo. 591-600 [doi]

Angular quantization based affinity propagation clustering and its application to astronomical big spectra dataKe Wang, Ping Guo, A.-li Luo. 601-608 [doi]

Scalable classification for large dynamic networksYibo Yao, Lawrence B. Holder. 609-618 [doi]

CINTIA: A distributed, low-latency index for big interval dataRuslan Mavlyutov, Philippe Cudré-Mauroux. 619-628 [doi]

Revealing the fog-of-war: A visualization-directed, uncertainty-aware approach for exploring high-dimensional dataYang Wang, Kwan-Liu Ma. 629-638 [doi]

Inferring crowd-sourced venues for tweetsBokai Cao, Francine Chen, Dhiraj Joshi, Philip S. Yu. 639-648 [doi]

Core decomposition in large temporal graphsHuanhuan Wu, James Cheng, Yi Lu, Yiping Ke, Yuzhen Huang, Da Yan, Hejun Wu. 649-658 [doi]

Recommending forum posts to designated expertsJason H. D. Cho, Yanen Li, Roxana Girju, ChengXiang Zhai. 659-666 [doi]

Accelerating collaborative filtering using concepts from high performance computingMark Gates, Hartwig Anzt, Jakub Kurzak, Jack Dongarra. 667-676 [doi]

Modelling cascades over time in microblogsWei Xie, Feida Zhu, Siyuan Liu, Ke Wang. 677-686 [doi]

CSFinder: A cold-start friend finder in large-scale social networksYasser Salem, Jun Hong, Weiru Liu. 687-696 [doi]

Effectively crowdsourcing the acquisition and analysis of visual data for disaster responseHien To, Seon Ho Kim, Cyrus Shahabi. 697-706 [doi]

Full diffusion history reconstruction in networksZhen Chen, Hanghang Tong, Lei Ying. 707-716 [doi]

AdaM: An adaptive monitoring framework for sampling and filtering on IoT devicesDemetris Trihinas, George Pallis, Marios D. Dikaiakos. 717-726 [doi]

Modeling graphs using a mixture of Kronecker modelsSuchismit Mahapatra, Varun Chandola. 727-736 [doi]

Data quality assessment and anomaly detection via map/reduce and linked data: A case study in the medical domainStephen Bonner, Andrew Stephen McGough, Ibad Kureshi, John Brennan, Georgios Theodoropoulos, Laura Moss, David Corsar, Grigoris Antoniou. 737-746 [doi]

SigCO: Mining significant correlations via a distributed real-time computation engineTian Guo, Jean-Paul Calbimonte, Hao Zhuang, Karl Aberer. 747-756 [doi]

Identifying smallest unique subgraphs in a heterogeneous social networkYen-Kai Wang, Wei-Ming Chen, Cheng-Te Li, Shou-de Lin. 757-766 [doi]

Toward precise user-topic alignment in online social mediaJieJun Xu, Tsai-Ching Lu. 767-775 [doi]

Visual interface for exploring caution spots from vehicle recorder big dataMasahiko Itoh, Daisaku Yokoyama, Masashi Toyoda, Masaru Kitsuregawa. 776-784 [doi]

ACURDION: An adaptive clustering-based algorithm for tracing large-scale MPI applicationsAmir Bahmani, Frank Mueller. 785-792 [doi]

Time maps: A tool for visualizing many discrete events across multiple timescalesMax C. Watson. 793-800 [doi]

Learning relevance from click data via neural network based similarity modelsXugang Ye, Zijie Qi, Dan Massey. 801-806 [doi]

Matisse: A visual analytics system for exploring emotion trends in social media text streamsChad A. Steed, Margaret Drouhard, Justin M. Beaver, Joshua Pyle, Paul Logasa Bogen. 807-814 [doi]

Robust crowd bias correction via dual knowledge transfer from multiple overlapping sourcesSihong Xie, Qingbo Hu, Jingyuan Zhang, Jing Gao, Wei Fan, Philip S. Yu. 815-820 [doi]

A community driven social recommendation systemDeepika Lalwani, Durvasula V. L. N. Somayajulu, P. Radha Krishna. 821-826 [doi]

Task-based recommendation on a web-scaleYongfeng Zhang, Min Zhang, Yiqun Liu, Tat-Seng Chua, Yi Zhang, Shaoping Ma. 827-836 [doi]

Multi-modal learning for video recommendation based on mobile application usageXiaowei Jia, Aosen Wang, Xiaoyi Li, Guangxu Xun, Wenyao Xu, Aidong Zhang. 837-842 [doi]

Improving EEG feature learning via synchronized facial videoXiaoyi Li, Xiaowei Jia, Guangxu Xun, Aidong Zhang. 843-848 [doi]

MMC-margin: Identification of maximum frequent subgraphs by metropolis Monte Carlo samplingMuyi Liu, Michael Gribskov. 849-856 [doi]

KeyLabel algorithms for keyword search in large graphsYue Wang, Ke Wang, Ada Wai-Chee Fu, Raymond Chi-Wing Wong. 857-864 [doi]

Spatio-temporal asynchronous co-occurrence pattern for big climate data towards long-lead flood predictionChung-Hsien Yu, Dong Luo, Wei Ding 0003, Joseph Paul Cohen, David L. Small, Shafiqul Islam. 865-870 [doi]

Using big data to study the link between human mobility and socio-economic developmentLuca Pappalardo, Dino Pedreschi, Zbigniew Smoreda, Fosca Giannotti. 871-878 [doi]

Cluster-based aggregate forecasting for residential electricity demand using smart meter dataTri Kurniawan Wijaya, Matteo Vasirani, Samuel Humeau, Karl Aberer. 879-887 [doi]

A scalable approach for data-driven taxi ride-sharing simulationMasayo Ota, Huy T. Vo, Cláudio T. Silva, Juliana Freire. 888-897 [doi]

EveryoneCounts: Data-driven digital advertising with uncertain demand model in metro networksDesheng Zhang, Riiobing Jiang, Shiiai Wang, Yanmin Zhu, Bo Yang, Jian Cao, Fan Zhang, Tian He. 898-907 [doi]

Fast decentralized gradient descent method and applications to in-situ seismic tomographyLiang Zhao, Wen-Zhan Song, Xiaojing Ye. 908-917 [doi]

Scientific computing meets big data technology: An astronomy use caseZhao Zhang, Kyle Barbary, Frank Austin Nothaft, Evan R. Sparks, Oliver Zahn, Michael J. Franklin, David A. Patterson, Saul Perlmutter. 918-927 [doi]

An interactive learning framework for scalable classification of pathology imagesMichael Nalisnik, David A. Gutman, Jun Kong, Lee A. D. Cooper. 928-935 [doi]

America Tweets China: A fine-grained analysis of the state and individual characteristics regarding attitudes towards ChinaYu Wang, Jianbo Yuan, Jiebo Luo. 936-943 [doi]

A data-driven approach to extract connectivity structures from diffusion tensor imaging dataYu Jin, Joseph F. JáJá, Rong Chen, Edward H. Herskovits. 944-951 [doi]

A MapReduce based k-NN joins probabilistic classifierGeorgios Chatzigeorgakidis, Sophia Karagiorgou, Spiros Athanasiou, Spiros Skiadopoulos. 952-957 [doi]

Scalable k-NN based text clusteringAlessandro Lulli, Thibault Debatty, Matteo Dell'Amico, Pietro Michiardi, Laura Ricci. 958-963 [doi]

An ensemble learning based approach for building airfare forecast serviceYuwen Chen, Jian Cao, Shanshan Feng, Yudong Tan. 964-969 [doi]

Next-term student grade predictionMack Sweeney, Jaime Lester, Huzefa Rangwala. 970-975 [doi]

Predicting the location of users on Twitter from low density graphsSofia Apreleva, Alejandro Cantarero. 976-983 [doi]

How not to drown in a sea of information: An event recognition approachElias Alevizos, Alexander Artikis, Kostas Patroumpas, Marios Vodas, Yannis Theodoridis, Nikos Pelekis. 984-990 [doi]

Smog disaster forecasting using social web data and physical sensor dataJiaoyan Chen, Huajun Chen, Daning Hu, Jeff Z. Pan, Yalin Zhou. 991-998 [doi]

Large scale support vector regression for aviation safetyKamalika Das, Kanishka Bhaduri, Bryan L. Matthews, Nikunj C. Oza. 999-1006 [doi]

City users' classification with mobile phone dataLorenzo Gabrielli, Barbara Furletti, Roberto Trasarti, Fosca Giannotti, Dino Pedreschi. 1007-1012 [doi]

Spaler: Spark and GraphX based de novo genome assemblerAnas Abu-Doleh, Ümit V. Çatalyürek. 1013-1018 [doi]

Traffic forecasting in complex urban networks: Leveraging big data and machine learningFlorin Schimbinschi, Xuan Vinh Nguyen, James Bailey, Christopher Leckie, Hai Vu, Rao Kotagiri. 1019-1024 [doi]

Prediction of physiological subsystem failure and its impact in the prediction of patient mortalityKarla L. Caballero Barajas, Ram Akella. 1025-1030 [doi]

Efficient distributed maximum matching for solving the container exchange problem in the maritime industryFei Shao, Li-Yung Ho, Jan-Jan Wu, Pangfeng Liu. 1031-1036 [doi]

Cell analytics in compound hit selection of bacterial inhibitorsRobert P. Trevino, Steve A. Kawamoto, Thomas J. Lamkin, Huan Liu. 1037-1042 [doi]

Mining target users for online marketing based on App Store dataXiuqiang He, Wenyuan Dai, Guoxiang Cao, Ruiming Tang, Mingxuan Yuan, Qiang Yang 0001. 1043-1052 [doi]

Scalable community discovery from multi-faceted graphsAhmed Metwally, Jia-Yu Pan, Minh Doan, Christos Faloutsos. 1053-1062 [doi]

Towards real-time customer experience prediction for telecommunication operatorsErnesto Diaz-Aviles, Fabio Pinelli, Karol Lynch, Zubair Nabi, Yiannis Gkoufas, Eric Bouillet, Francesco Calabrese, Eoin Coughlan, Peter Holland, Jason Salzwedel. 1063-1072 [doi]

Early experience with optimizing I/O performance using high-performance SSDs for in-memory cluster computingI. Stephen Choi, Weiqing Yang, Yang-Suk Kee. 1073-1083 [doi]

An evaluation of alternative shared-nothing architecture for analytical processing systemsHyunsik Choi, Jongyoung Park, Yong In Lee, Kangho Roh, Kwanghyun La. 1084-1093 [doi]

Controlled experiments for decision-making in e-Commerce searchAnjan Goswami, Wei Han, Zhenrui Wang, Angela Jiang. 1094-1102 [doi]

Semantics for Big Data access & integration: Improving industrial equipment design through increased data usabilityJenny Weisenberg Williams, Paul Cuddihy, Justin McHugh, Kareem S. Aggour, Arvind Menon, Steven M. Gustafson, Timothy Healy. 1103-1112 [doi]

Online anomaly detection over Big Data streamsLaura Rettig, Mourad Khayati, Philippe Cudré-Mauroux, Michal Piórkowski. 1113-1122 [doi]

Contextual verification for false alarm reduction in maritime anomaly detectionAungon Nag Radon, Ke Wang, Uwe Glässer, Hans Wehn, Andrew Westwell-Roper. 1123-1133 [doi]

Batch-mode active learning for technology-assisted reviewTanay Kumar Saha, Mohammad Al Hasan, Chandler Burgess, Md. Ahsan Habib, Jeff Johnson. 1134-1143 [doi]

A pipeline for extracting and deduplicating domain-specific knowledge basesMayank Kejriwal, Qiaoling Liu, Ferosh Jacob, Faizan Javed. 1144-1153 [doi]

EXOS: Expansion on session for enhancing effectiveness of query auto-completionFang-Hsiang Su, Manas Somaiya, Shrish Mishra, Rajyashree Mukherjee. 1154-1163 [doi]

Probabilistic km-anonymity efficient anonymization of large set-valued datasetsGergely Ács, Jagdish Prasad Achara, Claude Castelluccia. 1164-1173 [doi]

ADMM based scalable machine learning on SparkSauptik Dhar, Congrui Yi, Naveen Ramakrishnan, Mohak Shah. 1174-1182 [doi]

Record-aware compression for big textual data analysis accelerationDapeng Dong, John Herbert. 1183-1190 [doi]

Graph analytics using vertica relational databaseAlekh Jindal, Samuel Madden, Malú Castellanos, Meichun Hsu. 1191-1200 [doi]

Automotive big data: Applications, workloads and infrastructuresAndré Luckow, Ken Kennedy, Fabian Manhardt, Emil Djerekarov, Bennie Vorster, Amy W. Apon. 1201-1210 [doi]

Cost-sensitive optimization of automated inspectionGoktug T. Cinar, Jeffrey Thompson, Soundar Srinivasan. 1211-1219 [doi]

From performance profiling to predictive analytics while evaluating hadoop cost-efficiency in ALOJANicolás Poggi, Josep Lluis Berral, David Carrera, Aaron Call, Fabrizio Gagliardi, Rob Reinauer, Nikola Vujic, Daron Green, José A. Blakeley. 1220-1229 [doi]

Query sense disambiguation leveraging large scale user behavioral dataMohammed Korayem, Camilo Ortiz, Khalifeh AlJadda, Trey Grainger. 1230-1237 [doi]

Personalized expertise search at LinkedInViet Ha-Thuc, Ganesh Venkataraman, Mario Rodriguez, Shakti Sinha, Senthil Sundaram, Lin Guo. 1238-1247 [doi]

How valuable is your data? A quantitative approach using data miningVinay Deolalikar. 1248-1253 [doi]

Mining lifestyle personas at scale in e-commerceKang Li, Vinay Deolalikar, Neeraj Pradhan. 1254-1261 [doi]

SDFS: Secure distributed file system for data-at-rest security for Hadoop-as-a-servicePetros Zerfos, Hangu Yeo, Brent D. Paulovicks, Vadim Sheinin. 1262-1271 [doi]

Open research challenges with Big Data - A data-scientist's perspectiveSreenivas R. Sukumar. 1272-1278 [doi]

Maritime situation analysis framework: Vessel interaction classification and anomaly detectionHamed Yaghoubi Shahir, Uwe Glässer, Amir Yaghoubi Shahir, Hans Wehn. 1279-1289 [doi]

PAIRS: A scalable geo-spatial data analytics platformLevente J. Klein, Fernando J. Marianno, Conrad M. Albrecht, Marcus Freitag, Siyuan Lu, Nigel Hinds, Xiaoyan Shao, Sergio Bermudez Rodriguez, Hendrik F. Hamann. 1290-1298 [doi]

Post-purchase recommendations in large-scale online marketplacesJayasimha Katukuri, Tolga Könik, Rajyashree Mukherjee, Santanu Kolay. 1299-1305 [doi]

Revenue maximization for telecommunications company with social viral marketingHong-Han Shuai, Chih-Ya Shen, Hsiang-Chun Hsu, De-Nian Yang, Chung-Kuang Chou, Jihg-Hong Lin, Ming-Syan Chen. 1306-1310 [doi]

Developer toolchains for large-scale analytics: Two case studiesStephanie Rosenthal, Scott McMillan, Matthew E. Gaston. 1311-1316 [doi]

Enterprise subscription churn predictionRamakrishna Vadakattu, Bibek Panda, Swarnim Narayan, Harshal Godhia. 1317-1321 [doi]

Data deidentification in medical transcriptions using regular expressions and machine learningJoshua Seeger, Aron Culotta, Jason Keller, Patrick van Kessel, Michael Jugovich. 1322-1323 [doi]

Macau: Large-scale skill sense disambiguation in the online recruitment domainQinlong Luo, Meng Zhao, Faizan Javed, Ferosh Jacob. 1324-1329 [doi]

Genomic analysis with MapReduceWei-Yi Liu, Hui-I. Hsiao, Shih-Yao Dai. 1330-1335 [doi]

Eagle: User profile-based anomaly detection for securing Hadoop clustersChaitali Gupta, Ranjan Sinha, Yong Zhang. 1336-1343 [doi]

Investigating insurance fraud using social mediaManuel Diaz-Granados, Javier Diaz Montes, Manish Parashar. 1344-1349 [doi]

A document-based data model for large scale computational maritime situational awarenessLuca Cazzanti, Leonardo M. Millefiori, Gianfranco Arcieri. 1350-1356 [doi]

Modeling social influences from call records and mobile web browsing historiesJhao-Yin Li, Mi-Yen Yeh, Ming-Syan Chen, Jihg-Hong Lin. 1357-1361 [doi]

Next generation biobanksChristian Seebode, Matthias Ort, Peter Hufnagl, Christian R. A. Regenbrecht. 1362-1367 [doi]

Business understanding, challenges and issues of Big Data Analytics for the servitization of a capital equipment manufacturerMikel Nino, José Miguel Blanco, Arantza Illarramendi. 1368-1377 [doi]

Data driven predictive analytics for a spindle's healthDivya Sardana, Raj Bhatnagar, Radu Pavel, Jonathan Iverson. 1378-1387 [doi]

A "smart component" data model in PLMYunpeng Li, Utpal Roy, Seung-Jun Shin, Y. Tina Lee. 1388-1397 [doi]

Big data process analytics for continuous process improvement in manufacturingNenad Stojanovic, Marko Dinic, Ljiljana Stojanovic. 1398-1407 [doi]

Automated uncertainty quantification analysis using a system model and dataSaideep Nannapaneni, Sankaran Mahadevan, David Lechevalier, Anantha Narayanan, Sudarsan Rachuri. 1408-1417 [doi]

Analysis and optimization in smart manufacturing based on a reusable knowledge base for process performance modelsAlexander Brodsky, Guodong Shao, Mohan Krishnamoorthy, Anantha Narayanan, Daniel A. Menascé, Ronay Ak. 1418-1427 [doi]

A neural network meta-model and its application for manufacturingDavid Lechevalier, Steven Hudak, Ronay Ak, Y. Tina Lee, Sebti Foufou. 1428-1435 [doi]

Performance assessment and uncertainty quantification of predictive models for smart manufacturing systemsLuca Oneto, Ilenia Orlandi, Davide Anguita. 1436-1445 [doi]

Time complexity and architecture of a cloud based prognostics system for a multi-client condition monitoring activityAshwin K. Thillai Natarajan, Sagar Kamarthi. 1446-1450 [doi]

Real-time energy prediction for a milling machine tool using sparse Gaussian process regressionJinkyoo Park, Kincho H. Law, Raunak Bhinge, Mason Chen, David Dornfeld, Sudarsan Rachuri. 1451-1460 [doi]

Parallel Particle Swarm Optimization (PPSO) clustering for learning analyticsKannan Govindarajan, David Boulanger, Vivekanandan Suresh Kumar, Kinshuk. 1461-1465 [doi]

Analysis and prediction of Ε-customers' behavior by mining clickstream dataGokhan Silahtaroglu, Hale Donertasli. 1466-1472 [doi]

High quality clustering of big data and solving empty-clustering problem with an evolutionary hybrid algorithmJeyhun Karimov, A. Murat Ozbayoglu. 1473-1478 [doi]

Data decomposition and dual clustering for clinical care managementShusaku Tsumoto, Shoji Hirano, Haruko Iwata. 1475-1584 [doi]

Agile text mining with SherlokRenaud Richardet, Jean-Cédric Chappelier, Shreejoy Tripathy, Sean L. Hill. 1479-1484 [doi]

Scalable adaptive label propagation in GrappaGolnoosh Farnadi, Zeinab Mahdavifar, Ivan Keller, Jacob Nelson, Ankur Teredesai, Marie-Francine Moens, Martine De Cock. 1485-1491 [doi]

Profiling subscribers according to their internet usage characteristics and behaviorsKasim Oztoprak. 1492-1499 [doi]

QueRIE reloaded: Using matrix factorization to improve database query recommendationsMagdalini Eirinaki, Sweta Patel. 1500-1508 [doi]

Monitoring adolescent alcohol use via multimodal analysis in social multimediaRan Pang, Agustin Baretto, Henry A. Kautz, Jiebo Luo. 1509-1518 [doi]

An efficient map-reduce algorithm for computing formal concepts from binary dataRaj Bhatnagar, Lalit Kumar. 1519-1528 [doi]

Learning relaxed 3-clusters from pairs of related datasetsJagadeesh Patchala, Raj Bhatnagar. 1529-1538 [doi]

Parallel information fusion method for microarray data analysisJun Meng, Rui Li, Jing Zhang. 1539-1544 [doi]

A-Star algorithm based on-demand routing protocol for hierarchical LEO/MEO satellite networksXuezhi Ji, Lixiang Liu, Pei Zhao, Dapeng Wang. 1545-1549 [doi]

Granular modeling with fuzzy comparatorsLukasz Sosnowski, Marcin S. Szczuka, Dominik Slezak. 1550-1555 [doi]

Agglomerative algorithm to discover semantics from unstructured big dataI. Jen Chiang. 1556-1563 [doi]

A granular approach for identifying user knowledgeAlexander Denzler, Marcel Wehrle, Andreas Meier. 1564-1569 [doi]

Twitter opinion mining for adverse drug reactionsLiang Wu, Teng-Sheng Moh, Natalia Khuri. 1570-1574 [doi]

Holistic entity matching across knowledge graphsMaria Pershina, Mohamed Yakout, Kaushik Chakrabarti. 1585-1590 [doi]

GrC-based statistic optimization algorithm for big truth tableZehua Chen, He Ma, Yu Zhang. 1591-1596 [doi]

Mining incomplete data with many attribute-concept values and "do not care" conditionsPatrick G. Clark, Jerzy W. Grzymala-Busse. 1597-1602 [doi]

Chinese wall security policies information flows in business cloudTsau-Young T. Y. Lin. 1603-1607 [doi]

Granular formalization of medical diagnostic processShusaku Tsumoto, Shoji Hirano. 1608-1614 [doi]

Mobile gesture-based iPhone user authenticationKaran Khare, Teng-Sheng Moh. 1615-1621 [doi]

Cost and data exploration considerations for big data prediction on the cloudChris Tseng, Tien Nguyen, Chetan Sharma. 1622-1628 [doi]

Mining local gazetteers of literary Chinese with CRF and pattern based methods for biographical information in Chinese historyChao-Lin Liu, Chih-Kai Huang, Hongsu Wang, Peter K. Bol. 1629-1638 [doi]

Towards a mobile social data commonsGiles Greenway, Leonard Mack, Tobias Blanke, Mark Cote, Tom Heath. 1639-1642 [doi]

Scaling out for extreme scale corpus dataMatthew Coole, Paul Rayson, John A. Mariani. 1643-1649 [doi]

Metaphor mining in historical german novels: An unsupervised learning approachStefan Pernes. 1650-1652 [doi]

Predicting social trends from non-photographic images on TwitterMehrdad Yazdani, Lev Manovich. 1653-1660 [doi]

The coding of literary form: Data mining and the information structure of historical textsDallas Liddle. 1661-1666 [doi]

Plot arceology: A vector-space model of narrative structureBenjamin M. Schmidt. 1667-1672 [doi]

A method for cross-document narrative alignment of a two-hundred-sixty-million word corpusBen Miller, Jennifer Olive, Shakthidhar Reddy Gopavaram, Yanjun Zhao, Ayush Shrestha, Cynthia Berger. 1673-1677 [doi]

Mixed-initiative social media analytics at the World Bank: Observations of citizen sentiment in Twitter data to explore "trust" of political actors and state institutions and its relationship to social protestNadya A. Calderón, Brian D. Fisher, Jeff Hemsley, Billy Ceskavich, Greg Jansen, Richard Marciano, Victoria L. Lemieux. 1678-1687 [doi]

Workload-driven adaptive data partitioning and distribution - The Cumulus approachIlir Fetai, Damian Murezzan, Heiko Schuldt. 1688-1697 [doi]

Account clustering in multi-tenant storage management environmentsGabor Madl, Ramani Routray, Yang Song, Rakesh Jain. 1698-1707 [doi]

Fine-tuning the consistency-latency trade-off in quorum-replicated distributed storage systemsMarlon McKenzie, Hua Fan, Wojciech M. Golab. 1708-1717 [doi]

Priority register: Application-defined replacement orderings for ad hoc reconciliationSathiya Prabhu Kumar, Sylvain Lefebvre, Minyoung Kim, Mark-Oliver Stehr. 1718-1727 [doi]

A generalized flow for multi-class and binary classification tasks: An Azure ML approachMatthew Bihis, Sohini Roychowdhury. 1728-1737 [doi]

Comparison of eager and quorum-based replication in a cloud environmentAlexander Stiemer, Ilir Fetai, Heiko Schuldt. 1738-1748 [doi]

Towards a taxonomy of standards in smart dataAlexander Lenk, Leif Bonorden, Astrid Hellmanns, Nico Rödder, Stefan Jähnichen. 1749-1754 [doi]

Marlin: Taming the big streaming data in large scale video similarity searchNan Zhu, Wenbo He, Yu Hua, Yixin Chen. 1755-1764 [doi]

Indexing historical spatio-temporal data in the cloudChong Zhang, Xiaoying Chen, Bin Ge, Weidong Xiao. 1765-1774 [doi]

Push-based system for molecular simulation data analysisVladimir Grupcev, Yi-Cheng Tu, Joseph C. Fogarty, Sagar Pandit. 1775-1784 [doi]

Challenges and opportunities on network resource management in DCN with SDNGuan Xu, Jun Yang, Bin Dai. 1785-1790 [doi]

On the implementation of Zigzag codes for distributed storage systemLijia Lu, Hui Li, Jun Chen, Bing Zhu, Weijuan Yin. 1791-1796 [doi]

A comprehensive evaluation of NoSQL datastores in the context of historians and sensor data analysisArun Kumar Kalakanti, Vinay Sudhakaran, Varsha Raveendran, Nisha Menon. 1797-1806 [doi]

Learning classifiers from remote RDF data stores augmented with RDFS subclass hierarchiesHarris T. Lin, Ngot Bui, Vasant Honavar. 1807-1813 [doi]

DISTINGER: A distributed graph data structure for massive dynamic graph processingGuoyao Feng, Xiao Meng, Khaled Ammar. 1814-1822 [doi]

LiteMat: A scalable, cost-efficient inference encoding scheme for large RDF graphsOlivier Curé, Hubert Naacke, Tendry Randriamalala, Bernd Amann. 1823-1830 [doi]

MQuery: A query language for scientific meshesAlireza Rezaei Mahdiraji, Peter Baumann. 1831-1838 [doi]

A fast parallel algorithm for counting triangles in graphs using dynamic load balancingShaikh Arifuzzaman, Maleq Khan, Madhav V. Marathe. 1839-1847 [doi]

Scalable storage structure for pattern matching on big graph dataJanani Balaji, Rajshekhar Sunderraman. 1848-1855 [doi]

Employing in-memory data grids for distributed graph processingSerafettin Tasci, Murat Demirbas. 1856-1864 [doi]

Current security threats and prevention measures relating to cloud services, Hadoop concurrent processing, and big dataAther Sharif, Sarah Cooney, Shengqi Gong, Drew Vitek. 1865-1870 [doi]

Security for the scientific data services frameworkJinoh Kim, Bin Dong, Surendra Byna, Kesheng Wu. 1871-1875 [doi]

A novel framework for mitigating insider attacks in big data systemsSantosh Aditham, Nagarajan Ranganathan. 1876-1885 [doi]

Heterogeneous k-anonymization with high utilityKaterina Doka, Mingqiang Xue, Dimitrios Tsoumakos, Panagiotis Karras, Alfredo Cuzzocrea, Nectarios Koziris. 1886-1890 [doi]

Multi-probe random projection clustering to secure very large distributed datasetsLee A. Carraher, Philip A. Wilsey, Anindya Moitra, Sayantan Dey. 1891-1900 [doi]

Fast summarization and anonymization of multivariate big time seriesDymitr Ruta, Ling Cen, Ernesto Damiani. 1901-1904 [doi]

Toward big data risk analysisErnesto Damiani. 1905-1909 [doi]

A distributed framework for supporting adaptive ensemble-based intrusion detectionAlfredo Cuzzocrea, Gianluigi Folino, Pietro Sabatino. 1910-1916 [doi]

Simplifying web analytics for digital marketingAndy Bengel, Amin Shawki, Dippy Aggarwal. 1917-1918 [doi]

PAUSE: A privacy architecture for heterogeneous big data environmentsDawn N. Jutla, Peter Bodorik. 1919-1928 [doi]

Spatio-temporal queries in HBaseXiaoying Chen, Chong Zhang, Bin Ge, Weidong Xiao. 1929-1937 [doi]

Component based dataflow processing frameworkV. Gyurjyan, A. Bartle, Constantine Lukashin, S. Mancilla, R. Oyarzun, A. Vakhnin. 1938-1942 [doi]

Earth science data fusion with event building approachConstantine Lukashin, A. Bartle, E. Callaway, V. Gyijrjyan, S. Mancilla, R. Oyarzun, A. Vakhnin. 1943-1947 [doi]

Climate model diagnostic analyzerSeungWon Lee, Lei Pan, Chengxing Zhai, Benyang Tang, Terry Kubar, Jia Zhang, Wei Wang. 1948-1952 [doi]

High performance analysis of big spatial dataDavid Haynes, Suprio Ray, Steven M. Manson, Ankit Soni. 1953-1957 [doi]

International standard "OGC® moving features" to address "4Vs" on locational bigdataAkinori Asahara, Hideki Hayashi, Nobuhiro Ishimaru, Ryosuke Shibasaki, Hiroshi Kanasugi. 1958-1966 [doi]

Optimizing apache nutch for domain specific crawling at large scaleLuis A. Lopez, Ruth E. Duerr, Siri Jodha Singh Khalsa. 1967-1971 [doi]

A Hadoop-based visualization and diagnosis framework for earth science dataShujia Zhou, Xi Yang, Xiaowen Li, Toshihisa Matsui, Si Liu, Xian-He Sun, Wei-Kuo Tao. 1972-1977 [doi]

Enabling scientific data storage and processing on big-data systemsSaman Biookaghazadeh, Yiqi Xu, Shujia Zhou, Ming Zhao 0002. 1978-1984 [doi]

Light-weight parallel Python tools for earth system modeling workflowsKevin Paul, Sheri A. Mickelson, John M. Dennis, Haiying Xu, David Brown. 1985-1994 [doi]

WDCloud: An end to end system for large-scale watershed delineation on cloudIn Kee Kim, Jacob Steele, Anthony M. Castronova, Jonathan L. Goodall, Marty Humphrey. 1995-2004 [doi]

Integrating 'Big' geoscience data into the petascale national environmental research interoperability platform (NERDIP): Successes and unforeseen challengesLesley Wyborn, Benjamin J. K. Evans. 2005-2009 [doi]

An optimized interestingness hotspot discovery framework for large gridded spatio-temporal datasetsFatih Akdag, Christoph F. Eick. 2010-2019 [doi]

SciSpark: Applying in-memory distributed computing to weather event detection and trackingRahul Palamuttam, Renato Marroquin Mogrovejo, Chris Mattmann, Brian Wilson, Kim Whitehall, Rishi Verma, Lewis J. McGibbney, Paul M. Ramirez. 2020-2026 [doi]

Detecting environmental disasters in digital news archivesAmelia Yzaguirre, Robert Warren, Mike Smit. 2027-2035 [doi]

Is Apache Spark scalable to seismic data analytics and computations?Yuzhong Yan, Lei Huang, Liqi Yi. 2036-2045 [doi]

On the efficient evaluation of array joinsPeter Baumann, Vlad Merticariu. 2046-2055 [doi]

Business information modeling: A methodology for data-intensive projects, data science and big data governanceTorsten Priebe, Stefan Markus. 2056-2065 [doi]

The need for new processes, methodologies and tools to support big data teams and improve big data project effectivenessJeffrey S. Saltz. 2066-2071 [doi]

Towards methods for systematic research on big dataManirupa Das, Renhao Cui, David R. Campbell, Gagan Agrawal, Rajiv Ramnath. 2072-2081 [doi]

Towards a big data theory modelMarco Pospiech, Carsten Felden. 2082-2090 [doi]

Three critical matters in big data projects for e-science: Different user groups, the mutually constitutive perspective, and virtual organizational capacityKerk F. Kee. 2091-2097 [doi]

Exploring the process of doing data science via an ethnographic study of a media advertising companyJeffrey S. Saltz, Ivan Shamshurin. 2098-2105 [doi]

Forecast UPC-level FMCG demand, Part I: Exploratory analysis and visualizationDazhi Yang, Gary S. W. Goh, Chi Xu, Allan N. Zhang, Orkan Akcan. 2106-2112 [doi]

Forecast UPC-level FMCG demand, Part II: Hierarchical reconciliationDazhi Yang, Gary S. W. Goh, Siwei Jiang, Allan N. Zhang, Orkan Akcan. 2113-2121 [doi]

Sparsity adjusted information gain for feature selection in sentiment analysisB. Y. Ong, S. W. Goh, Chi Xu. 2122-2128 [doi]

Dynamic aggregation for time series forecastingS. Iosevich, G. Arutyunyants, Z. Hou. 2129-2131 [doi]

Big data analytics for empowering milk yield prediction in dairy supply chainsW. J. Yan, X. Chen, O. Akcan, J. Lim, D. Yang. 2132-2137 [doi]

Profit estimation error analysis in recommender systems based on association rulesGürdal Ertek, Xu Chi, Gabriel Yee, Ong Boon Yong, Byung Geun Choi. 2138-2142 [doi]

Graph-based analysis of resource dependencies in project networksGürdal Ertek, Byung Geun Choi, Xu Chi, Dazhi Yang, Ong Boon Yong. 2143-2149 [doi]

A data fusion framework for large-scale measurement platformsPrapa Rattadilok, John McCall, Trevor Burbridge, Andrea Soppera, Philip Eardley. 2150-2158 [doi]

Sensor event mining with hybrid ensemble learning and evolutionary feature subset selection modelNijat Mehdiyev, Julian Krumeich, Dirk Werth, Peter Loos. 2159-2168 [doi]

Optimization of system architecture for Big Data analysis in climate scienceHuikyo Lee, Luca Cinquini, Daniel J. Crichton, Amy Braverman. 2169-2172 [doi]

In-situ analytics for tomographic imaging in sensor networkGoutham Kamath, Wen-Zhan Song. 2173-2176 [doi]

Ontology-drive data access at the NASA earth exchangeBeth Huffer, Marc Cotnoir, Jonathan Gleason. 2177-2181 [doi]

Strategie roadmap for the earth system grid federationDean N. Williams, Michael Lautenschlager, Venkatramani Balaji, Luca Cinquini, Cecelia DeLuca, Sebastien Denvil, Daniel Duffy, Benjamin J. K. Evans, Robert Ferraro, Martin Juckes, Claire Trenham. 2182-2190 [doi]

Constrained region selection method based on configuration space for visualization in scientific dataset searchShin'ichi Takeuchi, Komei Sugiura, Yuhei Akahoshi, Koji Zettsu. 2191-2200 [doi]

Enhancing science support in SQLPeter Baumann, Dimitar Misev. 2201-2204 [doi]

Modeling community detection using slow mixing random walksRamezan Paravi Torghabeh, Narayana Prasad Santhanam. 2205-2211 [doi]

Dimensional scalability of supervised and unsupervised concept drift detection: An empirical studyJorge David Destephen Lavaire, Anshuman Singh, Mahmoud Yousef, Sumi Singh, Xiaodong Yue. 2212-2218 [doi]

Efficient change detection for high dimensional data streamsSpiros V. Georgakopoulos, Sotiris K. Tasoulis, Vassilis P. Plagianakos. 2219-2222 [doi]

Big data analytics for demand response: Clustering over space and timeCharalampos Chelmis, Jahanvi Kolte, Viktor K. Prasanna. 2223-2232 [doi]

Finding banded patterns in big data using samplingFatimah Binta Abdullahi, Frans Coenen, Russell Martin. 2233-2242 [doi]

Scalable preference queries for high-dimensional data using map-reduceGheorghi Guzun, Joel E. Tosado, Guadalupe Canahuate. 2243-2252 [doi]

Discovering time-evolving influence from dynamic heterogeneous graphsChuan Hu, Huiping Cao. 2253-2262 [doi]

Combining activity-evaluation information with NMF for trust-link prediction in social mediaKanji Matsutani, Masahito Kumano, Masahiro Kimura, Kazumi Saito, Kouzou Ohara, Hiroshi Motoda. 2263-2272 [doi]

Identifying actionable messages on social mediaNemanja Spasojevic, Adithya Rao. 2273-2281 [doi]

Klout score: Measuring influence across multiple social networksAdithya Rao, Nemanja Spasojevic, Zhisheng Li, Trevor DSouza. 2282-2289 [doi]

Top (k1, k2) Distance-based outliers detection in an uncertain datasetFei Liu 0016, Yan Jia. 2290-2299 [doi]

Understanding the time characteristic of user behavior on online forumsGuirong Chen, Ning Wang, Fengqin Zhang, Hua Jiang. 2300-2306 [doi]

Characterizing super spreading in microblog: An epidemic-based modelYu Liu, Bin Wu, Bai Wang. 2307-2313 [doi]

A community detection method based on K-shellYang Wang, Liutong Xu, Bin Wu. 2314-2319 [doi]

How much is your information worth - A method for revenue generation for your informationDivya Rao, Wee Keong Ng. 2320-2326 [doi]

Efficient large scale distributed matrix computation with sparkRong Gu, Yun Tang, Zhaokang Wang, Shuai Wang, Xusen Yin, Chunfeng Yuan, Yihua Huang. 2327-2336 [doi]

A collaborative filtering algorithm fusing user-based, item-based and social networksBaiLing Wang, Junheng Huang, Libing Ou, Rui Wang. 2337-2343 [doi]

Mining the relation between dorm arrangement and student performanceMan Li, Ruisheng Shi. 2344-2347 [doi]

A proactive discovery and filtering solution on phishing websitesFang Lv, BaiLing Wang, Junheng Huang, Yushan Sun, Yuliang Wei. 2348-2355 [doi]

Finding community structure via rough K-means in social networkYunlei Zhang, Bin Wu. 2356-2361 [doi]

A survey of semantic similarity and its application to social network analysisShuang Zhang, Xuefeng Zheng, Changjun Hu. 2362-2367 [doi]

Dynamic community detection based on game theory in social networksFei Jiang, Jin Xu. 2368-2373 [doi]

The value of analytical queries on Social NetworksMichel de Rougemont, Guillaume Vimont. 2374-2383 [doi]

A collaborative filtering algorithm based on social network informationRui Wang, BaiLing Wang, Junheng Huang. 2384-2389 [doi]

Efficient approximation algorithms to determine minimum partial dominating sets in social networksAlina Campan, Traian Marius Truta, Matthew Beckerich. 2390-2397 [doi]

Ties that matterGarisha Chowdhary, Sanghamitra Bandyopadhyay. 2398-2403 [doi]

Sentiment expression via emoticons on social mediaHao Wang, Jorge A. Castanon. 2404-2408 [doi]

On compressing massive streaming graphs with QuadtreesMichael Nelson, Sridhar Radhakrishnan, Amlan Chatterjee, Chandra N. Sekharan. 2409-2417 [doi]

Social set visualizer: A set theoretical approach to big social data analytics of real-world eventsBenjamin Flesch, Ravi Vatrapu, Raghava Rao Mukkamala, Abid Hussain. 2418-2427 [doi]

A novel symbolization technique for time-series outlier detectionGavin Smith, James Goulding. 2428-2436 [doi]

Volatility matrix inference in high-frequency finance with regularization and efficient computationsJian Zou, Yunbo An, Hong Yan. 2437-2444 [doi]

Shaping data: Visualization under constructionOliver Bieh-Zimmert, Carsten Felden. 2445-2452 [doi]

Immersive visualization for materials science data analysis using the Oculus RiftMargaret Drouhard, Chad A. Steed, Steven Hahn, Thomas Proffen, Jamison Daniel, Michael Matheson. 2453-2461 [doi]

Spatio-temporal similarity search method for disaster estimationHideki Hayashi, Akinori Asahara, Natsuko Sugaya, Yuichi Ogawa, Hitoshi Tomita. 2462-2469 [doi]

Scalable dental computing on cyberinfrastructureHui Zhang 0006, Riqing Chen, Guangchen Ruan, Masatoshi Ando. 2470-2478 [doi]

Wrangler's user environment: A software framework for management of data-intensive computing systemChristopher Jordan, David Walling, Weijia Xu, Stephen A. Mock, Niall Gaffney, Dan Stanzione. 2479-2486 [doi]

Visual analysis of large-scale LiDAR point cloudsWanbo Luo, Hui Zhang. 2487-2492 [doi]

A database-based distributed computation architecture with Accumulo and D4M: An application of eigensolver for large sparse matrixYin Huang, Yelena Yesha, Shujia Zhou. 2493-2500 [doi]

Texture-based edge bundling: A web-based approach for interactively visualizing large graphsJieting Wu, Lina Yu, Hongfeng Yu. 2501-2508 [doi]

Big data provenance: Challenges, state of the art and opportunitiesJianwu Wang, Daniel Crawl, Shweta Purawat, Mai Nguyen, Ilkay Altintas. 2509-2516 [doi]

Performance evaluation of enabling logistic regression for big data with RRuizhu Huang, Weijia Xu. 2517-2524 [doi]

Skill grouping method: Mining and clustering skill differences from body movement BigDataShinichi Yamagiwa, Yoshinobu Kawahara, Noriyuki Tabuchi, Yoshinobu Watanabe, Takeshi Naruo. 2525-2534 [doi]

Regularized and sparse stochastic k-means for distributed large-scale clusteringVilen Jumutc, Rocco Langone, Johan A. K. Suykens. 2535-2540 [doi]

Join algorithms on GPUs: A revisit after seven yearsRan Rui, Hao Li, Yi-Cheng Tu. 2541-2550 [doi]

A data-driven approach towards patient identification for telehealth programsMartha Ganser, Sauptik Dhar, Unmesh Kurup, Carlos Cunha, Aca Gacic. 2551-2559 [doi]

Ensemble prediction of vascular injury in Trauma care: Initial efforts towards data-driven, low-cost screeningMax Metzger, Michael Howard, Lee Kellogg, Rishi Kundi. 2560-2568 [doi]

M-SEQ: Early detection of anxiety and depression via temporal orders of diagnoses in electronic health dataJinghe Zhang, Haoyi Xiong, Yu Huang, Hao Wu, Kevin Leach, Laura E. Barnes. 2569-2577 [doi]

Using clinical data, hypothesis generation tools and PubMed trends to discover the association between diabetic retinopathy and antihypertensive drugsKatherine Senter, Sreenivas R. Sukumar, Robert M. Patton, Edward Chaum. 2578-2582 [doi]

Enabling graph appliance for genome assemblyRina Singh, Jeffrey A. Graves, SangKeun Lee, Sreenivas R. Sukumar, Mallikarjun Shankar. 2583-2590 [doi]

A framework for consensual and online privacy preserving record linkage in real-timeDaniel Muller, Stefan Mau, Irena Pletikosa Cvijikj. 2591-2599 [doi]

A memory capacity model for high performing data-filtering applications in Samza frameworkTao Feng, Zhenyun Zhuang, Yi Pan, Haricharan Ramachandra. 2600-2605 [doi]

Robust and distributed web-scale near-dup document conflation in microsoft academic serviceChieh-Han Wu, Yang Song. 2606-2611 [doi]

Evaluation of data quality of multisite electronic health record data for secondary analysisAlicia L. Nobles, Ketki Vilankar, Hao Wu, Laura E. Barnes. 2612-2620 [doi]

CrowdMD: Crowdsourcing-based approach for deduplicationAsma Abboura, Soror Sahri, Mourad Ouziri, Salima Benbernou. 2621-2627 [doi]

Data veracity estimation with ensembling truth discovery methodsLaure Berti-Equille. 2628-2636 [doi]

Distributed life cycle scheduling for cascaded data processingLavanya Sainik. 2637-2643 [doi]

Big data, big data quality problemDavid Becker, Trish Dunn King, Bill McMullen. 2644-2653 [doi]

Data quality issues in big dataDhana Rao, Venkat N. Gudivada, Vijay V. Raghavan 0001. 2654-2660 [doi]

Machine learning for stress detection from ECG signals in automobile driversN. Keshan, P. V. Parimi, Isabelle Bichindaritz. 2661-2669 [doi]

Sequential pattern mining of electronic healthcare reimbursement claims: Experiences and challenges in uncovering how patients are treated by physiciansKunal Malhotra, Tanner C. Hobson, Silvia Valkova, Laura L. Pullum, Arvind Ramanathan. 2670-2679 [doi]

SQL-like big data environments: Case study in clinical trial analyticsAkshay Grover, Jay Gholap, Vandana P. Janeja, Yelena Yesha, Raghu Chintalapati, Harsh Marwaha, Kunal Modi. 2680-2689 [doi]

Exploring spatio-temporal-theme correlation between physical and social streaming data for event detection and pattern interpretation from heterogeneous sensorsMinh-Son Dao, Koji Zettsu, Siripen Pongpaichet, Laleh Jalali, Ramesh Jain. 2690-2699 [doi]

Microdata analysis of the accommodation survey in Japanese tourism statisticsAki-Hiro Sato. 2700-2708 [doi]

Detecting rumor patterns in streaming social mediaShihan Wang, Takao Terano. 2709-2715 [doi]

A collaborative framework for annotating energy datasetsHông-Ân Cao, Tri Kurniawan Wijaya, Karl Aberer, Nuno Nunes. 2716-2725 [doi]

The relation between firm age distributions and the decay rate of firm activities in the united states and JapanAtushi Ishikawa, Shouji Fujimoto, Takayuki Mizuno, Tsutomu Watanabe. 2726-2731 [doi]

An epidemic simulation with a delayed stochastic SIR model based on international socioeconomic-technological databasesAki-Hiro Sato, Isao Ito, Hidefumi Sawai, Kentaro Iwata. 2732-2741 [doi]

A spatio-temporal multimedia big data framework for a large crowdBilal Sadiq, Faizan Ur Reliman, Akhlaq Ahmad, Md. Abdur Rahman, Sohaib Ghani, Abdullah Murad, Saleh Basalamah, Ahmed Lbath. 2742-2751 [doi]

Distributed dynamic elastic nets: A scalable approach for regularization in dynamic manufacturing environmentsNaveen Ramakrishnan, Rumi Ghosh. 2752-2761 [doi]

Directional decision listsMarc Goessling, Shan Kang. 2762-2766 [doi]

Analysis of key operation performance data in manufacturing systemsNingxuan Kang, Cong Zhao, Jingshan Li, John A. Horst. 2767-2770 [doi]

Outlier detection for large scale manufacturing processesAbhinav Jauhri, Bradley McDanel, Chris Connor. 2771-2774 [doi]

Fast detection of material deformation through structural dissimilarityDaniela Ushizima, Talita Perciano, Dilworth Parkinson. 2775-2781 [doi]

Data analytics and uncertainty quantification for energy prediction in manufacturingRonay Ak, Raunak Bhinge. 2782-2784 [doi]

Lambda architecture for cost-effective batch and speed big data processingMariam Kiran, Peter Murphy, Inder Monga, Jon Dugan, Sartaj Singh Baveja. 2785-2792 [doi]

Network-aware resource management for scalable data analytics frameworksThomas Renner, Lauritz Thamsen, Odej Kao. 2793-2800 [doi]

On a new approach to the index selection problem using mining algorithmsParinaz Ameri, Jörg Meyer, Achim Streit. 2801-2810 [doi]

Preparing, storing, and distributing multi-dimensional scientific dataRanjeet Devarakonda, Yaxing Wei, Michele Thornton, Ben Mayer, Peter Thornton, Bob Cook. 2811-2813 [doi]

Use of a metadata documentation and search tool for large data volumes: The NGEE arctic exampleRanjeet Devarakonda, Les Hook, Terri Killeffer, Misha Krassovski, Tom Boden, Stan Wullschleger. 2814-2816 [doi]

Data optimised computing for heterogeneous big data computing applicationsErica Yang, Derek Ross, Srikanth Nagella, Martin Turner, Winfried Kockelmann, Genoveva Burca, Federico Montesino-Pouzols. 2817-2819 [doi]

Top-k computations in MapReduce: A case study on recommendationsVasilis Efthymiou, Kostas Stefanidis, Eirini Ntoutsi. 2820-2822 [doi]

A LSTM-based method for stock returns prediction: A case study of China stock marketKai Chen 0006, Yi Zhou, Fangyan Dai. 2823-2824 [doi]

Predicting various types of user attributes in Twitter by using personalized pagerankKazuya Uesato, Hiroki Asai, Hayato Yamana. 2825-2827 [doi]

Large-scale learning with AdaGrad on SparkAsmelash Teka Hadgu, Aastha Nigam, Ernesto Diaz-Aviles. 2828-2830 [doi]

Parallelizing natural language techniques for knowledge extraction from cloud service level agreementsSudip Mittal, Karuna P. Joshi, Claudia Pearce, Anupam Joshi. 2831-2833 [doi]

Gradient-based signatures for big multimedia dataChristian Beecks, Merih Seran Uysal, Thomas Seidl 0001. 2834-2835 [doi]

Indexing media storms on FlinkDimitrios Rafailidis, Stefanos Antaris. 2836-2838 [doi]

Scaling NLP algorithms to meet high demandConnor Stokes, Anoop Kumar, Frederick Choi, Ralph M. Weischedel. 2839 [doi]

The NIST data science evaluation series: Part of the NIST information access division data science initiativeBonnie J. Dorr, Craig S. Greenberg, Peter Fontana, Mark A. Przybocki, Marion Le Bras, Cathryn A. Ploehn, Oleg Aulov, Wo Chang. 2840-2842 [doi]

Flexible ingest framework: A scalable architecture for dynamic routing through composable pipelinesAlexei Samoylov, Jason Schlachter. 2843-2845 [doi]

A scalable solution for group feature selectionPriya Govindan, Ruobing Chen, Katya Scheinberg, Soundararajan Srinivasan. 2846-2848 [doi]

Genetic deep neural networks using different activation functions for financial data miningLuna M. Zhang. 2849-2851 [doi]

Performance of graph reconstruction method for large-scale web graph analysisRyota Takei, Ayahiko Niimi. 2852-2854 [doi]

Low latency analytics for streaming traffic data with Apache SparkAltti Ilari Maarala, Mika Rautiainen, Miikka Salmi, Susanna Pirttikangas, Jukka Riekki. 2855-2858 [doi]

How to make money from your information and keep your privacyDivya Rao, Wee Keong Ng. 2859-2861 [doi]

Scheduling of Big Data application workflows in cloud and inter-cloud environmentsB. Kezia Rani, A. Vinaya Babu. 2862-2864 [doi]

Patient-like-mine: A real time, visual analytics tool for clinical decision supportPeter Li, Simon N. Yates, Jenna K. Lovely, David W. Larson. 2865-2867 [doi]

A pricing mechanism using social media and web data to infer dynamic consumer valuationsSamuel D. Johnson, Kang-Yu Ni. 2868-2870 [doi]

Efficient keyword search on graphs using MapReduceYifan Hao, Huiping Cao, Yan Qi 0002, Chuan Hu, Sukumar Brahma, Jingyu Han. 2871-2873 [doi]

Non-blocking one-phase commit made possible for distributed transactions over replicated dataYuqing Zhu. 2874-2876 [doi]

A large scale examination of vehicle recorder data to understand relationship between drivers' behaviors and their past driving historiesDaisaku Yokoyama, Masashi Toyoda. 2877-2879 [doi]

Online pattern mining for high-dimensional data streamsYoshitaka Yamamoto, Koji Iwanuma. 2880-2882 [doi]

Modeling the learning behaviors of massive open online coursesZhenhui Liu, Jingjing He, Yufei Xue, Zhenzhong Huang, Manli Li, Zhihui Du. 2883-2885 [doi]

Data confidentiality challenges in big data applicationsJian Yin, Dongfang Zhao. 2886-2888 [doi]

Factorization machines with follow-the-regularized-leader for CTR prediction in display advertisingAnh-Phuong Ta. 2889-2891 [doi]

Taxi trip time prediction using similar trips and road network dataAakash Deep Singh, Wei Wu, Shili Xiang, Shonali Krishnaswamy. 2892-2894 [doi]

Using Word2Vec to process big text dataLong Ma, Yanqing Zhang. 2895-2897 [doi]

Inferring bike trip patterns from bike sharing system open dataLongbiao Chen, Jérémie Jakubowicz. 2898-2900 [doi]

MHT: A light-weight scalable zero-hop MPI enabled distributed key-value storeXiaobing Zhou, Tonglin Li, Ke Wang, Dongfang Zhao, Iman Sadooghi, Ioan Raicu. 2901-2903 [doi]

Big Data: Cloud computing in genomics applicationsHangu Yeo, Catherine H. Crawford. 2904-2906 [doi]

Integrating semantic knowledge into Tag-LDA model through cloud modelMaoyuan Zhang, Fang Yuan, Jianping Zhu. 2907-2909 [doi]

A case study to apply mobile technology into individual's local communityYunkai Liu, Christopher Magno. 2910-2912 [doi]

Clairvoyant-push: A real-time news personalized push notifier using topic modeling and social scoring for enhanced reader engagementBiying Tan, Sangaralingam Kajanan, Vivek Kumar Singh, Chandra Sekhar Saripaka, Giuseppe Manai. 2913-2915 [doi]

Using probabilistic approach to joint clustering and statistical inference: Analytics for big investment dataHua Fang, Honggang Wang, Chonggang Wang, Mahmoud Daneshmand. 2916-2918 [doi]

Towards a subgraph/supergraph cached query-graph indexJing Wang, Nikos Ntarmos, Peter Triantafillou. 2919-2921 [doi]

30 Day hospital readmission analysisRatna Madhuri Maddipatla, Mirsad Hadzikadic, Dipti Patel Misra, Lixia Yao. 2922-2924 [doi]

Using pairwise difference features to measure temporal changes in the microbial ecologyM. Yazdani, L. Smarr. 2925-2927 [doi]

A timeline visualization system for road traffic big dataArdi Imawan, Joonho Kwon. 2928-2929 [doi]

A new area tourist ranking methodGaël Chareyron, Bérengère Branchet, Sebastien Jacquot. 2930-2932 [doi]

Text retrieval based on the feature conversion of vector spaceMaoyuan Zhang, Jianping Zhu, Lijun Hua, Fang Yuan. 2933-2935 [doi]

Big data gathering and mining pipelines for CRM using open-sourceKang Li, Vinay Deolalikar, Neeraj Pradhan. 2936-2938 [doi]

Unified framework for clinical data analytics (U-CDA)Jay Gholap, Vandana P. Janeja, Yelena Yesha. 2939-2941 [doi]

A novel initialization method for particle swarm optimization-based FCM in big biomedical dataChanpaul Jin Wang, Hua Fang, Chonggang Wang, Mahmoud Daneshmand, Honggang Wang. 2942-2944 [doi]

Algorithmic content generation for productsChandra Khatri, Suman Voleti, Sathish Veeraraghavan, Nish Parikh, Atiq Islam, Shifa Mahmood, Neeraj Garg, Vivek Singh. 2945-2947 [doi]

Hotspots of news articles: Joint mining of news text & social media to discover controversial points in newsIsmini Lourentzou, Graham Dyer, Abhishek Sharma, ChengXiang Zhai. 2948-2950 [doi]

Improving the quality of semantic relationships extracted from massive user behavioral dataKhalifeh AlJadda, Mohammed Korayem, Trey Grainger. 2951-2953 [doi]

Analysis of star ratings in consumer reviews: A case study of YelpMaruthi Prithivirajan, Vivian Lai, Kyong Jin Shim, Koo Ping Shung. 2954-2956 [doi]

From stars to patients: Lessons from space science and astrophysics for health care informaticsS. George Djorgovski, Ashish Mahabal, Daniel J. Crichton, B. Chaudhry. 2957-2959 [doi]

External Links

Cite Key

Statistics

PDF

Researchr

2015 IEEE International Conference on Big Data, Big Data 2015, Santa Clara, CA, USA, October 29 - November 1, 2015

Abstract

Table of Contents