Proceedings of the 2013 IEEE International Conference on Big Data, 6-9 October 2013, Santa Clara, CA, USA

researchr

You are not signed in
Sign in
Sign up

Xiaohua Hu, Tsau Young Lin, Vijay Raghavan, Benjamin W. Wah, Ricardo A. Baeza-Yates, Geoffrey Fox, Cyrus Shahabi, Matthew Smith, Qiang Yang 0001, Rayid Ghani, Wei Fan, Ronny Lempel, Raghunath Nambiar, editors, Proceedings of the 2013 IEEE International Conference on Big Data, 6-9 October 2013, Santa Clara, CA, USA. IEEE, 2013. [doi]

Conference: bigdataconf2013

Abstract is missing.

Optimizing a MapReduce module of preprocessing high-throughput DNA sequencing dataWei-Chun Chung, Yu-Jung Chang, Chien-Chih Chen, Der-Tsai Lee, Jan-Ming Ho. 1-6 [doi]

Knowledge cubes - A proposal for scalable and semantically-guided management of Big DataAmgad Madkour, Walid G. Aref, Saleh Basalamah. 1-7 [doi]

Enterprise pre-sales forums: A preliminary study of metadata and contentVinay Deolalikar. 1-4 [doi]

A big data analytics framework for scientific data managementSandro Fiore, Cosimo Palazzo, Alessandro D'Anca, Ian T. Foster, Dean N. Williams, Giovanni Aloisio. 1-8 [doi]

Re-projection of terabyte-sized imagesPeter Bajcsy, Antoine Vandecreme, Mary Brady. 1 [doi]

The Code rebalancing problem for a storage-flexible Data Center NetworkIryna Andriyanova, Alan Jule, Emina Soljanin. 1-6 [doi]

Managing massive graphs in relational DBMSRuiwen Chen. 1-8 [doi]

Assessment of dimensionality reduction based on communication channel model; application to immersive information visualizationMohammadreza Babaee, Mihai Datcu, Gerhard Rigoll. 1-6 [doi]

Fast solution of load shedding problems via a sequence of linear programsHarish S. Bhat, Garnet J. Vaz, Juan C. Meza. 1-6 [doi]

Modeling and querying data in NoSQL databasesKaramjit Kaur, Rinkle Rani. 1-7 [doi]

th century English booksAlberto Acerbi, Vasileios Lampos, R. Alexander Bentley. 1-8 [doi]

Dynamic reduction of query result sets for interactive visualizatonLeilani Battle, Michael Stonebraker, Remco Chang. 1-8 [doi]

The Microsoft Academic Search challenges at KDD Cup 2013Martine De Cock, Senjuti Basu Roy, Swapna Savvana, Vani Mandava, Brian Dalessandro, Claudia Perlich, William Cukierski, Benjamin Hamner. 1-4 [doi]

Lung transplant outcome prediction using UNOS dataAnkit Agrawal, Reda Al-Bahrani, Mark J. Russo, Jaishankar Raman, Alok N. Choudhary. 1-8 [doi]

Tile based visual analytics for Twitter big data exploratory analysisDaniel Cheng, Peter Schretlen, Nathan Kronenfeld, Neil Bozowsky, William Wright. 2-4 [doi]

Bibliometric-enhanced retrieval models for big scholarly information systemsPhilipp Mayr, Peter Mutschke. 5-8 [doi]

Advancing value creation and value capture in data-intensive contextsRoman Ferrando-Llopis, David López-Berzosa, Catherine Mulligan. 5-9 [doi]

Optimizing queries over semantically integrated datasets on MapReduce platformsHyeongSik Kim, Kemafor Anyanwu. 5-6 [doi]

On-line learning gossip algorithm in multi-agent systems with local decision rulesPascal Bianchi, Stéphan Clémençon, Gemma Morral, Jérémie Jakubowicz. 6-14 [doi]

Secure Decoupled Linkage (SDLink) system for building a social genomeHye-Chung Kum, Ashok Krishnamurthy, Darshana Pathak, Michael K. Reiter, Stanley C. Ahalt. 7-11 [doi]

suvfs: A virtual file system in userspace that supports large filesWasim Ahmad Bhat, S. M. K. Quadri. 7-11 [doi]

Hierarchical feature learning from sensorial data by spherical clusteringBonny Banerjee, Jayanta K. Dutta. 7-13 [doi]

Hash in a flash: Hash tables for flash devicesTyler Clemons, S. M. Faisal, Shirish Tatikonda, Charu C. Aggarwal, Srinivasan Parthasarathy. 7-14 [doi]

Alarm prediction in large-scale sensor networks - A case study in railroadHongfei Li, Buyue Qian, Dhaivat Parikh, Arun Hampapur. 7-14 [doi]

Elastic data partitioning for cloud-based SQL processing systemsLipyeow Lim. 8-16 [doi]

Searching inter-disciplinary scientific big data based on latent correlation analysisEloy Gonzales, Bun Theang Ong, Koji Zettsu. 9-12 [doi]

Overplotting: Unified solutions under Abstract RenderingJoseph A. Cottam, Andrew Lumsdaine, Peter Wang. 9-16 [doi]

Colon cancer survival prediction using ensemble data mining on SEER dataReda Al-Bahrani, Ankit Agrawal, Alok N. Choudhary. 9-16 [doi]

Academic publishing as a social media paradigmMichael E. Payne, Linh Bao Ngo, Amy W. Apon. 9-12 [doi]

VisualPage: Towards large scale analysis of nineteenth-century print cultureNeal Audenaert, Natalie M. Houston. 9-16 [doi]

A distributed approach for graph-oriented multidimensional analysisBenoît Denis, Amine Ghrab, Sabri Skhiri. 9-16 [doi]

A cloud service for the evaluation of company's financial health using XBRL-based financial statementsWen-Chiao Hsu, Jyun-Yao Huang, Chi-Hao Chen, Chien-Yu Su, Hsiao-Chen Shih, Tzu-Ya Liao, I-En Liao. 10-14 [doi]

Risk adjustment of patient expenditures: A big data analytics approachLin Li, Saeed Bagheri, Helena Goote, Asif Hasan, Gregg Hazard. 12-14 [doi]

Reliability of erasure coded storage systems: A geometric approachAntonio Campello, Vinay A. Vaishampayan. 12-16 [doi]

Complete storm identification algorithms from big raw rainfall data using MapReduce frameworkKulsawasd Jitkajornwanich, Upa Gupta, Sakthi Kumaran Shanmuganathan, Ramez Elmasri, Leonidas Fegaras, John McEnery. 13-20 [doi]

Big spatial data miningShuliang Wang, Gangyi Ding, Ming Zhong. 13-21 [doi]

Efficient learning from explanation of prediction errors in streaming dataBonny Banerjee, Jayanta K. Dutta. 14-20 [doi]

Real-time data analysis in ClowdFlowsJanez Kranjc, Vid Podpecan, Nada Lavrac. 15-22 [doi]

Memory system characterization of big data workloadsMartin Dimitrov, Karthik Kumar, Patrick Lu, Vish Viswanathan, Thomas Willhalm. 15-22 [doi]

Communication efficient algorithms for fundamental big data problemsPeter Sanders, Sebastian Schlag, Ingo Müller. 15-23 [doi]

Parallel auto-encoder for efficient outlier detectionYunlong Ma, Peng Zhang, Yanan Cao, Li Guo. 15-17 [doi]

MiSTRAL: An architecture for low-latency analytics on MasSive time seriesAlice Marascu, Pascal Pompey, Eric Bouillet, Olivier Verscheure, Michael Wurst, Martin Grund, Philippe Cudré-Mauroux. 15-21 [doi]

Constructing E-Tourism platform based on service value broker: A knowledge management perspectiveYucong Duan, Yongzhi Wang, Jinpeng Wei, Ajay Kattepur, Wencai Du. 17-24 [doi]

A look at challenges and opportunities of Big Data analytics in healthcareRaghunath Nambiar, Ruchie Bhardwaj, Adhiraaj Sethi, Rajesh Vargheese. 17-22 [doi]

Parallel SECONDO: Practical and efficient mobility data processing in the cloudJiamin Lu, Ralf Hartmut Güting. 17-25 [doi]

Typograph: Multiscale spatial exploration of text documentsAlex Endert, Russ Burtner, Nick Cramer, Ralph Perko, Shawn Hampton, Kristin Cook. 17-24 [doi]

Back to our data - Experiments with NoSQL technologies in the HumanitiesTobias Blanke, Michael Bryant 0001, Mark Hedges. 17-20 [doi]

Distributed storage evaluation on a three-wide inter-data center deploymentYih-Farn Chen, Scott Daniels, Marios Hadjieleftheriou, Pingkai Liu, Chao Tian, Vinay A. Vaishampayan. 17-22 [doi]

New factors for identifying influential bloggersTeng-Sheng Moh, SivaNaga Prasad Shola. 18-27 [doi]

The human face of crowdsourcing: A citizen-led crowdsourcing case studySheryl Grant, Richard Marciano, Priscilla Ndiaye, Kristan E. Shawgo, Jeff Heard. 21-24 [doi]

Distributed Pivot Clustering with arbitrary distance functionsKarl Branting. 21-27 [doi]

A scalable data analysis platform for metagenomicsWei Tang, Jared Wilkening, Narayan Desai, Wolfgang Gerlach, Andreas Wilke, Folker Meyer. 21-26 [doi]

Yellow cabs as red corpusclesTimothy H. Savage, Huy T. Vo. 22-28 [doi]

Performance evaluation of R with Intel Xeon Phi coprocessorYaakoub El Khamra, Niall Gaffney, David Walling, Eric Wernert, Weijia Xu, Hui Zhang. 23-30 [doi]

Multidimensional analysis of fetal growth curvesMario A. Bochicchio, Antonella Longo, Lucia Vaira, Antonio Malvasi, Andrea Tinelli. 23-28 [doi]

Paired-replicas with constant repair time: Loss functions and memorylessnessVinay Deolalikar. 23-27 [doi]

3tch: Privacy and knowledge: 'Dynamic networked collective intelligence'Udo Kroon. 23-31 [doi]

Map-based graph analysis on MapReduceUpa Gupta, Leonidas Fegaras. 24-30 [doi]

ADraw: A novel social network visualization tool with attribute-based layout and coloringZhenwen Wang, Weidong Xiao, Bin Ge, Hao Xu. 25-32 [doi]

Visualization and rhetoric: Key concerns for utilizing big data in humanities research: A case study of vaccination discourses: 1918-1919Kathleen Kerr, Bernice L. Hausman, Samah Gad, Waqas Javen. 25-32 [doi]

VisReduce: Fast and responsive incremental information visualization of large datasetsJean-Francois Im, Felix Giguere Villegas, Michael J. McGuffin. 25-32 [doi]

Index-based join operations in HiveMahsa Mofidpoor, Nematollaah Shiri, T. Radhakrishnan. 26-33 [doi]

Rethinking data management for big data scientific workflowsKaran Vahi, Mats Rynge, Gideon Juve, Rajiv Mayani, Ewa Deelman. 27-35 [doi]

Nearest neighbor classification using bottom-k sketchesSoren Dahlgaard, Christian Igel, Mikkel Thorup. 28-34 [doi]

A scalable infrastructure of interactive evolutionary computation to evolve services online with dataMasaharu Munetomo, Shintaro Bando. 28 [doi]

Efficient updates in cross-object erasure-coded storage systemsKyumars Sheykh Esmaili, Aatish Chiniah, Anwitaman Datta. 28-32 [doi]

Scalable prediction of energy consumption using incremental time series clusteringYogesh Simmhan, Muhammad Usman Noor. 29-36 [doi]

OWL reasoning over big biomedical dataXi Chen, Huajun Chen, Ningyu Zhang, Jiaoyan Chen, Zhaohui Wu. 29-36 [doi]

Big data for business managers - Bridging the gap between potential and valueAnmol Rajpurohit. 29-31 [doi]

The implications from benchmarking three big data systemsJing Quan, Yingjie Shi, Ming Zhao, Wei Yang. 31-38 [doi]

P-DOT: A model of computation for big dataTao Luo, Yin Liao, Guoliang Chen, Yunquan Zhang. 31-37 [doi]

Granularity-based temporal data mining in hospital information systemShusaku Tsumoto, Shoji Hirano, Haruko Iwata. 32-40 [doi]

Business model canvas perspective on big data applicationsF. Canari Pembe Muhtaroglu, Seniz Demir, Murat Obali, Canan Girgin. 32-37 [doi]

IntegrityMR: Integrity assurance framework for big data analytics and management applicationsYongzhi Wang, Jinpeng Wei, Mudhakar Srivatsa, Yucong Duan, Wencai Du. 33-40 [doi]

Humanities 'big data': Myths, challenges, and lessonsAmalia S. Levi. 33-36 [doi]

Construction of exact-BASIC codes for distributed storage systems at the MSR pointHanxu Hou, Kenneth W. Shum, Hui Li. 33-38 [doi]

A system for large-scale visualization of streaming Doppler dataPeter Kristof, Bedrich Benes, Carol X. Song, Lan Zhao. 33-40 [doi]

SLA data management criteriaKaterina Stamou, Verena Kantere, Jean-Henry Morin. 34-42 [doi]

Feature selection strategies for classifying high dimensional astronomical data setsCiro Donalek, S. George Djorgovski, Ashish Mahabal, Matthew J. Graham, Andrew J. Drake, Arun Kumar A., N. Sajeeth Philip, Thomas J. Fuchs, Michael J. Turmon, Michael Ting-Chang Yang, Giuseppe Longo. 35-41 [doi]

SciFlow: A dataflow-driven model architecture for scientific computing using HadoopPengfei Xuan, Yueli Zheng, Sapna Sarupria, Amy W. Apon. 36-44 [doi]

KUChemBio: A repository of computational chemical biology data setsAaron Smalter Hall, Jun Huan. 37-42 [doi]

Digging into human rights violations: Data modelling and collective memoryBen Miller, Ayush Shrestha, Jason Derby, Jennifer Olive, Karthikeyan Umapathy, Fuxin Li, Yanjun Zhao. 37-45 [doi]

A big data driven model for taxi drivers' airport pick-up decisions in New York CityM. Anil Yazici, Camille Kamga, Abhishek Singhal. 37-44 [doi]

Transparent composite model for large scale image/video processingEn-Hui Yang, Xiang Yu. 38-44 [doi]

Understanding the value of (big) dataKoutroumpis Pantelis, Leiponen Aija. 38-42 [doi]

A performance evaluation of Hive for scientific data managementTaoying Liu, Jing Liu, Hong Liu, Wei Li. 39-46 [doi]

Minimum storage BASIC codes: A system perspectiveXianxia Huang, Hui Li, Tai Zhou, Yumeng Zhang, Han Guo, Hanxu Hou, Huayu Zhang, Kai Lei. 39-43 [doi]

Observation of Matthew Effects in Sina Weibo microbloggerMengmeng Yang, Yi Zhou, Qu Zhou, Kai Chen 0006, Jianhua He, Xiaokang Yang. 41-43 [doi]

Visualization of streaming data: Observing change and context in information visualization techniquesMilos Krstajic, Daniel A. Keim. 41-47 [doi]

Local join optimization over a heterogeneously distributed scientific databaseHelen X. Xiang. 41-45 [doi]

How data partitioning strategies and subset size influence the performance of an ensemble?Majed Farrash, Wenjia Wang. 42-49 [doi]

Parallel and memory-efficient Burrows-Wheeler transformShinya Hayashi, Kenjiro Taura. 43-50 [doi]

OpenFridge: A platform for data economy for energy efficiency dataSlobodanka Dana Kathrin Tomic, Anna Fensel. 43-47 [doi]

Layout-aware I/O Scheduling for terabits data movementYoungjae Kim, Scott Atchley, Geoffroy Vallée, Galen M. Shipman. 44-51 [doi]

A framework of spatial co-location mining on MapReduceJin Soung Yoo, Douglas Boulware. 44 [doi]

Elastic algorithms for guaranteeing quality monotonicity in big data miningRui Han, Lei Nie, Moustafa Ghanem, Yike Guo. 45-50 [doi]

Access control for big data using data contentWenrong Zeng, Yuhao Yang, Bo Luo. 45-47 [doi]

The royal birth of 2013: Analysing and visualising public sentiment in the UK using TwitterVu Dung Nguyen, Blesson Varghese, Adam Barker. 46-54 [doi]

Core-based community evolution in mobile social networksHao Xu, Weidong Xiao, Daquan Tang, Jiuyang Tang, Zhenwen Wang. 46-51 [doi]

Evaluating task scheduling in hadoop-based cloud systemsShengyuan Liu, Jungang Xu, Zongzhen Liu, Xu Liu. 47-53 [doi]

CompactMap: A mental map preserving visual interface for streaming text dataXiaotong Liu, Yifan Hu, Stephen C. North, Han-Wei Shen. 48-55 [doi]

A study of innovation network database Construction by using big data and an enterprise strategy modelZhou Wen, Ye Shu-Tao, Lu Xiao-Long. 48-52 [doi]

Fast Change Point Detection for electricity market analysisWilliam Gu, Jaesik Choi, Ming Gu, Horst D. Simon, Kesheng Wu. 50-57 [doi]

HFSP: Size-based scheduling for HadoopMario Pastorelli, Antonio Barbuzzi, Damiano Carra, Matteo Dell'Amico, Pietro Michiardi. 51-59 [doi]

Content-based assessment of the credibility of online healthcare informationMeeyoung Park, Hariprasad Sampathkumar, Bo Luo, Xue-wen Chen. 51-58 [doi]

Super-sequence frequent pattern mining on sequential datasetXinran Yu, Turgay Korkmaz. 52-59 [doi]

Enhanced user data privacy with pay-by-data modelChao Wu, Yike Guo. 53-57 [doi]

Efficient near-duplicate document detection using FPGAsXi Luo, Walid A. Najjar, Vagelis Hristidis. 54-61 [doi]

Bibliographic records as humanities big dataAndrew Prescott. 55-58 [doi]

Egocentric storylines for visual analysis of large dynamic graphsChris Muelder, Tarik Crnovrsanin, Arnaud Sallaberry, Kwan-Liu Ma. 56-62 [doi]

Query optimization over a heterogeneously distributed scientific databaseHelen X. Xiang. 58-64 [doi]

A novel integrated method for human multiplex protein subcellular localization predictionHong Gu, Junzhe Cao. 58-62 [doi]

BIG DATA infrastructures for pharmaceutical researchChristian Seebode, Matthias Ort, Christian Regenbrecht, Martin Peuker. 59-63 [doi]

Customising geoparsing and georeferencing for historical textsC. J. Rupp, Paul Rayson, Alistair Baron, Christopher Donaldson, Ian N. Gregory, Andrew Hardie, Patricia Murrieta-Flores. 59-62 [doi]

Exploring big data in small forms: A multi-layered knowledge extraction of social networksYun Wei Zhao, Willem-Jan van den Heuvel, Xiaojun Ye. 60-67 [doi]

An evaluation study of BigData frameworks for graph processingBenedikt Elser, Alberto Montresor. 60-67 [doi]

Workload-aware aggregate maintenance in columnar in-memory databasesStephan Müller, Lars Butzmann, Stefan Klauck, Hasso Plattner. 62-69 [doi]

Learning from multiple data sets with different missing attributes and privacy policies: Parallel distributed fuzzy genetics-based machine learning approachHisao Ishibuchi, Masakazu Yamane, Yusuke Nojima. 63-70 [doi]

A concept of Generic Workspace for Big Data Processing in HumanitiesJedrzej Rybicki, Benedikt von St. Vieth, Daniel Mallmann. 63-70 [doi]

GPU-accelerated incremental correlation clustering of large data with visual feedbackEric Papenhausen, Bing Wang, Sungsoo Ha, Alla Zelenyuk, Dan Imre, Klaus Mueller. 63-70 [doi]

Big data solutions for predicting risk-of-readmission for congestive heart failure patientsKiyana Zolfaghar, Naren Meadem, Ankur Teredesai, Senjuti Basu Roy, Si-Chi Chin, Brian Muckian. 64-71 [doi]

Enterprise data economy: A hadoop-driven model and strategyWuheng Luo. 65-70 [doi]

Storing and manipulating environmental big data with JASMINB. N. Lawrence, V. L. Bennett, J. Churchill, M. Juckes, Philip Kershaw, Stephen Pascoe, Sam Pepler, M. Pritchard, A. Stephens. 68-75 [doi]

Provenance comparison for large-scale knowledge discoveryXiang Zhao, Bin Ge, Jiuyang Tang, Weidong Xiao, Haichuan Shang. 68-75 [doi]

Virtualization I/O optimization based on shared memoryFengfeng Ning, Chuliang Weng, Yuan Luo. 70-77 [doi]

From assets to stories via the Google Cultural Institute PlatformW. Brent Seales, Steve Crossan, Mark Yoshitake, Sertan Girgin. 71-76 [doi]

Data chaos: An entropy based MapReduce framework for scalable learningJiaoyan Chen, Huajun Chen, Xi Chen, Guozhou Zheng, Zhaohui Wu. 71-78 [doi]

Visualization of big SPH simulations via compressed octree gridsFlorian Reichl, Marc Treib, Rüdiger Westermann. 71-78 [doi]

Efficient gear-shifting for a power-proportional distributed data-placement methodHieu Hanh Le, Satoshi Hikida, Haruo Yokota. 76-84 [doi]

The curious identity of Michael Field and its implications for humanities research with the semantic webSusan Brown, John Simpson. 77-85 [doi]

An ensemble MIC-based approach for performance diagnosis in big data platformPengfei Chen, Yong Qi, Xinyi Li, Li Su. 78-85 [doi]

A novel visual analytics approach for clustering large-scale social dataZhangye Wang, Chang Chen, Juanxia Zhou, Jiyuan Liao, Wei Chen, Ross Maciejewski. 79-86 [doi]

Exploring sketches for probability estimation with sublinear memoryAnthony Kleerekoper, Mikel Luján, Gavin Brown. 79-86 [doi]

Agrios: A hybrid approach to big array analyticsPatrick Leyshock, David Maier, Kristin Tufte. 85-93 [doi]

A reconfigurable stream compression hardware based on static symbol-lookup tableShinichi Yamagiwa, Hiroshi Sakamoto. 86-93 [doi]

Infectious texts: Modeling text reuse in nineteenth-century newspapersDavid A. Smith, Ryan Cordell, Elizabeth Maddock Dillon. 86-94 [doi]

DriveSense: Contextual handling of large-scale route map data for the automobileFrederik Wiehr, Vidya Setlur, Alark Joshi. 87-94 [doi]

Agglomerative co-clustering for synonymous phrases based on common effects and influencesKoji Kumanami, Kazuhiro Seki, Kuniaki Uehara. 87-94 [doi]

NativeTask: A Hadoop compatible framework for high performanceDong Yang, Xiang Zhong, Dong Yan, Fangqin Dai, Xusen Yin, Cheng Lian, Zhongliang Zhu, Weihua Jiang, Gansha Wu. 94-101 [doi]

Building a generic platform for big sensor data applicationChun-Hsiang Lee, David Birch, Chao Wu, Dilshan Silva, Orestis Tsinalis, Yang Li, Shulin Yan, Moustafa Ghanem, Yike Guo. 94-102 [doi]

Leveraging memory mapping for fast and scalable graph computation on a PCZhiyuan Lin, Duen Horng (Polo) Chau, U. Kang. 95-98 [doi]

Mapping mutable genres in structurally complex volumesTed Underwood, Michael L. Black, Loretta Auvil, Boris Capitanu. 95-103 [doi]

Scalable sentiment classification for Big Data analysis using Naïve Bayes ClassifierBingwei Liu, Erik Blasch, Yu Chen, Dan Shen, Genshe Chen. 99-104 [doi]

On mixing high-speed updates and in-memory queries: A big-data architecture for real-time analyticsTao Zhong, Kshitij A. Doshi, Xi Tang, Ting Lou, Zhongyan Lu, Hong Li. 102-109 [doi]

Locality-driven high-level I/O aggregation for processing scientific datasetsJialin Liu, Bradly Crysler, Yin Lu, Yong Chen. 103-111 [doi]

CKM: A shared visual analytical tool for large-scale analysis of audio-video interviewsLu Xiao, Yan Luo, Steven High. 104-112 [doi]

Meta-learning for large scale machine learning with MapReduceXuan Liu, Xiaoguang Wang, Stan Matwin, Nathalie Japkowicz. 105-110 [doi]

AxPUE: Application level metrics for power usage effectiveness in data centersRunlin Zhou, Yingjie Shi, Chunge Zhu. 110-117 [doi]

Frequent Itemset Mining for Big DataSandy Moens, Emin Aksehirli, Bart Goethals. 111-118 [doi]

clusiVAT: A mixed visual/numerical clustering algorithm for big dataDheeraj Kumar, Marimuthu Palaniswami, Sutharshan Rajasegarar, Christopher Leckie, James C. Bezdek, Timothy C. Havens. 112-117 [doi]

A case study on entity Resolution for Distant Processing of big Humanities dataWeijia Xu, Maria Esteva, Jessica Trelogan, Todd Swinson. 113-120 [doi]

Hardware acceleration of Hadoop MapReduceToshimori Honjo, Kazuki Oikawa. 118-124 [doi]

A characterization of big data benchmarksWen Xiong, Zhibin Yu, Zhendong Bei, Juanjuan Zhao, Fan Zhang, Yubin Zou, Xue Bai, Ye Li, Cheng-Zhong Xu. 118-125 [doi]

Evaluating parallel logistic regression modelsHaoruo Peng, Ding Liang, Cyrus Choi. 119-126 [doi]

Optimizing the MapReduce framework on Intel Xeon Phi coprocessorMian Lu, Lei Zhang, Huynh Phung Huynh, Zhongliang Ong, Yun Liang, Bingsheng He, Rick Siow Mong Goh, Richard Huynh. 125-130 [doi]

Approximate triangle counting algorithms on multi-coresMahmudur Rahman, Mohammad Al Hasan. 127-133 [doi]

On the performance and energy efficiency of Hadoop deployment modelsEugen Feller, Lavanya Ramakrishnan, Christine Morin. 131-136 [doi]

Tree Labeled LDA: A Hierarchical model for web summariesAnton Slutsky, Xiaohua Hu, Yuan An. 134-140 [doi]

Optimizing throughput on guaranteed-bandwidth WAN networks for the Large Synoptic Survey Telescope (LSST)D. Michael Freemon. 137-142 [doi]

Nearest neighbour regression outperforms model-based prediction of specific star formation rateKristoffer Stensbo-Smidt, Christian Igel, Andrew Zirm, Kim Steenstrup Pedersen. 141-144 [doi]

Feliss: Flexible distributed computing framework with light-weight checkpointingTakuya Araki, Kazuyo Narita, Hiroshi Tamano. 143-149 [doi]

MapReduce implementation of Variational Bayesian Probabilistic Matrix Factorization algorithmNaveen C. Tewari, Hari M. Koduvely, Sarbendu Guha, Arun Yadav, Gladbin David. 145-152 [doi]

Algebraic dataflows for big data analysisJonas Dias, Eduardo S. Ogasawara, Daniel de Oliveira, Fabio Porto, Patrick Valduriez, Marta Mattoso. 150-155 [doi]

A unified framework for predicting attributes and links in social networksXusen Yin, Bin Wu, Xiuqin Lin. 153-160 [doi]

Scalable and robust key group size estimation for reducer load balancing in MapReduceWei Yan, Yuan Xue, Bradley Malin. 156-162 [doi]

Scalable approximation of kernel fuzzy c-meansZijian Zhang, Timothy C. Havens. 161-168 [doi]

Robot: An efficient model for big data storage systems based on erasure codingChao Yin, Jianzong Wang, Changsheng Xie, Jiguang Wan, Changlin Long, Wenjuan Bi. 163-168 [doi]

Large-scale restricted boltzmann machines on single GPUYun Zhu, Yanqing Zhang, Yi Pan. 169-174 [doi]

Multilevel Active Storage for big data applications in high performance computingChao Chen, Michael Lang, Yong Chen. 169-174 [doi]

GPU accelerated item-based collaborative filtering for big-data applicationsChandima Hewa Nadungodage, Yuni Xia, Jaehwan John Lee, Myungcheol Lee, Choon Seo Park. 175-180 [doi]

GPU-accelerated adaptive compression framework for genomics dataGuiXin Guo, Shuang Qiu, Zhiqiang Ye, Bingqiang Wang, Lin Fang, Mian Lu, Simon See, Rui Mao. 181-186 [doi]

An infrastructure for automating large-scale performance studies and data processingDeepal Jayasinghe, Josh Kimball, Tao Zhu, Siddharth Choudhary, Calton Pu. 187-192 [doi]

Kylin: An efficient and scalable graph data processing systemLi-Yung Ho, Tsung-Han Li, Jan-Jan Wu, Pangfeng Liu. 193-198 [doi]

Towards hybrid online on-demand querying of realtime data with stateful complex event processingQunzhi Zhou, Yogesh Simmhan, Viktor K. Prasanna. 199-205 [doi]

DDSN: Duplicate detection to reduce both storage and bandwidth consumptionJiaran Zhang, Xiaohui Yu, Yang Liu, Liwei Lin. 206-211 [doi]

A reconfigurable computing architecture for semantic information filteringAalap Tripathy, Ka Chon Ieong, Atish Patra, Rabi N. Mahapatra. 212-218 [doi]

Iteration aware prefetching for unstructured gridsOyindamola O. Akande, Philip J. Rhodes. 219-227 [doi]

Measuring inter-site engagementElad Yom-Tov, Mounia Lalmas, Ricardo A. Baeza-Yates, Georges Dupret, Janette Lehmann, Pinar Donmez. 228-236 [doi]

A selective checkpointing mechanism for query plans in a parallel database systemTing Chen, Kenjiro Taura. 237-245 [doi]

CORE: Cross-object redundancy for efficient data repair in storage systemsKyumars Sheykh Esmaili, Lluis Pamies-Juarez, Anwitaman Datta. 246-254 [doi]

H2RDF+: High-performance distributed joins over large-scale RDF graphsNikolaos Papailiou, Ioannis Konstantinou, Dimitrios Tsoumakos, Panagiotis Karras, Nectarios Koziris. 255-263 [doi]

Direct QR factorizations for tall-and-skinny matrices in MapReduce architecturesAustin R. Benson, David F. Gleich, James Demmel. 264-272 [doi]

Adaptive file management for scientific workflows on the Azure cloudRadu Tudoran, Alexandru Costan, Ramin Rezai Rad, Goetz Brasche, Gabriel Antoniu. 273-281 [doi]

Model-view sensor data management in the cloudTian Guo, Thanasis G. Papaioannou, Karl Aberer. 282-290 [doi]

Spatio-temporal indexing in non-relational distributed databasesAnthony Fox, Chris Eichelberger, James Hughes, Skylar Lyon. 291-299 [doi]

Scientific discovery through weighted samplingLefteris Sidirourgos, Martin L. Kersten, Peter A. Boncz. 300-306 [doi]

Scalable data citation in dynamic, large databases: Model and reference implementationStefan Pröll, Andreas Rauber. 307-312 [doi]

On the use of shared storage in shared-nothing environmentsKrish K. R., Aleksandr Khasymski, Guanying Wang, Ali Raza Butt, Gaurav Makkar. 313-318 [doi]

Self-adaptive event recognition for intelligent transport managementAlexander Artikis, Matthias Weidlich, Avigdor Gal, Vana Kalogeraki, Dimitrios Gunopulos. 319-325 [doi]

Improving floating point compression through binary masksLeonardo Arturo Bautista Gomez, Franck Cappello. 326-331 [doi]

Using pattern-models to guide SSD deployment for Big Data applications in HPC systemsJunjie Chen, Philip C. Roth, Yong Chen. 332-337 [doi]

Robust crowdsourced learningZhiquan Liu, Luo Luo, Wu-Jun Li. 338-343 [doi]

Segmented analysis for reducing data movementJialin Liu, Surendra Byna, Yong Chen. 344-349 [doi]

Continuous hyperparameter optimization for large-scale recommender systemsSimon Chan, Philip C. Treleaven, Licia Capra. 350-358 [doi]

4S: Scalable subspace search scheme overcoming traditional Apriori processingHoang Vu Nguyen, Emmanuel Müller, Klemens Böhm. 359-367 [doi]

Computing betweenness centrality in external memoryLars Arge, Michael T. Goodrich, Freek van Walderveen. 368-375 [doi]

A parallel computing platform for training large scale neural networksRong Gu, Furao Shen, Yihua Huang. 376-384 [doi]

Self-tuned kernel spectral clustering for large scale networksRaghvendra Mall, Rocco Langone, Johan A. K. Suykens. 385-393 [doi]

NUMA-optimized parallel breadth-first search on multicore single-node systemYuichiro Yasui, Katsuki Fujisawa, Kazushige Goto. 394-402 [doi]

A distributed vertex-centric approach for pattern matching in massive graphsArash Fard, M. Usman Nisar, Lakshmish Ramaswamy, John A. Miller, Matthew Saltz. 403-411 [doi]

Fast scalable selection algorithms for large scale dataLee Parnell Thompson, Weijia Xu, Daniel P. Miranker. 412-420 [doi]

An NML-based model selection criterion for general relational data modelingYoshiki Sakai, Kenji Yamanishi. 421-429 [doi]

Parallel matrix factorization for binary responseRajiv Khanna, Liang Zhang, Deepak Agarwal, Bee-Chung Chen. 430-438 [doi]

CallCab: A unified recommendation system for carpooling and regular taxicab servicesDesheng Zhang, Tian He, Yunhuai Liu, John A. Stankovic. 439-447 [doi]

Top-K aggregation over a large graph using shared-nothing systemsAbhirup Chakraborty. 448-457 [doi]

Distributed confidence-weighted classification on MapReduceNemanja Djuric, Mihajlo Grbovic, Slobodan Vucetic. 458-466 [doi]

Scalable context-aware role mining with MapReduceZhiwei Yu, Raymond K. Wong, Chi-Hung Chi. 467-474 [doi]

Elver: Recommending Facebook pages in cold start situation without content featuresYusheng Xie, Zhengzhang Chen, Kunpeng Zhang, Chen Jin, Yu Cheng, Ankit Agrawal, Alok N. Choudhary. 475-479 [doi]

Massively scalable near duplicate detection in streams of documents using MDSHPaul Logasa Bogen, Christopher T. Symons, Amber McKenzie, Robert M. Patton, Robert E. Gillen. 480-486 [doi]

Incremental algorithms for closeness centralityAhmet Erdem Sariyüce, Kamer Kaya, Erik Saule, Umit V. Catalyiirek. 487-492 [doi]

Classification of big velocity data via cross-domain Canonical Correlation AnalysisBo Zhang, Zhongzhi Shi. 493-498 [doi]

A distributed tree data structure for real-time OLAP on cloud architecturesFrank K. H. A. Dehne, Q. Kong, Andrew Rau-Chaplin, Hamidreza Zaboli, R. Zhou. 499-505 [doi]

DL-MPI: Enabling data locality computation for MPI-based data-intensive applicationsJiangling Yin, Andrew Foran, Jun Wang. 506-511 [doi]

Sparse Poisson coding for high dimensional document clusteringChenxia Wu, Haiqin Yang, Jianke Zhu, Jiemi Zhang, Irwin King, Michael R. Lyu. 512-517 [doi]

Fast OLAP query execution in main memory on large data in a clusterMartin Weidner, Jonathan Dees, Peter Sanders. 518-524 [doi]

Group-Scheme: SIMD-based compression algorithms for web text dataXudong Zhang, Wayne Xin Zhao, Dongdong Shan, Hongfei Yan. 525-530 [doi]

Efficient large graph pattern mining for big data in the cloudChun-Chieh Chen, Kuan-Wei Lee, Chih-Chieh Chang, De-Nian Yang, Ming-Syan Chen. 531-536 [doi]

A stream partitioning approach to processing large scale distributed graph datasetsRui Wang, Kenneth Chiu. 537-542 [doi]

Scalable distributed event detection for TwitterRichard McCreadie, Craig Macdonald, Iadh Ounis, Miles Osborne, Sasa Petrovic. 543-549 [doi]

Analysis of GSM calls data for understanding user mobility behaviorBarbara Furletti, Lorenzo Gabrielli, Chiara Renso, Salvatore Rinzivillo. 550-555 [doi]

Scaling concurrency of personalized Semantic search over Large RDF dataHaizhou Fu, HyeongSik Kim, Kemafor Anyanwu. 556-562 [doi]

A hypergraph-partitioned vertex programming approach for large-scale consensus optimizationHui Miao, Xiangyang Liu, Bert Huang, Lise Getoor. 563-568 [doi]

A Higher-order data flow model for heterogeneous Big DataSimon Price, Peter A. Flach. 569-574 [doi]

Parallel subgroup discovery on computing clusters - First resultsDaniel Trabold, Henrik Grosskreutz. 575-579 [doi]

DP-WHERE: Differentially private modeling of human mobilityDarakhshan J. Mir, Sibren Isaacman, Ramón Cáceres, Margaret Martonosi, Rebecca N. Wright. 580-588 [doi]

Malicious URL filtering - A big data applicationMin-Sheng Lin, Chien-Yi Chiu, Yuh-Jye Lee, Hsing-Kuo Pao. 589-596 [doi]

Zero-knowledge private graph summarizationMaryam Shoaran, Alex Thomo, Jens H. Weber-Jahnke. 597-605 [doi]

Scalable network traffic visualization using compressed graphsLei Shi, Qi Liao, Xiaohua Sun, Yarui Chen, Chuang Lin. 606-612 [doi]

Breaking the Arc: Risk control for Big DataDuncan Hodges, Sadie Creese. 613-621 [doi]

The BTWorld use case for big data analytics: Description, MapReduce logical workflow, and empirical evaluationTim Hegeman, Bogdan Ghit, Mihai Capota, Jan Hidders, Dick H. J. Epema, Alexandru Iosup. 622-630 [doi]

Modeling heterogeneous time series dynamics to profile big sensor data in complex physical systemsBin Liu, Haifeng Chen, Abhishek B. Sharma, Guofei Jiang, Hui Xiong. 631-638 [doi]

Efficiently extracting frequent subgraphs using MapReduceWei Lu, Gang Chen, Anthony K. H. Tung, Feng Zhao. 639-647 [doi]

Explaining the product range effect in purchase dataDiego Pennacchioli, Michele Coscia, Salvatore Rinzivillo, Dino Pedreschi, Fosca Giannotti. 648-656 [doi]

Large Scale predictive analytics for real-time energy managementNatasha Balac, Tamara Sipes, Nicole Wolter, Kenneth Nunes, Robert S. Sinkovits, Homa Karimabadi. 657-664 [doi]

Parallel deterministic annealing clustering and its application to LC-MS data analysisGeoffrey Fox, D. R. Mani, Saumyadipta Pyne. 665-673 [doi]

Terabyte-scale image similarity search: Experience and best practiceDiana Moise, Denis Shestakov, Gylfi Þór Gudmundsson, Laurent Amsaleg. 674-682 [doi]

Demand response targeting using big data analyticsJungsuk Kwac, Ram Rajagopal. 683-690 [doi]

HIG - An in-memory database platform enabling real-time analyses of genome dataMatthieu-P. Schapranow, Hasso Plattner. 691-696 [doi]

Real-time streaming mobility analyticsAndrás Garzó, András A. Benczúr, Csaba István Sidló, Daniel Tahara, Erik Francis Wyatt. 697-702 [doi]

QuPARA: Query-driven large-scale portfolio aggregate risk analysis on MapReduceAndrew Rau-Chaplin, Blesson Varghese, Duane Wilson, Zhimin Yao, Norbert Zeh. 703-709 [doi]

Constructing consumer profiles from social media dataMauricio A. Hernández, Kirsten Hildrum, Prateek Jain, Rohit Wagle, Bogdan Alexe, Rajasekar Krishnamurthy, Ioana Roxana Stanoi, Chitra Venkatramani. 710-716 [doi]

CloudRS: An error correction algorithm of high-throughput sequencing data based on scalable frameworkChien-Chih Chen, Yu-Jung Chang, Wei-Chun Chung, Der-Tsai Lee, Jan-Ming Ho. 717-722 [doi]

Building dynamic thermal profiles of energy consumption for individuals and neighborhoodsAdrian Albert, Ram Rajagopal. 723-728 [doi]

Terabyte-sized image computations on Hadoop cluster platformsPeter Bajcsy, Antoine Vandecreme, Julien Amelot, Phuong Nguyen, Joe Chalfoun, Mary Brady. 729-737 [doi]

A fast and scalable method for threat detection in large-scale DNS logsRon Begleiter, Yuval Elovici, Yona Hollander, Ori Mendelson, Lior Rokach, Roi Saltzman. 738-741 [doi]

Hourglass: A library for incremental processing on HadoopMatthew Hayes, Sam Shah. 742-752 [doi]

Correlation-based performance analysis for full-system MapReduce optimizationQi Guo, Yan Li, Tao Liu, Kun Wang, Guancheng Chen, Xiaoming Bao, Wentao Tang. 753-761 [doi]

Large scale ad latency analysisMihajlo Grbovic, Jon Malkin, Hirakendu Das. 762-767 [doi]

Accelerating semantic graph databases on commodity clustersAlessandro Morari, Vito Giovanni Castellana, David Haglin, John Feo, Jesse Weaver, Antonino Tumeo, Oreste Villa. 768-772 [doi]

Practical distributed classification using the Alternating Direction Method of Multipliers algorithmPeter Lubell-Doughtie, Jon Sondag. 773-776 [doi]

Scaling deep social feeds at PinterestVarun Sharma, Jeremy Carroll, Abhi Khune. 777-783 [doi]

Big data analytics on high Velocity streams: A case studyThibaud Chardonnens, Philippe Cudré-Mauroux, Martin Grund, Benoit Perroud. 784-787 [doi]

External Links

Cite Key

Statistics

PDF

Researchr

Proceedings of the 2013 IEEE International Conference on Big Data, 6-9 October 2013, Santa Clara, CA, USA

Abstract

Table of Contents